RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy1544
(1331 letters)
>gnl|CDD|215601 PLN03142, PLN03142, Probable chromatin-remodeling complex ATPase
chain; Provisional.
Length = 1033
Score = 524 bits (1351), Expect = e-166
Identities = 257/588 (43%), Positives = 367/588 (62%), Gaps = 35/588 (5%)
Query: 497 DEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYYSIAHTVHEIVTEQASILVNGK 556
D+ + K K G + +K + ED+EY K + ++ + + I GK
Sbjct: 116 DQSASAKKAKGRGRHASKLTEEEEDEEYLKEEEDGLGGSG----GTRLLVQPSCI--KGK 169
Query: 557 LKEYQIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKVNGPFLIIVPLS 616
+++YQ+ GL W++ L+ N +NGILADEMGLGKT+QTI+L+ YL E + + GP +++ P S
Sbjct: 170 MRDYQLAGLNWLIRLYENGINGILADEMGLGKTLQTISLLGYLHEYRGITGPHMVVAPKS 229
Query: 617 TLSNWSLEFERWAPSVNVVAYKGSPHLRKTLQAQ-MKASKFNVLLTTYEYVIKDKGPLAK 675
TL NW E R+ P + V + G+P R + + + A KF+V +T++E IK+K L +
Sbjct: 230 TLGNWMNEIRRFCPVLRAVKFHGNPEERAHQREELLVAGKFDVCVTSFEMAIKEKTALKR 289
Query: 676 LHWKYMIIDEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTPLQNKLPELWALLNFLLPSI 735
W+Y+IIDE HR+KN + L+ + F +RLL+TGTPLQN L ELWALLNFLLP I
Sbjct: 290 FSWRYIIIDEAHRIKNENSLLSKTMRLF-STNYRLLITGTPLQNNLHELWALLNFLLPEI 348
Query: 736 FKSVSTFEQWFNAPFATTGEKVELNEEETILIIRRLHKVLRPFLLRRLKKEVESQLPDKV 795
F S TF++WF +GE N+++ + +++LHKVLRPFLLRRLK +VE LP K
Sbjct: 349 FSSAETFDEWF----QISGE----NDQQEV--VQQLHKVLRPFLLRRLKSDVEKGLPPKK 398
Query: 796 EYIIKCDMSGLQKVLYRHMHTKGILLTDGSEKGKQGKGGAKALMNTIVQLRKLCNHPFMF 855
E I+K MS +QK Y+ + K + + + G K L+N +QLRK CNHP++F
Sbjct: 399 ETILKVGMSQMQKQYYKALLQKDLDVVNAG-------GERKRLLNIAMQLRKCCNHPYLF 451
Query: 856 QNIEEKFSDHVGGSGIVSGPDLYRVSGKFELLDRILPKLKSTGHRVLLFCQMTQLMNILE 915
Q E G +G L SGK LLD++LPKLK RVL+F QMT+L++ILE
Sbjct: 452 QGAEP-------GPPYTTGEHLVENSGKMVLLDKLLPKLKERDSRVLIFSQMTRLLDILE 504
Query: 916 DYFSYRGFKYMRLDGTTKAEDRGDLLKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVII 975
DY YRG++Y R+DG T EDR + FN P SE F+F+LSTRAGGLG+NL TAD VI+
Sbjct: 505 DYLMYRGYQYCRIDGNTGGEDRDASIDAFNKPGSEKFVFLLSTRAGGLGINLATADIVIL 564
Query: 976 FDSDWNPHQDLQAQDRAHRIGQKNEVRVLRLMTVNSVEERILAAARYKLNMDEKVIQAG- 1034
+DSDWNP DLQAQDRAHRIGQK EV+V R T ++EE+++ A KL +D VIQ G
Sbjct: 565 YDSDWNPQVDLQAQDRAHRIGQKKEVQVFRFCTEYTIEEKVIERAYKKLALDALVIQQGR 624
Query: 1035 MFDQKSTGSERHQFLQTILHQDDEEDEEENAVPDDETVNQMLARSEEE 1082
+ +QK+ + + LQ + + + +++ DE +++++A+ EE
Sbjct: 625 LAEQKTVNKD--ELLQMVRYGAEMVFSSKDSTITDEDIDRIIAKGEEA 670
>gnl|CDD|215770 pfam00176, SNF2_N, SNF2 family N-terminal domain. This domain is
found in proteins involved in a variety of processes
including transcription regulation (e.g., SNF2, STH1,
brahma, MOT1), DNA repair (e.g. ERCC6, RAD16, RAD5), DNA
recombination (e.g. RAD54), and chromatin unwinding
(e.g. ISWI) as well as a variety of other proteins with
little functional information (e.g. lodestar, ETL1).
Length = 301
Score = 389 bits (1000), Expect = e-125
Identities = 147/302 (48%), Positives = 197/302 (65%), Gaps = 7/302 (2%)
Query: 560 YQIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIALI-TYLMEKKKVNGPFLIIVPLSTL 618
YQ++G+ W++SL +N L GILADEMGLGKT+QTIAL+ TYL E K GP L++ PLSTL
Sbjct: 1 YQLEGVNWLISLESNGLGGILADEMGLGKTLQTIALLATYLKEGKDRRGPTLVVCPLSTL 60
Query: 619 SNWSLEFERWAPSVNVVAYKGSPHLRKTLQAQM--KASKFNVLLTTYEYVIKDK---GPL 673
NW EFE+WAP++ VV Y G R L+ M + ++V++TTYE + KDK L
Sbjct: 61 HNWLNEFEKWAPALRVVVYHGDGRERSKLRQSMAKRLDTYDVVITTYEVLRKDKKLLSLL 120
Query: 674 AKLHWKYMIIDEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTPLQNKLPELWALLNFLLP 733
K+ W +++DE HR+KN KL L +RLLLTGTP+QN L ELWALLNFL P
Sbjct: 121 NKVEWDRVVLDEAHRLKNSKSKLYKALKKLK-TRNRLLLTGTPIQNNLEELWALLNFLRP 179
Query: 734 SIFKSVSTFEQWFNAPFATTGEKVELNEEETILIIRRLHKVLRPFLLRRLKKEVESQLPD 793
F S FE+WFN P A T + N E+ I RLHK+L+PFLLRR K +VE LP
Sbjct: 180 GPFGSFKVFEEWFNIPIANTADNKNKNLEKGKEGINRLHKLLKPFLLRRTKDDVEKSLPP 239
Query: 794 KVEYIIKCDMSGLQKVLYRHMHTKGILLTDGSEKGKQGKGGAKALMNTIVQLRKLCNHPF 853
K E+++ C++S Q+ LY+ + TK L + +G + G +L+N I+QLRK+CNHP+
Sbjct: 240 KTEHVLYCNLSDEQRKLYKKLLTKRRLALSFAVEGGEKNVGIASLLNLIMQLRKICNHPY 299
Query: 854 MF 855
+F
Sbjct: 300 LF 301
>gnl|CDD|223627 COG0553, HepA, Superfamily II DNA/RNA helicases, SNF2 family
[Transcription / DNA replication, recombination, and
repair].
Length = 866
Score = 353 bits (906), Expect = e-105
Identities = 204/520 (39%), Positives = 290/520 (55%), Gaps = 38/520 (7%)
Query: 555 GKLKEYQIKGLEWMVSLFN-NNLNGILADEMGLGKTIQTIALITYLMEKKKVN-GPFLII 612
+L+ YQ++G+ W+ L N L GILAD+MGLGKT+QTIAL+ L+E KV GP LI+
Sbjct: 337 AELRPYQLEGVNWLSELLRSNLLGGILADDMGLGKTVQTIALLLSLLESIKVYLGPALIV 396
Query: 613 VPLSTLSNWSLEFERWAPSVN-VVAYKGSPHL----RKTLQAQMKASK---FNVLLTTYE 664
VP S LSNW EFE++AP + V+ Y G R+ L+ +K F+V++TTYE
Sbjct: 397 VPASLLSNWKREFEKFAPDLRLVLVYHGEKSELDKKREALRDLLKLHLVIIFDVVITTYE 456
Query: 665 YVIK---DKGPLAKLHWKYMIIDEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTPLQNKL 721
+ + D G L K+ W +++DE HR+KN L A +RL LTGTPL+N+L
Sbjct: 457 LLRRFLVDHGGLKKIEWDRVVLDEAHRIKNDQSSEGKALQFL-KALNRLDLTGTPLENRL 515
Query: 722 PELWALLN-FLLPSIF-KSVSTFEQWFNAPFATTGEKVELNEEETILIIRRLHKVLRPFL 779
ELW+LL FL P + S + F + F P E+ E L I L K+L PF+
Sbjct: 516 GELWSLLQEFLNPGLLGTSFAIFTRLFEKPIQA--EEDIGPLEARELGIELLRKLLSPFI 573
Query: 780 LRRLKKEVE--SQLPDKVEYIIKCDMSGLQKVLY-RHMHTKGILLTD------GSEKGKQ 830
LRR K++VE +LP K+E +++C++S Q+ LY + +
Sbjct: 574 LRRTKEDVEVLKELPPKIEKVLECELSEEQRELYEALLEGAEKNQQLLEDLEKADSDENR 633
Query: 831 GKGGAKALMNTIVQLRKLCNHPFMFQNIEEKFSDHVGGSGIVSGPDLYRV-------SGK 883
++ + +LR++CNHP + E D + Y GK
Sbjct: 634 IGDSELNILALLTRLRQICNHPALVDEGLEATFDRIVLLLREDKDFDYLKKPLIQLSKGK 693
Query: 884 FELLDRIL-PKLKSTGH--RVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRGDL 940
+ LD +L KL GH +VL+F Q T ++++LEDY G KY+RLDG+T A+ R +L
Sbjct: 694 LQALDELLLDKLLEEGHYHKVLIFSQFTPVLDLLEDYLKALGIKYVRLDGSTPAKRRQEL 753
Query: 941 LKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVIIFDSDWNPHQDLQAQDRAHRIGQKNE 1000
+ +FNA D E +F+LS +AGGLGLNL ADTVI+FD WNP +LQA DRAHRIGQK
Sbjct: 754 IDRFNA-DEEEKVFLLSLKAGGLGLNLTGADTVILFDPWWNPAVELQAIDRAHRIGQKRP 812
Query: 1001 VRVLRLMTVNSVEERILAAARYKLNMDEKVIQAGMFDQKS 1040
V+V RL+T ++EE+IL K + + +I A + S
Sbjct: 813 VKVYRLITRGTIEEKILELQEKKQELLDSLIDAEGEKELS 852
>gnl|CDD|99947 cd05516, Bromo_SNF2L2, Bromodomain, SNF2L2-like subfamily, specific
to animals. SNF2L2 (SNF2-alpha) or SWI/SNF-related
matrix-associated actin-dependent regulator of chromatin
subfamily A member 2 is a global transcriptional
activator, which cooperates with nuclear hormone
receptors to boost transcriptional activation.
Bromodomains are 110 amino acid long domains, that are
found in many chromatin associated proteins. Bromodomains
can interact specifically with acetylated lysine.
Length = 107
Score = 153 bits (389), Expect = 1e-43
Identities = 64/107 (59%), Positives = 85/107 (79%)
Query: 1217 KLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDG 1276
+L K + KI+ VVIKY DSDGR L+E FI+LPSRKELP+YYE+I +P+D KKI RI +
Sbjct: 1 ELTKKMNKIVDVVIKYKDSDGRQLAEVFIQLPSRKELPEYYELIRKPVDFKKIKERIRNH 60
Query: 1277 KYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQRV 1323
KY S+++L+KD LC+NAQ +N E SLI+EDS+VL+SVF ARQ++
Sbjct: 61 KYRSLEDLEKDVMLLCQNAQTFNLEGSLIYEDSIVLQSVFKSARQKI 107
>gnl|CDD|214692 smart00487, DEXDc, DEAD-like helicases superfamily.
Length = 201
Score = 123 bits (310), Expect = 8e-32
Identities = 51/200 (25%), Positives = 90/200 (45%), Gaps = 16/200 (8%)
Query: 556 KLKEYQIKGLEWMVSLFNNNLNGILADEMGLGKTIQ-TIALITYLMEKKKVNGPFLIIVP 614
L+ YQ + +E ++S + ILA G GKT+ + + L K G L++VP
Sbjct: 8 PLRPYQKEAIEALLS---GLRDVILAAPTGSGKTLAALLPALEALKRGKG--GRVLVLVP 62
Query: 615 LSTL-SNWSLEFERWAPS--VNVVAYKGSPHLRKTLQAQMKASKFNVLLTTYEYVIKD-- 669
L W+ E ++ PS + VV G R+ L+ ++++ K ++L+TT ++
Sbjct: 63 TRELAEQWAEELKKLGPSLGLKVVGLYGGDSKREQLR-KLESGKTDILVTTPGRLLDLLE 121
Query: 670 KGPLAKLHWKYMIIDEGHRMKN--HHCKLTHILNTFYVAPHRLLLTGTPLQNKLPELWAL 727
L+ + +I+DE HR+ + +L +L LLL+ TP + L
Sbjct: 122 NDKLSLSNVDLVILDEAHRLLDGGFGDQLEKLLKLLPKNVQLLLLSATPPEEIENLLELF 181
Query: 728 LN--FLLPSIFKSVSTFEQW 745
LN + F + EQ+
Sbjct: 182 LNDPVFIDVGFTPLEPIEQF 201
>gnl|CDD|238034 cd00079, HELICc, Helicase superfamily c-terminal domain; associated
with DEXDc-, DEAD-, and DEAH-box proteins, yeast
initiation factor 4A, Ski2p, and Hepatitis C virus NS3
helicases; this domain is found in a wide variety of
helicases and helicase related proteins; may not be an
autonomously folding unit, but an integral part of the
helicase; 4 helicase superfamilies at present according
to the organization of their signature motifs; all
helicases share the ability to unwind nucleic acid
duplexes with a distinct directional polarity; they
utilize the free energy from nucleoside triphosphate
hydrolysis to fuel their translocation along DNA,
unwinding the duplex in the process.
Length = 131
Score = 113 bits (286), Expect = 2e-29
Identities = 36/124 (29%), Positives = 58/124 (46%), Gaps = 3/124 (2%)
Query: 881 SGKFELLDRILPKLKSTGHRVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRGDL 940
K E L +L + G +VL+FC ++++ L + G K L G E+R ++
Sbjct: 11 DEKLEALLELLKEHLKKGGKVLIFCPSKKMLDELAELLRKPGIKVAALHGDGSQEEREEV 70
Query: 941 LKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVIIFDSDWNPHQDLQAQDRAHRIGQKNE 1000
LK F + + +++T G++L VI +D W+P LQ RA R GQK
Sbjct: 71 LKDFREGE---IVVLVATDVIARGIDLPNVSVVINYDLPWSPSSYLQRIGRAGRAGQKGT 127
Query: 1001 VRVL 1004
+L
Sbjct: 128 AILL 131
>gnl|CDD|238005 cd00046, DEXDc, DEAD-like helicases superfamily. A diverse family
of proteins involved in ATP-dependent RNA or DNA
unwinding. This domain contains the ATP-binding region.
Length = 144
Score = 108 bits (272), Expect = 2e-27
Identities = 35/146 (23%), Positives = 60/146 (41%), Gaps = 9/146 (6%)
Query: 577 NGILADEMGLGKTIQTIALITYLMEKKKVNGPFLIIVPLSTLSNWSLEFERWAPS--VNV 634
+ +LA G GKT+ + I L++ K G L++ P L+N E + + V
Sbjct: 2 DVLLAAPTGSGKTLAALLPILELLDSLK-GGQVLVLAPTRELANQVAERLKELFGEGIKV 60
Query: 635 VAYKGSPHLRKTLQAQMKASKFNVLLTTYEYVIKDKGPLAKL--HWKYMIIDEGHRMKNH 692
G Q ++ + K ++++ T ++ + L +I+DE HR+ N
Sbjct: 61 GYLIGG--TSIKQQEKLLSGKTDIVVGTPGRLLDELERLKLSLKKLDLLILDEAHRLLNQ 118
Query: 693 HCKLT--HILNTFYVAPHRLLLTGTP 716
L IL LLL+ TP
Sbjct: 119 GFGLLGLKILLKLPKDRQVLLLSATP 144
>gnl|CDD|99946 cd05515, Bromo_polybromo_V, Bromodomain, polybromo repeat V.
Polybromo is a nuclear protein of unknown function, which
contains 6 bromodomains. The human ortholog BAF180 is
part of a SWI/SNF chromatin-remodeling complex, and it
may carry out the functions of Yeast Rsc-1 and Rsc-2. It
was shown that polybromo bromodomains bind to histone H3
at specific acetyl-lysine positions. Bromodomains are
found in many chromatin-associated proteins and in
nuclear histone acetyltransferases. They interact
specifically with acetylated lysine, but not all the
bromodomains in polybromo may bind to acetyl-lysine.
Length = 105
Score = 106 bits (266), Expect = 4e-27
Identities = 41/105 (39%), Positives = 68/105 (64%)
Query: 1218 LKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
+++ L ++ V YTD GR LS F++LPS+ E PDYY+VI +P+D++KI +IE +
Sbjct: 1 MQQKLWELYNAVKNYTDGRGRRLSLIFMRLPSKSEYPDYYDVIKKPIDMEKIRSKIEGNQ 60
Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQR 1322
Y S+D++ DF + NA YNE S I++D++ L+ V + ++
Sbjct: 61 YQSLDDMVSDFVLMFDNACKYNEPDSQIYKDALTLQKVLLETKRE 105
>gnl|CDD|197636 smart00297, BROMO, bromo domain.
Length = 107
Score = 105 bits (265), Expect = 6e-27
Identities = 46/107 (42%), Positives = 70/107 (65%), Gaps = 2/107 (1%)
Query: 1217 KLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDG 1276
KL+K L+++++ V+ DS LS PF+K SRKE PDYY++I +PMD+K I ++E+G
Sbjct: 3 KLQKKLQELLKAVLDKLDSHP--LSWPFLKPVSRKEAPDYYDIIKKPMDLKTIKKKLENG 60
Query: 1277 KYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQRV 1323
KYSSV+E DF + NA+ YN S +++D+ LE F K + +
Sbjct: 61 KYSSVEEFVADFNLMFSNARTYNGPDSEVYKDAKKLEKFFEKKLREL 107
>gnl|CDD|99922 cd04369, Bromodomain, Bromodomain. Bromodomains are found in many
chromatin-associated proteins and in nuclear histone
acetyltransferases. They interact specifically with
acetylated lysine.
Length = 99
Score = 102 bits (255), Expect = 1e-25
Identities = 36/105 (34%), Positives = 59/105 (56%), Gaps = 8/105 (7%)
Query: 1214 DQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRI 1273
+ KL+ L + ++ SEPF++ KE PDYYEVI PMD+ I ++
Sbjct: 1 LKKKLRSLLDALKKLKRDL--------SEPFLEPVDPKEAPDYYEVIKNPMDLSTIKKKL 52
Query: 1274 EDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTK 1318
++G+Y S++E + D + + NA+ YN S I++D+ LE +F K
Sbjct: 53 KNGEYKSLEEFEADVRLIFSNAKTYNGPGSPIYKDAKKLEKLFEK 97
>gnl|CDD|99950 cd05519, Bromo_SNF2, Bromodomain, SNF2-like subfamily, specific to
fungi. SNF2 is a yeast protein involved in
transcriptional activation, it is the catalytic component
of the SWI/SNF ATP-dependent chromatin remodeling
complex. The protein is essential for the regulation of
gene expression (both positive and negative) of a large
number of genes. The SWI/SNF complex changes chromatin
structure by altering DNA-histone contacts within the
nucleosome, which results in a re-positioning of the
nucleosome and facilitates or represses the binding of
gene-specific transcription factors. Bromodomains are 110
amino acid long domains, that are found in many chromatin
associated proteins. Bromodomains can interact
specifically with acetylated lysine.
Length = 103
Score = 97.4 bits (243), Expect = 5e-24
Identities = 42/102 (41%), Positives = 63/102 (61%)
Query: 1218 LKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
LK + +I V+ D GR LSE F++ PS+K PDYY +I RP+ + +I RIE
Sbjct: 1 LKAAMLEIYDAVLNCEDETGRKLSELFLEKPSKKLYPDYYVIIKRPIALDQIKRRIEGRA 60
Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKA 1319
Y S++E +DF + NA+ YN+E S+++ED+V +E F K
Sbjct: 61 YKSLEEFLEDFHLMFANARTYNQEGSIVYEDAVEMEKAFKKK 102
>gnl|CDD|201125 pfam00271, Helicase_C, Helicase conserved C-terminal domain. The
Prosite family is restricted to DEAD/H helicases,
whereas this domain family is found in a wide variety of
helicases and helicase related proteins. It may be that
this is not an autonomously folding unit, but an
integral part of the helicase.
Length = 78
Score = 95.3 bits (238), Expect = 1e-23
Identities = 25/81 (30%), Positives = 37/81 (45%), Gaps = 3/81 (3%)
Query: 916 DYFSYRGFKYMRLDGTTKAEDRGDLLKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVII 975
G K RL G E+R ++L+ F S +++T G G++L + VI
Sbjct: 1 KLLRKPGIKVARLHGGLSQEEREEILEDFRNGKS---KVLVATDVAGRGIDLPDVNLVIN 57
Query: 976 FDSDWNPHQDLQAQDRAHRIG 996
+D WNP +Q RA R G
Sbjct: 58 YDLPWNPASYIQRIGRAGRAG 78
>gnl|CDD|197757 smart00490, HELICc, helicase superfamily c-terminal domain.
Length = 82
Score = 94.6 bits (236), Expect = 3e-23
Identities = 28/84 (33%), Positives = 39/84 (46%), Gaps = 3/84 (3%)
Query: 913 ILEDYFSYRGFKYMRLDGTTKAEDRGDLLKKFNAPDSEYFIFVLSTRAGGLGLNLQTADT 972
L + G K RL G E+R ++L KFN +++T GL+L D
Sbjct: 2 ELAELLKELGIKVARLHGGLSQEEREEILDKFNNGKI---KVLVATDVAERGLDLPGVDL 58
Query: 973 VIIFDSDWNPHQDLQAQDRAHRIG 996
VII+D W+P +Q RA R G
Sbjct: 59 VIIYDLPWSPASYIQRIGRAGRAG 82
>gnl|CDD|99949 cd05518, Bromo_polybromo_IV, Bromodomain, polybromo repeat IV.
Polybromo is a nuclear protein of unknown function, which
contains 6 bromodomains. The human ortholog BAF180 is
part of a SWI/SNF chromatin-remodeling complex, and it
may carry out the functions of Yeast Rsc-1 and Rsc-2. It
was shown that polybromo bromodomains bind to histone H3
at specific acetyl-lysine positions. Bromodomains are
found in many chromatin-associated proteins and in
nuclear histone acetyltransferases. They interact
specifically with acetylated lysine, but not all the
bromodomains in polybromo may bind to acetyl-lysine.
Length = 103
Score = 95.2 bits (237), Expect = 3e-23
Identities = 42/103 (40%), Positives = 67/103 (65%)
Query: 1218 LKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
KK + + V++Y + GR L + F++ PS+K+ PDYY++I P+D+K I I + K
Sbjct: 1 RKKRMLALFLYVLEYREGSGRRLCDLFMEKPSKKDYPDYYKIILEPIDLKTIEHNIRNDK 60
Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKAR 1320
Y++ +EL DFK + RNA+ YNEE S ++ED+ +LE V + R
Sbjct: 61 YATEEELMDDFKLMFRNARHYNEEGSQVYEDANILEKVLKEKR 103
>gnl|CDD|99948 cd05517, Bromo_polybromo_II, Bromodomain, polybromo repeat II.
Polybromo is a nuclear protein of unknown function, which
contains 6 bromodomains. The human ortholog BAF180 is
part of a SWI/SNF chromatin-remodeling complex, and it
may carry out the functions of Yeast Rsc-1 and Rsc-2. It
was shown that polybromo bromodomains bind to histone H3
at specific acetyl-lysine positions. Bromodomains are
found in many chromatin-associated proteins and in
nuclear histone acetyltransferases. They interact
specifically with acetylated lysine, but not all the
bromodomains in polybromo may bind to acetyl-lysine.
Length = 103
Score = 90.2 bits (224), Expect = 2e-21
Identities = 40/103 (38%), Positives = 69/103 (66%)
Query: 1218 LKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
LK+ L++++ V+ TD GR++SE F KLPS+ PDYY VI P+D+K I RI+ G
Sbjct: 1 LKQILEQLLEAVMTATDPSGRLISELFQKLPSKVLYPDYYAVIKEPIDLKTIAQRIQSGY 60
Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKAR 1320
Y S+++++KD + +NA+ +NE S +++D+ ++ +FT +
Sbjct: 61 YKSIEDMEKDLDLMVKNAKTFNEPGSQVYKDANAIKKIFTAKK 103
>gnl|CDD|99954 cd05524, Bromo_polybromo_I, Bromodomain, polybromo repeat I.
Polybromo is a nuclear protein of unknown function, which
contains 6 bromodomains. The human ortholog BAF180 is
part of a SWI/SNF chromatin-remodeling complex, and it
may carry out the functions of Yeast Rsc-1 and Rsc-2. It
was shown that polybromo bromodomains bind to histone H3
at specific acetyl-lysine positions. Bromodomains are
found in many chromatin-associated proteins and in
nuclear histone acetyltransferases. They interact
specifically with acetylated lysine, but not all the
bromodomains in polybromo may bind to acetyl-lysine.
Length = 113
Score = 83.2 bits (206), Expect = 6e-19
Identities = 39/104 (37%), Positives = 61/104 (58%)
Query: 1225 IMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDEL 1284
+ + Y DGR+L E FI++P R+ P+YYEV+ P+D+ KI +++ +Y VD+L
Sbjct: 10 LYDTIRNYKSEDGRILCESFIRVPKRRNEPEYYEVVSNPIDLLKIQQKLKTEEYDDVDDL 69
Query: 1285 QKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQRVESGED 1328
DF+ L NA+ Y + S H+D+ L +F AR V SG +
Sbjct: 70 TADFELLINNAKAYYKPDSPEHKDACKLWELFLSARNEVLSGGE 113
>gnl|CDD|214727 smart00573, HSA, domain in helicases and associated with SANT
domains.
Length = 73
Score = 81.3 bits (201), Expect = 1e-18
Identities = 31/73 (42%), Positives = 49/73 (67%)
Query: 297 QKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQ 356
QK+E ER+++Q + ++ H KDFKE H+ A ++ KAVM+YH N EKE+++ +
Sbjct: 1 QKLEEERRRKQHWDHLLEEMIWHAKDFKEEHKWKIAAAKKMAKAVMDYHQNKEKEEERRE 60
Query: 357 ERIEKERMRRLMA 369
E+ EK R+R+L A
Sbjct: 61 EKNEKRRLRKLAA 73
>gnl|CDD|215921 pfam00439, Bromodomain, Bromodomain. Bromodomains are 110 amino acid
long domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 84
Score = 80.5 bits (199), Expect = 2e-18
Identities = 32/77 (41%), Positives = 46/77 (59%)
Query: 1233 TDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLC 1292
D L+EPF++ +E PDYYEVI PMD+ I +++ GKY S+ E KD + +
Sbjct: 6 EDLMEHPLAEPFLEPVDPEEYPDYYEVIKEPMDLSTIRQKLKSGKYKSLAEFLKDVELIF 65
Query: 1293 RNAQIYNEELSLIHEDS 1309
NA YN E S I++D+
Sbjct: 66 SNAITYNGEDSDIYKDA 82
>gnl|CDD|227408 COG5076, COG5076, Transcription factor involved in chromatin
remodeling, contains bromodomain [Chromatin structure and
dynamics / Transcription].
Length = 371
Score = 87.2 bits (216), Expect = 6e-18
Identities = 58/265 (21%), Positives = 97/265 (36%), Gaps = 31/265 (11%)
Query: 1057 DEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAER---RKEQGKKSRLIEVSELPDWL 1113
+V +E N++L + + +A K +K E
Sbjct: 7 SYSQLGRPSVLKEEFGNELLRL--VDNDSSPFPNAPEEEGSKNLFQKQLKRMPKEY-ITS 63
Query: 1114 IKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEE 1173
I +D L G T S EK +++ +D+
Sbjct: 64 IVDDR----EPGSMANVNDDLENVGG--------ITYSPFEKNRPESLR----FDEIVFL 107
Query: 1174 EEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYT 1233
E V + + +K K +++ D K I + +
Sbjct: 108 AIESVTPESGLGSLLMAHL--KTSVKKRKTPKIEDELLYADN-------KAIAKFKKQLF 158
Query: 1234 DSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCR 1293
DGR LS F+ LPS++E PDYYE+I PMD+ I ++++G+Y S +E D +
Sbjct: 159 LRDGRFLSSIFLGLPSKREYPDYYEIIKSPMDLLTIQKKLKNGRYKSFEEFVSDLNLMFD 218
Query: 1294 NAQIYNEELSLIHEDSVVLESVFTK 1318
N ++YN S ++ D+ LE F K
Sbjct: 219 NCKLYNGPDSSVYVDAKELEKYFLK 243
Score = 31.7 bits (72), Expect = 2.4
Identities = 19/89 (21%), Positives = 38/89 (42%), Gaps = 3/89 (3%)
Query: 1195 EEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELP 1254
E+ + +E + ++ R + T+S V + PF++ S +E+P
Sbjct: 238 EKYFLKLIEEIPEEMLEL---SIKPGREEREERESVLITNSQAHVGAWPFLRPVSDEEVP 294
Query: 1255 DYYEVIDRPMDIKKILGRIEDGKYSSVDE 1283
DYY+ I PMD+ ++ + Y +
Sbjct: 295 DYYKDIRDPMDLSTKELKLRNNYYRPEET 323
>gnl|CDD|99953 cd05522, Bromo_Rsc1_2_II, Bromodomain, repeat II in Rsc1/2_like
subfamily, specific to fungi. Rsc1 and Rsc2 are
components of the RSC complex (remodeling the structure
of chromatin), are essential for transcriptional control,
and have a specific domain architecture including two
bromodomains. The RSC complex has also been linked to
homologous recombination and nonhomologous end-joining
repair of DNA double strand breaks. Bromodomains are 110
amino acid long domains, that are found in many chromatin
associated proteins. Bromodomains can interact
specifically with acetylated lysine.
Length = 104
Score = 79.6 bits (197), Expect = 1e-17
Identities = 32/96 (33%), Positives = 54/96 (56%)
Query: 1223 KKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVD 1282
K I++ + K D +GR+L+ F KLP + P+YY+ I P+ + I +++ KY S D
Sbjct: 7 KNILKGLRKERDENGRLLTLHFEKLPDKAREPEYYQEISNPISLDDIKKKVKRRKYKSFD 66
Query: 1283 ELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTK 1318
+ D + NA++YNE S ++D+V+LE
Sbjct: 67 QFLNDLNLMFENAKLYNENDSQEYKDAVLLEKEARL 102
>gnl|CDD|99951 cd05520, Bromo_polybromo_III, Bromodomain, polybromo repeat III.
Polybromo is a nuclear protein of unknown function, which
contains 6 bromodomains. The human ortholog BAF180 is
part of a SWI/SNF chromatin-remodeling complex, and it
may carry out the functions of Yeast Rsc-1 and Rsc-2. It
was shown that polybromo bromodomains bind to histone H3
at specific acetyl-lysine positions. Bromodomains are
found in many chromatin-associated proteins and in
nuclear histone acetyltransferases. They interact
specifically with acetylated lysine, but not all the
bromodomains in polybromo may bind to acetyl-lysine.
Length = 103
Score = 76.2 bits (188), Expect = 1e-16
Identities = 29/83 (34%), Positives = 59/83 (71%)
Query: 1233 TDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLC 1292
++ G++L+EPF+KLPS+++ PDYY+ I P+ +++I ++++G+Y +++EL+ D +
Sbjct: 16 RNNQGQLLAEPFLKLPSKRKYPDYYQEIKNPISLQQIRTKLKNGEYETLEELEADLNLMF 75
Query: 1293 RNAQIYNEELSLIHEDSVVLESV 1315
NA+ YN S I++D+ L+ +
Sbjct: 76 ENAKRYNVPNSRIYKDAEKLQKL 98
>gnl|CDD|99955 cd05525, Bromo_ASH1, Bromodomain; ASH1_like sub-family. ASH1 (absent,
small, or homeotic 1) is a member of the trithorax-group
in Drosophila melanogaster, an epigenetic transcriptional
regulator of HOX genes. Drosophila ASH1 has been shown to
methylate specific lysines in histones H3 and H4.
Mammalian ASH1 has been shown to methylate histone H3.
Bromodomains are 110 amino acid long domains, that are
found in many chromatin associated proteins. Bromodomains
can interact specifically with acetylated lysine.
Length = 106
Score = 74.3 bits (183), Expect = 6e-16
Identities = 40/106 (37%), Positives = 59/106 (55%)
Query: 1216 AKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIED 1275
A+L + LK+I +I Y DS+G+ L+ PFI LPS+K+ PDYYE I P+D+ I +I
Sbjct: 1 ARLAQVLKEICDAIITYKDSNGQSLAIPFINLPSKKKNPDYYERITDPVDLSTIEKQILT 60
Query: 1276 GKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQ 1321
G Y + + D + RNA+ Y S I D L + +A+
Sbjct: 61 GYYKTPEAFDSDMLKVFRNAEKYYGRKSPIGRDVCRLRKAYYQAKH 106
>gnl|CDD|219455 pfam07529, HSA, HSA. This domain is predicted to bind DNA and is
often found associated with helicases.
Length = 73
Score = 71.2 bits (175), Expect = 3e-15
Identities = 22/73 (30%), Positives = 42/73 (57%)
Query: 297 QKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQ 356
Q++E E++++ + + KDF+E + A+ +L +AV YH EKE+++ +
Sbjct: 1 QRLEEEQREKTHWDHLLEEMKWMSKDFREERKWKIAKAKKLARAVAQYHKYIEKEEQRRK 60
Query: 357 ERIEKERMRRLMA 369
ER K+R++ L A
Sbjct: 61 EREAKQRLKALKA 73
>gnl|CDD|99941 cd05509, Bromo_gcn5_like, Bromodomain; Gcn5_like subfamily. Gcn5p is
a histone acetyltransferase (HAT) which mediates
acetylation of histones at lysine residues; such
acetylation is generally correlated with the activation
of transcription. Bromodomains are 110 amino acid long
domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 101
Score = 68.7 bits (169), Expect = 6e-14
Identities = 25/77 (32%), Positives = 46/77 (59%)
Query: 1243 PFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEEL 1302
PF++ ++E PDYY+VI +PMD+ + ++E+G Y +++E D K + N ++YN
Sbjct: 21 PFLEPVDKEEAPDYYDVIKKPMDLSTMEEKLENGYYVTLEEFVADLKLIFDNCRLYNGPD 80
Query: 1303 SLIHEDSVVLESVFTKA 1319
+ ++ + LE F K
Sbjct: 81 TEYYKCANKLEKFFWKK 97
>gnl|CDD|99952 cd05521, Bromo_Rsc1_2_I, Bromodomain, repeat I in Rsc1/2_like
subfamily, specific to fungi. Rsc1 and Rsc2 are
components of the RSC complex (remodeling the structure
of chromatin), are essential for transcriptional control,
and have a specific domain architecture including two
bromodomains. The RSC complex has also been linked to
homologous recombination and nonhomologous end-joining
repair of DNA double strand breaks. Bromodomains are 110
amino acid long domains, that are found in many chromatin
associated proteins. Bromodomains can interact
specifically with acetylated lysine.
Length = 106
Score = 59.6 bits (145), Expect = 1e-10
Identities = 29/98 (29%), Positives = 53/98 (54%), Gaps = 2/98 (2%)
Query: 1217 KLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDG 1276
KL K LK + + + +G + F LP RK+ PDYY++I P+ + + R+
Sbjct: 1 KLSKKLKPLYDGIYTLKEENGIEIHPIFNVLPLRKDYPDYYKIIKNPLSLNTVKKRLP-- 58
Query: 1277 KYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLES 1314
Y++ E D + NA++YN + S+I++ +++LE
Sbjct: 59 HYTNAQEFVNDLAQIPWNARLYNTKGSVIYKYALILEK 96
>gnl|CDD|99936 cd05504, Bromo_Acf1_like, Bromodomain; Acf1_like or BAZ1A_like
subfamily. Bromo adjacent to zinc finger 1A (BAZ1A) was
identified as a novel human bromodomain gene by cDNA
library screening. The Drosophila homologue, Acf1, is
part of the CHRAC (chromatin accessibility complex) and
regulates ISWI-induced nucleosome remodeling.
Bromodomains are 110 amino acid long domains, that are
found in many chromatin associated proteins. Bromodomains
can interact specifically with acetylated lysine.
Length = 115
Score = 59.3 bits (144), Expect = 1e-10
Identities = 25/78 (32%), Positives = 45/78 (57%)
Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNE 1300
S PF++ S+ E+PDYY++I +PMD+ I ++ G+Y +E D + + N +YN
Sbjct: 30 SWPFLRPVSKIEVPDYYDIIKKPMDLGTIKEKLNMGEYKLAEEFLSDIQLVFSNCFLYNP 89
Query: 1301 ELSLIHEDSVVLESVFTK 1318
E + +++ L+ F K
Sbjct: 90 EHTSVYKAGTRLQRFFIK 107
>gnl|CDD|99945 cd05513, Bromo_brd7_like, Bromodomain, brd7_like subgroup. The BRD7
gene encodes a nuclear protein that has been shown to
inhibit cell growth and the progression of the cell cycle
by regulating cell-cycle genes at the transcriptional
level. BRD7 has been identified as a gene involved in
nasopharyngeal carcinoma. The protein interacts with
acetylated histone H3 via its bromodomain. Bromodomains
are 110 amino acid long domains that are found in many
chromatin associated proteins. Bromodomains can interact
specifically with acetylated lysine.
Length = 98
Score = 55.5 bits (134), Expect = 2e-09
Identities = 18/46 (39%), Positives = 27/46 (58%)
Query: 1254 PDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYN 1299
P Y +I PMD + +I++ Y S++E + DFK +C NA YN
Sbjct: 32 PGYSSIIKHPMDFSTMKEKIKNNDYQSIEEFKDDFKLMCENAMKYN 77
>gnl|CDD|203672 pfam07533, BRK, BRK domain. The function of this domain is
unknown. It is often found associated with helicases and
transcription factors.
Length = 45
Score = 52.3 bits (126), Expect = 8e-09
Identities = 16/45 (35%), Positives = 26/45 (57%)
Query: 446 SQLTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVAD 490
S + + V +GK L G+DAP L++W+Q++PG+EV
Sbjct: 1 SLDGEERVPVINRKTGKRLTGDDAPKLKDLERWLQENPGYEVDPR 45
>gnl|CDD|99937 cd05505, Bromo_WSTF_like, Bromodomain; Williams syndrome
transcription factor-like subfamily (WSTF-like). The
Williams-Beuren syndrome deletion transcript 9 is a
putative transcriptional regulator. WSTF was found to
play a role in vitamin D-mediated transcription as part
of two chromatin remodeling complexes, WINAC and WICH.
Bromodomains are 110 amino acid long domains, that are
found in many chromatin associated proteins. Bromodomains
can interact specifically with acetylated lysine.
Length = 97
Score = 54.1 bits (130), Expect = 8e-09
Identities = 26/76 (34%), Positives = 38/76 (50%), Gaps = 6/76 (7%)
Query: 1225 IMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDEL 1284
I+ ++KY S PF + + E DY +VI PMD++ + + G YSSV E
Sbjct: 8 ILSKILKYRFS------WPFREPVTADEAEDYKKVITNPMDLQTMQTKCSCGSYSSVQEF 61
Query: 1285 QKDFKTLCRNAQIYNE 1300
D K + NA+ Y E
Sbjct: 62 LDDMKLVFSNAEKYYE 77
>gnl|CDD|99942 cd05510, Bromo_SPT7_like, Bromodomain; SPT7_like subfamily. SPT7 is a
yeast protein that functions as a component of the
transcription regulatory histone acetylation (HAT)
complexes SAGA, SALSA, and SLIK. SAGA is involved in the
RNA polymerase II-dependent transcriptional regulation of
about 10% of all yeast genes. The SPT7 bromodomain has
been shown to weakly interact with acetylated histone H3,
but not H4. The human representative of this subfamily is
cat eye syndrome critical region protein 2 (CECR2).
Bromodomains are 110 amino acid long domains, that are
found in many chromatin associated proteins. Bromodomains
can interact specifically with acetylated lysine.
Length = 112
Score = 54.0 bits (130), Expect = 1e-08
Identities = 22/63 (34%), Positives = 39/63 (61%)
Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNE 1300
S PF+ S++E PDYY++I +PMD+ +L ++++ +Y S E D + +N +YN
Sbjct: 26 STPFLTKVSKREAPDYYDIIKKPMDLGTMLKKLKNLQYKSKAEFVDDLNLIWKNCLLYNS 85
Query: 1301 ELS 1303
+ S
Sbjct: 86 DPS 88
>gnl|CDD|99943 cd05511, Bromo_TFIID, Bromodomain, TFIID-like subfamily. Human
TAFII250 (or TAF250) is the largest subunit of TFIID, a
large multi-domain complex, which initiates the assembly
of the transcription machinery. TAFII250 contains two
bromodomains that specifically bind to acetylated histone
H4. Bromodomains are 110 amino acid long domains, that
are found in many chromatin associated proteins.
Bromodomains can interact specifically with acetylated
lysine.
Length = 112
Score = 53.8 bits (130), Expect = 1e-08
Identities = 27/83 (32%), Positives = 46/83 (55%), Gaps = 11/83 (13%)
Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNE 1300
S PF ++K++PDYY++I RPMD++ I +I KY S +E +D + + N+ +YN
Sbjct: 18 SWPFHTPVNKKKVPDYYKIIKRPMDLQTIRKKISKHKYQSREEFLEDIELIVDNSVLYNG 77
Query: 1301 ELSLIHEDSVVLESVFTKARQRV 1323
+SV+TK + +
Sbjct: 78 P-----------DSVYTKKAKEM 89
>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1. Members
of this family are necessary for accurate chromosome
transmission during cell division.
Length = 804
Score = 59.0 bits (143), Expect = 1e-08
Identities = 43/209 (20%), Positives = 59/209 (28%), Gaps = 35/209 (16%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGS-----------GPPGSPGPSPGQAPGQN 49
S+ S + P Q + P G P P P QAP
Sbjct: 90 DSDLSQKTSTFSPCQSGYEAST------DPEYIPDLQPDPSLWGTAPKPEPQPPQAPESQ 143
Query: 50 PQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQV--QQLRFQIMAY 107
PQ + K LEE + + + + +Q F
Sbjct: 144 PQP-------QTPAQKMLSLEEVEAQLQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGP 196
Query: 108 RLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPN--QAQPMPLQ 165
P PQ Q + + +P+ Q P P+P Q P Q Q L
Sbjct: 197 PEQPPGYPQPPQGHPE-QVQPQQFLPAPSQAPAQP---PLPPQLPQQPPPLQQPQFPGLS 252
Query: 166 QQPPPQPHQQQGHISSQIKQSKLTNIPKP 194
QQ PP P Q Q + + P P
Sbjct: 253 QQMPPPPPQPPQQ---QQQPPQPQAQPPP 278
Score = 57.5 bits (139), Expect = 4e-08
Identities = 41/209 (19%), Positives = 50/209 (23%), Gaps = 39/209 (18%)
Query: 8 PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ 67
P P P QQ P + P GPP P P G Q A Q
Sbjct: 170 PQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQ 229
Query: 68 GLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGK 127
+ + Q L Q+ P PQ
Sbjct: 230 -----------PPLPPQLPQQPPPLQQPQFPGLSQQMPP------PPPQPPQQQ------ 266
Query: 128 RMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSK 187
Q PP P P P P Q PP QP Q +Q
Sbjct: 267 ---------QQPPQPQAQPPPQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQRG 317
Query: 188 LTNIPKPEGLDPLIILQERENRVALNIER 216
+ L ++ R AL+ E
Sbjct: 318 PQFREQLVQL-------SQQQREALSQEE 339
Score = 41.3 bits (97), Expect = 0.004
Identities = 29/180 (16%), Positives = 47/180 (26%), Gaps = 12/180 (6%)
Query: 29 APGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEI 88
+ +P Q+ E T + D L+ DP +
Sbjct: 85 PSVGPDSDLSQKTSTFSPCQSGYEASTDPEYIPD------LQPDPSLWGTAPKPEPQPPQ 138
Query: 89 KHAFTSAQVQQLRFQIMAYRLLA--RNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGP 146
+ + + A + + PQL Q ++ P+ GP
Sbjct: 139 APESQPQPQTPAQKMLSLEEVEAQLQQRQQAPQLPQPPQ--QVLPQGMPPRQAAFPQQGP 196
Query: 147 MPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQER 206
PP P P Q P Q QP + +L P P L ++
Sbjct: 197 PEQPPGYPQPPQ--GHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQQ 254
Score = 32.8 bits (75), Expect = 1.3
Identities = 19/92 (20%), Positives = 31/92 (33%), Gaps = 4/92 (4%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRA 60
PP PQ Q PP N PG + P Q P P Q+
Sbjct: 260 PQPPQQQQQPPQPQAQPPPQNQPTPH---PGLPQGQNAPLPPPQQPQLLPLVQQPQGQQR 316
Query: 61 IDSMKEQGLEEDPRYQKLIEMKAN-RTEIKHA 91
+EQ ++ + ++ + + R + +H
Sbjct: 317 GPQFREQLVQLSQQQREALSQEEAKRAKRRHK 348
Score = 30.9 bits (70), Expect = 5.4
Identities = 14/59 (23%), Positives = 16/59 (27%), Gaps = 7/59 (11%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQ-------NPQENLTALQ 58
+ P PPQ Q P + Q P P P Q PQ T
Sbjct: 228 AQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHP 286
Score = 30.5 bits (69), Expect = 8.7
Identities = 19/84 (22%), Positives = 28/84 (33%), Gaps = 14/84 (16%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAP------------GSGPP-GSPGPSPGQAPGQNPQEN 53
PP Q Q P L+ P PP P P PG GQN
Sbjct: 238 QQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAPLP 297
Query: 54 LTALQRAIDSMK-EQGLEEDPRYQ 76
+ + ++ QG + P+++
Sbjct: 298 PPQQPQLLPLVQQPQGQQRGPQFR 321
>gnl|CDD|220309 pfam09606, Med15, ARC105 or Med15 subunit of Mediator complex
non-fungal. The approx. 70 residue Med15 domain of the
ARC-Mediator co-activator is a three-helix bundle with
marked similarity to the KIX domain. The sterol
regulatory element binding protein (SREBP) family of
transcription activators use the ARC105 subunit to
activate target genes in the regulation of cholesterol
and fatty acid homeostasis. In addition, Med15 is a
critical transducer of gene activation signals that
control early metazoan development.
Length = 768
Score = 58.9 bits (142), Expect = 2e-08
Identities = 56/202 (27%), Positives = 70/202 (34%), Gaps = 18/202 (8%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRA 60
M P Q P + Q+P G G G PG P P Q PG PQ A+Q+
Sbjct: 275 MQQQPPQQQPQQSQLGMLPNQMQQMPGG--GQGGPGQPMGPPPQRPGAVPQ-GGQAVQQG 331
Query: 61 IDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQL 120
+ S +Q L + KL M+ + T Q QQ A NQ +
Sbjct: 332 VMSAGQQQL----KQMKLRNMRGQQQ------TQQQQQQQGGNHPAAHQQQMNQQVGQGG 381
Query: 121 AMG-VQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQ-QQGH 178
M + ++G G PM P M P+P PPQP G
Sbjct: 382 QMVALGYLNIQGNQGGLGANPMQQGQPGMMSSPSPVPQVQTNQ--SMPQPPQPSVPSPGG 439
Query: 179 ISSQIKQSKLTN-IPKPEGLDP 199
SQ QS IP P L P
Sbjct: 440 PGSQPPQSVSGGMIPSPPALMP 461
Score = 53.9 bits (129), Expect = 6e-07
Identities = 47/202 (23%), Positives = 61/202 (30%), Gaps = 17/202 (8%)
Query: 1 MSNSSTSPNPPPPQQ-QQPPLNVGQLPMGAPGSGPPGSP-GPSPGQAPGQ--NPQENLTA 56
MS + Q P+N Q G GP P GP PG+ GQ +
Sbjct: 53 MSKKAAQQQVLQGGQGMPDPINALQNLTGQGTRGPQMGPMGPGPGRPMGQQMGGPGTASN 112
Query: 57 LQ-----RAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLA 111
L R M G+ + + +S Q Q + M +
Sbjct: 113 LLQSLNVRGQMPMGAAGMGPHQMSRVGTMQPGGQAGGMMQQSSGQPQSQQPNQMGPQQ-G 171
Query: 112 RNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ 171
+ Q + G QG P G Q PP MP Q Q QQQ PQ
Sbjct: 172 QAQGQAGGMNQGQQG------PVGQQQPPQMGQPGMPGGGGQGQMQQQGQPGGQQQQNPQ 225
Query: 172 PHQQ-QGHISSQIKQSKLTNIP 192
QQ Q Q+ Q +
Sbjct: 226 MQQQLQNQQQQQMDQQQGPADA 247
Score = 51.2 bits (122), Expect = 3e-06
Identities = 40/169 (23%), Positives = 49/169 (28%), Gaps = 10/169 (5%)
Query: 12 PPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ-GLE 70
P Q + Q G P S P GP GQA GQ N Q G
Sbjct: 143 PGGQAGGMM---QQSSGQPQSQQPNQMGPQQGQAQGQAGGMNQGQQGPVGQQQPPQMGQP 199
Query: 71 EDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRME 130
P +M+ + Q+QQ ++ + P Q MG Q
Sbjct: 200 GMPGGGGQGQMQQQGQPGGQQQQNPQMQQQLQNQQQQQMDQQQGPADAQAQMGQQ----- 254
Query: 131 GVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHI 179
M P + G P Q P Q QP Q P QQ
Sbjct: 255 -QQGQGGMQPQQMQGGQMQVPMQQQPPQQQPQQSQLGMLPNQMQQMPGG 302
Score = 50.0 bits (119), Expect = 8e-06
Identities = 43/199 (21%), Positives = 59/199 (29%), Gaps = 12/199 (6%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKE 66
S P Q Q GQ A G G GP Q P Q Q + M++
Sbjct: 155 SGQPQSQQPNQMGPQQGQAQGQAGGMNQ-GQQGPVGQQQPPQMGQPGMPGGGGQ-GQMQQ 212
Query: 67 QGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG 126
QG + Q + + + + Q Q + + PQ G Q
Sbjct: 213 QGQPGGQQQQNPQMQQQLQNQQQQQMDQQQGPADA-QAQMGQQQQGQGGMQPQQMQGGQM 271
Query: 127 KR-MEGVPSGPQM---PPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ-----PHQQQG 177
+ M+ P Q L M P QPM Q P QQG
Sbjct: 272 QVPMQQQPPQQQPQQSQLGMLPNQMQQMPGGGQGGPGQPMGPPPQRPGAVPQGGQAVQQG 331
Query: 178 HISSQIKQSKLTNIPKPEG 196
+S+ +Q K + G
Sbjct: 332 VMSAGQQQLKQMKLRNMRG 350
Score = 44.6 bits (105), Expect = 3e-04
Identities = 43/190 (22%), Positives = 52/190 (27%), Gaps = 31/190 (16%)
Query: 12 PPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEE 71
P QQQPP G G G G GQ QNPQ + M +Q
Sbjct: 187 PVGQQQPPQMGQPGMPGGGGQGQMQQQGQPGGQ-QQQNPQMQQQLQNQQQQQMDQQQGPA 245
Query: 72 DPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQ------ 125
D + Q + + Q+Q Q + QP QL M
Sbjct: 246 DAQAQMGQQQQGQGGMQPQQMQGGQMQVPMQQ-----QPPQQQPQQSQLGMLPNQMQQMP 300
Query: 126 --GKRMEGVPSGPQ------MPPMSLHGPMP-MPPSQP---------MPNQAQPMPLQQQ 167
G+ G P GP +P M Q M Q Q QQQ
Sbjct: 301 GGGQGGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQQQLKQMKLRNMRGQQQTQQ-QQQ 359
Query: 168 PPPQPHQQQG 177
H
Sbjct: 360 QQGGNHPAAH 369
Score = 43.5 bits (102), Expect = 0.001
Identities = 42/217 (19%), Positives = 59/217 (27%), Gaps = 46/217 (21%)
Query: 10 PPPPQQQQPPLNVGQLP---MGAPGSGPPGSPGPSPGQAPGQ-------NPQENLTALQR 59
PP Q QQ L + M G G PG P P Q PG Q ++A Q+
Sbjct: 279 PPQQQPQQSQLGMLPNQMQQMPGGGQGGPGQPMGPPPQRPGAVPQGGQAVQQGVMSAGQQ 338
Query: 60 AIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQV--QQLRFQIMAYRLLARNQ--- 114
+ MK + + + Q+ + + H Q Q + + Y + NQ
Sbjct: 339 QLKQMKLRNMRGQQQTQQQQQQQGGNHPAAHQQQMNQQVGQGGQMVALGYLNIQGNQGGL 398
Query: 115 ---------------PLTPQLAMGVQGKRMEGVPSGPQ----------------MPPMSL 143
P Q PS P +P
Sbjct: 399 GANPMQQGQPGMMSSPSPVPQVQTNQSMPQPPQPSVPSPGGPGSQPPQSVSGGMIPSPPA 458
Query: 144 HGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHIS 180
P P P P + + P P G S
Sbjct: 459 LMPSPSPQMSQSPASQRTIQQDMVSPGGPLNTPGQSS 495
Score = 33.4 bits (76), Expect = 1.0
Identities = 22/85 (25%), Positives = 35/85 (41%), Gaps = 12/85 (14%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQEN------LTALQRA 60
SP+P Q + Q + GP +PG S +P NPQE L +
Sbjct: 462 SPSPQMSQSPASQRTIQQDMVS--PGGPLNTPGQSSVNSPA-NPQEEQLYREKYKQLSKY 518
Query: 61 IDSMKE--QGLEEDP-RYQKLIEMK 82
I+ ++ ++ D R + L +MK
Sbjct: 519 IEPLRRMIAKIDNDEGRIKDLSKMK 543
>gnl|CDD|99957 cd05528, Bromo_AAA, Bromodomain; sub-family co-occurring with AAA
domains. Bromodomains are 110 amino acid long domains,
that are found in many chromatin associated proteins.
Bromodomains can interact specifically with acetylated
lysine. The structure(2DKW) in this alignment is an
uncharacterized protein predicted from analysis of cDNA
clones from human fetal liver.
Length = 112
Score = 53.5 bits (129), Expect = 2e-08
Identities = 28/78 (35%), Positives = 44/78 (56%), Gaps = 2/78 (2%)
Query: 1222 LKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSV 1281
L+ +R V+K SD R F K +E+PDYYE+I +PMD++ IL +++ +Y +
Sbjct: 4 LRLFLRDVLKRLASDKRFN--AFTKPVDEEEVPDYYEIIKQPMDLQTILQKLDTHQYLTA 61
Query: 1282 DELQKDFKTLCRNAQIYN 1299
+ KD + NA YN
Sbjct: 62 KDFLKDIDLIVTNALEYN 79
>gnl|CDD|99958 cd05529, Bromo_WDR9_I_like, Bromodomain; WDR9 repeat I_like
subfamily. WDR9 is a human gene located in the Down
Syndrome critical region-2 of chromosome 21. It encodes
for a nuclear protein containing WD40 repeats and two
bromodomains, which may function as a transcriptional
regulator involved in chromatin remodeling and play a
role in embryonic development. Bromodomains are 110 amino
acid long domains, that are found in many chromatin
associated proteins. Bromodomains can interact
specifically with acetylated lysine.
Length = 128
Score = 53.9 bits (130), Expect = 2e-08
Identities = 29/115 (25%), Positives = 56/115 (48%), Gaps = 8/115 (6%)
Query: 1205 KEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFI-KLPSRKELPDYYEVIDRP 1263
E+ R++++ +L L K++ S ++E F + R PDY+ + P
Sbjct: 16 WEQPHIRDEERERLISGLDKLLL-------SLQLEIAEYFEYPVDLRAWYPDYWNRVPVP 68
Query: 1264 MDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTK 1318
MD++ I R+E+ Y S++ L+ D + + NA+ +NE S I + + L +
Sbjct: 69 MDLETIRSRLENRYYRSLEALRHDVRLILSNAETFNEPNSEIAKKAKRLSDWLLR 123
>gnl|CDD|204086 pfam08880, QLQ, QLQ. The QLQ domain is named after the conserved
Gln, Leu, Gln motif. The QLQ domain is found at the
N-terminus of SWI2/SNF2 protein, which has been shown to
be involved in protein-protein interactions. This domain
has thus been postulated to be involved in mediating
protein interactions.
Length = 37
Score = 50.1 bits (121), Expect = 4e-08
Identities = 18/36 (50%), Positives = 25/36 (69%)
Query: 91 AFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG 126
FT AQ+Q+L+ QI+AY+ LA NQP+ P L +Q
Sbjct: 2 PFTPAQLQELKAQILAYKYLAANQPVPPHLQQPIQK 37
>gnl|CDD|99929 cd05497, Bromo_Brdt_I_like, Bromodomain, Brdt_like subfamily, repeat
I. Human Brdt is a testis-specific member of the BET
subfamily of bromodomain proteins; the first bromodomain
in Brdt has been shown to be essential for male germ cell
differentiation. Bromodomains are 110 amino acid long
domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 107
Score = 52.0 bits (125), Expect = 4e-08
Identities = 26/76 (34%), Positives = 40/76 (52%), Gaps = 7/76 (9%)
Query: 1252 ELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVV 1311
LPDY+++I PMD+ I R+E+ Y S E +DF T+ N IYN+ D VV
Sbjct: 36 NLPDYHKIIKTPMDLGTIKKRLENNYYWSASECIQDFNTMFTNCYIYNKP-----GDDVV 90
Query: 1312 L--ESVFTKARQRVES 1325
L +++ Q++
Sbjct: 91 LMAQTLEKLFLQKLAQ 106
>gnl|CDD|99931 cd05499, Bromo_BDF1_2_II, Bromodomain. BDF1/BDF2 like subfamily,
restricted to fungi, repeat II. BDF1 and BDF2 are yeast
transcription factors involved in the expression of a
wide range of genes, including snRNAs; they are required
for sporulation and DNA repair and protect histone H4
from deacetylation. Bromodomains are 110 amino acid long
domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 102
Score = 51.1 bits (123), Expect = 8e-08
Identities = 26/99 (26%), Positives = 54/99 (54%), Gaps = 9/99 (9%)
Query: 1220 KTLKKIMRVVIKYTDSDGRVLSEPFIKL--PSRKELPDYYEVIDRPMDIKKILGRIEDGK 1277
+ LK++M+ K++ + PF+ P +P+Y+ +I +PMD+ I ++++G+
Sbjct: 7 EVLKELMKP--KHSA-----YNWPFLDPVDPVALNIPNYFSIIKKPMDLGTISKKLQNGQ 59
Query: 1278 YSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVF 1316
Y S E ++D + + +N +N E + ++ LE VF
Sbjct: 60 YQSAKEFERDVRLIFKNCYTFNPEGTDVYMMGHQLEEVF 98
>gnl|CDD|99938 cd05506, Bromo_plant1, Bromodomain, uncharacterized subfamily
specific to plants. Might function as a global
transcription factor. Bromodomains are 110 amino acid
long domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 99
Score = 50.8 bits (122), Expect = 1e-07
Identities = 25/80 (31%), Positives = 39/80 (48%), Gaps = 8/80 (10%)
Query: 1222 LKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSV 1281
L+K+M G V + P LPDY+++I +PMD+ + ++E G+YSS
Sbjct: 9 LRKLM------KHKWGWVFNAPVD--VVALGLPDYFDIIKKPMDLGTVKKKLEKGEYSSP 60
Query: 1282 DELQKDFKTLCRNAQIYNEE 1301
+E D + NA YN
Sbjct: 61 EEFAADVRLTFANAMRYNPP 80
>gnl|CDD|99930 cd05498, Bromo_Brdt_II_like, Bromodomain, Brdt_like subfamily, repeat
II. Human Brdt is a testis-specific member of the BET
subfamily of bromodomain proteins; the first bromodomain
in Brdt has been shown to be essential for male germ cell
differentiation. Bromodomains are 110 amino acid long
domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 102
Score = 50.4 bits (121), Expect = 1e-07
Identities = 19/71 (26%), Positives = 34/71 (47%)
Query: 1248 PSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHE 1307
P L DY+++I PMD+ I ++++ +Y+ E D + + N YN +H
Sbjct: 30 PEALGLHDYHDIIKHPMDLSTIKKKLDNREYADAQEFAADVRLMFSNCYKYNPPDHPVHA 89
Query: 1308 DSVVLESVFTK 1318
+ L+ VF
Sbjct: 90 MARKLQDVFED 100
>gnl|CDD|99944 cd05512, Bromo_brd1_like, Bromodomain; brd1_like subfamily. BRD1 is a
mammalian gene which encodes for a nuclear protein
assumed to be a transcriptional regulator. BRD1 has been
implicated with brain development and susceptibility to
schizophrenia and bipolar affective disorder.
Bromodomains are 110 amino acid long domains that are
found in many chromatin associated proteins. Bromodomains
can interact specifically with acetylated lysine.
Length = 98
Score = 50.1 bits (120), Expect = 2e-07
Identities = 18/63 (28%), Positives = 32/63 (50%), Gaps = 4/63 (6%)
Query: 1237 GRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQ 1296
+ SEP E+PDY + I +PMD + ++E +Y ++++ + DF + N
Sbjct: 19 AEIFSEPV----DLSEVPDYLDHIKQPMDFSTMRKKLESQRYRTLEDFEADFNLIINNCL 74
Query: 1297 IYN 1299
YN
Sbjct: 75 AYN 77
>gnl|CDD|197800 smart00592, BRK, domain in transcription and CHROMO domain
helicases.
Length = 45
Score = 48.1 bits (115), Expect = 2e-07
Identities = 15/45 (33%), Positives = 26/45 (57%)
Query: 448 LTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSD 492
+ + V +GK L G+DAP A L++W++++P +EV S
Sbjct: 1 DGEERVPVINRETGKKLTGDDAPKAKDLERWLEENPEYEVAPRSA 45
>gnl|CDD|99927 cd05495, Bromo_cbp_like, Bromodomain, cbp_like subfamily. Cbp (CREB
binding protein or CREBBP) is an acetyltransferase acting
on histone, which gives a specific tag for
transcriptional activation and also acetylates
non-histone proteins. CREBBP binds specifically to
phosphorylated CREB protein and augments the activity of
phosphorylated CREB to activate transcription of
cAMP-responsive genes. Bromodomains are 110 amino acid
long domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 108
Score = 49.0 bits (117), Expect = 5e-07
Identities = 22/78 (28%), Positives = 40/78 (51%), Gaps = 2/78 (2%)
Query: 1241 SEPFIKL--PSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIY 1298
S PF + P +PDY++++ PMD+ I +++ G+Y + D + NA +Y
Sbjct: 22 SLPFRQPVDPKLLGIPDYFDIVKNPMDLSTIRRKLDTGQYQDPWQYVDDVWLMFDNAWLY 81
Query: 1299 NEELSLIHEDSVVLESVF 1316
N + S +++ L VF
Sbjct: 82 NRKTSRVYKYCTKLAEVF 99
>gnl|CDD|99935 cd05503, Bromo_BAZ2A_B_like, Bromodomain, BAZ2A/BAZ2B_like subfamily.
Bromo adjacent to zinc finger 2A (BAZ2A) and 2B (BAZ2B)
were identified as a novel human bromodomain gene by cDNA
library screening. BAZ2A is also known as Tip5
(Transcription termination factor I-interacting protein
5) and hWALp3. The proteins may play roles in
transcriptional regulation. Human Tip5 is part of a
complex termed NoRC (nucleolar remodeling complex), which
induces nucleosome sliding and may play a role in the
regulation of the rDNA locus. Bromodomains are 110 amino
acid long domains, that are found in many chromatin
associated proteins. Bromodomains can interact
specifically with acetylated lysine.
Length = 97
Score = 48.5 bits (116), Expect = 7e-07
Identities = 21/76 (27%), Positives = 42/76 (55%)
Query: 1243 PFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEEL 1302
PF++ + K +P Y ++I +PMD I ++E G+Y +++E +D + + N + +NE+
Sbjct: 20 PFLEPVNTKLVPGYRKIIKKPMDFSTIREKLESGQYKTLEEFAEDVRLVFDNCETFNEDD 79
Query: 1303 SLIHEDSVVLESVFTK 1318
S + + F K
Sbjct: 80 SEVGRAGHNMRKFFEK 95
>gnl|CDD|99932 cd05500, Bromo_BDF1_2_I, Bromodomain. BDF1/BDF2 like subfamily,
restricted to fungi, repeat I. BDF1 and BDF2 are yeast
transcription factors involved in the expression of a
wide range of genes, including snRNAs; they are required
for sporulation and DNA repair and protect histone H4
from deacetylation. Bromodomains are 110 amino acid long
domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 103
Score = 48.1 bits (115), Expect = 1e-06
Identities = 19/71 (26%), Positives = 35/71 (49%)
Query: 1248 PSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEELSLIHE 1307
P + +P Y +I +PMD+ I +++ Y+SV+E DF + N +N + +
Sbjct: 31 PVKLNIPHYPTIIKKPMDLGTIERKLKSNVYTSVEEFTADFNLMVDNCLTFNGPEHPVSQ 90
Query: 1308 DSVVLESVFTK 1318
L++ F K
Sbjct: 91 MGKRLQAAFEK 101
>gnl|CDD|99928 cd05496, Bromo_WDR9_II, Bromodomain; WDR9 repeat II_like subfamily.
WDR9 is a human gene located in the Down Syndrome
critical region-2 of chromosome 21. It encodes for a
nuclear protein containing WD40 repeats and two
bromodomains, which may function as a transcriptional
regulator involved in chromatin remodeling and play a
role in embryonic development. Bromodomains are 110 amino
acid long domains, that are found in many chromatin
associated proteins. Bromodomains can interact
specifically with acetylated lysine.
Length = 119
Score = 46.3 bits (110), Expect = 6e-06
Identities = 27/99 (27%), Positives = 49/99 (49%), Gaps = 7/99 (7%)
Query: 1219 KKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKY 1278
KK K+++ ++ DS EPF + + PDY ++ID PMD+ + + G Y
Sbjct: 7 KKQCKELVNLMWDCEDS------EPFRQPVDLLKYPDYRDIIDTPMDLGTVKETLFGGNY 60
Query: 1279 SSVDELQKDFKTLCRNAQIYN-EELSLIHEDSVVLESVF 1316
E KD + + N++ Y + S I+ ++ L ++F
Sbjct: 61 DDPMEFAKDVRLIFSNSKSYTPNKRSRIYSMTLRLSALF 99
>gnl|CDD|214931 smart00951, QLQ, QLQ is named after the conserved Gln, Leu, Gln
motif. QLQ is found at the N-terminus of SWI2/SNF2
protein, which has been shown to be involved in
protein-protein interactions. QLQ has been postulated to
be involved in mediating protein interactions.
Length = 36
Score = 43.7 bits (104), Expect = 7e-06
Identities = 19/33 (57%), Positives = 25/33 (75%), Gaps = 1/33 (3%)
Query: 91 AFTSAQVQQLRFQIMAYR-LLARNQPLTPQLAM 122
FT AQ++ LR QI+AY+ LLARNQP+ P+L
Sbjct: 2 PFTPAQLELLRAQILAYKYLLARNQPVPPELLQ 34
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 49.7 bits (118), Expect = 1e-05
Identities = 45/203 (22%), Positives = 68/203 (33%), Gaps = 26/203 (12%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSP---GQAPGQNPQENLTALQRA 60
S + P P QQ PL++ AP P P P P Q Q + R
Sbjct: 208 SPIAAQPAPQPQQPSPLSLIS----APSLHPQRLPSPHPPLQPQTASQQSPQPPAPSSRH 263
Query: 61 IDSMKEQGLEEDPR--YQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTP 118
S P Q + ++ + F AQ Q + + + P +
Sbjct: 264 PQSSHHGPGPPMPHALQQGPVFLQHPSSNPPQPFGLAQSQVPPLPLPSQAQPHSHTPPSQ 323
Query: 119 QLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGH 178
Q R + +P P MP + PP+ P+P Q P Q H+ H
Sbjct: 324 SALQPQQPPREQPLPPAPSMPHIK------PPPTTPIP----------QLPNQSHKHPPH 367
Query: 179 ISSQIKQSKL-TNIPKPEGLDPL 200
+ ++ +N+P P L PL
Sbjct: 368 LQGPSPFPQMPSNLPPPPALKPL 390
Score = 47.4 bits (112), Expect = 5e-05
Identities = 44/200 (22%), Positives = 62/200 (31%), Gaps = 32/200 (16%)
Query: 3 NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAID 62
N S+SP+ P PQ + + GPP P PG A + + Q
Sbjct: 147 NRSSSPSIPSPQDNESDSDSSAQQQLLQPQGPPSIQVP-PGAALAPSAPPPTPSAQAVPP 205
Query: 63 SMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAM 122
+ P+ Q+ + H RL + + PL PQ A
Sbjct: 206 QGSPIAAQPAPQPQQPSPLSLISAPSLHP---------------QRLPSPHPPLQPQTA- 249
Query: 123 GVQGKRMEGVPSGPQMPPMSLHGPMP------------MPPSQPMPNQAQPMPLQQQPP- 169
+ + + P S HGP P + P Q + Q PP
Sbjct: 250 --SQQSPQPPAPSSRHPQSSHHGPGPPMPHALQQGPVFLQHPSSNPPQPFGLAQSQVPPL 307
Query: 170 PQPHQQQGHISSQIKQSKLT 189
P P Q Q H + QS L
Sbjct: 308 PLPSQAQPHSHTPPSQSALQ 327
Score = 36.6 bits (84), Expect = 0.11
Identities = 49/205 (23%), Positives = 69/205 (33%), Gaps = 47/205 (22%)
Query: 1 MSNSSTSPNPPPPQQQQP-------------PLNVGQLPMGAPGSGPP---------GSP 38
+ S SP PP P + P L G + + P S PP P
Sbjct: 247 QTASQQSPQPPAPSSRHPQSSHHGPGPPMPHALQQGPVFLQHPSSNPPQPFGLAQSQVPP 306
Query: 39 GPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQ 98
P P QA + + + +EQ L P + IK T+ +
Sbjct: 307 LPLPSQAQPHSHTPPSQSALQPQQPPREQPLPPAP----------SMPHIKPPPTTP-IP 355
Query: 99 QLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMS---LHGP--MPMPPSQ 153
QL Q +++ Q +P M +P P + P+S H P PP Q
Sbjct: 356 QLPNQ--SHKHPPHLQGPSPFPQMP------SNLPPPPALKPLSSLPTHHPPSAHPPPLQ 407
Query: 154 PMPNQAQPMPLQQQPPPQPHQQQGH 178
MP Q+QP+ PP Q Q
Sbjct: 408 LMP-QSQPLQSVPAQPPVLTQSQSL 431
Score = 33.9 bits (77), Expect = 0.76
Identities = 70/320 (21%), Positives = 110/320 (34%), Gaps = 37/320 (11%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPS-PGQAPGQNPQENLTALQRA 60
S S+ P PP +Q PP P PP +P P P Q+ P +
Sbjct: 322 SQSALQPQQPPREQPLPPA-----PSMPHIKPPPTTPIPQLPNQSHKHPPHLQGPSPFPQ 376
Query: 61 IDSMKEQGLEEDPRYQKLIEMKANRTEIKHA---FTSAQVQQLRFQIMAYRLLARNQPLT 117
+ S L P + L + + H Q Q L+ +L ++Q L
Sbjct: 377 MPS----NLPPPPALKPLSSLPTHHPPSAHPPPLQLMPQSQPLQSVPAQPPVLTQSQSLP 432
Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLH--GPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQ 175
P+ + G+ SGP P + H +P P P+ P P
Sbjct: 433 PKAS----THPHSGLHSGPPQSPFAQHPFTSGGLPAIGPPPSLPTSTP----AAPPRASS 484
Query: 176 QGHISSQIKQSKLTNIPKPEGLDPLIILQERENRVALNIERRIEELNGSLTSTLPEHLRV 235
S L P+ I +E L+ E S PE V
Sbjct: 485 GSQPPGSALPSSGGCAGPGPPLPPIQIKEE-----PLDEAEEPESPPPPPRSPSPEPTVV 539
Query: 236 KAEIELRALKVLNFQRQLRAEVIACARRD---TTLETAVNVK----AYKRTKRQGLKEAR 288
A + F + L +CAR D T L ++ K A ++ KR+ ++AR
Sbjct: 540 NTPSH--ASQSARFYKHLDRGYNSCARTDLYFTPLASSKLAKKREEAVEKAKREAEQKAR 597
Query: 289 ATEKLEKQQKVEAERKKRQK 308
+ EK+++ E ER++ ++
Sbjct: 598 EEREREKEKEKERERERERE 617
>gnl|CDD|99934 cd05502, Bromo_tif1_like, Bromodomain; tif1_like subfamily. Tif1
(transcription intermediary factor 1) is a member of the
tripartite motif (TRIM) protein family, which is
characterized by a particular domain architecture. It
functions by recruiting coactivators and/or corepressors
to modulate transcription. Vertebrate Tif1-gamma, also
labeled E3 ubiquitin-protein ligase TRIM33, plays a role
in the control of hematopoiesis. Its homologue in Xenopus
laevis, Ectodermin, has been shown to function in
germ-layer specification and control of cell growth
during embryogenesis. Bromodomains are 110 amino acid
long domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 109
Score = 45.4 bits (108), Expect = 1e-05
Identities = 28/80 (35%), Positives = 40/80 (50%), Gaps = 4/80 (5%)
Query: 1240 LSEPFIKLPSRKELPDYYEVIDRPMD---IKKILGRIEDGKYSSVDELQKDFKTLCRNAQ 1296
LS PF P +P+YY++I PMD I+K L YSS +E D + + +N
Sbjct: 21 LSLPF-HEPVSPSVPNYYKIIKTPMDLSLIRKKLQPKSPQHYSSPEEFVADVRLMFKNCY 79
Query: 1297 IYNEELSLIHEDSVVLESVF 1316
+NEE S + + LE F
Sbjct: 80 KFNEEDSEVAQAGKELELFF 99
>gnl|CDD|221124 pfam11496, HDA2-3, Class II histone deacetylase complex subunits 2
and 3. This family of class II histone deacetylase
complex subunits HDA2 and HDA3 is found in fungi, The
member from S. pombe is referred to as Ccq1. These
proteins associate with HDA1 to generate the activity of
the HDA1 histone deacetylase complex. HDA1 interacts with
itself and with the HDA2-HDA3 subcomplex to form a
probable tetramer and these interactions are necessary
for catalytic activity. The HDA1 histone deacetylase
complex is responsible for the deacetylation of lysine
residues on the N-terminal part of the core histones
(H2A, H2B, H3 and H4). Histone deacetylation gives a tag
for epigenetic repression and plays an important role in
transcriptional regulation, cell cycle progression and
developmental events. HDA2 and HDA3 have a conserved
coiled-coil domain towards their C-terminus.
Length = 279
Score = 47.4 bits (113), Expect = 3e-05
Identities = 38/178 (21%), Positives = 73/178 (41%), Gaps = 21/178 (11%)
Query: 881 SGKFELLDRILPKL--KSTGHRVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRG 938
SGKF +L+ ++ L VL+ + + ++++E +G Y RL G + E+
Sbjct: 94 SGKFLVLNDLINLLIRSERDLHVLIISRSVKTLDLVEALLLGKGLNYKRLSGESLYEE-- 151
Query: 939 DLLKKFNAPDSEYFIFVLSTRAGGL------GLNLQTADTVIIFDSDWNPH----QDLQA 988
+ +I + ++ GL L+ D +I FD + + L+
Sbjct: 152 NHKVSDKKGSLSLWIHLTTSD--GLTNTDSSLLSNYKFDLIISFDPSLDTSLPSIESLRT 209
Query: 989 QDRAHRIGQKNEVRVLRLMTVNSVEERILAAARYKLNMDEKVIQAGMFDQKSTGSERH 1046
Q+R N ++RL+ VNS+E L + N + ++QA + + G
Sbjct: 210 QNRRG-----NLTPIIRLVVVNSIEHVELCFPKKYPNRLDYLVQASVVLRDIVGDLPP 262
>gnl|CDD|99956 cd05526, Bromo_polybromo_VI, Bromodomain, polybromo repeat VI.
Polybromo is a nuclear protein of unknown function, which
contains 6 bromodomains. The human ortholog BAF180 is
part of a SWI/SNF chromatin-remodeling complex, and it
may carry out the functions of Yeast Rsc-1 and Rsc-2. It
was shown that polybromo bromodomains bind to histone H3
at specific acetyl-lysine positions. Bromodomains are
found in many chromatin-associated proteins and in
nuclear histone acetyltransferases. They interact
specifically with acetylated lysine, but not all the
bromodomains in polybromo may bind to acetyl-lysine.
Length = 110
Score = 43.5 bits (103), Expect = 5e-05
Identities = 26/106 (24%), Positives = 50/106 (47%), Gaps = 2/106 (1%)
Query: 1215 QAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIE 1274
Q +++ L + V+ + D +GR S+ +LP + P+ + I ++
Sbjct: 1 QLLVQELLATLFVSVMNHQDEEGRCYSDSLAELPELAVDGVGPK--KIPLTLDIIKRNVD 58
Query: 1275 DGKYSSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKAR 1320
G+Y +D+ Q+D + A+ + S I+ED+V L+ F K R
Sbjct: 59 KGRYRRLDKFQEDMFEVLERARRLSRTDSEIYEDAVELQQFFIKIR 104
>gnl|CDD|237871 PRK14965, PRK14965, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 576
Score = 45.9 bits (109), Expect = 1e-04
Identities = 33/133 (24%), Positives = 44/133 (33%), Gaps = 21/133 (15%)
Query: 45 APGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTS--AQVQQLRF 102
A + Q +LT L RA M PR ++EM +K A + A V +L
Sbjct: 322 ADAADLQRHLTLLLRAEGEMA---HASFPRL--VLEM----ALLKMATLAPGAPVSELLD 372
Query: 103 QIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPM 162
++ A L R P P A G P PP P + P A+P
Sbjct: 373 RLEA---LERGAPAPPSAAWGAPTPAAPAAPPPAAAPP-------VPPAAPARPAAARPA 422
Query: 163 PLQQQPPPQPHQQ 175
P P
Sbjct: 423 PAPAPPAAAAPPA 435
>gnl|CDD|219339 pfam07223, DUF1421, Protein of unknown function (DUF1421). This
family represents a conserved region approximately 350
residues long within a number of plant proteins of
unknown function.
Length = 357
Score = 45.3 bits (107), Expect = 1e-04
Identities = 44/191 (23%), Positives = 65/191 (34%), Gaps = 23/191 (12%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPG-QNPQENLTALQRAID 62
S P+ PPQQ Q P PP P P P Q P Q PQ Q
Sbjct: 95 SHQYPSQLPPQQVQSVPQQPTPQQ-EPYYPPPSQPQPPPAQQPQAQQPQPPPQVPQ---- 149
Query: 63 SMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAM 122
+Q + P+ + + + + + ++ +Q +Y N+PL +AM
Sbjct: 150 ---QQQYQSPPQQPQYQQNPPPQAQSAPQVSGLYPEESPYQPQSY---PPNEPLPSSMAM 203
Query: 123 GVQGKRMEGVPS----GPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP-----PQPH 173
Q PS GP P ++G P+ P+ QP P Q Q P P
Sbjct: 204 --QPPYSGAPPSQQFYGPPQPSPYMYGGPGGRPNSGFPSGQQPPPSQGQEGYGYSGPPPS 261
Query: 174 QQQGHISSQIK 184
+ +
Sbjct: 262 KGNHGSVASYA 272
Score = 40.7 bits (95), Expect = 0.005
Identities = 38/198 (19%), Positives = 52/198 (26%), Gaps = 23/198 (11%)
Query: 11 PPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLE 70
PPP Q QPP Q P PP P Q+P Q PQ +A + + GL
Sbjct: 123 PPPSQPQPP--PAQQPQAQQPQPPPQVPQQQQYQSPPQQPQYQQNPPPQAQSAPQVSGLY 180
Query: 71 -EDPRYQKL----IEMKANRTEIKHAF-TSAQVQQL------RFQIMAYRLLARNQPLTP 118
E+ YQ E + ++ + + QQ + N
Sbjct: 181 PEESPYQPQSYPPNEPLPSSMAMQPPYSGAPPSQQFYGPPQPSPYMYGGPGGRPNSGFPS 240
Query: 119 QLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQ---------PMPNQAQPMPLQQQPP 169
+ SGP + P P A +P
Sbjct: 241 GQQPPPSQGQEGYGYSGPPPSKGNHGSVASYAPQGSSQSYSTAYPSLPAATVLPQALPMS 300
Query: 170 PQPHQQQGHISSQIKQSK 187
P G S Q
Sbjct: 301 SAPMSGGGSGSPQSGNRV 318
Score = 37.6 bits (87), Expect = 0.043
Identities = 31/121 (25%), Positives = 40/121 (33%), Gaps = 23/121 (19%)
Query: 110 LARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQ---------------- 153
A Q + L + + P S P +PP Q
Sbjct: 63 DAPLQQVNAALPPAPAPQSPQPDQQQQSQAPPSHQYPSQLPPQQVQSVPQQPTPQQEPYY 122
Query: 154 PMPNQAQPMPLQQ------QPPPQ-PHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQER 206
P P+Q QP P QQ QPPPQ P QQQ Q Q + P+ + + L
Sbjct: 123 PPPSQPQPPPAQQPQAQQPQPPPQVPQQQQYQSPPQQPQYQQNPPPQAQSAPQVSGLYPE 182
Query: 207 E 207
E
Sbjct: 183 E 183
Score = 34.5 bits (79), Expect = 0.37
Identities = 43/193 (22%), Positives = 49/193 (25%), Gaps = 43/193 (22%)
Query: 5 STSPNPPPPQQQQPPLNVG---QLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAI 61
ST P P Q + L Q+ P + P SP P Q Q P + Q
Sbjct: 46 STKQPPAPEQVAKHELADAPLQQVNAALPPAPAPQSPQPDQ-QQQSQAPPSHQYPSQLP- 103
Query: 62 DSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLA 121
+ QQ Q Y P PQ
Sbjct: 104 ----------------------------PQQVQSVPQQPTPQQEPYY----PPPSQPQPP 131
Query: 122 MGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISS 181
Q + PQ PP P Q Q P P Q Q PQ S
Sbjct: 132 PAQQPQ-----AQQPQPPPQVPQQQQYQSPPQQPQYQQNPPP-QAQSAPQVSGLYPEESP 185
Query: 182 QIKQSKLTNIPKP 194
QS N P P
Sbjct: 186 YQPQSYPPNEPLP 198
Score = 31.4 bits (71), Expect = 3.1
Identities = 19/106 (17%), Positives = 32/106 (30%), Gaps = 11/106 (10%)
Query: 89 KHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG-----KRMEGVPSGPQMPPMSL 143
K Q + + Q+ ++ ++ + + Q + P ++
Sbjct: 14 KQEIAETQKELSKLQL-SHEEAQSSEAHSFHVDSTKQPPAPEQVAKHELADAPL-QQVNA 71
Query: 144 HGPMPMPPSQPMPNQAQPM--PLQQQPPPQ--PHQQQGHISSQIKQ 185
P P P P+Q Q P Q P Q P Q Q Q
Sbjct: 72 ALPPAPAPQSPQPDQQQQSQAPPSHQYPSQLPPQQVQSVPQQPTPQ 117
>gnl|CDD|223989 COG1061, SSL2, DNA or RNA helicases of superfamily II
[Transcription / DNA replication, recombination, and
repair].
Length = 442
Score = 45.1 bits (107), Expect = 2e-04
Identities = 53/278 (19%), Positives = 100/278 (35%), Gaps = 33/278 (11%)
Query: 553 VNGKLKEYQIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKVNGPFLII 612
+L+ YQ + L+ +V G++ G GKT+ I ++ L++
Sbjct: 33 FEFELRPYQEEALDALVKNRRTERRGVIVLPTGAGKTVVAAEAI------AELKRSTLVL 86
Query: 613 VPLSTL-SNWSLEFERWA-PSVNVVAYKGSPHLRKTLQAQMKASKFNVLLTTYEYVIKDK 670
VP L W+ +++ + + Y G K L V + T + + + +
Sbjct: 87 VPTKELLDQWAEALKKFLLLNDEIGIYGGG---EKEL------EPAKVTVATVQTLARRQ 137
Query: 671 GPLAKL--HWKYMIIDEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTP---LQNKLPELW 725
L + +I DE H + IL A RL LT TP ++ +
Sbjct: 138 LLDEFLGNEFGLIIFDEVHHLPAP--SYRRILELLSAAYPRLGLTATPEREDGGRIGD-- 193
Query: 726 ALLNFLLPSI---FKSVSTFEQWFNAPFATTGEKVELNEEETILIIRRLHKVLRPFLLRR 782
L L+ I ++ + AP+ KV L E+E + R L R
Sbjct: 194 --LFDLIGPIVYEVSLKELIDEGYLAPYKYVEIKVTLTEDE-EREYAKESARFRELLRAR 250
Query: 783 LKKEVESQLPDKVEYIIKCDMSGLQKVLYRHMHTKGIL 820
E++ ++ + ++ ++ +L +H L
Sbjct: 251 GTLRAENEAR-RIAIASERKIAAVRGLLLKHARGDKTL 287
>gnl|CDD|220950 pfam11029, DAZAP2, DAZ associated protein 2 (DAZAP2). DAZ
associated protein 2 has a highly conserved sequence
throughout evolution including a conserved polyproline
region and several SH2/SH3 binding sites. It occurs as a
single copy gene with a four-exon organisation and is
located on chromosome 12. It encodes a ubiquitously
expressed protein and binds to DAZ and DAZL1 through DAZ
repeats.
Length = 136
Score = 42.1 bits (99), Expect = 2e-04
Identities = 13/51 (25%), Positives = 17/51 (33%), Gaps = 5/51 (9%)
Query: 131 GVPSGPQMPPMSLHGP-----MPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
VP QMP S P +PM + Q+ P+ P P
Sbjct: 17 VVPPQAQMPQASAPYPGPSMYLPMAQVMAVGPQSSHPPMAYYPIGAPPPVY 67
>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
Length = 3151
Score = 44.9 bits (106), Expect = 4e-04
Identities = 40/200 (20%), Positives = 55/200 (27%), Gaps = 21/200 (10%)
Query: 11 PPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLT--ALQRAIDSMKEQG 68
PP P G P + G P P+P AP P LT A+ +S +
Sbjct: 2741 PPAVPAGPATPGGPARPARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLP 2799
Query: 69 LEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMG---VQ 125
DP + A + + P P L +G
Sbjct: 2800 SPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPP-GPPPPSLPLGGSVAP 2858
Query: 126 G---------KRMEGVPSGPQMPPMS-LHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQ 175
G + P+ P PP+ L P ++ A P P Q + PPQP
Sbjct: 2859 GGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES---FALP-PDQPERPPQPQAP 2914
Query: 176 QGHISSQIKQSKLTNIPKPE 195
P P
Sbjct: 2915 PPPQPQPQPPPPPQPQPPPP 2934
Score = 41.8 bits (98), Expect = 0.003
Identities = 38/191 (19%), Positives = 52/191 (27%), Gaps = 33/191 (17%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDS 63
+ P P Q PP G P P G G + P ++P A R
Sbjct: 2825 AGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPAR---- 2880
Query: 64 MKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMG 123
P ++L +R+ +F Q R P PQ
Sbjct: 2881 ---------PPVRRLARPAVSRS--TESFALPPDQPER-------------PPQPQAPPP 2916
Query: 124 VQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQI 183
Q + P PQ PP P P P P P + P G +
Sbjct: 2917 PQPQPQPPPPPQPQPPPP----PPPRPQPPLAP-TTDPAGAGEPSGAVPQPWLGALVPGR 2971
Query: 184 KQSKLTNIPKP 194
+P+P
Sbjct: 2972 VAVPRFRVPQP 2982
Score = 38.0 bits (88), Expect = 0.046
Identities = 24/74 (32%), Positives = 33/74 (44%), Gaps = 5/74 (6%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGS--GPPGSPGPSPGQAPGQNPQENLT--ALQR 59
S +P P P PP LP PGS GP P P AP P + A ++
Sbjct: 414 SVPTPAPTPVPASAPPPPATPLPSAEPGSDDGPAPPPERQP-PAPATEPAPDDPDDATRK 472
Query: 60 AIDSMKEQGLEEDP 73
A+D+++E+ E P
Sbjct: 473 ALDALRERRPPEPP 486
Score = 36.5 bits (84), Expect = 0.13
Identities = 35/165 (21%), Positives = 51/165 (30%), Gaps = 15/165 (9%)
Query: 12 PPQQQQPPLNV---GQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQG 68
PPQ +P V G AP S P P +P N +
Sbjct: 2592 PPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPER 2651
Query: 69 LEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKR 128
+DP ++ + R + A S+ Q+ R + P +
Sbjct: 2652 PRDDPAPGRVSRPRRARRLGRAAQASSPPQR-----------PRRRAARPTVGSLTSLAD 2700
Query: 129 MEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQP-MPLQQQPPPQP 172
P P+ P +L P+PP QA P +P PP P
Sbjct: 2701 PPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVP 2745
Score = 33.8 bits (77), Expect = 0.97
Identities = 29/175 (16%), Positives = 41/175 (23%), Gaps = 6/175 (3%)
Query: 7 SPNPP-----PPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAI 61
+P PP P ++ V L P P P A
Sbjct: 2768 APAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGP 2827
Query: 62 DSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLA 121
P + A ++ + A + A +P +LA
Sbjct: 2828 LPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLA 2887
Query: 122 MGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQ-QPPPQPHQQ 175
+ E P P P PP P P PPP+P
Sbjct: 2888 RPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPP 2942
Score = 31.4 bits (71), Expect = 4.2
Identities = 29/170 (17%), Positives = 43/170 (25%), Gaps = 27/170 (15%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDS 63
+P P P P A P G G AP + A
Sbjct: 2571 PRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPP--- 2627
Query: 64 MKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMG 123
P AN + T ++ R+ P +++
Sbjct: 2628 ------PPSPS------PAANEPDPHPPPTVPPPER-----------PRDDPAPGRVSRP 2664
Query: 124 VQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPH 173
+ +R G + PP P + + A P P P P PH
Sbjct: 2665 RRARR-LGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPH 2713
>gnl|CDD|99940 cd05508, Bromo_RACK7, Bromodomain, RACK7_like subfamily. RACK7 (also
called human protein kinase C-binding protein) was
identified as a potential tumor suppressor genes, it
shares domain architecture with BS69/ZMYND11; both have
been implicated in the regulation of cellular
proliferation. Bromodomains are 110 amino acid long
domains, that are found in many chromatin associated
proteins. Bromodomains can interact specifically with
acetylated lysine.
Length = 99
Score = 40.4 bits (95), Expect = 5e-04
Identities = 20/59 (33%), Positives = 31/59 (52%)
Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYN 1299
+EPF+K ++ PDY + + +PMD+ + + Y S D D K + NA IYN
Sbjct: 20 AEPFLKPVDLEQFPDYAQYVFKPMDLSTLEKNVRKKAYGSTDAFLADAKWILHNAIIYN 78
>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
recombination, and repair].
Length = 908
Score = 44.0 bits (104), Expect = 7e-04
Identities = 45/241 (18%), Positives = 100/241 (41%), Gaps = 24/241 (9%)
Query: 200 LIILQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIA 259
LI L E E + +E ++E+L L +++ + QL+ E+
Sbjct: 517 LIELLELEEALKEELEEKLEKLENLLEELEELKEKLQLQ-------------QLKEELRQ 563
Query: 260 CARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQK-----HQEYIT 314
R L+ + RT+++ L+E R K K++ E E + Q E
Sbjct: 564 LEDRLQELKELLEELRLLRTRKEELEELRERLKELKKKLKELEERLSQLEELLQSLELSE 623
Query: 315 TVLQHCKDFKEYHRNNQARI---MRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAED 371
+ ++ +E + ++ L + + E++ ++ + I +E R E
Sbjct: 624 AENEL-EEAEEELESELEKLNLQAELEELLQAALEELEEKVEELEAEIRRELQRIENEEQ 682
Query: 372 EEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKE-HKMEQKKKQDEESKKRKQSVKQK 430
E + ++Q +++ L L + +E + L ++ + ++E +K + EE KK + +++
Sbjct: 683 LEEKLEELEQLEEE-LEQLREELEELLKKLGEIEQLIEELESRKAELEELKKELEKLEKA 741
Query: 431 L 431
L
Sbjct: 742 L 742
Score = 41.7 bits (98), Expect = 0.003
Identities = 42/268 (15%), Positives = 106/268 (39%), Gaps = 14/268 (5%)
Query: 196 GLDPLIILQERENRVALNIERRIEELNGSLTSTLPEHLR--------VKAEIELRALKVL 247
GL+ L E V + +IEEL G L+ L + +K +L ++
Sbjct: 165 GLEKYEKLSELLKEVIKEAKAKIEELEGQLSELLEDIEDLLEALEEELKELKKLEEIQEE 224
Query: 248 NFQRQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQ 307
+ +L E+ A R LE + + + E+ E L+ +++ E ++
Sbjct: 225 QEEEELEQEIEALEERLAELEEEKERLEELKARLLEI-ESLELEALKIREEELRELERLL 283
Query: 308 KHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQK---KEQERIEKERM 364
+ E L+ + E + L + + + ++ K +E++EK
Sbjct: 284 EELEEKIERLEELEREIEELEEELEGLRALLEELEELLEKLKSLEERLEKLEEKLEKLES 343
Query: 365 RRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEH--KMEQKKKQDEESKK 422
+E+ + +++ K L L + ++ + + +K+ +++ K++ E
Sbjct: 344 ELEELAEEKNELAKLLEERLKELEERLEELEKELEKALERLKQLEEAIQELKEELAELSA 403
Query: 423 RKQSVKQKLMDTDGKVTLDQDETSQLTD 450
+ ++++L + + ++ + E +L +
Sbjct: 404 ALEEIQEELEELEKELEELERELEELEE 431
Score = 32.8 bits (75), Expect = 1.8
Identities = 23/156 (14%), Positives = 56/156 (35%), Gaps = 20/156 (12%)
Query: 1070 ETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKE 1129
E L EEE + R + + + L E+ E L++ +E +++ E E
Sbjct: 477 ELYELELEELEEELSREKEEAELREEIEELEKELRELEEELIELLELEEALKEELEEKLE 536
Query: 1130 EEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKK 1189
+ + E L+ + + ++ +EE + ++ K +
Sbjct: 537 KLE--------------------NLLEELEELKEKLQLQQLKEELRQLEDRLQELKELLE 576
Query: 1190 TEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
+ +E+ K+ +K +L++ L ++
Sbjct: 577 ELRLLRTRKEELEELRERLKELKKKLKELEERLSQL 612
Score = 31.3 bits (71), Expect = 4.4
Identities = 52/326 (15%), Positives = 115/326 (35%), Gaps = 33/326 (10%)
Query: 216 RRIEELNGSLTSTLPEHLRVKAEIELRAL--------KVLNFQRQLRAEVIACARRDTTL 267
RI + + + E L + + R++ L + + R E++ D
Sbjct: 110 ERIADGKKDVNEKIEELLGLDKDTFTRSVYLPQGEFDAFLKSKPKERKEIL-----DELF 164
Query: 268 ETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKR--QKHQEYITTVLQHCKDFKE 325
K + K + E+LE Q E + + +E + + K
Sbjct: 165 GLEKYEKLSELLKEVIKEAKAKIEELEGQLSELLEDIEDLLEALEEELKELK------KL 218
Query: 326 YHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDK 385
+ L + + E ++E+ER+E+ + R L E E I +++ +
Sbjct: 219 EEIQEEQEEEELEQEIEAL-EERLAELEEEKERLEELKARLLEIESLELEALKIREEELR 277
Query: 386 RLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDET 445
L LL + +E I L + E ++E+ +++ E + + +++ L + +L++
Sbjct: 278 ELERLLEELEEKIERLEE--LEREIEELEEELEGLRALLEELEELL---EKLKSLEERLE 332
Query: 446 SQLTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKE 505
+ E+ K E A L + +++ + + E E ++ +E
Sbjct: 333 KLEEKLEKLESELEELAEEKNELAKLLEERLKELEER---LEELEKELEKALERLKQLEE 389
Query: 506 KTSGENENKEKNKGEDDEYNKNAMEE 531
E KE+ + E
Sbjct: 390 AIQ---ELKEELAELSAALEEIQEEL 412
Score = 30.5 bits (69), Expect = 7.9
Identities = 27/182 (14%), Positives = 75/182 (41%), Gaps = 5/182 (2%)
Query: 1056 DDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAE-----RRKEQGKKSRLIEVSELP 1110
+ E + E + + E + L + + ++ E + + + L E+ E
Sbjct: 231 EQEIEALEERLAELEEEKERLEELKARLLEIESLELEALKIREEELRELERLLEELEEKI 290
Query: 1111 DWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDE 1170
+ L + + EIE+ E + L ++ + +E L+ ++ +E E
Sbjct: 291 ERLEELEREIEELEEELEGLRALLEELEELLEKLKSLEERLEKLEEKLEKLESELEELAE 350
Query: 1171 EEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVI 1230
E+ E ++ +R + ++ E+ ++E + +R K+ E+ ++ + +L + + +
Sbjct: 351 EKNELAKLLEERLKELEERLEELEKELEKALERLKQLEEAIQELKEELAELSAALEEIQE 410
Query: 1231 KY 1232
+
Sbjct: 411 EL 412
>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
Length = 2084
Score = 43.6 bits (102), Expect = 0.001
Identities = 62/338 (18%), Positives = 127/338 (37%), Gaps = 29/338 (8%)
Query: 207 ENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAE---VIACARR 263
E+ + I R+ E+ + + E + KAE +A +V + +AE AR+
Sbjct: 1149 EDAKRVEIARKAEDARKAEEARKAEDAK-KAEAARKAEEVRKAEELRKAEDARKAEAARK 1207
Query: 264 DTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDF 323
A + + K+ A A +K E+ +K E KK ++ ++ ++
Sbjct: 1208 AEEERKAEEARKAEDAKK-----AEAVKKAEEAKKDAEEAKKAEE--------ERNNEEI 1254
Query: 324 KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKK 383
+++ A R A+ A E KK +E+ + + ++ AE+++ + + +
Sbjct: 1255 RKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAKK--AEEKKKADEAKKKAE 1312
Query: 384 DKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQD 443
+ + A + E K+ K E+ KK E +K ++ + + K +
Sbjct: 1313 EAKKADEAKKKAEEAKKKADAAKK-KAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEK 1371
Query: 444 ETSQ-------LTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENE 496
+ + ++ K ED A LK+ + EE +
Sbjct: 1372 KKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKK 1431
Query: 497 --DEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEA 532
DE +K++E + K+ + + E K EEA
Sbjct: 1432 KADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEA 1469
Score = 42.8 bits (100), Expect = 0.002
Identities = 55/296 (18%), Positives = 121/296 (40%), Gaps = 37/296 (12%)
Query: 231 EHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARAT 290
+ + AE + +A + + +A+ A + A KA ++ K LK+A
Sbjct: 1500 DEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAK--KAEEKKKADELKKAEEL 1557
Query: 291 EKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEK 350
+K E+++K E +K +E L+ ++ K+ +M+L + AE+
Sbjct: 1558 KKAEEKKKAEEAKKA----EEDKNMALRKAEEAKKAEEARIEEVMKLYEE--EKKMKAEE 1611
Query: 351 EQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEHKM 410
+K E+ +I+ E +++ E +K ++Q K K + ++ K
Sbjct: 1612 AKKAEEAKIKAEELKK-----AEEEKKKVEQLKKK-----------------EAEEKKKA 1649
Query: 411 EQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISVREISSGKVLKGEDAP 470
E+ KK +EE+K + +K + D+ + + ++ + + E+A
Sbjct: 1650 EELKKAEEENKIKAAEEAKKAEE-------DKKKAEEAKKAEEDEKKAAEALKKEAEEAK 1702
Query: 471 LAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNK 526
A LK+ + EE +E++K++ + + E+ K +++E K
Sbjct: 1703 KAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKK 1758
Score = 42.1 bits (98), Expect = 0.003
Identities = 40/157 (25%), Positives = 72/157 (45%), Gaps = 25/157 (15%)
Query: 1077 ARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEE----K 1132
A+ EE + A R+ E+ KK E + + + + +EE + A EAK+ E K
Sbjct: 1569 AKKAEE----DKNMALRKAEEAKK---AEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIK 1621
Query: 1133 ALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTED 1192
A + + ++K+V+ +E KA EE ++ EE + + KK E+
Sbjct: 1622 AEELKKAEEEKKKVEQLKKKEAEEKKKA---------EELKKAEEENKIKAAEEAKKAEE 1672
Query: 1193 DDEEPSTSKK-----RKKEKEKDREKDQAKLKKTLKK 1224
D ++ +KK +K + +E ++AK + LKK
Sbjct: 1673 DKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKK 1709
Score = 40.9 bits (95), Expect = 0.007
Identities = 59/302 (19%), Positives = 123/302 (40%), Gaps = 30/302 (9%)
Query: 254 RAEVIACARRDTTLETAVNVKAYKRTKRQGL-KEARATEKLEKQQKVEAERKKRQKHQEY 312
+AE A E + K+ K + K A K E+ +K E E+KK ++ ++
Sbjct: 1582 KAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKK 1641
Query: 313 ITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDE 372
+ ++ K+ N+ + K AE+ +K E++ + + AE+
Sbjct: 1642 EAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEA 1701
Query: 373 EGYRKL-IDQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKK-----QDEESKKRKQS 426
+ +L + ++K+ A L + +E + K+ E KKK +DEE KK+
Sbjct: 1702 KKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAH 1761
Query: 427 VKQKLMDTDGKVT----------LDQDETSQLTDMHISVREISSGKVL---KGEDAPLAA 473
+K++ ++ LD+++ + ++ +++I G++ L
Sbjct: 1762 LKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIFDNFANIIEGGKEGNLVI 1821
Query: 474 HLKQWIQDHPGWEVVADSDEENEDED----------SEKSKEKTSGENENKEKNKGEDDE 523
+ + ++D EV + + E+ D +E ++ + NKEK+ EDDE
Sbjct: 1822 NDSKEMEDSAIKEVADSKNMQLEEADAFEKHKFNKNNENGEDGNKEADFNKEKDLKEDDE 1881
Query: 524 YN 525
Sbjct: 1882 EE 1883
Score = 34.7 bits (79), Expect = 0.45
Identities = 55/283 (19%), Positives = 117/283 (41%), Gaps = 19/283 (6%)
Query: 1058 EEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKED 1117
E+++ A+ E + EE + + + E+ KK+ E + L K +
Sbjct: 1572 AEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAE--EAKIKAEELKKAE 1629
Query: 1118 EEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEE 1177
EE ++ K+E + + +K + +E KA +D + ++ ++ EE+E
Sbjct: 1630 EEKKKVEQLKKKEAEEKK--KAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDE 1687
Query: 1178 VRS----KRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYT 1233
++ K++ + KK E+ ++ + KK+ +E +K E+++ K ++ K+ K
Sbjct: 1688 KKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEED--KKK 1745
Query: 1234 DSDGRVLSEPFIKLPSRKELPDYYEVIDRPMD---IKKILGRIEDGKYSSVDELQKDFKT 1290
+ + E K+ K+ + R I++ L ++ + VD+ KD
Sbjct: 1746 AEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIFD 1805
Query: 1291 LCRNAQIYNEELSLI------HEDSVVLESVFTKARQRVESGE 1327
N +E +L+ EDS + E +K Q E+
Sbjct: 1806 NFANIIEGGKEGNLVINDSKEMEDSAIKEVADSKNMQLEEADA 1848
Score = 34.3 bits (78), Expect = 0.60
Identities = 46/272 (16%), Positives = 109/272 (40%), Gaps = 34/272 (12%)
Query: 984 QDLQAQDRAHRIGQKNEVRVLRLMTVNSVEE-RILAAARYKLNMDEKVIQAGMFDQKSTG 1042
++L+ + + + + + M + EE + AR + M + M +++
Sbjct: 1555 EELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKK 1614
Query: 1043 SERHQFLQTILHQDDEEDE--EENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKK 1100
+E + L + +EE + E+ + E + + E + + E +K + K
Sbjct: 1615 AEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDK 1674
Query: 1101 SRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKA 1160
+ E K +E+ E++ A + + + + K+ + +E KA
Sbjct: 1675 KKAEE-------AKKAEED---------EKKAAEALKKEAEEAKKAEELKKKEAEEKKKA 1718
Query: 1161 IDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEK------DREKD 1214
EE ++ EE + + +K+ E+D ++ +KK ++EK+K + EK
Sbjct: 1719 ---------EELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKK 1769
Query: 1215 QAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIK 1246
+++K + ++ + D R+ + IK
Sbjct: 1770 AEEIRKEKEAVIEEELDEEDEKRRMEVDKKIK 1801
Score = 33.6 bits (76), Expect = 1.0
Identities = 33/142 (23%), Positives = 69/142 (48%), Gaps = 7/142 (4%)
Query: 1091 AERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTD 1150
AE K+ + + E ++ D K+ EE ++ A EAK+ +A +++ ++ D
Sbjct: 1466 AEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKAD 1525
Query: 1151 SLTEKEWLKAIDDGVEYDDEEEEEE----EEVR---SKRKGKRRKKTEDDDEEPSTSKKR 1203
+ E K D+ + +++++ +E EE++ K+K + KK E+D +
Sbjct: 1526 EAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEE 1585
Query: 1204 KKEKEKDREKDQAKLKKTLKKI 1225
K+ E+ R ++ KL + KK+
Sbjct: 1586 AKKAEEARIEEVMKLYEEEKKM 1607
Score = 33.6 bits (76), Expect = 1.1
Identities = 28/160 (17%), Positives = 64/160 (40%), Gaps = 13/160 (8%)
Query: 1078 RSEEEFQTYQ---RIDAERRKEQGKKS----------RLIEVSELPDWLIKEDEEIEQWA 1124
R EE + + + +A ++ E+ KK R E + Q A
Sbjct: 1212 RKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAA 1271
Query: 1125 FEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKG 1184
+A+E KA + + ++K + + +K+ +A E +E +++ +K+K
Sbjct: 1272 IKAEEARKADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKA 1331
Query: 1185 KRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
KK ++ ++ + + K + E D + + + +K
Sbjct: 1332 DAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEK 1371
Score = 32.4 bits (73), Expect = 1.9
Identities = 34/163 (20%), Positives = 68/163 (41%), Gaps = 16/163 (9%)
Query: 1077 ARSEEEFQTYQ---RIDAERRKEQGKKS----RLIEVSELPDWLIKEDEEIEQWAFEAKE 1129
AR E + + + + R+ E KK+ + E + + K +EE E
Sbjct: 1199 ARKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFE 1258
Query: 1130 EEKALHMGRGSRQRK--QVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRR 1187
E + H R K + D L + E K D+ + EE+++ +E + K + ++
Sbjct: 1259 EARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAKK--AEEKKKADEAKKKAEEAKK 1316
Query: 1188 -----KKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
KK E+ ++ +KK+ +E +K E +A+ + +
Sbjct: 1317 ADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEA 1359
Score = 31.6 bits (71), Expect = 3.8
Identities = 37/174 (21%), Positives = 76/174 (43%), Gaps = 7/174 (4%)
Query: 1056 DDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIK 1115
++ + E A + E +EE+ + ++ E +K+ + E + D K
Sbjct: 1339 EEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKK 1398
Query: 1116 EDEEIEQWAFEAKEEEKALHMGRGSRQR-KQVDYTDSLTEKEWLKAIDDGVEYDDEEEEE 1174
+ EE ++ A E K+ A ++++ ++ D +K D + EE ++
Sbjct: 1399 KAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKK 1458
Query: 1175 EEEVRSKRKGKRRKKTEDDDEEPSTSKK----RKKEKEKDREKDQAKLKKTLKK 1224
EE K+K + KK ++ ++ +KK +KK +E ++ D+AK KK
Sbjct: 1459 AEEA--KKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKK 1510
Score = 31.3 bits (70), Expect = 4.5
Identities = 55/287 (19%), Positives = 102/287 (35%), Gaps = 17/287 (5%)
Query: 261 ARRDTTLETAVNVKAYKRTKR-QGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQH 319
A++ T + KA + K+ + ++A K E +K E RK + I +
Sbjct: 1103 AKKTETGKAEEARKAEEAKKKAEDARKAEEARKAEDARKAEEARKAEDAKRVEIARKAED 1162
Query: 320 CKDFKEYHRNNQA-RIMRLNKAVMNYHAN----------AEKEQKKEQERIEKERMRRLM 368
+ +E + A + KA A AE +K E+ER +E +
Sbjct: 1163 ARKAEEARKAEDAKKAEAARKAEEVRKAEELRKAEDARKAEAARKAEEERKAEEARKAED 1222
Query: 369 AEDEEGYRKLIDQKKDKRLAFLLSQ--TDEYISNLTQMVKEHKMEQKKKQDEESKKRKQS 426
A+ E +K + KKD A + +E I + H ++ E ++
Sbjct: 1223 AKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKADE 1282
Query: 427 VKQKLMDTDGKVTLDQDETSQLTDMHISVREISSGKVLKG---EDAPLAAHLKQWIQDHP 483
+K+ +E + + E K E A K+ ++
Sbjct: 1283 LKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAK 1342
Query: 484 GWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNKNAME 530
A ++ E +++E ++EK + KE+ K + D K A E
Sbjct: 1343 KAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEE 1389
Score = 30.9 bits (69), Expect = 5.8
Identities = 52/303 (17%), Positives = 111/303 (36%), Gaps = 17/303 (5%)
Query: 216 RRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLETAVNVKA 275
++ +E + + + KAE +A + + + + A ++ + A
Sbjct: 1290 KKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAK 1349
Query: 276 YKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIM 335
+ + A EK E +K + E KK+ + + + K+ ++ +
Sbjct: 1350 -AEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKAD 1408
Query: 336 RLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTD 395
L KA E ++K E+++ E ++ AE+ + + + ++ + A +
Sbjct: 1409 ELKKAAAAKKKADEAKKKAEEKKKADEAKKK--AEEAKKADEAKKKAEEAKKAEEAKKKA 1466
Query: 396 EYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISV 455
E + K K E+ KK DE KK +++ K+ DE + +
Sbjct: 1467 EEAKKADEAKK--KAEEAKKADEAKKKAEEAKKKA------------DEAKKAAEAKKKA 1512
Query: 456 REISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKE 515
E + K D A + + E +DE + E+ +K++EK E K
Sbjct: 1513 DEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKA 1572
Query: 516 KNK 518
+
Sbjct: 1573 EED 1575
Score = 30.5 bits (68), Expect = 9.3
Identities = 39/162 (24%), Positives = 77/162 (47%), Gaps = 13/162 (8%)
Query: 1076 LARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAK---EEEK 1132
L ++EE+ + + AE +K+ + + E ++ D K+ EE ++ A AK EE K
Sbjct: 1283 LKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAK 1342
Query: 1133 ALHMGRGSRQRKQVDYTDSLTEKEWL--KAIDDGVEYDDEEEEEEEEVR----SKRKGKR 1186
+ D ++ EK K ++ + D +++ EE + +K+K +
Sbjct: 1343 KAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEE 1402
Query: 1187 RKKTEDDDEEPSTSKKR----KKEKEKDREKDQAKLKKTLKK 1224
KK D+ ++ + +KK+ KK+ E+ ++ D+AK K K
Sbjct: 1403 DKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK 1444
>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger. [Transport and
binding proteins, Cations and iron carrying compounds].
Length = 1096
Score = 43.4 bits (102), Expect = 0.001
Identities = 43/197 (21%), Positives = 78/197 (39%), Gaps = 26/197 (13%)
Query: 1020 ARYKLNMDEKVIQAGMFDQKSTGSERHQFLQTILHQDDEEDEEENAVPDDETVNQMLARS 1079
A +K + + ++ + + G+E ++T ++ EDE E V R
Sbjct: 704 ADHKGETEAEEVEHEGETE-AEGTEDEGEIETGEEGEEVEDEGEGEAEGKHEVETEGDRK 762
Query: 1080 EEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRG 1139
E E + + + +++G E+ D +K DE E E E
Sbjct: 763 ETEHEGETEAEGKEDEDEG------EIQAGEDGEMKGDEGAEGKVEHEGETEAGEKDEHE 816
Query: 1140 SRQRKQVDYTDSLTE--------------KEWLKAID-----DGVEYDDEEEEEEEEVRS 1180
+ Q D T+ E K+ K +D DG + ++EEEEEEEE
Sbjct: 817 GQSETQADDTEVKDETGEQELNAENQGEAKQDEKGVDGGGGSDGGDSEEEEEEEEEEEEE 876
Query: 1181 KRKGKRRKKTEDDDEEP 1197
+ + + ++ E+++EEP
Sbjct: 877 EEEEEEEEEEEEENEEP 893
Score = 31.5 bits (71), Expect = 3.7
Identities = 47/184 (25%), Positives = 76/184 (41%), Gaps = 15/184 (8%)
Query: 1013 EERILAAARYKLNMDEKVIQA-GMFDQKSTGSERHQFLQTILHQDDEEDEEENAVPDDET 1071
E I + DE +A G + ++ G + + + +EDE+E + E
Sbjct: 729 EGEIETGEEGEEVEDEGEGEAEGKHEVETEGDRKETEHEGETEAEGKEDEDEGEIQAGED 788
Query: 1072 VNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSE-LPDWLIKEDEEIEQWAF----- 1125
+M E +++ E E G+K SE D +DE EQ
Sbjct: 789 G-EMKGDEGAE----GKVEHEGETEAGEKDEHEGQSETQADDTEVKDETGEQELNAENQG 843
Query: 1126 EAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
EAK++EK + G GS D + E+E + ++ E ++EEEEE EE S +
Sbjct: 844 EAKQDEKGVDGGGGS---DGGDSEEEEEEEEEEEEEEEEEEEEEEEEEENEEPLSLEWPE 900
Query: 1186 RRKK 1189
R+K
Sbjct: 901 TRQK 904
>gnl|CDD|197891 smart00818, Amelogenin, Amelogenins, cell adhesion proteins, play a
role in the biomineralisation of teeth. They seem to
regulate formation of crystallites during the secretory
stage of tooth enamel development and are thought to
play a major role in the structural organisation and
mineralisation of developing enamel. The extracellular
matrix of the developing enamel comprises two major
classes of protein: the hydrophobic amelogenins and the
acidic enamelins. Circular dichroism studies of porcine
amelogenin have shown that the protein consists of 3
discrete folding units: the N-terminal region appears to
contain beta-strand structures, while the C-terminal
region displays characteristics of a random coil
conformation. Subsequent studies on the bovine protein
have indicated the amelogenin structure to contain a
repetitive beta-turn segment and a "beta-spiral" between
Gln112 and Leu138, which sequester a (Pro, Leu, Gln)
rich region. The beta-spiral offers a probable site for
interactions with Ca2+ ions. Muatations in the human
amelogenin gene (AMGX) cause X-linked hypoplastic
amelogenesis imperfecta, a disease characterised by
defective enamel. A 9bp deletion in exon 2 of AMGX
results in the loss of codons for Ile5, Leu6, Phe7 and
Ala8, and replacement by a new threonine codon,
disrupting the 16-residue (Met1-Ala16) amelogenin signal
peptide.
Length = 165
Score = 40.9 bits (96), Expect = 0.001
Identities = 26/92 (28%), Positives = 32/92 (34%), Gaps = 7/92 (7%)
Query: 110 LARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP 169
L QP+ PQ + M VP M P H P P+Q P Q Q P
Sbjct: 61 LPAQQPVVPQQPL------MP-VPGQHSMTPTQHHQPNLPQPAQQPFQPQPLQPPQPQQP 113
Query: 170 PQPHQQQGHISSQIKQSKLTNIPKPEGLDPLI 201
QP I Q L + + L PL+
Sbjct: 114 MQPQPPVHPIPPLPPQPPLPPMFPMQPLPPLL 145
Score = 39.8 bits (93), Expect = 0.003
Identities = 25/93 (26%), Positives = 32/93 (34%), Gaps = 3/93 (3%)
Query: 113 NQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP-PQ 171
L P + V + VP P MP H P QP Q P Q QP P
Sbjct: 49 THTLQPHHHIPVLPAQQPVVPQQPLMPVPGQHSMTPTQHHQPNLPQPAQQPFQPQPLQPP 108
Query: 172 PHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQ 204
QQ + Q + +P L P+ +Q
Sbjct: 109 QPQQP--MQPQPPVHPIPPLPPQPPLPPMFPMQ 139
Score = 32.8 bits (75), Expect = 0.55
Identities = 21/82 (25%), Positives = 27/82 (32%), Gaps = 8/82 (9%)
Query: 90 HAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPS--GPQMPPMSLHGPM 147
H+ T Q Q A QP PQ Q ++ P P
Sbjct: 80 HSMTPTQHHQPNLPQPA------QQPFQPQPLQPPQPQQPMQPQPPVHPIPPLPPQPPLP 133
Query: 148 PMPPSQPMPNQAQPMPLQQQPP 169
PM P QP+P +PL+ P
Sbjct: 134 PMFPMQPLPPLLPDLPLEAWPA 155
Score = 30.9 bits (70), Expect = 2.8
Identities = 12/44 (27%), Positives = 12/44 (27%)
Query: 8 PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQ 51
PN P P QQ Q P P P P P
Sbjct: 90 PNLPQPAQQPFQPQPLQPPQPQQPMQPQPPVHPIPPLPPQPPLP 133
>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit; Provisional.
Length = 482
Score = 41.8 bits (99), Expect = 0.002
Identities = 16/60 (26%), Positives = 35/60 (58%)
Query: 1165 VEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
+ E++ EEE + K+K K ++++EE KK ++++E++ E ++ K ++ KK
Sbjct: 414 KIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEKK 473
Score = 32.6 bits (75), Expect = 1.6
Identities = 19/91 (20%), Positives = 41/91 (45%)
Query: 1134 LHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDD 1193
LH + +R+ + + + + A + EEE E SK+ K+ KK +
Sbjct: 359 LHTSKRKVRREVLPFLSIIFKHNPELAARLAAFLELTEEEIEFLTGSKKATKKIKKIVEK 418
Query: 1194 DEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
E+ +K++K+K+ K + + ++ K+
Sbjct: 419 AEKKREEEKKEKKKKAFAGKKKEEEEEEEKE 449
Score = 31.0 bits (71), Expect = 4.8
Identities = 22/87 (25%), Positives = 44/87 (50%), Gaps = 6/87 (6%)
Query: 1116 EDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEE 1175
+EEIE + +K+ K + ++K+ + +KE K G + ++EEEEE+
Sbjct: 395 TEEEIE-FLTGSKKATKKIKKIVEKAEKKREE-----EKKEKKKKAFAGKKKEEEEEEEK 448
Query: 1176 EEVRSKRKGKRRKKTEDDDEEPSTSKK 1202
E+ +++ + + E+ +EE KK
Sbjct: 449 EKKEEEKEEEEEEAEEEKEEEEEKKKK 475
Score = 30.3 bits (69), Expect = 8.4
Identities = 19/87 (21%), Positives = 41/87 (47%), Gaps = 1/87 (1%)
Query: 1124 AFEAKEEEKALHMGRGSR-QRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKR 1182
E EEE G ++ + + ++E K + +++EEEEE ++
Sbjct: 391 FLELTEEEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEK 450
Query: 1183 KGKRRKKTEDDDEEPSTSKKRKKEKEK 1209
K + +++ E++ EE ++ KK+K+
Sbjct: 451 KEEEKEEEEEEAEEEKEEEEEKKKKQA 477
>gnl|CDD|99939 cd05507, Bromo_brd8_like, Bromodomain, brd8_like subgroup. In
mammals, brd8 (bromodomain containing 8) interacts with
the thyroid hormone receptor in a ligand-dependent
fashion and enhances thyroid hormone-dependent activation
from thyroid response elements. Brd8 is thought to be a
nuclear receptor coactivator. Bromodomains are 110 amino
acid long domains, that are found in many chromatin
associated proteins. Bromodomains can interact
specifically with acetylated lysine.
Length = 104
Score = 38.9 bits (91), Expect = 0.002
Identities = 21/59 (35%), Positives = 32/59 (54%)
Query: 1241 SEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYN 1299
+ F+K + P Y+ V+ RPMD+ I IE+G S E Q+D + +NA +YN
Sbjct: 21 ASVFLKPVTEDIAPGYHSVVYRPMDLSTIKKNIENGTIRSTAEFQRDVLLMFQNAIMYN 79
>gnl|CDD|130689 TIGR01628, PABP-1234, polyadenylate binding protein, human types 1,
2, 3, 4 family. These eukaryotic proteins recognize the
poly-A of mRNA and consists of four tandem RNA
recognition domains at the N-terminus (rrm: pfam00076)
followed by a PABP-specific domain (pfam00658) at the
C-terminus. The protein is involved in the transport of
mRNA's from the nucleus to the cytoplasm. There are four
paralogs in Homo sapiens which are expressed in testis
(GP:11610605_PABP3 ), platelets (SP:Q13310_PABP4 ),
broadly expressed (SP:P11940_PABP1) and of unknown
tissue range (SP:Q15097_PABP2).
Length = 562
Score = 42.1 bits (99), Expect = 0.002
Identities = 35/150 (23%), Positives = 48/150 (32%), Gaps = 23/150 (15%)
Query: 59 RAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTP 118
RA+ M + L P Y L A R E + A Q QL+ ++ + +
Sbjct: 341 RAVTEMHGRMLGGKPLYVAL----AQRKEQRRAHLQDQFMQLQPRMRQLPMGSPMGGAMG 396
Query: 119 QLAMGVQGKRME------GVPSGPQMP----PMSLHGPM---PMPPSQPMPNQAQPMPLQ 165
Q QG + + G P MP P P PM + AQ
Sbjct: 397 QPPYYGQGPQQQFNGQPLGWPRMSMMPTPMGPGGPLRPNGLAPMNAVRAPSRNAQNAA-- 454
Query: 166 QQPPPQPH----QQQGHISSQIKQSKLTNI 191
Q+PP QP Q SQ +
Sbjct: 455 QKPPMQPVMYPPNYQSLPLSQDLPQPQSTA 484
Score = 32.5 bits (74), Expect = 1.8
Identities = 16/93 (17%), Positives = 22/93 (23%), Gaps = 3/93 (3%)
Query: 95 AQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQP 154
Q L + M+ P P G+ PS P+ P
Sbjct: 409 FNGQPLGWPRMSMMP-TPMGPGGPLRPNGLAPMNAVRAPSRNAQNAAQKPPMQPV-MYPP 466
Query: 155 MPNQAQPMPLQQQPPPQPHQQ-QGHISSQIKQS 186
QP Q Q +Q+ S
Sbjct: 467 NYQSLPLSQDLPQPQSTASQGGQNKKLAQVLAS 499
Score = 30.9 bits (70), Expect = 5.7
Identities = 17/79 (21%), Positives = 21/79 (26%), Gaps = 12/79 (15%)
Query: 1 MSNSSTSPN--PPPPQQQQPPLN----VGQLPMGA----PG--SGPPGSPGPSPGQAPGQ 48
PN P + P N + PM P S P P P Q
Sbjct: 427 GPGGPLRPNGLAPMNAVRAPSRNAQNAAQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQ 486
Query: 49 NPQENLTALQRAIDSMKEQ 67
Q A A + + Q
Sbjct: 487 GGQNKKLAQVLASATPQMQ 505
Score = 30.5 bits (69), Expect = 7.6
Identities = 25/124 (20%), Positives = 40/124 (32%), Gaps = 9/124 (7%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKE 66
S P P P G PM A + + A + P + +
Sbjct: 420 SMMPTPMGPGGPLRPNGLAPMNAVRAPSRNAQ-----NAAQKPPMQPVMYPPNYQSLPLS 474
Query: 67 QGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG 126
Q L P+ Q ++ SA Q + Q++ RL + + P LA + G
Sbjct: 475 QDL---PQPQSTASQGGQNKKLAQVLASATPQMQK-QVLGERLFPLVEAIEPALAAKITG 530
Query: 127 KRME 130
+E
Sbjct: 531 MLLE 534
>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family. Emg1 and Nop14 are novel
proteins whose interaction is required for the maturation
of the 18S rRNA and for 40S ribosome production.
Length = 809
Score = 41.5 bits (98), Expect = 0.003
Identities = 34/155 (21%), Positives = 62/155 (40%), Gaps = 39/155 (25%)
Query: 1129 EEEKAL-HMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEE-EEEEVRSKRKGKR 1186
E+E L H+G+ SL+E + + D ++DD++ + R+ G
Sbjct: 111 EDEFVLTHLGQ------------SLSEIDKDDDVRDDDDFDDDDLGDLASDDRAAHFGGG 158
Query: 1187 RKKTEDDDEEP--------------STSKKRKKEKEKDREKDQA---KLKKTLKKIMRVV 1229
ED++E+P + SK K E++K +E+D+ +L K +M
Sbjct: 159 EDDEEDEEEQPERKKSKKEVMKEVIAKSKFYKAERQKAKEEDEDLREELDDDFKDLM--- 215
Query: 1230 IKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPM 1264
S R + P + +E D Y+ R +
Sbjct: 216 -----SLLRTVKPPPKPPMTPEEKDDEYDQRVREL 245
Score = 30.7 bits (70), Expect = 7.2
Identities = 32/193 (16%), Positives = 77/193 (39%), Gaps = 25/193 (12%)
Query: 350 KEQKKEQERIEK--ERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKE 407
K ++++++ E+ + + LM+ + + DEY + ++ +
Sbjct: 195 KAKEEDEDLREELDDDFKDLMSLLRTVKPPPKPPMTPE------EKDDEYDQRVRELTFD 248
Query: 408 HKM---------EQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISVREI 458
+ E+ K EE+++ K+ ++L G+ D++E D S ++
Sbjct: 249 RRAQPTDRTKTEEELAK--EEAERLKKLEAERLRRMRGEEEDDEEEE----DSKESADDL 302
Query: 459 SSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNK 518
+D + + + V D DEE++D+D E+ +E +E +++
Sbjct: 303 DDEFEPDDDDNFGLG--QGEEDEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEEEDEED 360
Query: 519 GEDDEYNKNAMEE 531
+ D+ + EE
Sbjct: 361 EDSDDEDDEEEEE 373
>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
chromosome partitioning].
Length = 1163
Score = 41.2 bits (97), Expect = 0.004
Identities = 39/250 (15%), Positives = 92/250 (36%), Gaps = 25/250 (10%)
Query: 203 LQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACAR 262
L+E E ++ +E +EEL L E +K+E+E ++ Q +L
Sbjct: 241 LEELEEELS-RLEEELEELQEELEEAEKEIEELKSELEELREELEELQEELLELKEEIEE 299
Query: 263 RDTTLETAVNVKAYKRTKRQGLKEARA---TEKLEKQQKVEAERKKRQKHQEYITTVLQH 319
+ + + + L+E + ++++E ++ ++ + + +
Sbjct: 300 LEGEISLLRERLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEA 359
Query: 320 CKDFKEYHRNN-----------QARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLM 368
++ +E + + L + E E+ K + +ER+ RL
Sbjct: 360 KEELEEKLSALLEELEELFEALREELAELEAELAEI--RNELEELKREIESLEERLERLS 417
Query: 369 AEDEEGYRKLID--------QKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEES 420
E+ +L + Q + + L L + +E + L +KE + E + Q+E
Sbjct: 418 ERLEDLKEELKELEAELEELQTELEELNEELEELEEQLEELRDRLKELERELAELQEELQ 477
Query: 421 KKRKQSVKQK 430
+ K+ +
Sbjct: 478 RLEKELSSLE 487
Score = 38.2 bits (89), Expect = 0.039
Identities = 46/221 (20%), Positives = 104/221 (47%), Gaps = 13/221 (5%)
Query: 1059 EDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDE 1118
E+E E + E + + L EEE ++ + A+ K + E+ E L +E E
Sbjct: 743 EEELEELEEELEELQERLEELEEELESLEEALAKL------KEEIEELEEKRQALQEELE 796
Query: 1119 EIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTE-----KEWLKAIDDGVEYDDEEEE 1173
E+E+ EA+ AL S ++++ + E +E + +D+ E +E E+
Sbjct: 797 ELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEK 856
Query: 1174 EEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYT 1233
E EE++ + + +K E +DE ++ K+E E++ + +++L + ++I ++ +
Sbjct: 857 ELEELKEELEELEAEKEELEDEL-KELEEEKEELEEELRELESELAELKEEIEKLRERLE 915
Query: 1234 DSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIE 1274
+ + ++ +EL + YE ++++ + R+E
Sbjct: 916 ELEAKLERLEVELPELEEELEEEYE-DTLETELEREIERLE 955
Score = 33.5 bits (77), Expect = 0.95
Identities = 39/211 (18%), Positives = 82/211 (38%), Gaps = 19/211 (9%)
Query: 214 IERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLETAVNV 273
ERR++ L L S R++ EIE ++ + +L + LE
Sbjct: 805 AERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEELKEE 864
Query: 274 KAYKRTKRQGLKEARAT-----EKLEKQ-QKVEAERKK----RQKHQEYITTVLQHCKDF 323
+++ L++ E+LE++ +++E+E + +K +E + + +
Sbjct: 865 LEELEAEKEELEDELKELEEEKEELEEELRELESELAELKEEIEKLRERLEELEAKLERL 924
Query: 324 KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDE-----EGYRKL 378
+ + + + + E+E ++ +E IE L A +E E Y +L
Sbjct: 925 EVELPELEEELEEEYEDTL--ETELEREIERLEEEIEALGPVNLRAIEEYEEVEERYEEL 982
Query: 379 IDQKKD--KRLAFLLSQTDEYISNLTQMVKE 407
Q++D + LL +E + KE
Sbjct: 983 KSQREDLEEAKEKLLEVIEELDKEKRERFKE 1013
Score = 32.4 bits (74), Expect = 2.1
Identities = 49/293 (16%), Positives = 111/293 (37%), Gaps = 36/293 (12%)
Query: 1044 ERHQFLQTILHQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRL 1103
+ + + E E +E ++++ EE + ++ ++ + KS L
Sbjct: 223 RELELALLLAKLKELRKELEEL---EEELSRLEEELEEL---QEELEEAEKEIEELKSEL 276
Query: 1104 IEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGR----GSRQRKQVDYTDSLTEK-EWL 1158
E+ E + L +E E+++ E E E +L R + + + + L EK E L
Sbjct: 277 EELREELEELQEELLELKE-EIEELEGEISLLRERLEELENELEELEERLEELKEKIEAL 335
Query: 1159 KAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKL 1218
K + E EE E+ + K + +K EE + +E+ + E + A++
Sbjct: 336 KEELEERETLLEELEQLLAELEEAKEELEEKLSALLEELEELFEALREELAELEAELAEI 395
Query: 1219 KKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKY 1278
+ L+++ ++E+ E ++R + + L
Sbjct: 396 RNELEEL------------------------KREIESLEERLERLSERLEDLKEELKELE 431
Query: 1279 SSVDELQKDFKTLCRNAQIYNEELSLIHEDSVVLESVFTKARQRVESGEDPDE 1331
+ ++ELQ + + L + E+L + + LE + ++ ++ E
Sbjct: 432 AELEELQTELEELNEELEELEEQLEELRDRLKELERELAELQEELQRLEKELS 484
Score = 31.2 bits (71), Expect = 4.9
Identities = 35/188 (18%), Positives = 80/188 (42%), Gaps = 20/188 (10%)
Query: 350 KEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVK--- 406
+E +K+ E++E++ AE E Y++L + ++ LA LL++ E L ++ +
Sbjct: 196 EELEKQLEKLERQ------AEKAERYQELKAELRELELALLLAKLKELRKELEELEEELS 249
Query: 407 --EHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDM---------HISV 455
E ++E+ +++ EE++K + +K +L + ++ Q+E +L + +
Sbjct: 250 RLEEELEELQEELEEAEKEIEELKSELEELREELEELQEELLELKEEIEELEGEISLLRE 309
Query: 456 REISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKE 515
R L+ + L ++ E EE E +E + K E +
Sbjct: 310 RLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEAKEELEEKLSA 369
Query: 516 KNKGEDDE 523
+ ++
Sbjct: 370 LLEELEEL 377
Score = 30.8 bits (70), Expect = 6.2
Identities = 32/152 (21%), Positives = 61/152 (40%), Gaps = 19/152 (12%)
Query: 1076 LARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALH 1135
L R E+ + YQ + AE R+ + L ++ EL +E+E E +EE L
Sbjct: 205 LERQAEKAERYQELKAELRELELAL-LLAKLKEL-------RKELE----ELEEELSRLE 252
Query: 1136 MGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRK--KTEDD 1193
Q + EKE + + E +E EE +EE+ ++ +
Sbjct: 253 EELEELQEEL-----EEAEKEIEELKSELEELREELEELQEELLELKEEIEELEGEISLL 307
Query: 1194 DEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
E + +E E+ E+ + K++ +++
Sbjct: 308 RERLEELENELEELEERLEELKEKIEALKEEL 339
Score = 30.5 bits (69), Expect = 8.9
Identities = 30/152 (19%), Positives = 61/152 (40%), Gaps = 10/152 (6%)
Query: 1076 LARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALH 1135
L EEE + + +E KS E+ L D EE+ + E + + + L
Sbjct: 669 LKELEEELAELEAQLEKLEEEL--KSLKNELRSLED----LLEELRRQLEELERQLEELK 722
Query: 1136 MGRGSRQRKQVDYTDSLTEKEWLKAIDDG--VEYDDEEEEEEEEVRSKRKGKRRKKTEDD 1193
+ + + L E E + E + EE EEE+ S + K +++
Sbjct: 723 RELAALEEELEQLQSRLEELEEELEELEEELEELQERLEELEEELESLE--EALAKLKEE 780
Query: 1194 DEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
EE ++ +E+ ++ E++ + ++ L +
Sbjct: 781 IEELEEKRQALQEELEELEEELEEAERRLDAL 812
>gnl|CDD|206063 pfam13892, DBINO, DNA-binding domain. DBINO is a DNA-binding
domain found on global transcription activator SNF2L1
proteins and chromatin re-modelling proteins.
Length = 140
Score = 38.4 bits (90), Expect = 0.005
Identities = 21/70 (30%), Positives = 42/70 (60%), Gaps = 4/70 (5%)
Query: 328 RNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRL 387
++ Q R RL + ++ + EKE+++ ++R EKE + + E+E R+ Q+ ++L
Sbjct: 61 KDTQLRAKRLMREMLLFWKKNEKEERELRKRAEKEALEQAKKEEEL--REAKRQQ--RKL 116
Query: 388 AFLLSQTDEY 397
FL++QT+ Y
Sbjct: 117 NFLITQTELY 126
>gnl|CDD|148844 pfam07469, DUF1518, Domain of unknown function (DUF1518). This
domain, which is usually found tandemly repeated, is
found various receptor co-activating proteins.
Length = 56
Score = 36.4 bits (84), Expect = 0.005
Identities = 16/42 (38%), Positives = 17/42 (40%), Gaps = 3/42 (7%)
Query: 12 PPQQQQPPLNVGQLPMGAPGSGPPGSPGP---SPGQAPGQNP 50
PPQQ P N G P P SP SP P Q+P
Sbjct: 15 PPQQFPYPPNYGMGQQPDPAFTSPFSPQSPMMSPRMGPSQSP 56
>gnl|CDD|221040 pfam11235, Med25_SD1, Mediator complex subunit 25 synapsin 1. The
overall function of the full-length Med25 is efficiently
to coordinate the transcriptional activation of RAR/RXR
(retinoic acid receptor/retinoic X receptor) in higher
eukaryotic cells. Human Med25 consists of several
domains with different binding properties, the
N-terminal, VWA, domain, this SD1 - synapsin 1 - domain
from residues 229-381, a PTOV(B) or ACID domain from
395-545, an SD2 domain from residues 564-645 and a
C-terminal NR box-containing domain (646-650) from
646-747. This The function of the SD domains is unclear.
Length = 168
Score = 38.7 bits (89), Expect = 0.006
Identities = 40/193 (20%), Positives = 59/193 (30%), Gaps = 51/193 (26%)
Query: 5 STSPNPPPPQQQ--QPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLT------A 56
+ P P +Q PP + P +P P P P +N++ A
Sbjct: 6 GSVPGPLQSKQPVSLPPAA----VLPPQSLPAPQNPLP-PVTPPQMQVPQNVSLHAAHDA 60
Query: 57 LQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPL 116
Q+A+++ K Q R+ + ++ A + F +Q
Sbjct: 61 AQKAVEAAKNQKQGLKNRFSPITPLQ-----------QAPIVGPPF----------SQAP 99
Query: 117 TPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP------- 169
P L G PS P + P P+ Q Q P Q Q P
Sbjct: 100 APVLP---PGPPGAPKPS-PASQLSLVTTVSPGSGLAPVLTQQQVPPQQPQQPSMVPTPA 155
Query: 170 ------PQPHQQQ 176
PQP QQQ
Sbjct: 156 LGGVQPPQPSQQQ 168
Score = 35.2 bits (80), Expect = 0.10
Identities = 15/43 (34%), Positives = 22/43 (51%), Gaps = 1/43 (2%)
Query: 135 GPQMP-PMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
G +P P+ P+ +PP+ +P Q+ P P PP P Q Q
Sbjct: 5 GGSVPGPLQSKQPVSLPPAAVLPPQSLPAPQNPLPPVTPPQMQ 47
Score = 30.2 bits (67), Expect = 4.4
Identities = 13/45 (28%), Positives = 15/45 (33%), Gaps = 1/45 (2%)
Query: 1 MSNSSTSPNPPPPQ-QQQPPLNVGQLPMGAPGSGPPGSPGPSPGQ 44
++ S P QQQ P Q P P G P P Q
Sbjct: 122 VTTVSPGSGLAPVLTQQQVPPQQPQQPSMVPTPALGGVQPPQPSQ 166
>gnl|CDD|218292 pfam04851, ResIII, Type III restriction enzyme, res subunit.
Length = 100
Score = 36.8 bits (86), Expect = 0.009
Identities = 34/164 (20%), Positives = 48/164 (29%), Gaps = 71/164 (43%)
Query: 556 KLKEYQIKGLE-WMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKVNGPFLIIVP 614
+L+ YQ + +E + G++ G GKT+ ALI L + KK L +VP
Sbjct: 3 ELRPYQEEAIERLL-----EKKRGLIVMATGSGKTLTAAALIARLAKGKK---KVLFVVP 54
Query: 615 LSTLSNWSLEFERWAPSVNVVAYKGSPHLRKTLQAQMKASKFNVLLTTYEYVIKDKGPLA 674
RK L Q
Sbjct: 55 -----------------------------RKDLLEQ------------------------ 61
Query: 675 KLHWKYMIIDEGHRM--KNHHCKLTHILNTFYVAPHRLLLTGTP 716
+IIDE H K + K + + L LT TP
Sbjct: 62 ---ALVIIIDEAHHSSAKTKYRK----ILEKFKPAFLLGLTATP 98
>gnl|CDD|165468 PHA03201, PHA03201, uracil DNA glycosylase; Provisional.
Length = 318
Score = 39.5 bits (92), Expect = 0.010
Identities = 15/42 (35%), Positives = 20/42 (47%), Gaps = 1/42 (2%)
Query: 7 SPNPPPPQQQQPPLNVG-QLPMGAPGSGPPGSPGPSPGQAPG 47
S +P PP++ PP + P +P PP PGP PG
Sbjct: 6 SRSPSPPRRPSPPRPTPPRSPDASPEETPPSPPGPGAEPPPG 47
>gnl|CDD|222579 pfam14179, YppG, YppG-like protein. The YppG-like protein family
includes the B. subtilis YppG protein, which is
functionally uncharacterized. This family of proteins is
found in bacteria. Proteins in this family are typically
between 115 and 181 amino acids in length. There are two
completely conserved residues (F and G) that may be
functionally important.
Length = 110
Score = 37.0 bits (86), Expect = 0.011
Identities = 20/54 (37%), Positives = 20/54 (37%), Gaps = 2/54 (3%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQS 186
P QMPP P MP Q QP P Q QP Q SQ K S
Sbjct: 24 PYHQQMPPPPYS-PPQQQQGHFMPPQPQPYPKQSPQQQQPPQFSS-FLSQFKNS 75
Score = 36.6 bits (85), Expect = 0.012
Identities = 18/52 (34%), Positives = 18/52 (34%), Gaps = 5/52 (9%)
Query: 136 PQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPP--QPHQQQGHISSQIKQ 185
P P Q P Q QP Q PPP P QQQGH Q
Sbjct: 2 PYQQN---TNQYPPQNQQQQPYQQQPYHQQMPPPPYSPPQQQQGHFMPPQPQ 50
Score = 35.5 bits (82), Expect = 0.031
Identities = 18/60 (30%), Positives = 18/60 (30%), Gaps = 6/60 (10%)
Query: 133 PSGPQMPP---MSLHGPMPMPPSQPMPNQ---AQPMPLQQQPPPQPHQQQGHISSQIKQS 186
P Q P H MP PP P Q P Q P P QQQ S
Sbjct: 12 PQNQQQQPYQQQPYHQQMPPPPYSPPQQQQGHFMPPQPQPYPKQSPQQQQPPQFSSFLSQ 71
Score = 33.9 bits (78), Expect = 0.10
Identities = 16/44 (36%), Positives = 18/44 (40%), Gaps = 4/44 (9%)
Query: 129 MEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
+ +P P PP G P QP P Q QQQ PPQ
Sbjct: 26 HQQMPPPPYSPPQQQQGHFMPPQPQPYPKQ----SPQQQQPPQF 65
Score = 30.1 bits (68), Expect = 2.8
Identities = 12/42 (28%), Positives = 12/42 (28%), Gaps = 1/42 (2%)
Query: 11 PPPQQQQPPLNVGQLPMGAPGSGPP-GSPGPSPGQAPGQNPQ 51
P QQ PP PP P P Q PQ
Sbjct: 23 QPYHQQMPPPPYSPPQQQQGHFMPPQPQPYPKQSPQQQQPPQ 64
Score = 28.9 bits (65), Expect = 6.3
Identities = 15/49 (30%), Positives = 16/49 (32%), Gaps = 5/49 (10%)
Query: 8 PNPPPPQQQQP-----PLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQ 51
P QQQP P P G P P P P Q+P Q
Sbjct: 14 NQQQQPYQQQPYHQQMPPPPYSPPQQQQGHFMPPQPQPYPKQSPQQQQP 62
Score = 28.5 bits (64), Expect = 8.1
Identities = 14/58 (24%), Positives = 19/58 (32%), Gaps = 6/58 (10%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQP---MPLQQQPPPQPHQQQGHISSQIKQSK 187
PQ + P P P + P P PQP+ +Q S Q +Q
Sbjct: 9 QYPPQNQQQQPYQQQPYHQQMPPPPYSPPQQQQGHFMPPQPQPYPKQ---SPQQQQPP 63
Score = 28.5 bits (64), Expect = 9.6
Identities = 11/49 (22%), Positives = 11/49 (22%), Gaps = 4/49 (8%)
Query: 3 NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQ 51
PPP P Q P P P SP Q
Sbjct: 22 QQPYHQQMPPPPYSPP---QQQQGHFMPPQPQP-YPKQSPQQQQPPQFS 66
>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
Transcription initiation factor IIA (TFIIA) is a
heterotrimer, the three subunits being known as alpha,
beta, and gamma, in order of molecular weight. The N and
C-terminal domains of the gamma subunit are represented
in pfam02268 and pfam02751, respectively. This family
represents the precursor that yields both the alpha and
beta subunits. The TFIIA heterotrimer is an essential
general transcription initiation factor for the
expression of genes transcribed by RNA polymerase II.
Together with TFIID, TFIIA binds to the promoter region;
this is the first step in the formation of a
pre-initiation complex (PIC). Binding of the rest of the
transcription machinery follows this step. After
initiation, the PIC does not completely dissociate from
the promoter. Some components, including TFIIA, remain
attached and re-initiate a subsequent round of
transcription.
Length = 332
Score = 39.3 bits (92), Expect = 0.012
Identities = 30/196 (15%), Positives = 47/196 (23%), Gaps = 19/196 (9%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPG-------SPGPSPGQAPGQNPQENLTA 56
PP Q P + Q P P +P SP P
Sbjct: 48 PWDPSPQAPPPVAQLPQPLPQPPPTQALQALPAGDQQQHNTPTGSPAANPPATFALPAGP 107
Query: 57 LQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPL 116
I E G + ++ + + +QQL+ R A
Sbjct: 108 AGPTI--QTEPGQLYPVQVPVMVTQNPANSPLDQPAQQRALQQLQ-----QRYGAPASGQ 160
Query: 117 TPQLAMGVQGKRMEGVPS--GPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQ 174
P Q + + PP G + + L+Q+
Sbjct: 161 LPSQQQSAQKNDESQLQQQPNGETPPQQTDGAGDDESEALVRLREADGTLEQRIKG---A 217
Query: 175 QQGHISSQIKQSKLTN 190
+ G +KQ K
Sbjct: 218 EGGGAMKVLKQPKKQA 233
Score = 37.8 bits (88), Expect = 0.034
Identities = 35/227 (15%), Positives = 56/227 (24%), Gaps = 34/227 (14%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPG------SPGPSPGQAPGQNPQENLTALQRA 60
SP PPP Q P P A + P G +P SP P
Sbjct: 52 SPQAPPPVAQLPQPLPQPPPTQALQALPAGDQQQHNTPTGSPAANPPATFALPAGPAGPT 111
Query: 61 IDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQL 120
I E G + + Q A Q L +
Sbjct: 112 I--QTEPGQL----------YPVQVPVMVTQNPANSPLDQPAQQRA------LQQLQQRY 153
Query: 121 AMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHIS 180
G+ +PS Q Q PN P + +
Sbjct: 154 GAPASGQ----LPSQQQSA-----QKNDESQLQQQPNGETPPQQTDGAGDDESEALVRLR 204
Query: 181 SQIKQSKLTNIPKPEGLDPLIILQERENRVALNIERRIEELNGSLTS 227
+ G + +L++ + + + R I +++G +
Sbjct: 205 EADGTLEQRIKGAEGGGA-MKVLKQPKKQAKSSKRRTIAQIDGIDSD 250
Score = 32.8 bits (75), Expect = 1.2
Identities = 14/53 (26%), Positives = 19/53 (35%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQEN 53
M + + +P QQ L Q GAP SG S S + Q+
Sbjct: 127 MVTQNPANSPLDQPAQQRALQQLQQRYGAPASGQLPSQQQSAQKNDESQLQQQ 179
>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein.
Length = 529
Score = 39.0 bits (91), Expect = 0.017
Identities = 21/54 (38%), Positives = 32/54 (59%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKT 1221
D+EEEEE+EE + + + K+ E D+EE KK+K +K K+ + L KT
Sbjct: 37 DEEEEEEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKVKETTTEWELLNKT 90
Score = 30.5 bits (69), Expect = 6.3
Identities = 17/67 (25%), Positives = 31/67 (46%), Gaps = 12/67 (17%)
Query: 1166 EYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
E + E +EEEE E ++++ K KE+E D E+++ + KK KK+
Sbjct: 30 EVEKEVPDEEEE------------EEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKV 77
Query: 1226 MRVVIKY 1232
++
Sbjct: 78 KETTTEW 84
>gnl|CDD|222095 pfam13388, DUF4106, Protein of unknown function (DUF4106). This
family of proteins are found in large numbers in the
Trichomonas vaginalis proteome. The function of this
protein is unknown.
Length = 422
Score = 38.9 bits (90), Expect = 0.017
Identities = 15/52 (28%), Positives = 17/52 (32%)
Query: 136 PQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSK 187
Q P P Q P Q P QQ P P QQ K+S+
Sbjct: 209 VQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQPPQTEQGHKRSR 260
Score = 36.6 bits (84), Expect = 0.081
Identities = 14/52 (26%), Positives = 18/52 (34%), Gaps = 1/52 (1%)
Query: 132 VPSGPQMPPMSLHGPMPMP-PSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQ 182
V + Q P + P P Q AQ Q P +QGH S+
Sbjct: 209 VQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQPPQTEQGHKRSR 260
Score = 33.5 bits (76), Expect = 0.86
Identities = 18/65 (27%), Positives = 23/65 (35%), Gaps = 4/65 (6%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKE 66
P P QQ N Q P P P Q P Q P + +R+ +E
Sbjct: 206 QPTVQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQPPQTEQGHKRS----RE 261
Query: 67 QGLEE 71
QG +E
Sbjct: 262 QGNQE 266
Score = 32.7 bits (74), Expect = 1.5
Identities = 15/52 (28%), Positives = 16/52 (30%)
Query: 144 HGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLTNIPKPE 195
H P P QP P Q P QP QQ Q Q P +
Sbjct: 197 HRHAPKPTQQPTVQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTPQNPAQQ 248
Score = 31.2 bits (70), Expect = 3.8
Identities = 16/62 (25%), Positives = 19/62 (30%)
Query: 115 PLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQ 174
P P+ G R P Q P + P + Q QP QP QP
Sbjct: 183 PGLPKTFTSSHGHRHRHAPKPTQQPTVQNPAQQPTVQNPAQQPQQQPQQQPVQPAQQPTP 242
Query: 175 QQ 176
Q
Sbjct: 243 QN 244
>gnl|CDD|220441 pfam09849, DUF2076, Uncharacterized protein conserved in bacteria
(DUF2076). This domain, found in various hypothetical
prokaryotic proteins, has no known function. The domain,
however, is found in various periplasmic ligand-binding
sensor proteins.
Length = 234
Score = 37.7 bits (88), Expect = 0.024
Identities = 31/144 (21%), Positives = 44/144 (30%), Gaps = 27/144 (18%)
Query: 49 NPQENLTALQRAIDSMKEQ--GLEEDPRYQKLIEMKANRTEIKHAFTSAQ------VQQL 100
PQE ++ ID + + E PR + +A I A VQ +
Sbjct: 2 TPQE-----RQLIDGLFSRLKQAEGAPR-----DAEAEA-LIAEALRRQPDAPYYLVQTI 50
Query: 101 RFQIMAY-RLLARNQPLTPQLAMGVQGKR------MEGVPSGPQMPPMSLHGPMPMPPSQ 153
Q A + AR + L Q M G P+ PP + P PP++
Sbjct: 51 LVQEAALKQANARIEELEAQAQHPQSQSSGGFLSGMFG-GGAPRPPPAAPAVQPPAPPAR 109
Query: 154 PMPNQAQPMPLQQQPPPQPHQQQG 177
P P P Q
Sbjct: 110 PGWGSGGPSQQGAGQQPGYAQPGP 133
Score = 35.4 bits (82), Expect = 0.12
Identities = 14/50 (28%), Positives = 15/50 (30%), Gaps = 6/50 (12%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP 50
+ P P P Q P P PG G G GQ PG
Sbjct: 87 FGGGAPRPPPAAPAVQPPA------PPARPGWGSGGPSQQGAGQQPGYAQ 130
Score = 32.7 bits (75), Expect = 0.85
Identities = 10/46 (21%), Positives = 11/46 (23%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPG 47
P P Q PP G G G PG +
Sbjct: 90 GAPRPPPAAPAVQPPAPPARPGWGSGGPSQQGAGQQPGYAQPGPGS 135
>gnl|CDD|215038 PLN00040, PLN00040, Protein MAK16 homolog; Provisional.
Length = 233
Score = 37.4 bits (87), Expect = 0.026
Identities = 24/122 (19%), Positives = 46/122 (37%), Gaps = 6/122 (4%)
Query: 1086 YQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQ 1145
Q + R+ + +++ P L+K + E A +A + EK++ R +
Sbjct: 116 TQYLIRMRKLALKTREKIVTT---PRKLLKRERRRESKAQKAAQLEKSIEKELLERLKSG 172
Query: 1146 VDYTD--SLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKR 1203
Y D + K + K ++ + EEE + + K K R E + E+ K
Sbjct: 173 T-YGDIYNFPSKSYNKVLEMEEVEEAEEELPKSDKNPNSKKKSRVHVEIEYEDEIEYKSL 231
Query: 1204 KK 1205
Sbjct: 232 MS 233
>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein. This protein is found to be part of
a large ribonucleoprotein complex containing the U3
snoRNA. Depletion of the Utp proteins impedes production
of the 18S rRNA, indicating that they are part of the
active pre-rRNA processing complex. This large RNP
complex has been termed the small subunit (SSU)
processome.
Length = 728
Score = 38.5 bits (90), Expect = 0.029
Identities = 31/157 (19%), Positives = 65/157 (41%), Gaps = 31/157 (19%)
Query: 1058 EEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKED 1117
EEDE+E++ ++E + + + K +L E + +E+
Sbjct: 325 EEDEDEDSDSEEEDEDDDEDDDD---------GENPWMLRKKLGKLKEGED-----DEEN 370
Query: 1118 EEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEE 1177
+ F + E R++++ ++ + +E ++E +EEE E
Sbjct: 371 SGLLSMKFMQRAEA---------RKKEE--------NDAEIEELRRELEGEEESDEEENE 413
Query: 1178 VRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKD 1214
SK+ RRK ++ E+ + SKK KKE + + ++
Sbjct: 414 EPSKKNVGRRKFGPENGEKEAESKKLKKENKNEFKEK 450
Score = 35.8 bits (83), Expect = 0.16
Identities = 31/171 (18%), Positives = 58/171 (33%), Gaps = 9/171 (5%)
Query: 1054 HQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWL 1113
Q E ++E + E + + L EE + ++ K G++ E E
Sbjct: 379 MQRAEARKKEENDAEIEELRRELEGEEESDEEENEEPSK--KNVGRRKFGPENGEKEAES 436
Query: 1114 IKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEE 1173
K +E + E KE + + + + L + + ++EEEE
Sbjct: 437 KKLKKENKNEFKEKKESD-------EEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEE 489
Query: 1174 EEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
+EE + K+ + S + + K K+KK KK
Sbjct: 490 LDEENPWLKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKK 540
Score = 33.9 bits (78), Expect = 0.66
Identities = 37/193 (19%), Positives = 71/193 (36%), Gaps = 25/193 (12%)
Query: 342 MNYHANAEKEQKKEQERIEKERMRRLMAE----DEEGYRKLIDQKKDKRLAFLLSQTDEY 397
M + AE +KKE+ E E +RR + DEE + + +R + E
Sbjct: 376 MKFMQRAE-ARKKEENDAEIEELRRELEGEEESDEEENEEPSKKNVGRRKFGPENGEKEA 434
Query: 398 ISNLTQMVKEHKMEQKKK-------QDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTD 450
S + +++ ++KK+ +DEE K ++ + L ++ +++E +
Sbjct: 435 ESKKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEEN 494
Query: 451 MHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGE 510
S GK K KQ + ++ D + + K K+K E
Sbjct: 495 -PWLKTTSSVGKSAK----------KQDSKKKSSSKL--DKAANKISKAAVKVKKKKKKE 541
Query: 511 NENKEKNKGEDDE 523
+ D+E
Sbjct: 542 KSIDLDDDLIDEE 554
>gnl|CDD|189968 pfam01391, Collagen, Collagen triple helix repeat (20 copies).
Members of this family belong to the collagen
superfamily. Collagens are generally extracellular
structural proteins involved in formation of connective
tissue structure. The alignment contains 20 copies of
the G-X-Y repeat that forms a triple helix. The first
position of the repeat is glycine, the second and third
positions can be any residue but are frequently proline
and hydroxyproline. Collagens are post translationally
modified by proline hydroxylase to form the
hydroxyproline residues. Defective hydroxylation is the
cause of scurvy. Some members of the collagen
superfamily are not involved in connective tissue
structure but share the same triple helical structure.
Length = 60
Score = 34.0 bits (79), Expect = 0.031
Identities = 23/46 (50%), Positives = 23/46 (50%), Gaps = 6/46 (13%)
Query: 8 PNPP-PPQQQQPPLNVGQL-PMGAPGS-GPPGSPGPSPGQA--PGQ 48
P PP PP PP G P G PG GPPG PGP PG PG
Sbjct: 3 PGPPGPPGPPGPPGPPGPPGPPGPPGPPGPPGPPGP-PGPPGPPGP 47
Score = 33.6 bits (78), Expect = 0.051
Identities = 15/25 (60%), Positives = 15/25 (60%), Gaps = 4/25 (16%)
Query: 28 GAPGS-GPPGSPGPSPGQ--APGQN 49
G PG GPPG PGP PG APG
Sbjct: 37 GPPGPPGPPGPPGP-PGAPGAPGPP 60
Score = 32.8 bits (76), Expect = 0.073
Identities = 14/27 (51%), Positives = 14/27 (51%), Gaps = 2/27 (7%)
Query: 26 PMGAPGS-GPPGSPGPSPGQAPGQNPQ 51
P G PG GPPG PGP PG P
Sbjct: 5 PPGPPGPPGPPGPPGP-PGPPGPPGPP 30
>gnl|CDD|223587 COG0513, SrmB, Superfamily II DNA and RNA helicases [DNA
replication, recombination, and repair / Transcription /
Translation, ribosomal structure and biogenesis].
Length = 513
Score = 38.2 bits (89), Expect = 0.032
Identities = 33/116 (28%), Positives = 52/116 (44%), Gaps = 11/116 (9%)
Query: 881 SGKFELLDRILPKLKSTGHRVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRGDL 940
K ELL ++L RV++F + +L+ L + RGFK L G E+R
Sbjct: 258 EEKLELLLKLLKDEDEG--RVIVFVRTKRLVEELAESLRKRGFKVAALHGDLPQEERDRA 315
Query: 941 LKKFNAPDSEYFIFVLSTRAGGLGLNLQTADTVIIFDSDWNPHQDLQAQDRAHRIG 996
L+KF + +++T GL++ VI +D +P +D HRIG
Sbjct: 316 LEKFKDGELRV---LVATDVAARGLDIPDVSHVINYDLPLDP------EDYVHRIG 362
>gnl|CDD|234468 TIGR04095, dnd_restrict_1, DNA phosphorothioation system
restriction enzyme. The DNA phosphorothioate
modification system dnd (DNA instability during
electrophoresis) recently has been shown to provide a
modification essential to a restriction system. This
protein family was detected by Partial Phylogenetic
Profiling as linked to dnd, and its members usually are
clustered with the dndABCDE genes.
Length = 451
Score = 38.1 bits (89), Expect = 0.034
Identities = 50/197 (25%), Positives = 86/197 (43%), Gaps = 29/197 (14%)
Query: 556 KLKEYQIKGL-EWMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKVNGPFLIIV- 613
+L++YQ + + W F NN GIL G GKT+ +A + L EK G +++V
Sbjct: 8 ELRDYQKEAIRAW----FKNNGRGILKMATGTGKTLTALAAASKLYEK---IGLLVLLVV 60
Query: 614 -PLSTL-SNWSLEFERWAPSVN-VVAYKGSPHLRKTLQAQM-----KASKFNVLLTTYEY 665
P L W+ E E++ +N ++ Y+ + + L + KF ++TT
Sbjct: 61 CPYQHLVDQWAREAEKF--GLNPILCYESVSNWQSELSTGLYNLNSGNQKFLAIITT-NA 117
Query: 666 VIKDKGPLAKLHW---KYMII-DEGHRMKNHHCKLTHILNTFYVAPHRLLLTGTPLQNKL 721
K ++L K ++I DE H + + L RL L+ TP ++
Sbjct: 118 TFIGKNFQSQLRRFPGKTLLIGDEAHNLGAPR--IRESLPDN--IGFRLGLSATPERHFD 173
Query: 722 PE-LWALLNFLLPSIFK 737
E ALLN+ +++
Sbjct: 174 EEGTNALLNYFGKIVYE 190
>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein. The proteins in this family are
designated YL1. These proteins have been shown to be
DNA-binding and may be a transcription factor.
Length = 238
Score = 37.4 bits (87), Expect = 0.035
Identities = 24/77 (31%), Positives = 34/77 (44%)
Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKT 1221
DD E DDEEE E+E R +R K+++ +EP+ KK+K K A K
Sbjct: 63 DDEPESDDEEEGEKELQREERLKKKKRVKTKAYKEPTKKKKKKDPTAAKSPKAAAPRPKK 122
Query: 1222 LKKIMRVVIKYTDSDGR 1238
+ + DS R
Sbjct: 123 KSERISWAPTLLDSPRR 139
>gnl|CDD|215832 pfam00270, DEAD, DEAD/DEAH box helicase. Members of this family
include the DEAD and DEAH box helicases. Helicases are
involved in unwinding nucleic acids. The DEAD box
helicases are involved in various aspects of RNA
metabolism, including nuclear transcription, pre mRNA
splicing, ribosome biogenesis, nucleocytoplasmic
transport, translation, RNA decay and organellar gene
expression.
Length = 169
Score = 36.5 bits (85), Expect = 0.039
Identities = 30/143 (20%), Positives = 57/143 (39%), Gaps = 15/143 (10%)
Query: 585 GLGKTIQTIALITYL--MEKKKVNGPFLIIVPLSTLSNWSLE-FERWAPSVNV---VAYK 638
G GKT+ L+ L + KK L++ P L+ E ++ + + +
Sbjct: 24 GSGKTL--AFLLPILQALLPKKGGPQALVLAPTRELAEQIYEELKKLFKILGLRVALLTG 81
Query: 639 GSPHLRKTLQAQMKASKFNVLLTTYE---YVIKDKGPLAKLHWKYMIIDEGHRM--KNHH 693
G+ K ++K K ++L+ T +++ + K +++DE HR+
Sbjct: 82 GTS--LKEQARKLKKGKADILVGTPGRLLDLLRRGKLKLLKNLKLLVLDEAHRLLDMGFG 139
Query: 694 CKLTHILNTFYVAPHRLLLTGTP 716
L IL+ LLL+ T
Sbjct: 140 DDLEEILSRLPPDRQILLLSATL 162
>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682). The
members of this family are all hypothetical eukaryotic
proteins of unknown function. One member is described as
being an adipocyte-specific protein, but no evidence of
this was found.
Length = 322
Score = 37.2 bits (87), Expect = 0.047
Identities = 24/76 (31%), Positives = 41/76 (53%), Gaps = 10/76 (13%)
Query: 1142 QRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSK 1201
++VD T E++ LKA EEE +EE + K++ K++++ E + S +
Sbjct: 257 VLRKVDKTREEEEEKILKA---------AEEERQEEAQEKKEEKKKEEREAKLAKLSPEE 307
Query: 1202 KRKKEKEKDREKDQAK 1217
+RK E EK+R+K K
Sbjct: 308 QRKLE-EKERKKQARK 322
Score = 31.8 bits (73), Expect = 2.5
Identities = 14/62 (22%), Positives = 34/62 (54%), Gaps = 6/62 (9%)
Query: 328 RNNQAR---IMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKD 384
+ ++ R ++ KA +E+K+E+++ E+E ++ +E+ RKL ++K+
Sbjct: 260 KVDKTREEEEEKILKAAEEERQEEAQEKKEEKKKEEREAKLAKLSPEEQ--RKL-EEKER 316
Query: 385 KR 386
K+
Sbjct: 317 KK 318
Score = 29.9 bits (68), Expect = 8.6
Identities = 16/60 (26%), Positives = 27/60 (45%), Gaps = 2/60 (3%)
Query: 251 RQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEA-RATEKLEKQQKVEA-ERKKRQK 308
+ R E + E + K K++ +EA A E+Q+K+E ERKK+ +
Sbjct: 262 DKTREEEEEKILKAAEEERQEEAQEKKEEKKKEEREAKLAKLSPEEQRKLEEKERKKQAR 321
>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381). This
domain is functionally uncharacterized. This domain is
found in eukaryotes. This presumed domain is typically
between 156 to 174 amino acids in length. This domain is
found associated with pfam07780, pfam01728.
Length = 154
Score = 35.7 bits (83), Expect = 0.051
Identities = 18/73 (24%), Positives = 33/73 (45%), Gaps = 11/73 (15%)
Query: 1155 KEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKD 1214
++ L E ++EEE E EE E++ + K+ K K + R ++
Sbjct: 91 RKLLGLDKKEKEEEEEEEVEVEE-----------LDEEEQIDELLEKELAKLKREKRREN 139
Query: 1215 QAKLKKTLKKIMR 1227
+ K K+ LK+ M+
Sbjct: 140 ERKQKEILKEQMK 152
Score = 32.6 bits (75), Expect = 0.68
Identities = 17/67 (25%), Positives = 32/67 (47%), Gaps = 11/67 (16%)
Query: 1152 LTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR 1211
EKE + + VE DEEE+ +E + + +R+K ++ + K+K+
Sbjct: 98 KKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREK-----------RRENERKQKEI 146
Query: 1212 EKDQAKL 1218
K+Q K+
Sbjct: 147 LKEQMKM 153
>gnl|CDD|221247 pfam11825, Nuc_recep-AF1, Nuclear/hormone receptor activator site
AF-1. Nuclear receptors (NRs) are a family of
ligand-inducible transcription factors, and, like other
transcription factors, they contain a distinct DNA
binding domain that allows for target gene recognition
and several activation domains that possess the ability
to activate transcription. One of these activation
domains is at the N-terminal, although there are two
distinct motifs within this domain, between residues
20-36 and between 74 and the end of this domain, which
are the binding regions. One of the co-activators is
TIF1beta, which appears to bind at the first motif.
Length = 106
Score = 34.8 bits (80), Expect = 0.055
Identities = 15/52 (28%), Positives = 19/52 (36%), Gaps = 2/52 (3%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQEN 53
S P P +V MG+P P +PG G G +PQ N
Sbjct: 25 PMGPMSTLSSPINGLGSPYSVISSSMGSPSMSLPSTPGLGYG--TGSSPQIN 74
>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
This is a family of fungal proteins of unknown function.
Length = 182
Score = 36.2 bits (84), Expect = 0.056
Identities = 23/76 (30%), Positives = 42/76 (55%), Gaps = 5/76 (6%)
Query: 1148 YTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
YT++ +K+ L E + ++E EE+ + K K K+ KK +D D++ KK K +
Sbjct: 57 YTEAKKKKKELAE-----EIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSE 111
Query: 1208 EKDREKDQAKLKKTLK 1223
+KD ++ + KL+ K
Sbjct: 112 KKDEKEAEDKLEDLTK 127
Score = 35.1 bits (81), Expect = 0.13
Identities = 16/60 (26%), Positives = 32/60 (53%)
Query: 1166 EYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
E +E E+ ++E K+K K +KK ++ KK K+ +K +KD+ + + L+ +
Sbjct: 66 ELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDL 125
Score = 33.9 bits (78), Expect = 0.32
Identities = 18/70 (25%), Positives = 35/70 (50%), Gaps = 3/70 (4%)
Query: 1158 LKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSK--KRKKEKEKDREKDQ 1215
K + + +E ++E EE++ +K K +KK + D ++ K K +K+ EK+ E
Sbjct: 64 KKELAEEIE-KVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKL 122
Query: 1216 AKLKKTLKKI 1225
L K+ +
Sbjct: 123 EDLTKSYSET 132
Score = 30.4 bits (69), Expect = 3.6
Identities = 10/53 (18%), Positives = 24/53 (45%)
Query: 492 DEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYYSIAHTVHEI 544
++ + +K K+K + ++K + K E + +K +Y T+ E+
Sbjct: 87 KKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYSETLSTLSEL 139
Score = 29.3 bits (66), Expect = 8.6
Identities = 17/101 (16%), Positives = 43/101 (42%), Gaps = 9/101 (8%)
Query: 1126 EAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
EAK+++K L + +K+ + +++W + D++++++++ K+ K
Sbjct: 59 EAKKKKKELA-EEIEKVKKEYE-----EKQKWKWKKKKSKKKKDKDKDKKDD---KKDDK 109
Query: 1186 RRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIM 1226
KK E + E+ + + + K L K +
Sbjct: 110 SEKKDEKEAEDKLEDLTKSYSETLSTLSELKPRKYALHKDI 150
Score = 29.3 bits (66), Expect = 9.3
Identities = 17/58 (29%), Positives = 33/58 (56%)
Query: 1167 YDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
YD E E +++ + + + K E ++++ KK+K +K+KD++KD+ KK K
Sbjct: 53 YDAEYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKS 110
>gnl|CDD|219262 pfam07001, BAT2_N, BAT2 N-terminus. This family represents the
N-terminus (approximately 200 residues) of the
proline-rich protein BAT2. BAT2 is similar to other
proteins with large proline-rich domains, such as some
nuclear proteins, collagens, elastin, and synapsin.
Length = 189
Score = 36.1 bits (83), Expect = 0.058
Identities = 19/62 (30%), Positives = 25/62 (40%), Gaps = 1/62 (1%)
Query: 5 STSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGP-SPGQAPGQNPQENLTALQRAIDS 63
+++ +PPPP Q PL G A S PG+ G E +LQ A D
Sbjct: 117 TSASSPPPPPQPATPLVPGGAKSWAVASAKPGAQGDGGRASQLSSFSHEEFPSLQAAGDQ 176
Query: 64 MK 65
K
Sbjct: 177 DK 178
>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein. This family includes proteins
related to Mpp10 (M phase phosphoprotein 10). The U3
small nucleolar ribonucleoprotein (snoRNP) is required
for three cleavage events that generate the mature 18S
rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
depletion of Mpp10, a U3 snoRNP-specific protein, halts
18S rRNA production and impairs cleavage at the three U3
snoRNP-dependent sites.
Length = 613
Score = 37.3 bits (86), Expect = 0.059
Identities = 28/170 (16%), Positives = 62/170 (36%), Gaps = 12/170 (7%)
Query: 1053 LHQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDW 1112
++ E D V+ + +E + + +AE G + + +
Sbjct: 171 EEKESVEQATREKKFDKSGVDDKFFKLDEMNEFLEATEAEEEAALGDEDDFEDYFQDDSE 230
Query: 1113 LIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEE 1172
K+DE+ E +EE ++Y D KE K D G + + E++
Sbjct: 231 DGKDDEDFGSGEDEEDDEEGN------------IEYEDFFDPKEKDKKKDAGDDAELEDD 278
Query: 1173 EEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTL 1222
E ++E K + ++ +++D+E + ++ E +K +
Sbjct: 279 EPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEEPPEAAMDKVKLDEPVLE 328
>gnl|CDD|152960 pfam12526, DUF3729, Protein of unknown function (DUF3729). This
family of proteins is found in viruses. Proteins in
this family are typically between 145 and 1707 amino
acids in length. The family is found in association
with pfam01443, pfam01661, pfam05417, pfam01660,
pfam00978. There is a single completely conserved
residue L that may be functionally important.
Length = 115
Score = 34.7 bits (80), Expect = 0.064
Identities = 15/47 (31%), Positives = 19/47 (40%), Gaps = 5/47 (10%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP 50
S+ PPP + PP + PG P SP P AP + P
Sbjct: 58 SAVWVLPPPSEPAAPPPD---PEPPVPGPAGPPSPLAPP--APARKP 99
>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572). Family of
eukaryotic proteins with undetermined function.
Length = 321
Score = 36.6 bits (85), Expect = 0.072
Identities = 29/137 (21%), Positives = 54/137 (39%), Gaps = 21/137 (15%)
Query: 1110 PDWLIKEDEEIEQWAFEAKEEEKALHM--GRGSRQRKQVDYTDSLTEKEWLKAIDDGVEY 1167
D L +E EE + E + A+ R + +++++ + L E + L++ V+
Sbjct: 110 ADKLDEEQEERVEKEREEELAGDAMKKLENRTADSKREMEVLERLEELKELQSRRADVDV 169
Query: 1168 D---------------DEEEEEEEEVRSKRKG----KRRKKTEDDDEEPSTSKKRKKEKE 1208
+ +EEEE+E ++S G + R++ +D+D E
Sbjct: 170 NSMLEALFRREKKEEEEEEEEDEALIKSLSFGPETEEDRRRADDEDSEDDEEDNDNTPSP 229
Query: 1209 KDREKDQAKLKKTLKKI 1225
K AK LKK
Sbjct: 230 KSGSSSPAKPTSILKKS 246
>gnl|CDD|220252 pfam09468, RNase_H2-Ydr279, Ydr279p protein family (RNase H2 complex
component). RNases H are enzymes that specifically
hydrolyse RNA when annealed to a complementary DNA and
are present in all living organisms. In yeast RNase H2 is
composed of a complex of three proteins (Rnh2Ap, Ydr279p
and Ylr154p), this family represents the homologues of
Ydr279p. It is not known whether non yeast proteins in
this family fulfil the same function.
Length = 287
Score = 36.5 bits (85), Expect = 0.074
Identities = 17/78 (21%), Positives = 35/78 (44%), Gaps = 7/78 (8%)
Query: 1150 DSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTS---KKRKKE 1206
L + + + E + + K K++++TE+D E + K++ KE
Sbjct: 201 SYLPPDLYKELLK----SLLIPEFKPLDKYLKESKKKKRETEEDVEAAESRAEKKRKSKE 256
Query: 1207 KEKDREKDQAKLKKTLKK 1224
+ K ++ ++K K LKK
Sbjct: 257 EIKKKKPKESKGVKALKK 274
>gnl|CDD|219358 pfam07271, Cytadhesin_P30, Cytadhesin P30/P32. This family
consists of several Mycoplasma species specific
Cytadhesin P32 and P30 proteins. P30 has been found to
be membrane associated and localised on the tip
organelle. It is thought that it is important in
cytadherence and virulence.
Length = 279
Score = 36.2 bits (83), Expect = 0.093
Identities = 32/125 (25%), Positives = 46/125 (36%), Gaps = 7/125 (5%)
Query: 57 LQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFT-SAQVQQLRFQIMAYRLLARNQP 115
LQR + ++Q +E DP + + + A QVQ R+ +
Sbjct: 116 LQRISEQNEQQAIEIDPTEEVNTQEPTQPAGVNVANNPQPQVQPQFGPNPQQRINPQRFG 175
Query: 116 LTPQLAMGVQGKRMEGVPSGPQMPPMSLH---GPMP-MPPSQPMPNQAQPMPLQQQP--P 169
Q MG++ + P P MPP + PMP MPP MP +P
Sbjct: 176 FPMQPNMGMRPGFNQMPPHMPGMPPNQMRPGFNPMPGMPPRPGFNQNPNMMPNMNRPGFR 235
Query: 170 PQPHQ 174
PQP
Sbjct: 236 PQPGG 240
>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
Length = 619
Score = 36.7 bits (86), Expect = 0.095
Identities = 21/88 (23%), Positives = 36/88 (40%), Gaps = 11/88 (12%)
Query: 1147 DYTDSLTEKE-WLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKK 1205
++ D L E L+ + DG D EE+ V S+ + + E+++E+ +
Sbjct: 154 EWYDRLENGERRLRELIDG-FVDPNAEEDPAHVGSELEELDDDEDEEEEEDENDDSLAAD 212
Query: 1206 EKE---------KDREKDQAKLKKTLKK 1224
E E K K KL+K +K
Sbjct: 213 ESELPEKVLEKFKALAKQYKKLRKAQEK 240
>gnl|CDD|180801 PRK07033, PRK07033, hypothetical protein; Provisional.
Length = 427
Score = 36.6 bits (85), Expect = 0.096
Identities = 17/48 (35%), Positives = 21/48 (43%)
Query: 3 NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP 50
N+S+ P PP + P AP +G P P P G A G NP
Sbjct: 1 NTSSDPFSAGSGGFVPPNPGDRTPAAAPAAGAPFQPRPGRGAASGLNP 48
Score = 31.2 bits (71), Expect = 4.6
Identities = 9/46 (19%), Positives = 11/46 (23%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGH 178
+ G +P P P A QP P G
Sbjct: 1 NTSSDPFSAGSGGFVPPNPGDRTPAAAPAAGAPFQPRPGRGAASGL 46
>gnl|CDD|215565 PLN03083, PLN03083, E3 UFM1-protein ligase 1 homolog; Provisional.
Length = 803
Score = 36.7 bits (85), Expect = 0.11
Identities = 38/155 (24%), Positives = 64/155 (41%), Gaps = 22/155 (14%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMR 1227
DDEE+ ++ ++++KG+ + D + K+ K +E + + +KKI+
Sbjct: 440 DDEEDAPKKGKKNQKKGRDKSSKVPSDSKAGGKKESVKSQED--NNNIPPEEWVMKKILE 497
Query: 1228 VVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDI-------KKILGRIEDGKYSS 1280
V + DG +K L D+ RPM I K + + +
Sbjct: 498 WVPDL-EEDGTEDPGSILKH-----LADHL----RPMLINSLKERRKALFTENAERRRRL 547
Query: 1281 VDELQKDFKTLCRNAQIYNEELSLIHED---SVVL 1312
+D LQK N Q+Y + L L +D SVVL
Sbjct: 548 LDNLQKKIDESFLNMQLYEKALDLFEDDQSTSVVL 582
>gnl|CDD|217298 pfam02948, Amelogenin, Amelogenin. Amelogenins play a role in
biomineralisation. They seem to regulate the formation
of crystallites during the secretory stage of tooth
enamel development. thought to play a major role in the
structural organisation and mineralisation of developing
enamel. They are found in the extracellular matrix.
Mutations in X-chromosomal amelogenin can cause
Amelogenesis imperfecta.
Length = 174
Score = 34.9 bits (80), Expect = 0.11
Identities = 14/73 (19%), Positives = 20/73 (27%), Gaps = 5/73 (6%)
Query: 132 VPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP-----PQPHQQQGHISSQIKQS 186
+P PQMP + P + P+ P P QQ
Sbjct: 51 IPLSPQMPQQQQSAHPKLTPHHQLLILPPQQPMMPVPGHHPMVPMTGQQPHLQPPAQHPL 110
Query: 187 KLTNIPKPEGLDP 199
+ T P+ P
Sbjct: 111 QPTYGQNPQPQQP 123
Score = 33.0 bits (75), Expect = 0.61
Identities = 20/65 (30%), Positives = 24/65 (36%), Gaps = 3/65 (4%)
Query: 133 PSGPQMP--PMSLHGPM-PMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLT 189
PQ P P+ H PM PM QP PLQ P QQ + Q +
Sbjct: 76 ILPPQQPMMPVPGHHPMVPMTGQQPHLQPPAQHPLQPTYGQNPQPQQPTHTQPPVQPQQP 135
Query: 190 NIPKP 194
P+P
Sbjct: 136 ADPQP 140
Score = 32.2 bits (73), Expect = 1.0
Identities = 21/88 (23%), Positives = 28/88 (31%), Gaps = 7/88 (7%)
Query: 90 HAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKR-MEGVPSGPQMPPMSLHGPMP 148
H QQ Q A PL P Q ++ P P P
Sbjct: 90 HPMVPMTGQQPHLQPPA------QHPLQPTYGQNPQPQQPTHTQPPVQPQQPADPQPGQP 143
Query: 149 MPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
M P QP+P +PL+ P +Q+
Sbjct: 144 MFPMQPLPPLVPDLPLEPWPAADKTKQE 171
>gnl|CDD|218538 pfam05285, SDA1, SDA1. This family consists of several SDA1 protein
homologues. SDA1 is a Saccharomyces cerevisiae protein
which is involved in the control of the actin
cytoskeleton. The protein is essential for cell viability
and is localised in the nucleus.
Length = 317
Score = 35.8 bits (83), Expect = 0.13
Identities = 12/50 (24%), Positives = 29/50 (58%)
Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR 1211
D +E D E+EEE++ +K+ + + +++E +++ + E EK++
Sbjct: 124 DKEIESSDSEDEEEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEK 173
>gnl|CDD|184281 PRK13729, PRK13729, conjugal transfer pilus assembly protein TraB;
Provisional.
Length = 475
Score = 36.0 bits (83), Expect = 0.14
Identities = 34/198 (17%), Positives = 59/198 (29%), Gaps = 34/198 (17%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRA 60
+S+ S N +Q+P ++ + +++ T +
Sbjct: 33 LSDVDMSGNGEAVAEQEPVPDMTGVV----------------DTTFDDKVRQHATTEMQV 76
Query: 61 IDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQL 120
+ ++ EE R ++ + + + L Q+ A L N P
Sbjct: 77 TAAQMQKQYEEIRRELDVLNKQRGDDQRRIEKLGQDNAALAEQVKA---LGAN----PVT 129
Query: 121 AMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHIS 180
A G E VP P PP P P P Q PPP G+
Sbjct: 130 ATG------EPVPQMPASPPGPEGEPQPGNTPVSFPPQGSVAV----PPPTAF-YPGNGV 178
Query: 181 SQIKQSKLTNIPKPEGLD 198
+ Q ++P P +
Sbjct: 179 TPPPQVTYQSVPVPNRIQ 196
>gnl|CDD|223065 PHA03378, PHA03378, EBNA-3B; Provisional.
Length = 991
Score = 36.2 bits (83), Expect = 0.15
Identities = 32/122 (26%), Positives = 44/122 (36%), Gaps = 12/122 (9%)
Query: 106 AYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHG-PMPMPPSQPMPNQAQPMPL 164
A A P A + + P G PP + G P P PP QA P P
Sbjct: 734 ARPPAAAPGRARPPAAAPGRARPPAAAP-GRARPPAAAPGAPTPQPPP-----QAPPAPQ 787
Query: 165 QQ---QPPPQPHQQQGHISSQI--KQSKLTNIPKPEGLDPLIILQERENRVALNIERRIE 219
Q+ P PQP Q G S Q+ + + P + L L+ + R +L +E
Sbjct: 788 QRPRGAPTPQPPPQAGPTSMQLMPRAAPGQQGPTKQILRQLLTGGVKRGRPSLKKPAALE 847
Query: 220 EL 221
Sbjct: 848 RQ 849
Score = 36.2 bits (83), Expect = 0.16
Identities = 45/233 (19%), Positives = 70/233 (30%), Gaps = 52/233 (22%)
Query: 5 STSPNPP-----PPQQQQP---PLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTA 56
S +P PP P+ P P+ + +PM P P + P Q PQ +T
Sbjct: 603 SQTPEPPTTQSHIPETSAPRQWPMPLRPIPM-RPLRMQPITFNVLVFPTPHQPPQVEITP 661
Query: 57 LQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLL---ARN 113
+ P + ++ + + L Q +
Sbjct: 662 YK--------------PTWTQIGHIPYQPSPTGAN------TMLPIQWAPGTMQPPPRAP 701
Query: 114 QPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHG--------PMPMPPSQPMPNQAQP---M 162
P+ P A + +R G PP + G P P P +A+P
Sbjct: 702 TPMRPPAAPPGRAQRPAAAT-GRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAA 760
Query: 163 PLQQQPP--------PQPHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQERE 207
P + +PP PQP Q Q + T P P+ + L R
Sbjct: 761 PGRARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQAGPTSMQLMPRA 813
Score = 32.7 bits (74), Expect = 1.6
Identities = 16/60 (26%), Positives = 20/60 (33%), Gaps = 5/60 (8%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQ-APGQNPQENLTALQR 59
+ +P P P + P APG P P +PG P PQ QR
Sbjct: 734 ARPPAAAPGRARPPAAAP--GRARPPAAAPGRARP--PAAAPGAPTPQPPPQAPPAPQQR 789
Score = 30.4 bits (68), Expect = 7.4
Identities = 20/54 (37%), Positives = 22/54 (40%), Gaps = 6/54 (11%)
Query: 1 MSNSSTSPNPPPPQQQ-QPPLNVGQLPMGAPGSGPPGSPGP-----SPGQAPGQ 48
+ +P P PQ Q P Q P GAP PP GP P APGQ
Sbjct: 764 ARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQAGPTSMQLMPRAAPGQ 817
>gnl|CDD|221404 pfam12067, Sox_C_TAD, Sox C-terminal transactivation domain. This
domain is found at the C-terminus of the Sox family of
transcription factors. It is found associated with
pfam00505. It binds to the Armadillo repeats (pfam00514)
in Catenin beta-1 (CTNNB1), which is involved in
transcriptional regulation. It functions as a
transactivating domain (TAD).
Length = 197
Score = 35.1 bits (81), Expect = 0.15
Identities = 20/89 (22%), Positives = 28/89 (31%), Gaps = 20/89 (22%)
Query: 113 NQPLTPQLAM-GVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPM-------PNQAQPMPL 164
N PQ A ++M + PQ P M P M P A+ P+
Sbjct: 56 NSSYAPQNAHAPALLRQMAVTENIPQGSPA--PSIMGCPTPPQMYYGQMYVPECAKHHPV 113
Query: 165 ---QQQPPPQ-------PHQQQGHISSQI 183
Q PPP+ QQ + +
Sbjct: 114 QLGQLSPPPESQHLDTLDQLQQAELLGDV 142
Score = 30.1 bits (68), Expect = 6.5
Identities = 12/57 (21%), Positives = 20/57 (35%), Gaps = 2/57 (3%)
Query: 131 GVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP-PQPHQQQGHISSQIKQS 186
G+P+ P+M P+ P S P Q MP + + Q+ +
Sbjct: 21 GLPT-PEMSPLDALESEPAFFSPPCQEDCQMMPYGYNSSYAPQNAHAPALLRQMAVT 76
>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
Length = 1388
Score = 36.2 bits (84), Expect = 0.15
Identities = 19/71 (26%), Positives = 30/71 (42%), Gaps = 2/71 (2%)
Query: 1157 WLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQA 1216
WL+ +D E +E+EE EE+ K +R K K KK+++K ++
Sbjct: 1130 WLEDLDKFEEALEEQEEVEEK--EIAKEQRLKSKTKGKASKLRKPKLKKKEKKKKKSSAD 1187
Query: 1217 KLKKTLKKIMR 1227
K KK
Sbjct: 1188 KSKKASVVGNS 1198
Score = 31.9 bits (73), Expect = 3.3
Identities = 20/87 (22%), Positives = 39/87 (44%), Gaps = 5/87 (5%)
Query: 1155 KEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKD 1214
++ +A+++ E +++E +E+ ++SK KGK K +P KK KK+K+ +K
Sbjct: 1135 DKFEEALEEQEEVEEKEIAKEQRLKSKTKGKASKL-----RKPKLKKKEKKKKKSSADKS 1189
Query: 1215 QAKLKKTLKKIMRVVIKYTDSDGRVLS 1241
+ K + K D
Sbjct: 1190 KKASVVGNSKRVDSDEKRKLDDKPDNK 1216
>gnl|CDD|219753 pfam08226, DUF1720, Domain of unknown function (DUF1720). This
domain is found in different combinations with cortical
patch components EF hand, SH3 and ENTH and is therefore
likely to be involved in cytoskeletal processes. This
family contains many hypothetical proteins.
Length = 73
Score = 32.1 bits (73), Expect = 0.21
Identities = 14/68 (20%), Positives = 18/68 (26%), Gaps = 10/68 (14%)
Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMP---PSQPMPNQAQP------MPLQQQP 168
PQ ++ + GP + P P P Q Q Q Q
Sbjct: 3 PQQTGYQPPQQQQPQQQGP-LQPQPTGFMQPQPTGFGQQQQGLQPQQTGFQPQAGQQMPT 61
Query: 169 PPQPHQQQ 176
P Q Q
Sbjct: 62 GTGPLQPQ 69
Score = 29.4 bits (66), Expect = 2.2
Identities = 15/59 (25%), Positives = 18/59 (30%), Gaps = 12/59 (20%)
Query: 114 QPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMP----MPPSQPMPNQAQPMPLQQQP 168
PL PQ +Q + P+G L P P PLQ QP
Sbjct: 20 GPLQPQPTGFMQPQ-----PTGFGQQQQGL---QPQQTGFQPQAGQQMPTGTGPLQPQP 70
>gnl|CDD|220708 pfam10349, WWbp, WW-domain ligand protein. The WWbp domain is
characterized by several short PY and PT-like motifs of
the PPPPY form. These appear to bind directly to the WW
domains of WWP1 and WWP2 and other such diverse proteins
as dystrophin and YAP (Yes-associated protein). This is
the WW-domain binding protein WWbp via PY and PY_like
motifs. The presence of a phosphotyrosine residue in the
pWBP-1 peptide abolishes WW domain binding which
suggests a potential regulatory role for tyrosine
phosphorylation in modulating WW domain-ligand
interactions. Given the likelihood that WWP1 and WWP2
function as E3 ubiquitin-protein ligases, it is possible
that initial substrate-specific recognition occurs via
WW domain-substrate protein interaction followed by
ubiquitin transfer and subsequent proteolysis. This
domain lies just downstream of the GRAM (pfam02893) in
many members.
Length = 111
Score = 33.1 bits (76), Expect = 0.24
Identities = 15/66 (22%), Positives = 17/66 (25%), Gaps = 5/66 (7%)
Query: 110 LARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPP 169
R QP+ G V P P PP P P P P
Sbjct: 34 AQRAQPV--SRESGYYPPPGAYVHLEPL-PAY--GQYAAPPPYGPPPPYYPAPPGVYPTP 88
Query: 170 PQPHQQ 175
P P+
Sbjct: 89 PPPNSG 94
Score = 32.8 bits (75), Expect = 0.29
Identities = 11/47 (23%), Positives = 14/47 (29%), Gaps = 2/47 (4%)
Query: 134 SGPQMPPMSLHGPMPMPP--SQPMPNQAQPMPLQQQPPPQPHQQQGH 178
SG PP + P+P P P P PP +
Sbjct: 44 SGYYPPPGAYVHLEPLPAYGQYAAPPPYGPPPPYYPAPPGVYPTPPP 90
Score = 32.0 bits (73), Expect = 0.49
Identities = 20/82 (24%), Positives = 26/82 (31%), Gaps = 5/82 (6%)
Query: 95 AQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQP 154
+ Q + + Y L P A G P PP P + P+ P
Sbjct: 35 QRAQPVSRESGYYPPPGAYVHLEPLPAYG-----QYAAPPPYGPPPPYYPAPPGVYPTPP 89
Query: 155 MPNQAQPMPLQQQPPPQPHQQQ 176
PN Q+ PPP P Q
Sbjct: 90 PPNSGYMADPQEPPPPYPGPPQ 111
Score = 31.2 bits (71), Expect = 0.94
Identities = 14/57 (24%), Positives = 16/57 (28%), Gaps = 1/57 (1%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSG-PPGSPGPSPGQAPGQNPQENLTA 56
+S S PP PL P G PP PG P P +
Sbjct: 40 VSRESGYYPPPGAYVHLEPLPAYGQYAAPPPYGPPPPYYPAPPGVYPTPPPPNSGYM 96
Score = 30.1 bits (68), Expect = 2.7
Identities = 11/36 (30%), Positives = 11/36 (30%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSP 42
P PP P M P PP PGP
Sbjct: 76 PYYPAPPGVYPTPPPPNSGYMADPQEPPPPYPGPPQ 111
>gnl|CDD|203444 pfam06424, PRP1_N, PRP1 splicing factor, N-terminal. This domain is
specific to the N-terminal part of the prp1 splicing
factor, which is involved in mRNA splicing (and possibly
also poly(A)+ RNA nuclear export and cell cycle
progression). This domain is specific to the N terminus
of the RNA splicing factor encoded by prp1. It is
involved in mRNA splicing and possibly also poly(A)and
RNA nuclear export and cell cycle progression.
Length = 131
Score = 33.4 bits (77), Expect = 0.24
Identities = 18/64 (28%), Positives = 34/64 (53%), Gaps = 8/64 (12%)
Query: 1166 EYDDEEEEEE---EEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQ-AKLKKT 1221
+YDDE+EE + E + + +R+K+ E ++E + K +E + + Q A LK+
Sbjct: 55 KYDDEDEEADRIYESIDERMDERRKKRREQKEKE----EIEKYREENPKIQQQFADLKRN 110
Query: 1222 LKKI 1225
L +
Sbjct: 111 LATV 114
>gnl|CDD|217502 pfam03343, SART-1, SART-1 family. SART-1 is a protein involved in
cell cycle arrest and pre-mRNA splicing. It has been
shown to be a component of U4/U6 x U5 tri-snRNP complex
in human, Schizosaccharomyces pombe and Saccharomyces
cerevisiae. SART-1 is a known tumour antigen in a range
of cancers recognised by T cells.
Length = 603
Score = 35.5 bits (82), Expect = 0.24
Identities = 39/192 (20%), Positives = 72/192 (37%), Gaps = 22/192 (11%)
Query: 1057 DEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKE 1116
DE E ++ + + + E + + + +E +E+S + + KE
Sbjct: 394 DETSEFVRSLQKEPLEEKPENKDESVEEISDAEEDDEDEEDEDGDGDVEMSAVDNDEEKE 453
Query: 1117 DEEIEQWAFEAKEEEKALHMGRGS-----RQRKQVDYTDSLTE-KEWLK-AIDDGVEYDD 1169
+E+ E EEE + G + + R + E +E+LK + +
Sbjct: 454 EEDKEAIPSTILEEEPTVGGGLAAALKLLKSRGILKKNQLERERREFLKEKERLKLLAEI 513
Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVV 1229
E E E R+ K R E EE + + +++KE+ + D V
Sbjct: 514 RERIERERDRNDGKYSRMSARER--EEYARPENDQRDKEEAYKPD-------------VK 558
Query: 1230 IKYTDSDGRVLS 1241
+KY D GR L+
Sbjct: 559 LKYVDEFGRELT 570
Score = 32.4 bits (74), Expect = 1.6
Identities = 25/98 (25%), Positives = 45/98 (45%), Gaps = 13/98 (13%)
Query: 1150 DSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKK-------TEDDDEEPSTS-- 1200
D+ + W K ++ EE E+ +++ K +R K EDDD++ T
Sbjct: 35 DAAAYENWKKRQEEAEAKRKREELREKIAKAREKRERNSKLGGIKTLGEDDDDDDDTKAW 94
Query: 1201 --KKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSD 1236
K +K++K+K+ E+ +A L +K +YT D
Sbjct: 95 LKKSKKRQKKKEAERKKALLLDEKEK--ERAAEYTSED 130
>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
Validated.
Length = 824
Score = 35.3 bits (82), Expect = 0.26
Identities = 31/193 (16%), Positives = 41/193 (21%), Gaps = 44/193 (22%)
Query: 8 PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ 67
PP P PP + P A + P P+P A + +
Sbjct: 598 EGPPAPASSGPPEEAAR-P--AAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHP 654
Query: 68 GLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGK 127
P S P P
Sbjct: 655 KHVAVPDA------------------SDGGDG------WPAKAGGAAPAAP--------- 681
Query: 128 RMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQ-AQPMPLQQQPPPQPHQQQGHISSQIKQS 186
P P + P P+QP P A P Q P Q +S +
Sbjct: 682 -------PPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPA 734
Query: 187 KLTNIPKPEGLDP 199
+P P D
Sbjct: 735 ADDPVPLPPEPDD 747
Score = 32.7 bits (75), Expect = 1.8
Identities = 11/43 (25%), Positives = 13/43 (30%)
Query: 10 PPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQE 52
P P G P P + P P P+P AP
Sbjct: 438 APAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAP 480
Score = 31.9 bits (73), Expect = 2.9
Identities = 12/65 (18%), Positives = 14/65 (21%)
Query: 108 RLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQ 167
RL R A P + P+ Q P P
Sbjct: 380 RLERRLGVAGGAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAPAPAP 439
Query: 168 PPPQP 172
PP P
Sbjct: 440 APPSP 444
Score = 31.1 bits (71), Expect = 5.0
Identities = 9/46 (19%), Positives = 11/46 (23%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQE 52
PP P P A +P P+ P P
Sbjct: 437 PAPAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAP 482
Score = 30.7 bits (70), Expect = 6.7
Identities = 14/68 (20%), Positives = 18/68 (26%), Gaps = 1/68 (1%)
Query: 10 PPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAP-GQNPQENLTALQRAIDSMKEQG 68
PPP P P P +P P Q PQ A + +
Sbjct: 681 PPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVP 740
Query: 69 LEEDPRYQ 76
L +P
Sbjct: 741 LPPEPDDP 748
Score = 30.3 bits (69), Expect = 8.1
Identities = 13/51 (25%), Positives = 15/51 (29%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQE 52
N+ P PP P P AP +P P AP P
Sbjct: 446 GNAPAGGAPSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAA 496
>gnl|CDD|99926 cd05494, Bromodomain_1, Bromodomain; uncharacterized subfamily.
Bromodomains are found in many chromatin-associated
proteins and in nuclear histone acetyltransferases. They
interact specifically with acetylated lysine.
Length = 114
Score = 32.8 bits (75), Expect = 0.28
Identities = 15/49 (30%), Positives = 25/49 (51%), Gaps = 2/49 (4%)
Query: 1241 SEPFIKL--PSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKD 1287
+ PF++ P R+ PDY +VI RPM + I + +++LQ
Sbjct: 21 AWPFLEPVNPPRRGAPDYRDVIKRPMSFGTKVNNIVETGARDLEDLQIV 69
>gnl|CDD|227512 COG5185, HEC1, Protein involved in chromosome segregation, interacts
with SMC proteins [Cell division and chromosome
partitioning].
Length = 622
Score = 35.0 bits (80), Expect = 0.30
Identities = 36/166 (21%), Positives = 64/166 (38%), Gaps = 19/166 (11%)
Query: 1067 PDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFE 1126
+ E Q L E+F D K Q L E I+E +I Q
Sbjct: 249 DNYEPSEQELKLGFEKFVHIINTDIANLKTQND--NLYE-------KIQEAMKISQKIKT 299
Query: 1127 AKEEEKALHMGRGSRQRKQVDYTDSLTEK--EWLKAIDDGVEYDDEEEEEEEEVRSKRKG 1184
+E+ +AL S K +Y +++ +K EW ++ + +EEE + ++S
Sbjct: 300 LREKWRALK----SDSNKYENYVNAMKQKSQEWPGKLEKLKSEIELKEEEIKALQSNIDE 355
Query: 1185 KRRKKTEDDDEEPSTSKKRKKEKEK-DREKDQAKLKKTLKKIMRVV 1229
K+ + +E+EK RE D+ ++ K+ + V
Sbjct: 356 -LHKQLRKQGISTEQFELMNQEREKLTRELDKINIQSD--KLTKSV 398
>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein. The YqfQ-like protein family
includes the B. subtilis YqfQ protein, also known as
VrrA, which is functionally uncharacterized. This family
of proteins is found in bacteria. Proteins in this family
are typically between 146 and 237 amino acids in length.
There are two conserved sequence motifs: QYGP and PKLY.
Length = 155
Score = 33.6 bits (77), Expect = 0.32
Identities = 16/62 (25%), Positives = 28/62 (45%)
Query: 1158 LKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAK 1217
L + DD E +EE +E E + K K + E P +++K K + ++ +K
Sbjct: 91 LSSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTEPKKPKPSK 150
Query: 1218 LK 1219
K
Sbjct: 151 PK 152
Score = 31.3 bits (71), Expect = 1.9
Identities = 18/56 (32%), Positives = 30/56 (53%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLK 1223
DDEEEE EEE + + + +T+ + +E + K + EK++ K + K K K
Sbjct: 95 DDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTEPKKPKPSK 150
>gnl|CDD|206039 pfam13868, Trichoplein, Tumour suppressor, Mitostatin. Trichoplein
or mitostatin, was first defined as a meiosis-specific
nuclear structural protein. It has since been linked
with mitochondrial movement. It is associated with the
mitochondrial outer membrane, and over-expression leads
to reduction in mitochondrial motility whereas lack of
it enhances mitochondrial movement. The activity appears
to be mediated through binding the mitochondria to the
actin intermediate filaments (IFs).
Length = 349
Score = 34.5 bits (80), Expect = 0.33
Identities = 25/112 (22%), Positives = 54/112 (48%), Gaps = 1/112 (0%)
Query: 277 KRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMR 336
+R ++Q L+ AR + EK+++++ ER + + +E + Q + E + R+ R
Sbjct: 228 RRRQKQELQRAREEQIEEKEERLQEERAEEEAERERMLEK-QAEDEELEQENAEKRRMKR 286
Query: 337 LNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLA 388
L EKE+++ ER E+ + E+E + I++++ + L
Sbjct: 287 LEHRRELEQQIEEKEERRAAEREEELEEGERLREEEAERQARIEEERQRLLK 338
>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
(vWA) domain [General function prediction only].
Length = 4600
Score = 35.4 bits (81), Expect = 0.33
Identities = 51/255 (20%), Positives = 91/255 (35%), Gaps = 51/255 (20%)
Query: 1055 QDDEEDEEENAVPDDETVNQMLARSEE----EFQTYQRID-------------AERRKEQ 1097
+ +E+ EEN ++E+ + EE E Q ID AE +E
Sbjct: 4061 KMNEDGFEENVQENEESTEDGVKSDEELEQGEVPEDQAIDNHPKMDAKSTFASAEADEEN 4120
Query: 1098 GKKSRLIEVSEL-----------PDWLIKEDEEIEQWAFEAKEEEKALHMGRGS-----R 1141
K + E EL D ++ +E EA E + G +
Sbjct: 4121 TDKGIVGENEELGEEDGVRGNGTADGEFEQVQEDTSTPKEAMSEADRQYQSLGDHLREWQ 4180
Query: 1142 QRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRK-KTEDDDEEPSTS 1200
Q ++ + LTE + D E+ +E+EEE++++ ++ + K+ D DE + +
Sbjct: 4181 QANRIHEWEDLTESQ--SQAFDDSEFMHVKEDEEEDLQALGNAEKDQIKSIDRDESANQN 4238
Query: 1201 KKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVI 1260
++ K L DG+ +S+ IK LP + I
Sbjct: 4239 PDSMNSTNIAEDEADEVGDKQL------------QDGQDISD--IKQTGEDTLPTEFGSI 4284
Query: 1261 DRPMDIKKILGRIED 1275
++ + L ED
Sbjct: 4285 NQSEKV-FELSEDED 4298
Score = 35.0 bits (80), Expect = 0.38
Identities = 32/145 (22%), Positives = 55/145 (37%), Gaps = 18/145 (12%)
Query: 395 DEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETS----QLTD 450
DE ++ EQ +E K+ + L D D + D++E S +
Sbjct: 3909 DEPNEEDLLETEQKSNEQSAANNESDLVSKEDDNKALEDKDRQEKEDEEEMSDDVGIDDE 3968
Query: 451 MHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGE 510
+ ++E +S + ED L LK D E + +DS+
Sbjct: 3969 IQPDIQENNSQPPPENEDLDLPEDLK------------LDEKEGDVSKDSDLEDMDMEAA 4016
Query: 511 NENKEKNKGEDDE--YNKNAMEEAT 533
+ENKE+ E DE +++ +EE
Sbjct: 4017 DENKEEADAEKDEPMQDEDPLEENN 4041
>gnl|CDD|222449 pfam13908, Shisa, Wnt and FGF inhibitory regulator. Shisa is a
transcription factor-type molecule that physically
interacts with immature forms of the Wnt receptor
Frizzled and the FGF receptor within the endoplasmic
reticulum to inhibit their post-translational maturation
and trafficking to the cell surface.
Length = 177
Score = 33.6 bits (77), Expect = 0.33
Identities = 23/71 (32%), Positives = 28/71 (39%), Gaps = 11/71 (15%)
Query: 112 RNQPLTPQ-LAMGVQGKRM----EGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQ 166
+P+ + + VQ + PS P H PMPP MP A P L Q
Sbjct: 109 PQRPVMTRATSTTVQTTPLPQPPSTAPSYPGPQYQGYH---PMPPQPGMP--APPYSL-Q 162
Query: 167 QPPPQPHQQQG 177
PPP Q QG
Sbjct: 163 YPPPGLLQPQG 173
Score = 31.3 bits (71), Expect = 2.1
Identities = 11/47 (23%), Positives = 14/47 (29%)
Query: 8 PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENL 54
P + V P+ P S P PGP PQ +
Sbjct: 108 RPQRPVMTRATSTTVQTTPLPQPPSTAPSYPGPQYQGYHPMPPQPGM 154
>gnl|CDD|111993 pfam03157, Glutenin_hmw, High molecular weight glutenin subunit.
Members of this family include high molecular weight
subunits of glutenin. This group of gluten proteins is
thought to be largely responsible for the elastic
properties of gluten, and hence, doughs. Indeed,
glutenin high molecular weight subunits are classified
as elastomeric proteins, because the glutenin network
can withstand significant deformations without breaking,
and return to the original conformation when the stress
is removed. Elastomeric proteins differ considerably in
amino acid sequence, but they are all polymers whose
subunits consist of elastomeric domains, composed of
repeated motifs, and non-elastic domains that mediate
cross-linking between the subunits. The elastomeric
domain motifs are all rich in glycine residues in
addition to other hydrophobic residues. High molecular
weight glutenin subunits have an extensive central
elastomeric domain, flanked by two terminal non-elastic
domains that form disulphide cross-links. The central
elastomeric domain is characterized by the following
three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It
possesses overlapping beta-turns within and between the
repeated motifs, and assumes a regular helical secondary
structure with a diameter of approx. 1.9 nm and a pitch
of approx. 1.5 nm.
Length = 779
Score = 35.1 bits (79), Expect = 0.34
Identities = 51/200 (25%), Positives = 67/200 (33%), Gaps = 28/200 (14%)
Query: 12 PPQQQQP--------PLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP--QENLTALQRAI 61
P Q QQP P + Q G PG P S P+ Q PGQ Q+ Q I
Sbjct: 288 PAQGQQPGQGQPGHYPASPQQPGQGQPGHYPASSQQPTQSQEPGQGQQGQQVGQGQQAQI 347
Query: 62 DSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTS----AQVQQLRFQIMAYRLLARNQPLT 117
+ +Q + P + ++ + H TS Q QQ+ + QP
Sbjct: 348 PAQGQQPGQGQPGHYPASPLQQGPGQPGHYLTSLQQLGQGQQIGQLQQSAPGQKGQQPGQ 407
Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQP--------- 168
Q Q + G Q P G P Q P Q Q QQP
Sbjct: 408 GQQPGQGQQGQQPGQGEQEQQPGQGQPGYYPTSLQQ--PGQGQQPGQWQQPGQGQPGYYP 465
Query: 169 --PPQPHQ-QQGHISSQIKQ 185
QP Q Q GH + ++Q
Sbjct: 466 TSLLQPGQGQPGHDPASLQQ 485
Score = 33.9 bits (76), Expect = 0.77
Identities = 46/202 (22%), Positives = 62/202 (30%), Gaps = 15/202 (7%)
Query: 6 TSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQ-------------APGQNPQE 52
TSP P Q QQP +G G G PGQ GQ Q+
Sbjct: 147 TSPQHQPGQLQQPAQGQQGQQIGQGQQGQQPEQGQQPGQGQQGQQPGQGQQPGQGQQGQQ 206
Query: 53 NLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLAR 112
Q +Q + P + + + + H S Q Q + A+
Sbjct: 207 LGQGQQGYYPGQLQQSGQGQPGHYPTSLQQLGQGQQGHYLASPQQPGQGQQPGQLQQPAQ 266
Query: 113 NQPLTPQLAMGVQGKRMEG-VPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ 171
Q + +G P+ Q P G P P QP Q P Q P Q
Sbjct: 267 GQQPEQGQQGQQPAQGQQGHQPAQGQQPGQGQPGHYPASPQQPGQGQPGHYPASSQQPTQ 326
Query: 172 PHQQ-QGHISSQIKQSKLTNIP 192
+ QG Q+ Q + IP
Sbjct: 327 SQEPGQGQQGQQVGQGQQAQIP 348
Score = 31.6 bits (70), Expect = 4.0
Identities = 48/177 (27%), Positives = 62/177 (35%), Gaps = 5/177 (2%)
Query: 12 PPQQQQPPLNVGQLPMGAPGSGP-PGSPGPSPGQA-PGQNPQENLTALQRAIDSMKEQGL 69
P Q+ Q P Q G G P G PGQ PG P Q +Q
Sbjct: 398 PGQKGQQPGQGQQPGQGQQGQQPGQGEQEQQPGQGQPGYYPTSLQQPGQGQQPGQWQQPG 457
Query: 70 EEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRM 129
+ P Y ++ + + H S Q Q + A+ QP QLA G QG++
Sbjct: 458 QGQPGYYPTSLLQPGQGQPGHDPASLQQPGQGQQPGQLQQPAQGQP-GQQLAQGQQGQQP 516
Query: 130 EGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQ-QQGHISSQIKQ 185
V G Q + P Q Q P Q + QP Q QQG Q +Q
Sbjct: 517 AQVQQGQQPAQGQQGQQLGQGQQGQQPGQGQ-HPAQGEQGQQPGQGQQGQQPGQGQQ 572
>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
Length = 880
Score = 34.7 bits (80), Expect = 0.41
Identities = 47/206 (22%), Positives = 87/206 (42%), Gaps = 18/206 (8%)
Query: 1089 IDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRG-SRQRKQVD 1147
+ E RKE ++ E+ + L + +E+ + E +E EK L + ++ +
Sbjct: 445 LTEEHRKELLEEYTA-ELKRIEKELKEIEEKERKLRKELRELEKVLKKESELIKLKELAE 503
Query: 1148 YTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKG--KRRKKTEDDDEEPSTSKKRKK 1205
L EK LK + +E +++ EE E+++ K K + + E+ KK+
Sbjct: 504 QLKELEEK--LKKYN--LEELEKKAEEYEKLKEKLIKLKGEIKSLKKELEKLEELKKKLA 559
Query: 1206 EKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMD 1265
E EK ++ + +L + LK++ + + + L KEL +Y D
Sbjct: 560 ELEKKLDELEEELAELLKELEELGFESVEELEERL----------KELEPFYNEYLELKD 609
Query: 1266 IKKILGRIEDGKYSSVDELQKDFKTL 1291
+K L R E +EL K F+ L
Sbjct: 610 AEKELEREEKELKKLEEELDKAFEEL 635
Score = 34.3 bits (79), Expect = 0.59
Identities = 39/229 (17%), Positives = 90/229 (39%), Gaps = 21/229 (9%)
Query: 213 NIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLE-TAV 271
NIE I+E L L E + +E+ +LR E+ + LE
Sbjct: 190 NIEELIKEKEKELEEVLREINEISSEL-----------PELREELEKLEKEVKELEELKE 238
Query: 272 NVKAYKRTKRQGLKEARATE--------KLEKQQKVEAERKKRQKHQEYITTVLQHCKDF 323
++ ++ R E ++E+ +K E +++ K + + +
Sbjct: 239 EIEELEKELESLEGSKRKLEEKIRELEERIEELKKEIEELEEKVKELKELKEKAEEYIKL 298
Query: 324 KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKE-RMRRLMAEDEEGYRKLIDQK 382
E++ + + K + +++ +E EKE R+ L + +E ++L + +
Sbjct: 299 SEFYEEYLDELREIEKRLSRLEEEINGIEERIKELEEKEERLEELKKKLKELEKRLEELE 358
Query: 383 KDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKL 431
+ L E + L + + E+ +K+ EE +K K+ +++++
Sbjct: 359 ERHELYEEAKAKKEELERLKKRLTGLTPEKLEKELEELEKAKEEIEEEI 407
Score = 32.3 bits (74), Expect = 2.1
Identities = 33/155 (21%), Positives = 65/155 (41%), Gaps = 11/155 (7%)
Query: 1075 MLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKAL 1134
+ ++E + + E ++ + K I++SE + + E EIE+ +EE +
Sbjct: 267 RIEELKKEIEELEEKVKELKELKEKAEEYIKLSEFYEEYLDELREIEKRLSRLEEEINGI 326
Query: 1135 HMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEE----EEVRSKRKGKRRKKT 1190
+ + + + E LK +E EE EE EE ++K++ R K
Sbjct: 327 -------EERIKELEEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEELERLKK 379
Query: 1191 EDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
P +K +E EK +E+ + ++ K +I
Sbjct: 380 RLTGLTPEKLEKELEELEKAKEEIEEEISKITARI 414
Score = 32.3 bits (74), Expect = 2.3
Identities = 35/175 (20%), Positives = 85/175 (48%), Gaps = 32/175 (18%)
Query: 291 EKLEKQQKVEAERKKRQKHQEYIT-------TVLQHCKDFKEYHRNNQARIMRLNKAVMN 343
E E+ +K+E E K+ ++ +E I ++ + +E R + RI
Sbjct: 218 ELREELEKLEKEVKELEELKEEIEELEKELESLEGSKRKLEEKIRELEERI--------- 268
Query: 344 YHANAEKEQKKEQERIEKERMRRL--MAEDEEGYRKLID-----QKKDKRLAFLLSQTDE 396
E+ K++ +E+++ L + E E Y KL + + + + LS+ +E
Sbjct: 269 -------EELKKEIEELEEKVKELKELKEKAEEYIKLSEFYEEYLDELREIEKRLSRLEE 321
Query: 397 YISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDM 451
I+ + + +KE +E+K+++ EE KK+ + ++++L + + + L ++ ++ ++
Sbjct: 322 EINGIEERIKE--LEEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEEL 374
>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis). This nucleolar
family of proteins are involved in 60S ribosomal
biogenesis. They are specifically involved in the
processing beyond the 27S stage of 25S rRNA maturation.
This family contains sequences that bear similarity to
the glioma tumour suppressor candidate region gene 2
protein (p60). This protein has been found to interact
with herpes simplex type 1 regulatory proteins.
Length = 387
Score = 34.3 bits (79), Expect = 0.44
Identities = 23/105 (21%), Positives = 45/105 (42%), Gaps = 12/105 (11%)
Query: 1126 EAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
E K E+K + R ++ + E L + +G+ + +++ EEE
Sbjct: 206 EVKAEKKRQELERVEEKKLEK----MAPEASRLDEMSEGLLEESDDDGEEESDDESAWEG 261
Query: 1186 RRKKTEDDDEEPSTSKKRKKEKEKDREK------DQAKLKKTLKK 1224
++E + KRK + ++++EK +AK +K LKK
Sbjct: 262 --FESEYEPINKPVRPKRKTKAQRNKEKRRKELEREAKEEKQLKK 304
>gnl|CDD|217503 pfam03344, Daxx, Daxx Family. The Daxx protein (also known as the
Fas-binding protein) is thought to play a role in
apoptosis, but precise role played by Daxx remains to be
determined. Daxx forms a complex with Axin.
Length = 715
Score = 34.5 bits (79), Expect = 0.45
Identities = 27/136 (19%), Positives = 59/136 (43%), Gaps = 26/136 (19%)
Query: 398 ISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISVRE 457
+S L +++ ++ M+Q ++EE +KR++ +Q T S
Sbjct: 381 VSRLEEVISKYAMKQDDTEEEERRKRQERERQG------------------TSSRSSDPS 422
Query: 458 ISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKN 517
+S ++P A Q+ E V + +EE E+E+ E+ + + + +E+
Sbjct: 423 KASST---SGESPSMAS-----QESEEEESVEEEEEEEEEEEEEEQESEEEEGEDEEEEE 474
Query: 518 KGEDDEYNKNAMEEAT 533
+ E D ++ ME ++
Sbjct: 475 EVEADNGSEEEMEGSS 490
>gnl|CDD|218188 pfam04641, Rtf2, Replication termination factor 2. It is vital for
effective cell-replication that replication is not
stalled at any point by, for instance, damaged bases.
Rtf2 stabilizes the replication fork stalled at the
site-specific replication barrier RTS1 by preventing
replication restart until completion of DNA synthesis by
a converging replication fork initiated at a flanking
origin. The RTS1 element terminates replication forks
that are moving in the cen2-distal direction while
allowing forks moving in the cen2-proximal direction to
pass through the region. Rtf2 contains a C2HC2 motif
related to the C3HC4 RING-finger motif, and would appear
to fold up, creating a RING finger-like structure but
forming only one functional Zn2+ ion-binding site.
Length = 254
Score = 33.9 bits (78), Expect = 0.46
Identities = 15/65 (23%), Positives = 22/65 (33%)
Query: 1172 EEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIK 1231
EEE + + K+K K+ KK + E Q K LKK +
Sbjct: 177 EEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEAKKLKKKRSIAPD 236
Query: 1232 YTDSD 1236
S+
Sbjct: 237 NEKSE 241
>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3 subunit.
This is a family of proteins which are subunits of the
eukaryotic translation initiation factor 3 (eIF3). In
yeast it is called Hcr1. The Saccharomyces cerevisiae
protein eIF3j (HCR1) has been shown to be required for
processing of 20S pre-rRNA and binds to 18S rRNA and eIF3
subunits Rpg1p and Prt1p.
Length = 242
Score = 33.9 bits (78), Expect = 0.46
Identities = 11/39 (28%), Positives = 21/39 (53%)
Query: 1172 EEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKD 1210
EE+E+ R K + R+ ED E+ K R ++ +++
Sbjct: 67 EEKEKAKREKEEKGLRELEEDTPEDELAEKLRLRKLQEE 105
Score = 30.8 bits (70), Expect = 4.2
Identities = 17/58 (29%), Positives = 31/58 (53%), Gaps = 8/58 (13%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
D+EE+EE+EE ++K K + K K + +EKEK + + + K + L++
Sbjct: 38 DEEEDEEKEEEKAKVAAKAKAKKA--------LKAKIEEKEKAKREKEEKGLRELEED 87
Score = 30.4 bits (69), Expect = 5.0
Identities = 19/70 (27%), Positives = 33/70 (47%), Gaps = 4/70 (5%)
Query: 1162 DDGVEYDDEEEEEEEEVRSKRK----GKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAK 1217
DD V+ +EEE+EE+ K K K +K + EE +K+ K+EK ++
Sbjct: 30 DDDVKDSWDEEEDEEKEEEKAKVAAKAKAKKALKAKIEEKEKAKREKEEKGLRELEEDTP 89
Query: 1218 LKKTLKKIMR 1227
+ +K+
Sbjct: 90 EDELAEKLRL 99
>gnl|CDD|234090 TIGR03021, pilP_fam, type IV pilus biogenesis protein PilP.
Members of this protein family are found in type IV
pilus biogenesis loci and include proteins designated
PilP [Cell envelope, Surface structures].
Length = 119
Score = 32.4 bits (74), Expect = 0.46
Identities = 14/67 (20%), Positives = 22/67 (32%), Gaps = 6/67 (8%)
Query: 93 TSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPS 152
T Q++ L+ + + ++ G GP MP S G PM +
Sbjct: 3 TVGQLEALQSETALLEAQLARA----KAQNELEEAERGGQVGGPGMPFTS--GVPPMALT 56
Query: 153 QPMPNQA 159
P A
Sbjct: 57 GANPTSA 63
>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
subunit (TFIIF-alpha). Transcription initiation factor
IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
II-associating protein 74 (RAP74) is the large subunit of
transcription factor IIF (TFIIF), which is essential for
accurate initiation and stimulates elongation by RNA
polymerase II.
Length = 528
Score = 34.2 bits (78), Expect = 0.48
Identities = 19/85 (22%), Positives = 32/85 (37%), Gaps = 1/85 (1%)
Query: 1140 SRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPST 1199
R++K + L + K +DEE E E+ + K + K E DDE+
Sbjct: 169 KRRKKTANGF-QLMMMKAAKNGPAAFGDEDEETEGEKGGGGRGKDLKIKDLEGDDEDDGD 227
Query: 1200 SKKRKKEKEKDREKDQAKLKKTLKK 1224
+ E + + + K K K
Sbjct: 228 ESDKGGEDGDEEKSKKKKKKLAKNK 252
Score = 33.0 bits (75), Expect = 1.1
Identities = 32/165 (19%), Positives = 64/165 (38%), Gaps = 22/165 (13%)
Query: 1090 DAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEE---EKALHMGRGSR----- 1141
+ E K G + + +++ +L + +E ++ + EE +K + + +
Sbjct: 199 ETEGEKGGGGRGKDLKIKDLEGDDEDDGDESDKGGEDGDEEKSKKKKKKLAKNKKKLDDD 258
Query: 1142 QRKQVDYTDSLTEKEWLKAIDDGVEYD--------DEEEEEEEEVRSKRKGKRR--KKTE 1191
++ + D E + D+G E D + EE E+ S + ++ E
Sbjct: 259 KKGKRGGDDDADEYDSDDGDDEGREEDYISDSSASGNDPEEREDKLSPEIPAKPEIEQDE 318
Query: 1192 DDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSD 1236
D +E ++ K E+E K KLKK K + +DS
Sbjct: 319 DSEES----EEEKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDSG 359
>gnl|CDD|218108 pfam04487, CITED, CITED. CITED, CBP/p300-interacting
transactivator with ED-rich tail, are characterized by a
conserved 32-amino acid sequence at the C-terminus.
CITED proteins do not bind DNA directly and are thought
to function as transcriptional co-activators.
Length = 206
Score = 33.3 bits (76), Expect = 0.50
Identities = 20/76 (26%), Positives = 22/76 (28%), Gaps = 13/76 (17%)
Query: 1 MSNSSTSPNPPPPQQQQP--------PLNV---GQLPM--GAPGSGPPGSPGPSPGQAPG 47
M N S+ P P LN G G PG G P P GQ PG
Sbjct: 81 MFNPSSKPQPFMLVPGPQLMASMQLQKLNTQYQGHAGAPAGHPGGGGPQQFRPGAGQPPG 140
Query: 48 QNPQENLTALQRAIDS 63
ID+
Sbjct: 141 MQHMPAPALPPNVIDT 156
>gnl|CDD|217453 pfam03251, Tymo_45kd_70kd, Tymovirus 45/70Kd protein. Tymoviruses
are single stranded RNA viruses. This family includes a
protein of unknown function that has been named based on
its molecular weight. Tymoviruses such as the ononis
yellow mosaic tymovirus encode only three proteins. Of
these two are overlapping this protein overlaps a larger
ORF that is thought to be the polymerase.
Length = 458
Score = 34.3 bits (79), Expect = 0.51
Identities = 11/52 (21%), Positives = 15/52 (28%), Gaps = 4/52 (7%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGP----PGSPGPSPGQAPGQNPQ 51
+T P+PP P + P + P P S G P
Sbjct: 249 HTTRPSPPRPAFSRSPSSPLSPLPRPSTRRGLLPNPRLPRASRGHLPPPTSS 300
>gnl|CDD|185594 PTZ00395, PTZ00395, Sec24-related protein; Provisional.
Length = 1560
Score = 34.3 bits (78), Expect = 0.52
Identities = 15/42 (35%), Positives = 24/42 (57%)
Query: 480 QDHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGED 521
+DHP E++++E E S + S ENEN+ +KGE+
Sbjct: 550 EDHPEGGTNRQKYEQSDEESVESSSSENSSENENEVTDKGEE 591
>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain. This domain is
found at the N terminus of SMC proteins. The SMC
(structural maintenance of chromosomes) superfamily
proteins have ATP-binding domains at the N- and
C-termini, and two extended coiled-coil domains separated
by a hinge in the middle. The eukaryotic SMC proteins
form two kind of heterodimers: the SMC1/SMC3 and the
SMC2/SMC4 types. These heterodimers constitute an
essential part of higher order complexes, which are
involved in chromatin and DNA dynamics. This family also
includes the RecF and RecN proteins that are involved in
DNA metabolism and recombination.
Length = 1162
Score = 34.2 bits (78), Expect = 0.56
Identities = 32/142 (22%), Positives = 60/142 (42%), Gaps = 3/142 (2%)
Query: 1087 QRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQW--AFEAKEEEKALHMGRGSRQRK 1144
Q A + +K L E + L +K +EE E+E+ + + +
Sbjct: 206 QAKKALEYYQLKEKLELEEENLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEE 265
Query: 1145 QVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKR-KGKRRKKTEDDDEEPSTSKKR 1203
++ KE K E +EEEE++S+ K +RRK +++ + S + +
Sbjct: 266 EILAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKELK 325
Query: 1204 KKEKEKDREKDQAKLKKTLKKI 1225
K EKE +EK++ + + K
Sbjct: 326 KLEKELKKEKEEIEELEKELKE 347
Score = 33.8 bits (77), Expect = 0.80
Identities = 40/237 (16%), Positives = 91/237 (38%), Gaps = 14/237 (5%)
Query: 990 DRAHRIGQKNEVRVLRLMTVNSVEERILAAARYKLNMDEKVIQAGMFDQKSTGSERHQFL 1049
++A +++ R + + E K + + + SE L
Sbjct: 610 NKATLEADEDDKRAKVVEGILKDTELTKLLESAKAKESGLRKGVSLEEGLAEKSELKASL 669
Query: 1050 QTILHQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAE-RRKEQGKKSRLIEVSE 1108
+ + E E + + N++L R EE + QRI E ++ + K+ L + +
Sbjct: 670 SELTKELLAEQELQEKAESELAKNEILRRQEEIKKKEQRIKEELKKLKLEKEELLADKVQ 729
Query: 1109 LPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYD 1168
I E+ ++ + + KEEE+ R ++ ++ + ++ +++ +
Sbjct: 730 EAQDKINEELKLLEQKIKEKEEEEEKS--RLKKEEEEEEKSELSLKEK-----------E 776
Query: 1169 DEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
EEEE+ E + K K ++E + ++ K+E E E+ ++ K
Sbjct: 777 LAEEEEKTEKLKVEEEKEEKLKAQEEELRALEEELKEEAELLEEEQLLIEQEEKIKE 833
Score = 33.4 bits (76), Expect = 0.95
Identities = 44/229 (19%), Positives = 83/229 (36%), Gaps = 12/229 (5%)
Query: 1053 LHQDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRL-IEVSELPD 1111
L +EE+ + + + + +E + ++++ E +KE+ + L E+ EL
Sbjct: 291 LLAKEEEELKSELLKLERRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEI 350
Query: 1112 WLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEE 1171
E+EE EQ K +EK Q ++ E E L + E + E
Sbjct: 351 KREAEEEEEEQ---LEKLQEKLE-------QLEEELLAKKKLESERLSSAAKLKEEELEL 400
Query: 1172 EEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIK 1231
+ EEE+ ++ + EE K +E E+ E Q KL + K+ +
Sbjct: 401 KNEEEKEAKLLLELSEQEEDLLKEEKKEELKIVEELEESLETKQGKLTE-EKEELEKQAL 459
Query: 1232 YTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSS 1280
D L + L K + ++ + K ++ K
Sbjct: 460 KLLKDKLELKKSEDLLKETKLVKLLEQLELLLLRQKLEEASQKESKARE 508
Score = 33.4 bits (76), Expect = 1.1
Identities = 41/292 (14%), Positives = 89/292 (30%), Gaps = 39/292 (13%)
Query: 324 KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKK 383
K+ R + N A + K Q+ + + K+ + +++ + +
Sbjct: 171 KKKERLKKLIEETENLAELIIDLEELKLQELKLKEQAKKALEYYQLKEKL-ELEEENLLY 229
Query: 384 DKRLAFLLSQTDEYISNLT-------QMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDG 436
L + D L +E + E++ + +++ ++KL + +
Sbjct: 230 LDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKKLQEEEL 289
Query: 437 KVTLDQDETSQLTDMHISVREISSGKVLKGEDAPLAAHLKQWIQDHPGWEVVADSDEENE 496
K+ ++E + + + R++ + LK S++E +
Sbjct: 290 KLLAKEEEELKSELLKLERRKVDDEEKLKE------------------------SEKELK 325
Query: 497 DEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYYSIAHTVHEIVTEQASILVNGK 556
+ E KEK E KE + E + EE E+ +
Sbjct: 326 KLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQLE---KLQEKLEQLEEELLAKKKL 382
Query: 557 LKEY---QIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIALITYLMEKKKV 605
E K E + L N + L + + + E K V
Sbjct: 383 ESERLSSAAKLKEEELELKNEEEK-EAKLLLELSEQEEDLLKEEKKEELKIV 433
>gnl|CDD|100796 PRK01156, PRK01156, chromosome segregation protein; Provisional.
Length = 895
Score = 34.1 bits (78), Expect = 0.59
Identities = 40/259 (15%), Positives = 97/259 (37%), Gaps = 28/259 (10%)
Query: 203 LQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACAR 262
L+ + NI+++I + S + TL E R+ E N + L +
Sbjct: 192 LKSSNLELE-NIKKQIADDEKSHSITLKEIERLSIEYNNAMDDYNNLKSALNE---LSSL 247
Query: 263 RDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKD 322
D +K + L++ ++LE++ + K++ YI ++ D
Sbjct: 248 EDMKNRYESEIKTAESDLSMELEKNNYYKELEERHM-KIINDPVYKNRNYINDYFKYKND 306
Query: 323 FKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQK 382
+ + + ++ + YHA +K +++ + + + +
Sbjct: 307 IENKKQ----ILSNIDAEINKYHAIIKKLSVLQKDYNDYIKKKSRYDD------------ 350
Query: 383 KDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQ 442
L + + + Y + +K +E KK+ EE K + + + + +D
Sbjct: 351 ----LNNQILELEGYEMDYNSYLKS--IESLKKKIEEYSKNIERMSAFISEILKIQEIDP 404
Query: 443 DE-TSQLTDMHISVREISS 460
D +L ++++ +++ISS
Sbjct: 405 DAIKKELNEINVKLQDISS 423
>gnl|CDD|215590 PLN03123, PLN03123, poly [ADP-ribose] polymerase; Provisional.
Length = 981
Score = 34.0 bits (78), Expect = 0.60
Identities = 17/79 (21%), Positives = 37/79 (46%), Gaps = 4/79 (5%)
Query: 1150 DSLT--EKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
D+L+ ++E + + + +EE+ EE + +KG +RKK D++ +K +
Sbjct: 169 DTLSDSDQEAVLPLVKKSPSEAKEEKAEERKQESKKGAKRKKDASGDDKSKKAKTDRDVS 228
Query: 1208 EKD--REKDQAKLKKTLKK 1224
+K + L+ L+
Sbjct: 229 TSTAASQKKSSDLESKLEA 247
>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1. All
proteins in this family for which functions are known
are cyclin dependent protein kinases that are components
of TFIIH, a complex that is involved in nucleotide
excision repair and transcription initiation. Also known
as MAT1 (menage a trois 1). This family is based on the
phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
Stanford University) [DNA metabolism, DNA replication,
recombination, and repair].
Length = 309
Score = 33.6 bits (77), Expect = 0.61
Identities = 25/104 (24%), Positives = 50/104 (48%), Gaps = 7/104 (6%)
Query: 321 KDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLID 380
K + Y + N+ I + NK E E+ E E+ E+E+ RRL+ + EE +++
Sbjct: 120 KKIETYQKENKDVIQK-NKEKST-REQEELEEALEFEKEEEEQ-RRLLLQKEEEEQQMNK 176
Query: 381 QKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRK 424
+K + L L + + +++ +HK + K + + +K K
Sbjct: 177 RKNKQALLDELETSTLPAA---ELIAQHK-KNSVKLEMQVEKPK 216
>gnl|CDD|148139 pfam06346, Drf_FH1, Formin Homology Region 1. This region is found
in some of the Diaphanous related formins (Drfs). It
consists of low complexity repeats of around 12
residues.
Length = 160
Score = 32.6 bits (74), Expect = 0.63
Identities = 16/47 (34%), Positives = 17/47 (36%), Gaps = 7/47 (14%)
Query: 131 GVPSGPQMPPMSLHGPMPMPPSQPMPNQA---QPMPLQQQPPPQPHQ 174
VP P +P GP PP P P P P PPP P
Sbjct: 108 AVPPPPPLPG----GPGVPPPPPPFPGAPGIPPPPPGMGSPPPPPFG 150
>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
Length = 509
Score = 33.8 bits (78), Expect = 0.65
Identities = 22/135 (16%), Positives = 49/135 (36%), Gaps = 1/135 (0%)
Query: 1088 RIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVD 1147
I ++ K K + ++ + D + A +++ L+ + Q D
Sbjct: 73 DIPKKKTKTAAKAAAAKAPAKK-KLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQAD 131
Query: 1148 YTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
D + + L D + DDE+++E+++ KK + E+ S ++
Sbjct: 132 DDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDVDDEDEEKKEAKELEKLSDDDDFVWDE 191
Query: 1208 EKDREKDQAKLKKTL 1222
+ QA+ L
Sbjct: 192 DDSEALRQARKDAKL 206
>gnl|CDD|218621 pfam05518, Totivirus_coat, Totivirus coat protein.
Length = 753
Score = 34.0 bits (78), Expect = 0.67
Identities = 12/46 (26%), Positives = 15/46 (32%), Gaps = 1/46 (2%)
Query: 8 PNPPPPQQQQPPLNVGQLPMG-APGSGPPGSPGPSPGQAPGQNPQE 52
P P+ PP G LP + +P S A P E
Sbjct: 693 RAPQAPRPGGPPGGGGGLPPPPDLPAAAGPAPCGSSLIASPTAPPE 738
>gnl|CDD|165431 PHA03160, PHA03160, hypothetical protein; Provisional.
Length = 499
Score = 33.9 bits (77), Expect = 0.67
Identities = 36/164 (21%), Positives = 56/164 (34%), Gaps = 23/164 (14%)
Query: 38 PGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSA-- 95
PG S N +N++ LQ + +K+ + + R I H F++
Sbjct: 342 PGESSLYKDVLNLTKNISQLQDDLKDLKQAAINQPNRI------------IPHHFSNPYS 389
Query: 96 --QVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQ 153
F+ Y + L P LA Q + P Q PM P P
Sbjct: 390 FDPGHAPFFRYAPYGAPKNDHHLLPPLACSQQ---LPMQPLHVQQAPMQAPHVAPPPMQP 446
Query: 154 PMPNQAQPMPLQQQP---PPQPHQQQG-HISSQIKQSKLTNIPK 193
P Q + +P P+P Q+ HI + Q ++ I K
Sbjct: 447 PHVQQPRVLPSTDGASNEAPKPSAQEPVHIDASFAQDPVSKIQK 490
Score = 31.2 bits (70), Expect = 4.6
Identities = 22/73 (30%), Positives = 31/73 (42%), Gaps = 7/73 (9%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPM--PLQQQPPPQ-PHQQQGHISSQIKQSKLT 189
+PP++ +PM QP+ Q PM P PP Q PH QQ + +
Sbjct: 408 NDHHLLPPLACSQQLPM---QPLHVQQAPMQAPHVAPPPMQPPHVQQPRVLPSTDGAS-N 463
Query: 190 NIPKPEGLDPLII 202
PKP +P+ I
Sbjct: 464 EAPKPSAQEPVHI 476
>gnl|CDD|237015 PRK11901, PRK11901, hypothetical protein; Reviewed.
Length = 327
Score = 33.5 bits (77), Expect = 0.68
Identities = 29/176 (16%), Positives = 41/176 (23%), Gaps = 38/176 (21%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPG--SGPPGSPGPSPGQAPGQNPQENLTALQ 58
+S+ + S G P S PP SP P+ P Q
Sbjct: 87 LSSGNQSSPSAANNTSDGHDASGVKNTAPPQDISAPPISPTPTQAAPPQTP-----NGQQ 141
Query: 59 RAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQ--PL 116
R IE+ N I A + QQ + + P
Sbjct: 142 R-------------------IELPGN---ISDALSQ---QQGQVNAASQNAQGNTSTLPT 176
Query: 117 TPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
P +G ++ PP P A P +P
Sbjct: 177 APATVAPSKGAKVPATAETHPTPPQKPATKKPAVNHHKTATVAVP----PATSGKP 228
>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
biogenesis [Translation, ribosomal structure and
biogenesis].
Length = 1077
Score = 33.9 bits (77), Expect = 0.69
Identities = 26/156 (16%), Positives = 55/156 (35%), Gaps = 9/156 (5%)
Query: 1058 EEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKED 1117
EE E+ + D++ +E+ T ++ E + EV+ D E
Sbjct: 420 EETSREDELSFDDSDVSTSDENEDVDFTGKKGAINNEDESDNE----EVAFDSDSQFDES 475
Query: 1118 EEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKE----WLKAIDDGVEYDDEEEE 1173
E +W + G+ R +++ Y +SL+ +E + E D ++
Sbjct: 476 EGNLRWKEGLASKLAYSQSGKRGRNIQKIFYDESLSPEECIEEYKGESAKSSESDLVVQD 535
Query: 1174 EEEE-VRSKRKGKRRKKTEDDDEEPSTSKKRKKEKE 1208
E E+ + + + S ++ KK+
Sbjct: 536 EPEDFFDVSKVANESISSNHEKLMESEFEELKKKWS 571
>gnl|CDD|213398 cd12191, gal11_coact, gall11 coactivator domain. Gall11/MED15 acts
in the general regulation of GAL structural genes and is
required for full expression for several genes in this
pathway, including GALs 1,7, and 10 in Saccharomyces
cerevisiae. GAL11 function is dependent on GCN4
functionality and binds GCN4 in a degenerate manner with
multiple orientations found at the GCN4-Gal11 interface.
Length = 90
Score = 31.2 bits (71), Expect = 0.69
Identities = 13/39 (33%), Positives = 15/39 (38%), Gaps = 1/39 (2%)
Query: 155 MPNQAQPMPLQQQPPPQPHQQQGHI-SSQIKQSKLTNIP 192
Q QP QQQ PQ Q + + I L IP
Sbjct: 1 PQQQQQPQQQQQQQMPQNPQLVNMMDNMPIPPQLLAKIP 39
>gnl|CDD|221042 pfam11244, Med25_NR-box, Mediator complex subunit 25 C-terminal
NR box-containing. The overall function of the
full-length Med25 is efficiently to coordinate the
transcriptional activation of RAR/RXR (retinoic acid
receptor/retinoic X receptor) in higher eukaryotic
cells. Human Med25 consists of several domains with
different binding properties, the N-terminal, VWA,
domain, an SD1 - synapsin 1 - domain from residues
229-381, a PTOV(B) or ACID domain from 395-545, an SD2
domain from residues 564-645 and this C-terminal NR
box-containing domain (646-650) from C69-747. The NR
box of MED25 is critical for its recruitment to the
promoter, probably through an interaction with pre
bound RAR.
Length = 89
Score = 31.2 bits (70), Expect = 0.72
Identities = 15/51 (29%), Positives = 16/51 (31%), Gaps = 8/51 (15%)
Query: 9 NPPPPQQQQPPLNVGQLPMGAPGSGPPGSP----GPSPGQ----APGQNPQ 51
Q QLPM P P GP GQ A G+ PQ
Sbjct: 11 AAMQQQAMGQQQQGHQLPMPGPAQFPLQQLQQMRGPGGGQMSMQAGGRAPQ 61
Score = 28.9 bits (64), Expect = 5.0
Identities = 13/45 (28%), Positives = 14/45 (31%)
Query: 14 QQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQ 58
QQ QP L Q P P P Q P Q Q+
Sbjct: 4 QQGQPGLAAMQQQAMGQQQQGHQLPMPGPAQFPLQQLQQMRGPGG 48
Score = 28.1 bits (62), Expect = 8.6
Identities = 18/59 (30%), Positives = 21/59 (35%)
Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
Q AMG Q + + GP P+ M P M QA QQ QP Q
Sbjct: 14 QQQAMGQQQQGHQLPMPGPAQFPLQQLQQMRGPGGGQMSMQAGGRAPQQMHALQPLLGQ 72
>gnl|CDD|233048 TIGR00605, rad4, DNA repair protein rad4. All proteins in this
family for which functions are known are involved in
targeting nucleotide excision repair to specific regions
of the genome.This family is based on the phylogenomic
analysis of JA Eisen (1999, Ph.D. Thesis, Stanford
University) [DNA metabolism, DNA replication,
recombination, and repair].
Length = 713
Score = 33.7 bits (77), Expect = 0.74
Identities = 34/138 (24%), Positives = 56/138 (40%), Gaps = 19/138 (13%)
Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVV 1229
E E+E E+ R+ K R++ E + ++RKK + + ++ +
Sbjct: 17 ENEKEAEKQPKSRRRKVRRENE------PSLRRRKKRFKTGLNELPHEVV------LMCN 64
Query: 1230 IKYTDSDGRVLSEPFI-----KLPSRKELPDYYEVID--RPMDIKKILGRIEDGKYSSVD 1282
+ T SD RV+S P ++PSR+E D E D + + + K SS
Sbjct: 65 LDSTHSDDRVVSVPDSLSVSEEIPSREEDYDSREFEDVYLSNLVAEFETISVEIKPSSKA 124
Query: 1283 ELQKDFKTLCRNAQIYNE 1300
E D +TL RN
Sbjct: 125 ESDDDAETLSRNVCSNEA 142
>gnl|CDD|224124 COG1203, COG1203, CRISPR-associated helicase Cas3 [Defense
mechanisms].
Length = 733
Score = 34.0 bits (78), Expect = 0.74
Identities = 26/147 (17%), Positives = 51/147 (34%), Gaps = 9/147 (6%)
Query: 481 DHPGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYYSI--- 537
+++ + E++ D E + + D + +N E A +
Sbjct: 116 HLARYQLSSLISEKSFLADWEGLSDSLFRFFFRLLEKM--DIKDTRNFTELAKQEARLLK 173
Query: 538 ---AHTVHEIVTEQASILVNGKLKEYQIKGLEWMVSLFNNNLNGILADEMGLGKTIQTIA 594
+ + + E Q K LE ++ L +L +L G GKT ++
Sbjct: 174 PLLLLLSAIARINKFKSFIEHEGYELQEKALELILRLEKRSLLVVLEAPTGYGKTEASLI 233
Query: 595 LITYLMEKKKVNGPFLIIV-PLSTLSN 620
L L+++K +I V P T+
Sbjct: 234 LALALLDEKIKLKSRVIYVLPFRTIIE 260
>gnl|CDD|179385 PRK02224, PRK02224, chromosome segregation protein; Provisional.
Length = 880
Score = 33.9 bits (78), Expect = 0.77
Identities = 44/172 (25%), Positives = 84/172 (48%), Gaps = 12/172 (6%)
Query: 1059 EDEEENAVPDDETVNQMLARSEEEFQTYQRID--AERRKEQG-----KKSRLIEVSELPD 1111
E E E+ + E V + L R+E+ + RI+ ERR++ ++ + E E +
Sbjct: 481 EAELEDLEEEVEEVEERLERAEDLVEAEDRIERLEERREDLEELIAERRETIEEKRERAE 540
Query: 1112 WLIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDS-LTE-KEWLKAIDDGVEYDD 1169
L + E+E A E K E A R++V +S L E KE +++++
Sbjct: 541 ELRERAAELEAEA-EEKREAAAEAEEEAEEAREEVAELNSKLAELKERIESLERIRTLLA 599
Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPST-SKKRKKEKEKDREKDQAKLKK 1220
+ E+E+ R+ KR E +DE ++KR++++E + E D+A++++
Sbjct: 600 AIADAEDEIERLRE-KREALAELNDERRERLAEKRERKRELEAEFDEARIEE 650
>gnl|CDD|227492 COG5163, NOP7, Protein required for biogenesis of the 60S ribosomal
subunit [Translation, ribosomal structure and
biogenesis].
Length = 591
Score = 33.5 bits (76), Expect = 0.77
Identities = 15/65 (23%), Positives = 33/65 (50%)
Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKT 1221
+E+++++EE++++++ + + E K K K K R+ D+ + +K
Sbjct: 484 RYEHVAGEEDDDDDEELQAQKELELEAQGIKYSETSEADKDVNKSKNKKRKVDEEEEEKK 543
Query: 1222 LKKIM 1226
LK IM
Sbjct: 544 LKMIM 548
>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y. Members of this family are
RNase Y, an endoribonuclease. The member from Bacillus
subtilis, YmdA, has been shown to be involved in
turnover of yitJ riboswitch [Transcription, Degradation
of RNA].
Length = 514
Score = 33.7 bits (78), Expect = 0.78
Identities = 36/168 (21%), Positives = 74/168 (44%), Gaps = 34/168 (20%)
Query: 277 KRTKRQGLKEA-RATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIM 335
+ ++ ++EA + E L+K+ +EA+ + + E + + + + R R
Sbjct: 31 EELAKRIIEEAKKEAETLKKEALLEAKEEVHKLRAELERELKERRNELQRLERRLLQREE 90
Query: 336 RLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTD 395
L++ + + E +KKE+E KE+ + E EE +LI +++ + L
Sbjct: 91 TLDRKMESLDKKEENLEKKEKELSNKEKN---LDEKEEELEELIAEQR-EEL-------- 138
Query: 396 EYISNLTQ---------------------MVKEHKMEQKKKQDEESKK 422
E IS LTQ ++KE + E K++ D+++K+
Sbjct: 139 ERISGLTQEEAKEILLEEVEEEARHEAAKLIKEIEEEAKEEADKKAKE 186
>gnl|CDD|99933 cd05501, Bromo_SP100C_like, Bromodomain, SP100C_like subfamily. The
SP100C protein is a splice variant of SP100, a major
component of PML-SP100 nuclear bodies (NBs), which are
poorly understood. It is covalently modified by SUMO-1
and may play a role in processes at the chromatin level.
Bromodomains are 110 amino acid long domains, that are
found in many chromatin associated proteins. Bromodomains
can interact specifically with acetylated lysine.
Length = 102
Score = 31.2 bits (71), Expect = 0.79
Identities = 14/58 (24%), Positives = 29/58 (50%), Gaps = 2/58 (3%)
Query: 1244 FIKLPSRKELPDYYEVIDRPMDIKKILGRIEDGKYSSVDELQKDFKTLCRNAQIYNEE 1301
FI P DY + I PM + K+ R+ + Y +V+ +D + + N +++ ++
Sbjct: 23 FISKPYYIR--DYCQGIKEPMWLNKVKERLNERVYHTVEGFVRDMRLIFHNHKLFYKD 78
>gnl|CDD|236498 PRK09401, PRK09401, reverse gyrase; Reviewed.
Length = 1176
Score = 33.8 bits (78), Expect = 0.80
Identities = 27/108 (25%), Positives = 48/108 (44%), Gaps = 12/108 (11%)
Query: 585 GLGKTIQTIALITYLMEK-KKVNGPFLIIVPLSTLSNWSLE-FERWAPSVNVVAYKGSPH 642
G+GKT + + YL +K KK II P L +E E++ V H
Sbjct: 105 GVGKTTFGLVMSLYLAKKGKKS----YIIFPTRLLVEQVVEKLEKFGEKVGCGVKILYYH 160
Query: 643 --LRKTLQAQMKAS----KFNVLLTTYEYVIKDKGPLAKLHWKYMIID 684
L+K + + F++L+TT +++ K+ L K + ++ +D
Sbjct: 161 SSLKKKEKEEFLERLKEGDFDILVTTSQFLSKNFDELPKKKFDFVFVD 208
>gnl|CDD|219321 pfam07174, FAP, Fibronectin-attachment protein (FAP). This
family contains bacterial fibronectin-attachment
proteins (FAP). Family members are rich in alanine and
proline, are approximately 300 long, and seem to be
restricted to mycobacteria. These proteins contain a
fibronectin-binding motif that allows mycobacteria to
bind to fibronectin in the extracellular matrix.
Length = 297
Score = 33.3 bits (76), Expect = 0.81
Identities = 11/44 (25%), Positives = 12/44 (27%)
Query: 7 SPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNP 50
S P PP AP P + P P P P
Sbjct: 49 STAAAAPAPAAPPPPPPPAAPPAPQPDDPNAAPPPPPADPNAPP 92
Score = 32.6 bits (74), Expect = 1.3
Identities = 13/43 (30%), Positives = 16/43 (37%), Gaps = 5/43 (11%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAP 46
ST+ P P PP P AP + P P +P P
Sbjct: 48 PSTAAAAPAPAAPPPPP-----PPAAPPAPQPDDPNAAPPPPP 85
Score = 31.4 bits (71), Expect = 3.0
Identities = 11/41 (26%), Positives = 11/41 (26%), Gaps = 1/41 (2%)
Query: 132 VPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
P P P P QP A P P P P
Sbjct: 53 AAPAPA-APPPPPPPAAPPAPQPDDPNAAPPPPPADPNAPP 92
>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein. This family
consists of several Borrelia P83/P100 antigen proteins.
Length = 489
Score = 33.4 bits (76), Expect = 0.82
Identities = 31/142 (21%), Positives = 58/142 (40%), Gaps = 14/142 (9%)
Query: 284 LKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMR-LNKAVM 342
LKE + E ++ Q+++ E K+Q + DF + + + Q +R +
Sbjct: 203 LKERESQEDAKRAQQLKEELDKKQIDADKAQQKA----DFAQDNADKQRDEVRQKQQEAK 258
Query: 343 NYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLT 402
N A+ KE +++ + + R + E +K D+ L D +L
Sbjct: 259 NLPKPADTSSPKEDKQVAENQKREIEKAQIEI------KKNDEEA---LKAKDHKAFDLK 309
Query: 403 QMVKEHKMEQKKKQDEESKKRK 424
Q K + E + K+ E KKR+
Sbjct: 310 QESKASEKEAEDKELEAQKKRE 331
>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
RPA34.5. This is a family of proteins conserved from
yeasts to human. Subunit A34.5 of RNA polymerase I is a
non-essential subunit which is thought to help Pol I
overcome topological constraints imposed on ribosomal DNA
during the process of transcription.
Length = 193
Score = 32.8 bits (75), Expect = 0.83
Identities = 17/62 (27%), Positives = 27/62 (43%), Gaps = 1/62 (1%)
Query: 1159 KAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKL 1218
G E E S+++ + + E + EE +K+KK KE +EK + K
Sbjct: 114 FPTGYGAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKK-KEVKKEKKEKKD 172
Query: 1219 KK 1220
KK
Sbjct: 173 KK 174
Score = 32.4 bits (74), Expect = 0.91
Identities = 16/53 (30%), Positives = 26/53 (49%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKK 1220
+ E+E EV + K +++KK E E+ K++K E K + K KK
Sbjct: 139 TTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKKKKKKKK 191
Score = 32.4 bits (74), Expect = 0.99
Identities = 19/61 (31%), Positives = 28/61 (45%), Gaps = 7/61 (11%)
Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPS------TSKKRKKEKEKDREKDQAKLKKTLK 1223
E E E + K K+ E ++EE KK KKEK KD+++ + K + K
Sbjct: 127 ELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEK-KDKKEKMVEPKGSKK 185
Query: 1224 K 1224
K
Sbjct: 186 K 186
Score = 29.7 bits (67), Expect = 6.8
Identities = 17/66 (25%), Positives = 29/66 (43%), Gaps = 5/66 (7%)
Query: 1151 SLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKR-----KGKRRKKTEDDDEEPSTSKKRKK 1205
+E E + + E EEEE+ K+ K K+ KK + + K+KK
Sbjct: 128 LGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKKKK 187
Query: 1206 EKEKDR 1211
+K+K +
Sbjct: 188 KKKKKK 193
>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
Length = 1066
Score = 33.7 bits (77), Expect = 0.83
Identities = 24/96 (25%), Positives = 41/96 (42%), Gaps = 4/96 (4%)
Query: 1188 KKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKL 1247
+ E +++KK++EK +EK+ KLK K+ + SDG + + K
Sbjct: 6 SEAEKKILTEEELERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKSEKK 65
Query: 1248 PSRK----ELPDYYEVIDRPMDIKKILGRIEDGKYS 1279
++ E P+ + D P KK L +YS
Sbjct: 66 SRKRDVEDENPEDFIDPDTPFGQKKRLSSQMAKQYS 101
>gnl|CDD|151322 pfam10873, DUF2668, Protein of unknown function (DUF2668). Members
in this family of proteins are annotated as Cysteine and
tyrosine-rich protein 1, however currently no function
is known.
Length = 154
Score = 32.1 bits (73), Expect = 0.85
Identities = 14/48 (29%), Positives = 16/48 (33%), Gaps = 9/48 (18%)
Query: 3 NSSTSPNPPPPQQQQ---------PPLNVGQLPMGAPGSGPPGSPGPS 41
N+ + P PPP PP A S PP PG S
Sbjct: 105 NAISYPMAPPPYTYDHEMEYPTDLPPPYSPAPQASAQRSPPPPYPGNS 152
>gnl|CDD|233366 TIGR01348, PDHac_trf_long, pyruvate dehydrogenase complex
dihydrolipoamide acetyltransferase, long form. This
model describes a subset of pyruvate dehydrogenase
complex dihydrolipoamide acetyltransferase specifically
close by both phylogenetic and per cent identity (UPGMA)
trees. Members of this set include two or three copies
of the lipoyl-binding domain. E. coli AceF is a member
of this model, while mitochondrial and some other
bacterial forms belong to a separate model [Energy
metabolism, Pyruvate dehydrogenase].
Length = 546
Score = 33.7 bits (77), Expect = 0.85
Identities = 15/49 (30%), Positives = 18/49 (36%), Gaps = 2/49 (4%)
Query: 5 STSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSP--GPSPGQAPGQNPQ 51
ST P P QP P + P + P+P QA QNP
Sbjct: 194 STPATAPAPASAQPAAQSPAATQPEPAAAPAAAKAQAPAPQQAGTQNPA 242
>gnl|CDD|220431 pfam09831, DUF2058, Uncharacterized protein conserved in bacteria
(DUF2058). This domain, found in various prokaryotic
proteins, has no known function.
Length = 177
Score = 32.6 bits (75), Expect = 0.86
Identities = 16/49 (32%), Positives = 28/49 (57%), Gaps = 1/49 (2%)
Query: 1177 EVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK-EKDREKDQAKLKKTLKK 1224
E R +RK R+ + DDE +++ K EK E+DRE ++ + + +K
Sbjct: 23 EKRKQRKQARKGADDGDDELKQAAEEAKAEKAERDRELNRQRQAEAEQK 71
>gnl|CDD|235585 PRK05733, PRK05733, single-stranded DNA-binding protein;
Provisional.
Length = 172
Score = 32.2 bits (73), Expect = 0.93
Identities = 14/47 (29%), Positives = 14/47 (29%), Gaps = 5/47 (10%)
Query: 131 GVPSGPQMPPMSLHGPMPMPPSQPMP-----NQAQPMPLQQQPPPQP 172
G P G P Q Q Q P QQP PQP
Sbjct: 113 GRPQGDDQGGQGGGNYNQSAPRQQAQRPQQAAQQQSRPAPQQPAPQP 159
>gnl|CDD|218328 pfam04921, XAP5, XAP5, circadian clock regulator. This protein is
found in a wide range of eukaryotes. It is a nuclear
protein and is suggested to be DNA binding. In plants,
this family is essential for correct circadian clock
functioning by acting as a light-quality regulator
coordinating the activities of blue and red light
signalling pathways during plant growth - inhibiting
growth in red light but promoting growth in blue light.
Length = 233
Score = 32.7 bits (75), Expect = 0.96
Identities = 17/66 (25%), Positives = 35/66 (53%), Gaps = 10/66 (15%)
Query: 1169 DEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK----------EKDREKDQAKL 1218
+++EEE+E + + K K++ + DE K+K K +K RE+ +A+L
Sbjct: 14 GDDDEEEDEDEGEDEKKVPKESSEPDEANVNPNKKKIGKNPSVDTSFLPDKAREEKEAEL 73
Query: 1219 KKTLKK 1224
++ L++
Sbjct: 74 REELRE 79
>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
Length = 413
Score = 33.2 bits (76), Expect = 1.00
Identities = 16/55 (29%), Positives = 28/55 (50%)
Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
EEE + S +K K+ K + ++ KK+KKEK++ + + + KL K
Sbjct: 43 SEEENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGFKTPK 97
Score = 32.8 bits (75), Expect = 1.2
Identities = 16/56 (28%), Positives = 30/56 (53%)
Query: 1169 DEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
EEE +V + K +K+ ++++ + + KK+KK+KEK K + + K K
Sbjct: 40 STFSEEENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGFKT 95
>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
primarily archaeal type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. It is found
in a single copy and is homodimeric in prokaryotes, but
six paralogs (excluded from this family) are found in
eukarotes, where SMC proteins are heterodimeric. This
family represents the SMC protein of archaea and a few
bacteria (Aquifex, Synechocystis, etc); the SMC of other
bacteria is described by TIGR02168. The N- and
C-terminal domains of this protein are well conserved,
but the central hinge region is skewed in composition
and highly divergent [Cellular processes, Cell division,
DNA metabolism, Chromosome-associated proteins].
Length = 1164
Score = 33.5 bits (77), Expect = 1.1
Identities = 38/221 (17%), Positives = 79/221 (35%), Gaps = 10/221 (4%)
Query: 204 QERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRA--EVIACA 261
+E+ +E + L + + E ++A IE + + L ++ +
Sbjct: 732 EEKLKERLEELEEDLSSLEQEIENVKSELKELEARIEELEEDLHKLEEALNDLEARLSHS 791
Query: 262 RRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCK 321
R ++ L+E + E K+ Q+ QE + + K
Sbjct: 792 RIPEIQAELSKLEEEVSRIEARLREIEQKLN-RLTLEKEYLEKEIQELQEQRIDLKEQIK 850
Query: 322 DFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKER--MRRLMAEDEEGYRKLI 379
++ N + L + + A ++ + ++KER + + E E +L
Sbjct: 851 SIEKEIENLNGKKEELEEELEE-LEAALRDLESRLGDLKKERDELEAQLRELERKIEELE 909
Query: 380 DQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEES 420
Q + KR LS+ + L + + E +E K +DEE
Sbjct: 910 AQIEKKRK--RLSELKAKLEALEEELSE--IEDPKGEDEEI 946
Score = 32.7 bits (75), Expect = 1.8
Identities = 49/291 (16%), Positives = 125/291 (42%), Gaps = 17/291 (5%)
Query: 249 FQRQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQK 308
F R AE+ R L+ ++ + + + + + E + +K+ K+ ++
Sbjct: 668 FSRSEPAELQRLRERLEGLKRELSSLQSELRRIENRLDELSQELSDASRKIGEIEKEIEQ 727
Query: 309 HQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLM 368
++ + + ++ +E + + I N A E+ +E +E + L
Sbjct: 728 LEQEEEKLKERLEELEEDLSSLEQEI--ENVKSELKELEARIEELEEDLHKLEEALNDLE 785
Query: 369 AEDEEGYRKLIDQKKDK------RLAFLLSQTDEYISNLTQMVK--EHKMEQKKKQDEES 420
A I + K R+ L + ++ ++ LT + E ++++ ++Q +
Sbjct: 786 ARLSHSRIPEIQAELSKLEEEVSRIEARLREIEQKLNRLTLEKEYLEKEIQELQEQRIDL 845
Query: 421 KKRKQSVKQKLMDTDGKVTLDQDETSQLTDMHISVREISSGKV-LKGEDAPLAAHLKQWI 479
K++ +S+++++ + +GK ++ +L ++ ++R++ S LK E L A L++ +
Sbjct: 846 KEQIKSIEKEIENLNGKK---EELEEELEELEAALRDLESRLGDLKKERDELEAQLRE-L 901
Query: 480 QDHPGWEVVADSDEENE-DEDSEKSKEKTSGENENKEKNKGEDDEYNKNAM 529
+ E+ A +++ + + + E E E KGED+E + +
Sbjct: 902 ERKIE-ELEAQIEKKRKRLSELKAKLEALEEELSEIEDPKGEDEEIPEEEL 951
Score = 32.3 bits (74), Expect = 2.2
Identities = 76/328 (23%), Positives = 127/328 (38%), Gaps = 25/328 (7%)
Query: 1012 VEERILAAARYKLNMDEKVIQAGMFDQKSTGSERHQFLQTILHQDDEE----DEEENAVP 1067
VEE I R L +DEK Q ++ +ER+Q L ++ E E+E
Sbjct: 182 VEENI---ERLDLIIDEKRQQLERLRREREKAERYQALLKEK-REYEGYELLKEKEALER 237
Query: 1068 DDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDEEIEQWAFEA 1127
E + + LA EEE + +E K + +L+E IK+ E EQ +
Sbjct: 238 QKEAIERQLASLEEELEKLTEEISELEKRLEEIEQLLEELNKK---IKDLGEEEQLRVKE 294
Query: 1128 KEEEKALHMGRGSRQRKQVDYTDSL--TEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
K E L S +R + L E+ K + + E EE E E+ +RK +
Sbjct: 295 KIGE--LEAEIASLERSIAEKERELEDAEERLAKLEAEIDKLLAEIEELEREIEEERKRR 352
Query: 1186 RRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFI 1245
+ E + + R + +E D K+ A+ + LK + K + E
Sbjct: 353 DKLTEEYAELKEELEDLRAELEEVD--KEFAETRDELKDYREKLEKLKREINELKRELDR 410
Query: 1246 KLPSRKELPDYYEVIDRPMDIKKILGRI---EDGKYSSVDELQKDFKTLCRNAQI---YN 1299
+ L + E+ D I I +I E+ K E++K L + A Y
Sbjct: 411 LQEELQRLSE--ELADLNAAIAGIEAKINELEEEKEDKALEIKKQEWKLEQLAADLSKYE 468
Query: 1300 EELSLIHEDSVVLESVFTKARQRVESGE 1327
+EL + E+ +E +K ++ + E
Sbjct: 469 QELYDLKEEYDRVEKELSKLQRELAEAE 496
>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
Length = 434
Score = 32.8 bits (75), Expect = 1.1
Identities = 14/71 (19%), Positives = 35/71 (49%), Gaps = 2/71 (2%)
Query: 1144 KQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKR 1203
K+ + +K L+ + +E + EE +K K ++ K+ E++ ++ + +
Sbjct: 366 KRQELLKEYNKK--LQDYTKKLGEVKDETDASEEAEAKAKEEKLKQEENEKKQKEQADED 423
Query: 1204 KKEKEKDREKD 1214
K++++KD K
Sbjct: 424 KEKRQKDERKK 434
>gnl|CDD|218146 pfam04554, Extensin_2, Extensin-like region.
Length = 57
Score = 29.7 bits (67), Expect = 1.1
Identities = 11/41 (26%), Positives = 18/41 (43%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPH 173
P PP + P PP + ++ P P+ + PPP +
Sbjct: 9 PVKQYSPPPPYYYKSPPPPVKSPVYKSPPPPVYKSPPPPKY 49
>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168). This
family consists of several hypothetical eukaryotic
proteins of unknown function.
Length = 142
Score = 31.6 bits (72), Expect = 1.1
Identities = 22/58 (37%), Positives = 36/58 (62%), Gaps = 5/58 (8%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
++E E+EE + KR+ K+RK DEE + K+ K++K+K ++K + K KK KK
Sbjct: 58 KWKKETEDEEFQQKREEKKRK-----DEEKTAKKRAKRQKKKQKKKKKKKAKKGNKKE 110
>gnl|CDD|221440 pfam12144, Med12-PQL, Eukaryotic Mediator 12 catenin-binding
domain. This domain is found in eukaryotes, and is
typically between 325 and 354 amino acids in length.
Both development and carcinogenesis are driven by signal
transduction within the canonical Wnt/beta-catenin
pathway through both programmed and unprogrammed changes
in gene transcription. Beta-catenin physically and
functionally targets this PQL (proline-, glutamine-,
leucine-rich) region of the Med12 subunit of Mediator to
activate transcription. The beta-catenin transactivation
domain binds directly to isolated Med12 and intact
Mediator both in vitro and in vivo, and Mediator is
recruited to Wnt-responsive genes in a
beta-catenin-dependent manner.
Length = 204
Score = 32.3 bits (73), Expect = 1.2
Identities = 48/210 (22%), Positives = 65/210 (30%), Gaps = 45/210 (21%)
Query: 10 PPPPQQQQPP--LNVGQLPMGAPGSG---PPGSPGPSPGQAPGQNPQE------NLTALQ 58
PP Q P L GQ M PPG PG P P +NP N T +
Sbjct: 1 PPELMQNAPYGRLPYGQQAMNMYTQNQPLPPGGPGLEPPYRPARNPMNKMPVRPNYTGMM 60
Query: 59 RAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTP 118
+ + + +Y K Q Q LR Q+
Sbjct: 61 PGMQGNMPTVMGLEKQYS---------MGFKPQPNMPQGQILRQQLQV------------ 99
Query: 119 QLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGH 178
+ + G+++ QM P + M SQ + M +QQ P
Sbjct: 100 KQNQSMIGQQIR------QMTPNQPYT--SMQASQGYTSYGSHMGMQQHPSQTGGMVPSS 151
Query: 179 ISSQIKQSK--LTNIPKPEGLDPLIILQER 206
SQ Q TN P +DP LQ+R
Sbjct: 152 YGSQNFQGTHPATN---PTVVDPHRQLQQR 178
>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR). This family
consists of several bovine specific leukaemia virus
receptors which are thought to function as transmembrane
proteins, although their exact function is unknown.
Length = 561
Score = 33.1 bits (75), Expect = 1.2
Identities = 25/103 (24%), Positives = 39/103 (37%), Gaps = 6/103 (5%)
Query: 1172 EEEEEEVRSKRKGKRRKKTEDDD------EEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
+ E+ V+ R + K E D + KK KKEKEK+R+KD+ K + K +
Sbjct: 169 DSEKLPVQKHRNAETSKSPEKGDVPAVEKKSKKPKKKEKKEKEKERDKDKKKEVEGFKSL 228
Query: 1226 MRVVIKYTDSDGRVLSEPFIKLPSRKELPDYYEVIDRPMDIKK 1268
+ + S V L + D P D +
Sbjct: 229 LLALDDSPASAASVAEADEASLANTVSGTAPDSEPDEPKDAEA 271
>gnl|CDD|223079 PHA03419, PHA03419, E4 protein; Provisional.
Length = 200
Score = 32.2 bits (73), Expect = 1.2
Identities = 16/52 (30%), Positives = 17/52 (32%), Gaps = 8/52 (15%)
Query: 11 PPPQQQQPPLNVGQLPMGAPGSGPPGSP--------GPSPGQAPGQNPQENL 54
P + G P P P P GPSPG PG QE L
Sbjct: 112 PDQGPEAKGEGEGHEPEDPPPEDTPPPPGGEGEVEGGPSPGPGPGPLDQEGL 163
>gnl|CDD|221827 pfam12881, NUT_N, NUT protein N terminus. This family includes the
NUT protein. The gene encoding for NUT protein (Nuclear
Testis protein) is found fused to BRD3 or BRD4 genes, in
some aggressive types of carcinoma, due to chromosomal
translocations. Proteins of the BRD family contain two
bromodomains that bind transcriptionally active
chromatin through associations with acetylated histones
H3 and H4. Such proteins are crucial for the regulation
of cell cycle progression. On the other hand, little is
known about NUT protein. NUT is known to have a Nuclear
Export Sequence (NES) as well as a Nuclear Localization
Signal (NLS), both located towards the C-terminal end of
the protein. A fused NUT-GFP protein showed either
cytoplasmic or nuclear localization, suggesting that it
is subject to nuclear/cytoplasmic shuttling. Consistent
with this possibility, treatment with leptomycin B an
inhibitor of CRM1-dependent nuclear export resulted in
re-distribution of NUT-GFP to the nucleus. Inspection of
NUT revealed a C-terminal sequence similar to known
nuclear export sequences (NES) which are often regulated
by phosphorylation.
Length = 328
Score = 32.6 bits (74), Expect = 1.3
Identities = 33/131 (25%), Positives = 49/131 (37%), Gaps = 26/131 (19%)
Query: 8 PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ 67
P PPPP Q P+ + + GP G+ G A + + S K +
Sbjct: 156 PPPPPPVAQLVPI----VSLENAWPGPQGATGEGGPAAIQKPSPGD--------YSSKPK 203
Query: 68 GLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLR-FQIMAYRLLARNQPLTPQLAMGVQG 126
+ E+ R + + A R H S V+ L F I R LAR +P M ++
Sbjct: 204 SVYENFRRWQHYKTLARR----HLPQSPDVEALSCFLIPVLRSLARRKP-----TMTLE- 253
Query: 127 KRMEGVPSGPQ 137
EG+ Q
Sbjct: 254 ---EGLWRALQ 261
>gnl|CDD|240578 cd12951, RRP7_Rrp7A, RRP7 domain ribosomal RNA-processing protein 7
homolog A (Rrp7A) and similar proteins. The family
corresponds to the RRP7 domain of Rrp7A, also termed
gastric cancer antigen Zg14, and similar proteins which
are yeast ribosomal RNA-processing protein 7 (Rrp7p)
homologs mainly found in Metazoans. The cellular function
of Rrp7A remains unclear currently. Rrp7A harbors an
N-terminal RNA recognition motif (RRM), also termed RBD
(RNA binding domain) or RNP (ribonucleoprotein domain),
and a C-terminal RRP7 domain.
Length = 129
Score = 31.1 bits (71), Expect = 1.4
Identities = 21/64 (32%), Positives = 31/64 (48%), Gaps = 13/64 (20%)
Query: 1159 KAIDDGVE-YDDEEEEEEEEVRS------------KRKGKRRKKTEDDDEEPSTSKKRKK 1205
ID+ +E YD EEEEE+EE +KG+R K + ++K KK
Sbjct: 21 SEIDEYMEEYDKEEEEEKEEKEKEAEPDEDGWVTVTKKGRRPKTARKESVAAKAAEKEKK 80
Query: 1206 EKEK 1209
+K+K
Sbjct: 81 KKKK 84
>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region. The myc family belongs
to the basic helix-loop-helix leucine zipper class of
transcription factors, see pfam00010. Myc forms a
heterodimer with Max, and this complex regulates cell
growth through direct activation of genes involved in
cell replication. Mutations in the C-terminal 20 residues
of this domain cause unique changes in the induction of
apoptosis, transformation, and G2 arrest.
Length = 329
Score = 32.6 bits (74), Expect = 1.4
Identities = 14/44 (31%), Positives = 20/44 (45%), Gaps = 2/44 (4%)
Query: 1162 DDGVEYDDEEEEEEEE--VRSKRKGKRRKKTEDDDEEPSTSKKR 1203
+ E ++EEEEEEEE V + K + + E T R
Sbjct: 228 SEEDEEEEEEEEEEEEIDVVTVEKRRSSSNRKASTSESITVPSR 271
>gnl|CDD|178748 PLN03209, PLN03209, translocon at the inner envelope of chloroplast
subunit 62; Provisional.
Length = 576
Score = 32.6 bits (74), Expect = 1.5
Identities = 19/62 (30%), Positives = 28/62 (45%), Gaps = 8/62 (12%)
Query: 115 PLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMP-PSQPMPNQAQPMPLQQQPPPQPH 173
PLTP + + +PS P S P P P++P+ +A P P ++ PPQP
Sbjct: 312 PLTPM------EELLAKIPSQRVPPKESDAADGPKPVPTKPVTPEA-PSPPIEEEPPQPK 364
Query: 174 QQ 175
Sbjct: 365 AV 366
>gnl|CDD|221654 pfam12589, WBS_methylT, Methyltransferase involved in Williams-Beuren
syndrome. This domain family is found in eukaryotes, and
is typically between 72 and 83 amino acids in length. The
family is found in association with pfam08241. This
family is made up of S-adenosylmethionine-dependent
methyltransferases. The proteins are deleted in
Williams-Beuren syndrome (WBS), a complex developmental
disorder with multisystemic manifestations including
supravalvular aortic stenosis (SVAS) and a specific
cognitive phenotype.
Length = 85
Score = 30.0 bits (68), Expect = 1.6
Identities = 18/67 (26%), Positives = 30/67 (44%), Gaps = 16/67 (23%)
Query: 1158 LKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDRE----- 1212
L +D+E+ + +VR + RRKK KK+KK K+K +E
Sbjct: 12 LPNGLGEEGEEDDEQIDASKVRRISQRNRRKK-----------KKKKKLKKKSKEWILRK 60
Query: 1213 KDQAKLK 1219
K+Q + +
Sbjct: 61 KEQMRRR 67
>gnl|CDD|218734 pfam05758, Ycf1, Ycf1. The chloroplast genomes of most higher plants
contain two giant open reading frames designated ycf1 and
ycf2. Although the function of Ycf1 is unknown, it is
known to be an essential gene.
Length = 832
Score = 32.7 bits (75), Expect = 1.7
Identities = 15/55 (27%), Positives = 31/55 (56%)
Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
EE + E E S+ KG ++++ +E+PS + K++ +K + D+ ++ K K
Sbjct: 237 EETDVEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDPDKTEDLDKLEILKEKKD 291
Score = 30.7 bits (70), Expect = 7.3
Identities = 27/105 (25%), Positives = 43/105 (40%), Gaps = 12/105 (11%)
Query: 1079 SEEEFQTYQRIDAERRKEQGKKSRLIEVS-ELPDWLIKEDEEIEQWAFEAKEEEKALHMG 1137
+E F D +K K + E+S ++P W K +E+EQ + +E H
Sbjct: 488 YQEFFNII-TTDPNDQKINKKSIGIEEISKKVPRWSYKLIDELEQDEGDNEENPPEDHDI 546
Query: 1138 RGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKR 1182
R SR+ K+V T+ + E+E+E EV R
Sbjct: 547 R-SRKAKRVVI---FTDN------KKNTDNTTNEDEKEREVALIR 581
>gnl|CDD|221143 pfam11593, Med3, Mediator complex subunit 3 fungal. Mediator is a
large complex of up to 33 proteins that is conserved
from plants to fungi to humans - the number and
representation of individual subunits varying with
species. It is arranged into four different sections, a
core, a head, a tail and a kinase-activity part, and the
number of subunits within each of these is what varies
with species. Overall, Mediator regulates the
transcriptional activity of RNA polymerase II but it
would appear that each of the four different sections
has a slightly different function. Mediator subunit
Hrs1/Med3 is a physical target for Cyc8-Tup1, a yeast
transcriptional co-repressor.
Length = 381
Score = 32.3 bits (73), Expect = 1.7
Identities = 27/163 (16%), Positives = 48/163 (29%), Gaps = 8/163 (4%)
Query: 26 PMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANR 85
P A + P+ G + A Q PR K A
Sbjct: 150 PAAAKVLKANAASAPNTTTGVGSAATTAAISATTATTPTTTQKKPRKPRQTKKTGPAAAA 209
Query: 86 TEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHG 145
A AQ Q + M + +N + Q+ M+ + + P +
Sbjct: 210 K--AQASAQAQAQASAYNQMGSLGVPQNTSMLAQIPNPTP--LMQLL---NGVSPNNAMA 262
Query: 146 PMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKL 188
P+ PM N Q P G++++Q +++ +
Sbjct: 263 S-PLNNMSPMRNLNQMGNQNNGGQMTPSANNGNMNNQSRENSM 304
>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon.
Length = 431
Score = 32.3 bits (73), Expect = 1.7
Identities = 31/177 (17%), Positives = 63/177 (35%), Gaps = 13/177 (7%)
Query: 1055 QDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLI 1114
+ +E+ V VN + +EE +T +A + + + L
Sbjct: 21 RQKQEEGSLGQVTTQVEVNSQNSVPDEESKTSTDDEAALLERLA-RREERRDERFSEALE 79
Query: 1115 KEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEE- 1173
++ E ++ E SR+ ++ ++ T +E K + EE E
Sbjct: 80 RQKEFKPTSTDQSLSE--------PSRRMQEDSGAENETVEEEEKEESREEREEVEETEG 131
Query: 1174 ---EEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMR 1227
E++ + + +K+ ++ + E KR +E + E KLK T R
Sbjct: 132 VTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKHTENTFSR 188
Score = 30.8 bits (69), Expect = 5.1
Identities = 28/174 (16%), Positives = 70/174 (40%), Gaps = 8/174 (4%)
Query: 263 RDTTLETAVNVKAYKRTKRQ--GLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHC 320
R+ ET K+ ++ + + E ++++ + + E++T L+H
Sbjct: 123 REEVEETEGVTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKHT 182
Query: 321 KDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDE------EG 374
++ A++ + E + E+ + ++E R+++ E+E E
Sbjct: 183 ENTFSRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEA 242
Query: 375 YRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVK 428
RK ++++ +RL + + + Q V E + + KK + + S+K
Sbjct: 243 DRKSREEEEKRRLKEEIERRRAEAAEKRQKVPEDGLSEDKKPFKCFTPKGSSLK 296
>gnl|CDD|227400 COG5068, ARG80, Regulator of arginine metabolism and related MADS
box-containing transcription factors [Transcription].
Length = 412
Score = 32.3 bits (73), Expect = 1.8
Identities = 24/192 (12%), Positives = 42/192 (21%), Gaps = 29/192 (15%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQ-------APGQNPQENL 54
+T + + + AP S P S P + Q N
Sbjct: 135 HTFTTPKLESVVKSLEGKSLIQSPCSNAP-SDSSEEPSSSASFSVDPNDNNPMGSFQHNG 193
Query: 55 TALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQ 114
+ I Q + + K T + A ++
Sbjct: 194 SPQTNFIPLQNPQTQQYQQH-----------SSRKDHPTVPHSNTNNGRPPAKFMIPELH 242
Query: 115 PLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ-PH 173
L + + P Q P L PP + H
Sbjct: 243 SSHSTL---------DLPSDFISDSGFPNQSSTSIFPLDSAIIQITPPHLPNNPPQENRH 293
Query: 174 QQQGHISSQIKQ 185
+ + SS + +
Sbjct: 294 ELYSNDSSMVSE 305
>gnl|CDD|236669 PRK10263, PRK10263, DNA translocase FtsK; Provisional.
Length = 1355
Score = 32.7 bits (74), Expect = 1.8
Identities = 37/183 (20%), Positives = 56/183 (30%), Gaps = 13/183 (7%)
Query: 10 PPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGL 69
P P Q PP+ +P P PGP G+ P P Q Q
Sbjct: 336 PVEPVTQTPPVASVDVPPAQPTVAWQPVPGPQTGE-PVIAPAPEGYPQQ-------SQYA 387
Query: 70 EEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRM 129
+ +Y + ++ + +A + Q Q + A A+ P V G
Sbjct: 388 QPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAW 447
Query: 130 EGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLT 189
+ P S + Q A PL QQP P Q ++++K
Sbjct: 448 QAEEQQSTFAPQSTY-----QTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPVVEETKPA 502
Query: 190 NIP 192
P
Sbjct: 503 RPP 505
Score = 32.4 bits (73), Expect = 2.4
Identities = 24/107 (22%), Positives = 34/107 (31%), Gaps = 9/107 (8%)
Query: 102 FQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQP 161
F+ + L + P P V+ + P PQ P+ P P Q
Sbjct: 727 FEFSPMKALLDDGPHEPLFTPIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVA 786
Query: 162 MPLQQQPPPQPH---------QQQGHISSQIKQSKLTNIPKPEGLDP 199
Q Q P QP QQ Q +Q + P+P+ P
Sbjct: 787 PQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQP 833
Score = 30.4 bits (68), Expect = 8.6
Identities = 23/89 (25%), Positives = 30/89 (33%), Gaps = 7/89 (7%)
Query: 114 QPLTPQLAMGVQGKRMEGVPSG-----PQMP-PMSLHGPMPMPPSQPMPNQAQPM-PLQQ 166
QP P + + V PQ P P P P P QP P+
Sbjct: 754 QPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAP 813
Query: 167 QPPPQPHQQQGHISSQIKQSKLTNIPKPE 195
QP Q QQ Q +Q + P+P+
Sbjct: 814 QPQYQQPQQPVAPQPQYQQPQQPVAPQPQ 842
>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
Length = 1068
Score = 32.7 bits (75), Expect = 1.8
Identities = 25/114 (21%), Positives = 46/114 (40%), Gaps = 21/114 (18%)
Query: 1055 QDDEEDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLI 1114
Q E E + A ++ AR+++E Q R + +RR+ K+ E
Sbjct: 649 QTAETRESQQAEVTEK------ARTQDEQQQAPRRERQRRRNDEKRQAQQEA-------- 694
Query: 1115 KEDEEIEQWAFEAKEEEKALH-MGRGSRQRKQ----VDYTDSLTEKEWLKAIDD 1163
K EQ E ++EE+ R R+++Q V S+ E+ +++
Sbjct: 695 KALNVEEQSVQETEQEERVQQVQPR--RKQRQLNQKVRIEQSVAEEAVAPVVEE 746
>gnl|CDD|177614 PHA03377, PHA03377, EBNA-3C; Provisional.
Length = 1000
Score = 32.7 bits (74), Expect = 1.8
Identities = 25/125 (20%), Positives = 46/125 (36%), Gaps = 14/125 (11%)
Query: 61 IDSMKEQGLEEDP-RYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARN------ 113
D+ E +ED R +E++++ E+ + + + Q R + R+ R
Sbjct: 362 GDATSETSSDEDTGRQGSDVELESSDDELPYIDPNMEPVQQRPVMFVSRVPWRKPRTLPW 421
Query: 114 -----QPLTPQLAM-GVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQ 167
P+ L + E S P+ P S +P+ P+ P + + Q
Sbjct: 422 PTPKTHPVKRTLVKTSGRSDEAEQAQSTPERPGPSDQPSVPVEPAHLTPVE-HTTVILHQ 480
Query: 168 PPPQP 172
PP P
Sbjct: 481 PPQSP 485
>gnl|CDD|227268 COG4932, COG4932, Predicted outer membrane protein [Cell envelope
biogenesis, outer membrane].
Length = 1531
Score = 32.5 bits (74), Expect = 1.9
Identities = 22/90 (24%), Positives = 30/90 (33%), Gaps = 13/90 (14%)
Query: 1144 KQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRR-------KKTEDDDEE 1196
K+ T SL +G+ D + EV+ K G T EE
Sbjct: 61 KETGKTISLNIPS------EGLTTTDSLLVGDYEVKEKSAGLGTTLDEATYNVTLALKEE 114
Query: 1197 PSTSKKRKKEKEKDREKDQAKLKKTLKKIM 1226
TS K ++EK KK LK +
Sbjct: 115 VITSTSTKTQEEKTEIVTPEPSKKKLKAEI 144
>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM). This
family consists of several Plasmodium falciparum SPAM
(secreted polymorphic antigen associated with merozoites)
proteins. Variation among SPAM alleles is the result of
deletions and amino acid substitutions in non-repetitive
sequences within and flanking the alanine heptad-repeat
domain. Heptad repeats in which the a and d position
contain hydrophobic residues generate amphipathic
alpha-helices which give rise to helical bundles or
coiled-coil structures in proteins. SPAM is an example of
a P. falciparum antigen in which a repetitive sequence
has features characteristic of a well-defined structural
element.
Length = 164
Score = 31.0 bits (70), Expect = 2.1
Identities = 35/139 (25%), Positives = 59/139 (42%), Gaps = 20/139 (14%)
Query: 1096 EQGKKSRLIEVSELPDW----LIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDS 1151
E+ K L+E ++ W +IKE+E+++ E EEE+ + +
Sbjct: 15 EEKKDENLLEHVKITSWDKEDIIKENEDVKDEKQEDDEEEEE-------------EDEEE 61
Query: 1152 LTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR 1211
+ E E I+D E ++EEEEEE+ K +K +D ST +
Sbjct: 62 IEEPE---DIEDEEEIVEDEEEEEEDEEDNVDLKDIEKKNINDIFNSTQDDNAQNLISKN 118
Query: 1212 EKDQAKLKKTLKKIMRVVI 1230
K K KKT + I++ +
Sbjct: 119 YKKNEKSKKTAEDIVKTLF 137
>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
protein; Reviewed.
Length = 782
Score = 32.1 bits (74), Expect = 2.2
Identities = 43/254 (16%), Positives = 86/254 (33%), Gaps = 51/254 (20%)
Query: 196 GLDPLIILQEREN--RVALNIERRIEELNGSLTSTLPEHLRVKAEIELRAL-KVLNFQRQ 252
GL II + ++ + I L E L + E + +L +
Sbjct: 498 GLPENIIEEAKKLIGEDKEKLNELIASL---------EELERELEQKAEEAEALLKEAEK 548
Query: 253 LRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEY 312
L+ E+ E ++ + + ++ A + +++ +K E K + +
Sbjct: 549 LKEEL---------EEKKEKLQEEEDKLLEEAEK-EAQQAIKEAKKEADEIIKELRQLQK 598
Query: 313 ITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDE 372
+ E + RLNKA +K+++K++E + ++
Sbjct: 599 GGYASVKAHELIEARK-------RLNKANEKKEKKKKKQKEKQEELKVGDEVK------- 644
Query: 373 EGYRKLIDQKKDKRLAFLLSQTDEYISNLT-QM----VKEHKMEQKKKQDEESKKRKQSV 427
Y L QK +LS D+ Q +K + +K Q + KK+K+
Sbjct: 645 --YLSL-GQK-----GEVLSIPDD--KEAIVQAGIMKMKVPLSDLEKIQKPKKKKKKKPK 694
Query: 428 KQKLMDTDGKVTLD 441
K + LD
Sbjct: 695 TVKPKPRTVSLELD 708
>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
bacterial type. SMC (structural maintenance of
chromosomes) proteins bind DNA and act in organizing and
segregating chromosomes for partition. SMC proteins are
found in bacteria, archaea, and eukaryotes. This family
represents the SMC protein of most bacteria. The smc
gene is often associated with scpB (TIGR00281) and scpA
genes, where scp stands for segregation and condensation
protein. SMC was shown (in Caulobacter crescentus) to be
induced early in S phase but present and bound to DNA
throughout the cell cycle [Cellular processes, Cell
division, DNA metabolism, Chromosome-associated
proteins].
Length = 1179
Score = 32.3 bits (74), Expect = 2.3
Identities = 35/246 (14%), Positives = 82/246 (33%), Gaps = 38/246 (15%)
Query: 203 LQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACAR 262
+ ++ +E IEEL L E +AEIE ++ + +L
Sbjct: 748 RIAQLSKELTELEAEIEELEERLEEAEEELAEAEAEIEELEAQIEQLKEEL--------- 798
Query: 263 RDTTLETAVNV--KAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHC 320
L A++ A E+ + +E ++ E
Sbjct: 799 --KALREALDELRAELTLLNE------EAANLRERLESLERRIAATERRLE--------- 841
Query: 321 KDFKEYHRNNQARIMRLNKAVMNYHANAEKEQKK-EQERIEKERMRRLMAEDEEGYRKLI 379
D +E I L + E+ + + E E+ + +A +L
Sbjct: 842 -DLEEQIEELSEDIESLAAEIEELEELIEELESELEALLNERASLEEALALLRSELEELS 900
Query: 380 DQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVT 439
++ ++ S+ + L + + + ++ E + R +++++L + + +T
Sbjct: 901 EELRELESK--RSELRRELEELREKLAQLELRL-----EGLEVRIDNLQERLSE-EYSLT 952
Query: 440 LDQDET 445
L++ E
Sbjct: 953 LEEAEA 958
>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain. This is a family of proteins of
approximately 300 residues, found in plants and
vertebrates. They contain a highly conserved DDRGK
motif.
Length = 189
Score = 31.2 bits (71), Expect = 2.3
Identities = 25/122 (20%), Positives = 50/122 (40%), Gaps = 14/122 (11%)
Query: 277 KRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMR 336
K ++ K + ++++ E ER++R+K +E + ++ +E + R
Sbjct: 2 KIGAKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEER 61
Query: 337 LNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDE 396
+ E++ +KEQE E E+++ +EEG D+ LL
Sbjct: 62 KER---------EEQARKEQE--EYEKLKSSFVVEEEG---TDKLSADEESNELLEDFIN 107
Query: 397 YI 398
YI
Sbjct: 108 YI 109
Score = 30.1 bits (68), Expect = 6.0
Identities = 19/57 (33%), Positives = 35/57 (61%), Gaps = 1/57 (1%)
Query: 1170 EEEEEEEEVRSKRKGKRRKKTEDDDEEP-STSKKRKKEKEKDREKDQAKLKKTLKKI 1225
E EEEE E R K + KR + ++++E KK+++E+ K+RE+ K ++ +K+
Sbjct: 22 EAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKEREEQARKEQEEYEKL 78
>gnl|CDD|172884 PRK14408, PRK14408, membrane protein; Provisional.
Length = 257
Score = 31.5 bits (71), Expect = 2.5
Identities = 19/70 (27%), Positives = 37/70 (52%), Gaps = 9/70 (12%)
Query: 549 ASILVNGKLKEYQIK-------GLEWMVSLFNNNLNGILADEMGLGKTIQTIALITYLME 601
ASI+++ K+ ++ G+ M+ + L GIL + + K I T++L TY++
Sbjct: 30 ASIILSLVFKKQDVRLFASKNAGMTNMIRVHGKKL-GILTLFLDIIKPITTVSL-TYIIY 87
Query: 602 KKKVNGPFLI 611
K ++ PF +
Sbjct: 88 KYALDAPFDL 97
>gnl|CDD|240273 PTZ00110, PTZ00110, helicase; Provisional.
Length = 545
Score = 32.1 bits (73), Expect = 2.5
Identities = 29/111 (26%), Positives = 51/111 (45%), Gaps = 9/111 (8%)
Query: 887 LDRILPKLKSTGHRVLLFCQMTQLMNILEDYFSYRGFKYMRLDGTTKAEDRGDLLKKFNA 946
L +L ++ G ++L+F + + + L G+ + + G K E+R +L +F
Sbjct: 366 LKMLLQRIMRDGDKILIFVETKKGADFLTKELRLDGWPALCIHGDKKQEERTWVLNEFKT 425
Query: 947 PDSEYFIFVLSTRAGGLGLNLQTADTVIIFDSDWNPHQDLQAQDRAHRIGQ 997
S I +T GL+++ VI FD P+ Q +D HRIG+
Sbjct: 426 GKSPIMI---ATDVASRGLDVKDVKYVINFDF---PN---QIEDYVHRIGR 467
>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
Length = 1352
Score = 32.1 bits (73), Expect = 2.5
Identities = 11/45 (24%), Positives = 17/45 (37%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAP 46
S+SSTS + + +P PP + SP + P
Sbjct: 325 SSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRP 369
>gnl|CDD|227615 COG5296, COG5296, Transcription factor involved in TATA site
selection and in elongation by RNA polymerase II
[Transcription].
Length = 521
Score = 31.9 bits (72), Expect = 2.6
Identities = 38/206 (18%), Positives = 80/206 (38%), Gaps = 15/206 (7%)
Query: 1027 DEKVIQAGMFDQKSTGSER--HQFLQTILHQDDEEDEEENAVPDDETVNQMLARSEEE-F 1083
DE + AG+ D + + H L +L +ED EN +E+ + +SEEE F
Sbjct: 6 DELLALAGIDDSDVASNRKRAHDDLDDVLSSSSDEDNNENVDYAEESGGEGNEKSEEEKF 65
Query: 1084 QTYQRIDAERRKEQGKKSRLIEVSELP-DWLIKEDEEIEQWAFEAKEEEKALHMGRGSRQ 1142
+ R++ + K++ +++++ ++E+ + ++ E EE E +E + S
Sbjct: 66 KNPYRLEG-KFKDEADRAKIMAMTEIERESILFEREEEISKLMERRELAIRMEQQHRSSG 124
Query: 1143 RKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTED---------- 1192
+ + ++ E EE + ++ +D
Sbjct: 125 CTDTRRSTRYEPLTSAAEEKKKKLLELKKTREREERLYSERHIELQRFKDYKELEESEQG 184
Query: 1193 DDEEPSTSKKRKKEKEKDREKDQAKL 1218
EE + S + ++ R D A+L
Sbjct: 185 LQEEYTPSYAEEAVEDISRTDDFAEL 210
>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
Length = 943
Score = 32.0 bits (72), Expect = 2.9
Identities = 16/46 (34%), Positives = 25/46 (54%), Gaps = 1/46 (2%)
Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
DDG E DDE+ EE + K + +RR+ + + SK +K +K
Sbjct: 877 DDGTEADDEDTHPPEE-KHKSEVRRRRPPKKPSKPKKPSKPKKPKK 921
>gnl|CDD|235124 PRK03427, PRK03427, cell division protein ZipA; Provisional.
Length = 333
Score = 31.5 bits (72), Expect = 3.0
Identities = 12/44 (27%), Positives = 15/44 (34%), Gaps = 1/44 (2%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
P+ P P+ P QP P Q P+ Q P P
Sbjct: 116 QHAPR-PAQPAPQPVQQPAYQPQPEQPLQQPVSPQVAPAPQPVH 158
Score = 30.0 bits (68), Expect = 8.2
Identities = 8/44 (18%), Positives = 9/44 (20%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQ 176
P P + P P P Q P P
Sbjct: 109 PEAQVPPQHAPRPAQPAPQPVQQPAYQPQPEQPLQQPVSPQVAP 152
>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
and chromosome partitioning].
Length = 420
Score = 31.6 bits (72), Expect = 3.0
Identities = 20/104 (19%), Positives = 37/104 (35%), Gaps = 5/104 (4%)
Query: 266 TLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKE 325
TL+ V+A ++ L +E+ +Q K+ ++R+K + + L + E
Sbjct: 169 TLKQLAAVRAEIAAEQAELTTLL-SEQRAQQAKLAQLLEERKKTLAQLNSELSADQKKLE 227
Query: 326 YHRNNQAR----IMRLNKAVMNYHANAEKEQKKEQERIEKERMR 365
R N++R I A A + E R
Sbjct: 228 ELRANESRLKNEIASAEAAAAKAREAAAAAEAAAARARAAEAKR 271
>gnl|CDD|237756 PRK14559, PRK14559, putative protein serine/threonine phosphatase;
Provisional.
Length = 645
Score = 32.0 bits (73), Expect = 3.0
Identities = 28/140 (20%), Positives = 39/140 (27%), Gaps = 25/140 (17%)
Query: 33 GPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQGLEEDPRYQKLIEMKANRTEIKHAF 92
P S G+A Q+ + S L+ RYQ L + T H
Sbjct: 62 ASPNSEVLESGEATQQSESSLTPSSSPLYGSY----LDPGQRYQLLASSEEIPTAAAHTE 117
Query: 93 TSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPS 152
+V QPL P + + P +P
Sbjct: 118 LQGRVLDC-------------QPLQPSPL-----EALLEQLEDLLNPLADPTEVLPTLLW 159
Query: 153 QPM--PNQAQP-MPLQQQPP 169
Q + P A P + LQ Q P
Sbjct: 160 QQLGIPALAIPYLALQDQFP 179
>gnl|CDD|132720 cd02584, RNAP_II_Rpb1_C, Largest subunit (Rpb1) of Eukaryotic RNA
polymerase II (RNAP II), C-terminal domain. RNA
polymerase II (RNAP II) is a large multi-subunit complex
responsible for the synthesis of mRNA. RNAP II consists
of a 10-subunit core enzyme and a peripheral heterodimer
of two subunits. The largest core subunit (Rpb1) of
yeast RNAP II is the best characterized member of this
family. Structure studies suggest that RNAP complexes
from different organisms share a crab-claw-shape
structure. In yeast, Rpb1 and Rpb2, the largest and the
second largest subunits, each makes up one clamp, one
jaw, and part of the cleft. Rpb1 interacts with Rpb2 to
form the DNA entry and RNA exit channels in addition to
the catalytic center of RNA synthesis. The C-terminal
domain of Rpb1 makes up part of the foot and jaw
structures.
Length = 410
Score = 31.8 bits (73), Expect = 3.0
Identities = 19/83 (22%), Positives = 39/83 (46%), Gaps = 4/83 (4%)
Query: 356 QERIEKERMR-RLMAEDEEGYRKLIDQKKDKRL-AFLLSQTDEYISNLTQMVKEHKMEQK 413
+ EK +R R++ +DEE D K++ + +LS + V + +K
Sbjct: 194 DDNAEKLVIRIRIINDDEEKEEDSEDDVFLKKIESNMLSDMTLKGIEGIRKVFIREENKK 253
Query: 414 KKQDEESKKRKQSVKQKLMDTDG 436
K E + +K+ ++ +++TDG
Sbjct: 254 KVDIETGEFKKR--EEWVLETDG 274
>gnl|CDD|145949 pfam03066, Nucleoplasmin, Nucleoplasmin. Nucleoplasmins are also
known as chromatin decondensation proteins. They bind to
core histones and transfer DNA to them in a reaction that
requires ATP. This is thought to play a role in the
assembly of regular nucleosomal arrays.
Length = 146
Score = 30.4 bits (69), Expect = 3.1
Identities = 11/44 (25%), Positives = 22/44 (50%), Gaps = 13/44 (29%)
Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKK 1205
DD + ++E++EE+++ ED+ EE + K+ K
Sbjct: 116 DDEEDEEEEDDEEDDD-------------EDESEEEESPVKKVK 146
Score = 29.2 bits (66), Expect = 8.5
Identities = 16/50 (32%), Positives = 24/50 (48%), Gaps = 9/50 (18%)
Query: 1158 LKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEK 1207
L A ++ DDEE+EEEE+ ++ +D+DE KK K
Sbjct: 106 LVASEEDESDDDEEDEEEED---------DEEDDDEDESEEEESPVKKVK 146
>gnl|CDD|129705 TIGR00618, sbcc, exonuclease SbcC. All proteins in this family for
which functions are known are part of an exonuclease
complex with sbcD homologs. This complex is involved in
the initiation of recombination to regulate the levels
of palindromic sequences in DNA. This family is based on
the phylogenomic analysis of JA Eisen (1999, Ph.D.
Thesis, Stanford University) [DNA metabolism, DNA
replication, recombination, and repair].
Length = 1042
Score = 31.9 bits (72), Expect = 3.2
Identities = 34/228 (14%), Positives = 79/228 (34%), Gaps = 16/228 (7%)
Query: 209 RVALNIERRIEELNGSLTSTLPEHLRVKAEIELRALKVLNFQRQLRAEVIACARRDTTLE 268
+ L + + + LP+ L ++ L+ ++ + + QL A+ T L
Sbjct: 650 ALQLTLTQERVREHALSIRVLPKELLASRQLALQKMQ--SEKEQLTYWKEMLAQCQTLLR 707
Query: 269 TAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHR 328
+ ++ A ++ + + +A + ++ TVL+ + +
Sbjct: 708 ELETHIEEYDREFNEIENASSSLGSDLAAREDALNQSLKELMHQARTVLKARTE--AHFN 765
Query: 329 NNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAE------DEEGYRKLIDQK 382
NN+ L H AE + + ++ L AE +E L +
Sbjct: 766 NNEEVTAALQTGAELSHLAAEIQFFNRLREEDTHLLKTLEAEIGQEIPSDEDILNLQCET 825
Query: 383 KDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESKKRKQSVKQK 430
+ LS+ +E + L ++ + + EE K+ + Q+
Sbjct: 826 LVQEEEQFLSRLEEKSATL------GEITHQLLKYEECSKQLAQLTQE 867
>gnl|CDD|220093 pfam09030, Creb_binding, Creb binding. The Creb binding domain
assumes a structure comprising of three alpha-helices
which pack in a bundle, exposing a hydrophobic groove
between alpha-1 and alpha-3 within which complimentary
domains found in the protein 'activator for thyroid
hormone and retinoid receptors' (ACTR) can dock. Docking
of these domains is required for the recruitment of RNA
polymerase II and the basal transcription machinery.
Length = 104
Score = 29.7 bits (66), Expect = 3.2
Identities = 16/62 (25%), Positives = 22/62 (35%), Gaps = 2/62 (3%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLTNIP 192
P GP + MP P Q + A P + QP +G +S Q L +
Sbjct: 7 PQGPLPQQQQMQPGMPRPVMQMVAQHAVAGP--RPGLVQPGISRGIVSPNALQDLLRTLK 64
Query: 193 KP 194
P
Sbjct: 65 SP 66
>gnl|CDD|221868 pfam12938, M_domain, M domain of GW182.
Length = 238
Score = 31.0 bits (70), Expect = 3.2
Identities = 21/122 (17%), Positives = 33/122 (27%), Gaps = 31/122 (25%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAI 61
N ++ + + G P G G+ GP P G
Sbjct: 53 PNLASLSSLTSQGLGKIL--SGLQPPPLGNGGGSGAGGPGPVGGGGGPGVAPN------- 103
Query: 62 DSMKEQGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMA----YRLLARNQPLT 117
I+ A + VQQ++ + ++L NQPL
Sbjct: 104 ----------------NIQPNAQAQQPSTQQLRMLVQQIQMAVQKGYLNPQIL--NQPLA 145
Query: 118 PQ 119
PQ
Sbjct: 146 PQ 147
>gnl|CDD|237791 PRK14701, PRK14701, reverse gyrase; Provisional.
Length = 1638
Score = 31.8 bits (72), Expect = 3.3
Identities = 21/108 (19%), Positives = 50/108 (46%), Gaps = 10/108 (9%)
Query: 585 GLGKTIQTIALITYLMEKKKVNGPFLIIVPLSTLSNWSLE-----FERWAPSVNVVAYKG 639
G+GK+ + +L K K II+P + L ++E E+ V +V Y
Sbjct: 104 GMGKSTFGAFIALFLALKGKKC---YIILPTTLLVKQTVEKIESFCEKANLDVRLVYYHS 160
Query: 640 --SPHLRKTLQAQMKASKFNVLLTTYEYVIKDKGPLAKLHWKYMIIDE 685
++ +++ F++L+TT +++ ++ + L + ++ +D+
Sbjct: 161 NLRKKEKEEFLERIENGDFDILVTTAQFLARNFPEMKHLKFDFIFVDD 208
>gnl|CDD|180481 PRK06231, PRK06231, F0F1 ATP synthase subunit B; Validated.
Length = 205
Score = 31.0 bits (70), Expect = 3.3
Identities = 18/64 (28%), Positives = 33/64 (51%), Gaps = 6/64 (9%)
Query: 338 NKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEY 397
N + EKE+++ +E+++KE + M EE +K +D++ D +L DE+
Sbjct: 143 NLIIFQARQEIEKERRELKEQLQKESVELAMLAAEELIKKKVDREDDDKL------VDEF 196
Query: 398 ISNL 401
I L
Sbjct: 197 IREL 200
>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
Provisional.
Length = 2849
Score = 31.6 bits (71), Expect = 3.6
Identities = 19/53 (35%), Positives = 28/53 (52%), Gaps = 1/53 (1%)
Query: 483 PGWEVVADSDEENEDEDSEKSKEKTSGENENKEKNKGEDDEYNKNAMEEATYY 535
P V D D+E+EDED + ++ E E +E+ KG DDE ++ E Y
Sbjct: 147 PRDNFVIDDDDEDEDEDDDDEEDDEEEEEE-EEEIKGFDDEDEEDEGGEDFTY 198
>gnl|CDD|118696 pfam10168, Nup88, Nuclear pore component. Nup88 can be divided
into two structural domains; the N-terminal two-thirds
of the protein has no obvious structural motifs but is
the region for binding to Nup98, one of the components
of the nuclear pore. the C-terminal end is a predicted
coiled-coil domain. Nup88 is overexpressed in tumour
cells.
Length = 717
Score = 31.4 bits (71), Expect = 3.7
Identities = 21/129 (16%), Positives = 57/129 (44%), Gaps = 14/129 (10%)
Query: 267 LETAVNVKAYKRTKRQGLKEARATEK-----LEKQQKVEAERKKRQKHQEYITTVLQHCK 321
L A V + + L + L+K++++E + R++ ++ ++ +
Sbjct: 541 LSRATQVFREQYLLKHDLAREEFQRRVKLLQLQKEKQLEDIQDCREE-RKSLSERAEKLA 599
Query: 322 DFKEYHRNNQARIMRLNKAVM-NYHAN------AEKEQKKEQERIEKERMRRLMAEDEEG 374
+ E + NQ ++ K ++ + ++ +E++ KE +RI K+ ++ L ++
Sbjct: 600 EKFEEAKYNQELLVNRCKRLLQSANSQLPVLSDSERDMSKELQRINKQ-LQHLANGIKQV 658
Query: 375 YRKLIDQKK 383
+K Q+
Sbjct: 659 KKKKNYQRY 667
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This protein,
which interacts with both microtubules and TRAF3 (tumour
necrosis factor receptor-associated factor 3), is
conserved from worms to humans. The N-terminal region is
the microtubule binding domain and is well-conserved; the
C-terminal 100 residues, also well-conserved, constitute
the coiled-coil region which binds to TRAF3. The central
region of the protein is rich in lysine and glutamic acid
and carries KKE motifs which may also be necessary for
tubulin-binding, but this region is the least
well-conserved.
Length = 506
Score = 31.4 bits (71), Expect = 3.8
Identities = 11/53 (20%), Positives = 29/53 (54%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKK 1220
+E+ +EE K K + ++K ++E KK ++ ++++ EK + +++
Sbjct: 119 KKKEKPKEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRA 171
Score = 30.6 bits (69), Expect = 5.8
Identities = 24/93 (25%), Positives = 43/93 (46%), Gaps = 6/93 (6%)
Query: 1169 DEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR---EKDQAKLKKTLKKI 1225
E +EEE ++ + +KK ++ +E +K K+E ++ R EK++ K KK +
Sbjct: 99 KNESGKEEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPR 158
Query: 1226 MRVVIKYTDSD---GRVLSEPFIKLPSRKELPD 1255
R K + R P K P++K+ P
Sbjct: 159 DREEEKKRERVRAKSRPKKPPKKKPPNKKKEPP 191
>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein. CDC45 is an essential gene
required for initiation of DNA replication in S.
cerevisiae, forming a complex with MCM5/CDC46. Homologues
of CDC45 have been identified in human, mouse and smut
fungus among others.
Length = 583
Score = 31.5 bits (72), Expect = 3.8
Identities = 15/52 (28%), Positives = 29/52 (55%)
Query: 1162 DDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREK 1213
+ + DDEE +EE+E SK + +DDD++ +T ++ + + + RE
Sbjct: 123 LEEDDDDDEESDEEDEESSKSEDDEDDDDDDDDDDIATRERSLERRRRRREW 174
Score = 30.3 bits (69), Expect = 7.5
Identities = 17/60 (28%), Positives = 27/60 (45%), Gaps = 9/60 (15%)
Query: 1161 IDDGVEYDDEEEEEEEEVRSKRKGKRRKK-----TEDDDEEPSTSKKRKKEKEKDREKDQ 1215
DDG D EEE ++E R + ++ E D+E+ +SK E + D + D
Sbjct: 101 FDDG----DIEEELQDEPRYDDAYRDLEEDDDDDEESDEEDEESSKSEDDEDDDDDDDDD 156
>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
envelope biogenesis, outer membrane].
Length = 387
Score = 31.1 bits (70), Expect = 3.8
Identities = 32/166 (19%), Positives = 67/166 (40%), Gaps = 10/166 (6%)
Query: 1059 EDEEENAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKKSRLIEVSELPDWLIKEDE 1118
+ ++ +A ++ + + EE + Q + ER K+ +K RL E+
Sbjct: 68 QSQQSSAKKGEQQRKKKEEQVAEELKPKQAAEQERLKQL-EKERL---KAQEQQKQAEEA 123
Query: 1119 EIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEV 1178
E + + ++EE+A +++ + + E LKA + + +E + EE
Sbjct: 124 EKQAQLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEA 183
Query: 1179 RSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKK 1224
++K + KK + K EK K + +AK +K +
Sbjct: 184 KAKAEAAAAKK------KAEAEAKAAAEKAKAEAEAKAKAEKKAEA 223
Score = 29.9 bits (67), Expect = 9.9
Identities = 30/142 (21%), Positives = 57/142 (40%), Gaps = 8/142 (5%)
Query: 289 ATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYHANA 348
++ + Q ++ KK + Q+ Q ++ K Q R+ +L K +
Sbjct: 60 VVQQYGRIQSQQSSAKKGE--QQRKKKEEQVAEELKPKQAAEQERLKQLEKERL-----K 112
Query: 349 EKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEH 408
+EQ+K+ E EK+ ++E+ + +QKK K A E +
Sbjct: 113 AQEQQKQAEEAEKQAQLEQKQQEEQARKAAAEQKK-KAEAAKAKAAAEAAKLKAAAEAKK 171
Query: 409 KMEQKKKQDEESKKRKQSVKQK 430
K E+ K EE+K + ++ K
Sbjct: 172 KAEEAAKAAEEAKAKAEAAAAK 193
>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal. This
domain is found to the N-terminus of bacterial signal
peptidases of the S49 family (pfam01343).
Length = 154
Score = 30.2 bits (69), Expect = 4.0
Identities = 18/60 (30%), Positives = 29/60 (48%), Gaps = 10/60 (16%)
Query: 1166 EYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKI 1225
EY D +E E + K++ K +K E KK +K K K EK +AK ++ ++
Sbjct: 50 EYKDLKESLEAALLDKKELKAWEKAE---------KKAEKAKAKA-EKKKAKKEEPKPRL 99
>gnl|CDD|219897 pfam08549, SWI-SNF_Ssr4, Fungal domain of unknown function
(DUF1750). This is a fungal domain of unknown function.
Length = 669
Score = 31.5 bits (71), Expect = 4.1
Identities = 19/60 (31%), Positives = 24/60 (40%), Gaps = 13/60 (21%)
Query: 132 VPSGPQMPPMSLH---GPMP--MPPSQPMPNQAQPMPLQQQPPP----QPHQQQGHISSQ 182
+P PQM S++ GP P M QP P P PP +GH +SQ
Sbjct: 202 IPLPPQMAGQSMYQPPGPYPNAMVGRQPFY----PQPGAVAGPPKRRGGHKAPRGHRASQ 257
>gnl|CDD|222374 pfam13779, DUF4175, Domain of unknown function (DUF4175).
Length = 820
Score = 31.4 bits (72), Expect = 4.2
Identities = 34/172 (19%), Positives = 47/172 (27%), Gaps = 35/172 (20%)
Query: 8 PNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAIDSMKEQ 67
Q QQ GQ G G G GQ GQ Q +L Q+A+ +
Sbjct: 617 AQRGEQQGQQGQGGQGQGQPGQQGQQGQGQQQGQQGQG-GQGGQGSLAERQQALRDELGR 675
Query: 68 GLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIM--AYRLLARNQPLTPQLAMGVQ 125
P A A A + M A L + A+ Q
Sbjct: 676 QRGGLPGMGGEAGEAARD-----ALGRAG------RAMGGAEEALGQGD---LAEAVDRQ 721
Query: 126 GKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQG 177
G+ +E + G ++ + Q Q QQQG
Sbjct: 722 GRALEALREG----------------ARALGEAMAQQ--QGQQQGGQGQQQG 755
>gnl|CDD|227701 COG5414, COG5414, TATA-binding protein-associated factor
[Transcription].
Length = 392
Score = 31.2 bits (70), Expect = 4.2
Identities = 22/104 (21%), Positives = 42/104 (40%), Gaps = 4/104 (3%)
Query: 1113 LIKEDEEIEQWAFEAKEEEKALHMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEE 1172
L KE + E+ E EE L +G + K+V D ++E ++ + E +
Sbjct: 257 LKKEKQGAEEEGEEGMSEED-LDVGAAEIENKEVSEGDKEQQQEEVENAEAHKEEVQSDR 315
Query: 1173 EEEEEVRSK---RKGKRRKKTEDDDEEPSTSKKRKKEKEKDREK 1213
+E + + + TE +E + +K +EK + E
Sbjct: 316 PDEIGEEKEEDDENEENERHTELLADELNELEKGIEEKRRQMES 359
>gnl|CDD|233158 TIGR00865, bcl-2, apoptosis regulator. The Bcl-2 (Bcl-2) Family
(TC 1.A.21) The Bcl-2 family consists of the apoptosis
regulator, Bcl-X, and its homologues. Bcl-X is a
dominant regulator of programmed cell death in mammalian
cells. The long form (Bcl-X(L)) displays cell death
repressor activity, but the short isoform (Bcl-X(S)) and
the b-isoform (Bcl-Xb) promote cell death. Bcl-X(L),
Bcl-X(S) and Bcl-Xb are three isoforms derived by
alternative RNA splicing. Bcl-X(S) forms heterodimers
with Bcl-2. Homologues of Bcl-X include the Bax (rat;
192 aas; spQ63690) and Bak (mouse; 208 aas; spO08734)
proteins which also influence apoptosis. Using isolated
mitochondria, recombinant Bax and Bak have been shown to
induce Dy loss, swelling and cytochrome c release. All
of these changes are dependent on Ca2+ and are prevented
by cyclosporin A and bongkrekic acid, both of which are
known to close permeability transition pores
(megachannels). Coimmimoprecipitation studies revealed
that Bax and Bak interact with VDAC to form permeability
transition pores. Thus, even though they can form
channels in artificial membranes at acidic pH,
proapoptotic Bcl-2 family proteins (including Bax and
Bak) probably induce the mitochondrial permeability
transition and cytochrome c release by interacting with
permeability transition pores, the most important
component for pore fomation of which is VDAC [Regulatory
functions, Other].
Length = 213
Score = 30.6 bits (69), Expect = 4.4
Identities = 8/16 (50%), Positives = 11/16 (68%)
Query: 471 LAAHLKQWIQDHPGWE 486
L HL WIQ++ GW+
Sbjct: 155 LNEHLHPWIQENGGWD 170
>gnl|CDD|225087 COG2176, PolC, DNA polymerase III, alpha subunit (gram-positive type)
[DNA replication, recombination, and repair].
Length = 1444
Score = 31.5 bits (72), Expect = 4.4
Identities = 32/127 (25%), Positives = 52/127 (40%), Gaps = 12/127 (9%)
Query: 1049 LQTILHQDDEEDEEE--------NAVPDDETVNQMLARSEEEFQTYQRIDAERRKEQGKK 1100
L +D +E+E N + + A + + ++ + + G+K
Sbjct: 161 LLIEFEVNDISEEQEFEKFEEAINEEVEKAAQEALEAEKKLKAESPKVEKPKP-LFDGQK 219
Query: 1101 SRLIEVSELPDWLIKEDEEIEQWAFEA---KEEEKALHMGRGSRQRKQVDYTDSLTEKEW 1157
R I+ +E LIK +EE + E K E K L GR K DYT SL K++
Sbjct: 220 GRKIKSTEEIKPLIKINEEETRVKVEGYIFKIEIKELKSGRTLLNIKVTDYTSSLILKKF 279
Query: 1158 LKAIDDG 1164
L+ +D
Sbjct: 280 LRDEEDE 286
>gnl|CDD|219882 pfam08524, rRNA_processing, rRNA processing. This is a family of
proteins that are involved in rRNA processing. In a
localisation study they were found to localise to the
nucleus and nucleolus. The family also includes other
metazoa members from plants to mammals where the protein
has been named BR22 and is associated with TTF-1, thyroid
transcription factor 1. In the lungs, the family binds
TTF-1 to form a complex which influences the expression
of the key lung surfactant protein-B (SP-B) and -C
(SP-C), the small hydrophobic surfactant proteins that
maintain surface tension in alveoli.
Length = 150
Score = 29.9 bits (67), Expect = 4.5
Identities = 27/86 (31%), Positives = 51/86 (59%), Gaps = 3/86 (3%)
Query: 1153 TEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDRE 1212
+KE+LK ++ E+E E++V+S ++ ++ +K + DE+ +K+RK+E+ RE
Sbjct: 34 LKKEYLKLLEKEGYAVPEKESAEKQVKSSKEDRKFEKKKKLDEKKEIAKQRKREQ---RE 90
Query: 1213 KDQAKLKKTLKKIMRVVIKYTDSDGR 1238
K+ AK +K L+KI K + + R
Sbjct: 91 KELAKRQKELEKIELSKKKQKERERR 116
>gnl|CDD|227504 COG5177, COG5177, Uncharacterized conserved protein [Function
unknown].
Length = 769
Score = 31.2 bits (70), Expect = 4.6
Identities = 42/205 (20%), Positives = 83/205 (40%), Gaps = 24/205 (11%)
Query: 1028 EKVIQAGMFDQKSTG------SERHQFLQTILHQDDEEDEEENAVPD-DETVNQMLARSE 1080
K+I G ++Q ++ LQT+ + D + P+ ++ + E
Sbjct: 295 NKIIVNGQYEQTIREIFADRATKLELDLQTVFESNMNRDTLDEYAPEGEDLRSDYDEDFE 354
Query: 1081 EEFQTYQRIDA----ERRKEQGKKSRLIEVSEL--PDWLIKEDEEIEQWAFEAKEEEKAL 1134
+ T RID R++ KK+ + + + W E+EE Q +E++
Sbjct: 355 YDGLTTVRIDDHGFLPGREQTSKKAAVPKGTSFYQAKWAEDEEEEDGQCN-----DEEST 409
Query: 1135 HMGRGSRQRKQVDYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDD 1194
K+ D + ++E AIDD +++ EEEE ++ + R ++D
Sbjct: 410 MSAIDDDDPKENDNEEVAGDEES--AIDDNEGFEELSPEEEE----RQLREFRDMEKEDR 463
Query: 1195 EEPSTSKKRKKEKEKDREKDQAKLK 1219
E P ++ + E +R K+ L+
Sbjct: 464 EFPDEAELQPSESAIERYKEYRGLR 488
>gnl|CDD|214661 smart00435, TOPEUc, DNA Topoisomerase I (eukaryota). DNA
Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina
virus topoisomerase, Variola virus topoisomerase, Shope
fibroma virus topoisomeras.
Length = 391
Score = 30.8 bits (70), Expect = 4.9
Identities = 15/67 (22%), Positives = 37/67 (55%), Gaps = 6/67 (8%)
Query: 1171 EEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDRE-KDQAKLKKTLKKIMRVV 1229
E +++ K K K + D E ++ ++K+KEK +E K + ++++ ++I ++
Sbjct: 302 LFEMISDLKRKLKSKFER-----DNEKLDAEVKEKKKEKKKEEKKKKQIERLEERIEKLE 356
Query: 1230 IKYTDSD 1236
++ TD +
Sbjct: 357 VQATDKE 363
>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
component YidC; Validated.
Length = 429
Score = 31.0 bits (70), Expect = 4.9
Identities = 13/38 (34%), Positives = 21/38 (55%), Gaps = 1/38 (2%)
Query: 277 KRTKRQGLKEARATEKLEKQQKVEAERK-KRQKHQEYI 313
K+T+ EA+A +K Q++ AER+ R+ QE
Sbjct: 333 KKTRTAEKNEAKARKKEIAQKRRAAEREINREARQERA 370
>gnl|CDD|221818 pfam12868, DUF3824, Domain of unknwon function (DUF3824). This
is a repeating domain found in fungal proteins. It is
proline-rich, and the function is not known.
Length = 135
Score = 29.5 bits (66), Expect = 5.1
Identities = 20/53 (37%), Positives = 25/53 (47%), Gaps = 2/53 (3%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQ-LPMGAPGSGPPGS-PGPSPGQAPGQNPQE 52
S+S P P P PP++ + P PPGS P P PG PG NP +
Sbjct: 47 SDSYEEPYDPTPYPPSPPVSDPRYYPNSNYFPPPPGSTPVPPPGPQPGYNPAD 99
>gnl|CDD|149453 pfam08397, IMD, IRSp53/MIM homology domain. The N-terminal
predicted helical stretch of the insulin receptor
tyrosine kinase substrate p53 (IRSp53) is an
evolutionary conserved F-actin bundling domain involved
in filopodium formation. The domain has been named IMD
after the IRSp53 and missing in metastasis (MIM)
proteins in which it occurs. Filopodium-inducing IMD
activity is regulated by Cdc42 and Rac1 and is
SH3-independent.
Length = 218
Score = 30.5 bits (69), Expect = 5.1
Identities = 18/70 (25%), Positives = 35/70 (50%), Gaps = 6/70 (8%)
Query: 362 ERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVKEHKMEQKKKQDEESK 421
R + ++ EE ++ D+ + L +T+ + K+++ E KKK+DE
Sbjct: 65 MVHRSINSKLEEFFKAFHDE----LINPLEKKTELDKKYANALDKDYQTEYKKKRDE--L 118
Query: 422 KRKQSVKQKL 431
++KQS +KL
Sbjct: 119 EKKQSDLKKL 128
>gnl|CDD|224272 COG1353, COG1353, Predicted CRISPR-associated polymerase [Defense
mechanisms].
Length = 799
Score = 30.9 bits (70), Expect = 5.2
Identities = 31/171 (18%), Positives = 57/171 (33%), Gaps = 22/171 (12%)
Query: 1147 DYTDSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGK------RRKKTEDDDEEPSTS 1200
D+ ++ K KA+ Y + E+ P+T
Sbjct: 254 DFIYEVSSKGASKALRGRSFYIELLTEDIVNRIISELNLTRANILFEGGGHFYLLLPNTE 313
Query: 1201 KKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLS-EPFIKLPSRKELPDYYEV 1259
+ RK +E ++E L + L + V Y + L+ + F + RK L E
Sbjct: 314 EVRKILEEIEKE-----LNEWLIEQNFKVDLYLELAWVELTLKDFYRRWFRKHLEKVSE- 367
Query: 1260 IDRPMDIKKILGR--IEDGKYSSVDELQKDFK---TLCRNAQIYNEELSLI 1305
+ +K L R +E G EL + ++C N + + +
Sbjct: 368 ----LPSRKKLRRFEVELGILFPRYELDGPGERTCSVCGNKRAKGDSEKEM 414
>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
Length = 1465
Score = 31.0 bits (70), Expect = 5.3
Identities = 27/146 (18%), Positives = 59/146 (40%), Gaps = 10/146 (6%)
Query: 1142 QRKQVDYTDSL---TEKE-WLK---AIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDD 1194
+ K + L T K WLK A++ ++ D+E+ + EE R K + +
Sbjct: 1135 RDKLNIEVEDLKKTTPKSLWLKDLDALEKELDKLDKEDAKAEEAREKLQRAAARGESGAA 1194
Query: 1195 EEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSEPFIKLPSRKELP 1254
++ S +K +K +K +T ++ T++ V+ +P + ++K+ P
Sbjct: 1195 KKVSRQAPKKPAPKKTTKKASE--SETTEETYGSSAMETENVAEVV-KPKGRAGAKKKAP 1251
Query: 1255 DYYEVIDRPMDIKKILGRIEDGKYSS 1280
+ + +I + R+ S
Sbjct: 1252 AAAKEKEEEDEILDLKDRLAAYNLDS 1277
>gnl|CDD|218191 pfam04652, DUF605, Vta1 like. Vta1 (VPS20-associated protein 1) is
a positive regulator of Vps4. Vps4 is an ATPase that is
required in the multivesicular body (MVB) sorting
pathway to dissociate the endosomal sorting complex
required for transport (ESCRT). Vta1 promotes correct
assembly of Vps4 and stimulates its ATPase activity
through its conserved Vta1/SBP1/LIP5 region.
Length = 315
Score = 30.8 bits (70), Expect = 5.4
Identities = 15/53 (28%), Positives = 17/53 (32%), Gaps = 3/53 (5%)
Query: 1 MSNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQEN 53
S+SS P P Q PP + S PPG P P P
Sbjct: 207 PSDSSLPPAPSSFQSDTPPPSPESPT---NPSPPPGPAAPPPPPVQQVPPLST 256
>gnl|CDD|221012 pfam11169, DUF2956, Protein of unknown function (DUF2956). This
family of proteins with unknown function appears to be
restricted to Gammaproteobacteria.
Length = 103
Score = 28.8 bits (65), Expect = 5.4
Identities = 14/39 (35%), Positives = 18/39 (46%)
Query: 407 EHKMEQKKKQDEESKKRKQSVKQKLMDTDGKVTLDQDET 445
E+K +QK K E K RKQ +K K D E+
Sbjct: 38 EYKKQQKAKAREADKARKQQLKAKQRQAANDDEEDTIES 76
>gnl|CDD|216368 pfam01213, CAP_N, Adenylate cyclase associated (CAP) N terminal.
Length = 313
Score = 30.6 bits (69), Expect = 5.4
Identities = 16/65 (24%), Positives = 23/65 (35%), Gaps = 5/65 (7%)
Query: 3 NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQ-NPQENLTA-LQRA 60
+SS PPPP PP + G + N E +T+ L++
Sbjct: 226 SSSAPSAPPPPPPPPPPSVP---TISNSVESASSDSKGGRGAVFAELNKGEGITSGLKKV 282
Query: 61 IDSMK 65
D MK
Sbjct: 283 TDDMK 287
>gnl|CDD|132062 TIGR03017, EpsF, chain length determinant protein EpsF. Sequences
in this family of proteins are members of the chain
length determinant family (pfam02706) which includes the
wzc protein from E.coli. This family of proteins are
homologous to the EpsF protein of the methanolan
biosynthesis operon of Methylobacillus species strain
12S. The distribution of this protein appears to be
restricted to a subset of exopolysaccharide operons
containing a syntenic grouping of genes including a
variant of the EpsH exosortase protein. Exosortase has
been proposed to be involved in the targetting and
processing of proteins containing the PEP-CTERM domain
to the exopolysaccharide layer.
Length = 444
Score = 30.5 bits (69), Expect = 6.0
Identities = 34/137 (24%), Positives = 54/137 (39%), Gaps = 26/137 (18%)
Query: 186 SKLTNIPKPEGLDPLI---ILQERENRVALNIERRIEELNGSLTSTLPEHLRVKAEIELR 242
SK + L +I I+Q + +A E ++ EL+ L P++ R +AEI
Sbjct: 236 SKEGGSSGKDALPEVIANPIIQNLKTDIA-RAESKLAELSQRLGPNHPQYKRAQAEIN-- 292
Query: 243 ALKVLNFQRQLRAEVIACARRDTTLETAVNVKAYKRTKRQGLKEARATEKLEKQQKVEAE 302
+ + QL AE+ +V R +Q +EA E LE Q+ E
Sbjct: 293 -----SLKSQLNAEIKKVTS---------SVGTNSRILKQ--REAELREALENQKAKVLE 336
Query: 303 RKKRQKHQEYITTVLQH 319
+ Q +VLQ
Sbjct: 337 LNR----QRDEMSVLQR 349
>gnl|CDD|130324 TIGR01257, rim_protein, retinal-specific rim ABC transporter. This
model describes the photoreceptor protein (rim protein)
in eukaryotes. It is the member of ABC transporter
superfamily. Rim protein is a membrane glycoprotein which
is localized in the photoreceptor outer segment discs.
Mutation/s in its genetic loci is implicated in the
recessive Stargardt's disease [Transport and binding
proteins, Other].
Length = 2272
Score = 31.1 bits (70), Expect = 6.0
Identities = 17/55 (30%), Positives = 19/55 (34%), Gaps = 1/55 (1%)
Query: 118 PQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
A G Q KR P P G P P Q P + QPPP+P
Sbjct: 1284 SLFAGGAQQKRENANLRHPCSGPTEKAGQTPQASHTCSPGQPAAHP-EGQPPPEP 1337
>gnl|CDD|220600 pfam10147, CR6_interact, Growth arrest and DNA-damage-inducible
proteins-interacting protein 1. Members of this family
of proteins act as negative regulators of G1 to S cell
cycle phase progression by inhibiting cyclin-dependent
kinases. Inhibitory effects are additive with GADD45
proteins but occur also in the absence of GADD45
proteins. Furthermore, they act as a repressor of the
orphan nuclear receptor NR4A1 by inhibiting AB
domain-mediated transcriptional activity.
Length = 217
Score = 30.2 bits (68), Expect = 6.2
Identities = 19/79 (24%), Positives = 39/79 (49%), Gaps = 15/79 (18%)
Query: 347 NAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMVK 406
A+K +++++ R KER RL+AE E + +D + + +M++
Sbjct: 141 RAQKRKREQKARAAKERKERLVAEAREHFGYWVDPRDPR---------------FQEMLQ 185
Query: 407 EHKMEQKKKQDEESKKRKQ 425
+ + E+KKK E ++ K+
Sbjct: 186 QKEKEEKKKVKEAKRREKE 204
>gnl|CDD|177464 PHA02682, PHA02682, ORF080 virion core protein; Provisional.
Length = 280
Score = 30.2 bits (67), Expect = 6.2
Identities = 16/40 (40%), Positives = 20/40 (50%), Gaps = 7/40 (17%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQ-PPPQ 171
PS Q PP P+P +P P A+P+ L Q PPP
Sbjct: 142 PSTRQCPPAP-----PLPTPKPAP-AAKPIFLHNQLPPPD 175
>gnl|CDD|115072 pfam06391, MAT1, CDK-activating kinase assembly factor MAT1. MAT1
is an assembly/targeting factor for cyclin-dependent
kinase-activating kinase (CAK), which interacts with the
transcription factor TFIIH. The domain found to the
N-terminal side of this domain is a C3HC4 RING finger.
Length = 200
Score = 30.1 bits (68), Expect = 6.4
Identities = 24/109 (22%), Positives = 50/109 (45%), Gaps = 5/109 (4%)
Query: 324 KEYHRNNQARIMRLNKAVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKK 383
+Y + N+ IMR NK + E EQ E+E+ KE R + ++E+ + ++ K
Sbjct: 71 DQYEKENKDSIMR-NKRRLTREQ-EELEQALEEEKEMKEEKRLHLQKEEQEQKMAKEKDK 128
Query: 384 DKRLAFLLSQTDEYISNLTQMVKEH--KMEQKKKQDEESKKRKQSVKQK 430
+ + L ++ + + K+ ++E + ++ E K+ S K
Sbjct: 129 -QEIIDELETSNLPANVIIAQHKKQSKQLESQVEKLERKKRVTFSTGIK 176
>gnl|CDD|235206 PRK04031, PRK04031, DNA primase; Provisional.
Length = 408
Score = 30.6 bits (70), Expect = 6.4
Identities = 16/94 (17%), Positives = 42/94 (44%), Gaps = 5/94 (5%)
Query: 1152 LTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDR 1211
LT+KE KA+ + V + EE ++ + + + ++ + + E + +KE
Sbjct: 251 LTKKEIAKALRNKVPVEQYLEELGKKAQKAAEKVKEEEEKPEKEPAEQPEPEEKEPAPVP 310
Query: 1212 EKDQAKLKKTLKKIM---RVVIKYTDSDGRVLSE 1242
+ + +++ +K++ + D + V+ E
Sbjct: 311 AEKEETVREHIKELKGTLEARL--LDENWNVIKE 342
>gnl|CDD|237862 PRK14948, PRK14948, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 620
Score = 30.7 bits (70), Expect = 6.4
Identities = 18/72 (25%), Positives = 27/72 (37%), Gaps = 2/72 (2%)
Query: 2 SNSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAI 61
++S + PPP Q+ PP P+ P + P P P P + Q
Sbjct: 515 GSASNTAKTPPPPQKSPPPPAPTPPLPQPTATAPPPTPPPP--PPTATQASSNAPAQIPA 572
Query: 62 DSMKEQGLEEDP 73
DS + E+P
Sbjct: 573 DSSPPPPIPEEP 584
>gnl|CDD|215521 PLN02967, PLN02967, kinase.
Length = 581
Score = 30.8 bits (69), Expect = 6.5
Identities = 14/59 (23%), Positives = 25/59 (42%), Gaps = 11/59 (18%)
Query: 1165 VEYDDEEEEEEEEVRSKRKGKRRKKT-------EDDDEEPSTSKKRKKEKEKDREKDQA 1216
V D ++E + K + R+K E++ E K+RK +K + +DQ
Sbjct: 100 VNEDAALDKESK----KTPRRTRRKAAAASSDVEEEKTEKKVRKRRKVKKMDEDVEDQG 154
Score = 30.4 bits (68), Expect = 8.1
Identities = 16/67 (23%), Positives = 34/67 (50%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKDQAKLKKTLKKIMR 1227
D EEE+ E++VR +RK K+ + +D S ++ + +++++ + L+K
Sbjct: 127 DVEEEKTEKKVRKRRKVKKMDEDVEDQGSESEVSDVEESEFVTSLENESEEELDLEKDDG 186
Query: 1228 VVIKYTD 1234
I +T
Sbjct: 187 EDISHTY 193
>gnl|CDD|236912 PRK11448, hsdR, type I restriction enzyme EcoKI subunit R;
Provisional.
Length = 1123
Score = 30.7 bits (70), Expect = 6.7
Identities = 24/102 (23%), Positives = 37/102 (36%), Gaps = 22/102 (21%)
Query: 286 EARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKDFKEYHRNNQARIMRLNKAVMNYH 345
E +A EK + Q EA++++ + E Q +L +
Sbjct: 159 ELQAREKAQSQALAEAQQQELVALEGLA----------AELEEKQQELEAQLEQL----- 203
Query: 346 ANAEKEQKKEQERIEKERMRRLMAE-----DEEGYRKLIDQK 382
EK + QER +K + A EE R LIDQ+
Sbjct: 204 --QEKAAETSQERKQKRKEITDQAAKRLELSEEETRILIDQQ 243
Score = 30.3 bits (69), Expect = 8.2
Identities = 22/106 (20%), Positives = 50/106 (47%), Gaps = 6/106 (5%)
Query: 346 ANAEKEQKKEQERIEKERMRRLMAEDEEGYRKLIDQKKDKRLAFLLSQTDEYISNLTQMV 405
N ++E ++++ L A ++ + L + ++ + +A L E ++
Sbjct: 141 ENLLHALQQEVLTLKQQL--ELQAREKAQSQALAEAQQQELVA-LEGLAAELEEKQQEL- 196
Query: 406 KEHKMEQKKKQDEE-SKKRKQSVKQKLMDTDGKVTLDQDETSQLTD 450
E ++EQ +++ E S++RKQ K+ ++ L ++ET L D
Sbjct: 197 -EAQLEQLQEKAAETSQERKQKRKEITDQAAKRLELSEEETRILID 241
>gnl|CDD|165245 PHA02934, PHA02934, Hypothetical protein; Provisional.
Length = 253
Score = 30.0 bits (67), Expect = 7.2
Identities = 14/27 (51%), Positives = 17/27 (62%)
Query: 573 NNNLNGILADEMGLGKTIQTIALITYL 599
NN +N IL D GLG + TI+ IT L
Sbjct: 158 NNEVNTILMDNKGLGVRLATISFITEL 184
>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510). This
family consists of several hypothetical bacterial
proteins of around 200 residues in length. The function
of this family is unknown.
Length = 214
Score = 29.7 bits (67), Expect = 7.2
Identities = 14/56 (25%), Positives = 29/56 (51%), Gaps = 1/56 (1%)
Query: 1159 KAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREKD 1214
K DD + EE +EEE+ + + K K + ++E S + ++E E+ +++
Sbjct: 52 KKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEEN-EEEDEESSDEN 106
>gnl|CDD|213341 cd05392, RasGAP_Neurofibromin_like, Ras-GTPase Activating Domain of
proteins similar to neurofibromin. Neurofibromin-like
proteins include the Saccharomyces cerevisiae RasGAP
proteins Ira1 and Ira2, the closest homolog of
neurofibromin, which is responsible for the human
autosomal dominant disease neurofibromatosis type I
(NF1). The RasGAP Ira1/2 proteins are negative regulators
of the Ras-cAMP signaling pathway and conserved from
yeast to human. In yeast Ras proteins are activated by
GEFs, and inhibited by two GAPs, Ira1 and Ira2. Ras
proteins activate the cAMP/protein kinase A (PKA)
pathway, which controls metabolism, stress resistance,
growth, and meiosis. Recent studies showed that the kelch
proteins Gpb1 and Gpb2 inhibit Ras activity via
association with Ira1 and Ira2. Gpb1/2 bind to a
conserved C-terminal domain of Ira1/2, and loss of Gpb1/2
results in a destabilization of Ira1 and Ira2, leading to
elevated levels of Ras2-GTP and uninhibited cAMP-PKA
signaling. Since the Gpb1/2 binding domain on Ira1/2 is
conserved in the human neurofibromin protein, the studies
suggest that an analogous signaling mechanism may
contribute to the neoplastic development of NF1.
Length = 317
Score = 30.3 bits (69), Expect = 7.2
Identities = 21/89 (23%), Positives = 32/89 (35%), Gaps = 20/89 (22%)
Query: 1227 RVVIKYTDSDG-----RVLSEPFIKLPSRKELPDYYEVIDRPMDI----------KKILG 1271
R++ Y S G +VL ++ + DY+EV D K
Sbjct: 79 RLLTLYAKSVGNKYLRKVLRPLLTEI---VDNKDYFEVEKIKPDDENLEENADLLMKYAQ 135
Query: 1272 RIEDGKYSSVDELQKDFKTLCRNAQIYNE 1300
+ D SVD+L F+ +C IY
Sbjct: 136 MLLDSITDSVDQLPPSFRYIC--NTIYES 162
>gnl|CDD|226193 COG3667, PcoB, Uncharacterized protein involved in copper
resistance [Inorganic ion transport and metabolism].
Length = 321
Score = 30.2 bits (68), Expect = 7.3
Identities = 8/45 (17%), Positives = 10/45 (22%), Gaps = 1/45 (2%)
Query: 133 PSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQPHQQQG 177
P + PM MP + P P P
Sbjct: 32 PHAHEHAPMD-APHPAMPGMDHHAHSKMPGPEMAAPQMDHGAMPH 75
>gnl|CDD|240653 cd12176, PGDH_3, Phosphoglycerate dehydrogenases, NAD-binding and
catalytic domains. Phosphoglycerate dehydrogenases
(PGDHs) catalyze the initial step in the biosynthesis of
L-serine from D-3-phosphoglycerate. PGDHs come in 3
distinct structural forms, with this first group being
related to 2-hydroxy acid dehydrogenases, sharing
structural similarity to formate and glycerate
dehydrogenases. PGDH in E. coli and Mycobacterium
tuberculosis form tetramers, with subunits containing a
Rossmann-fold NAD binding domain. Formate/glycerate and
related dehydrogenases of the D-specific 2-hydroxyacid
dehydrogenase superfamily include groups such as formate
dehydrogenase, glycerate dehydrogenase, L-alanine
dehydrogenase, and S-adenosylhomocysteine hydrolase.
Despite often low sequence identity, these proteins
typically have a characteristic arrangement of 2 similar
subdomains of the alpha/beta Rossmann fold NAD+ binding
form. The NAD+ binding domain is inserted within the
linear sequence of the mostly N-terminal catalytic
domain, which has a similar domain structure to the
internal NAD binding domain. Structurally, these domains
are connected by extended alpha helices and create a
cleft in which NAD is bound, primarily to the C-terminal
portion of the 2nd (internal) domain. Some related
proteins have similar structural subdomain but with a
tandem arrangement of the catalytic and NAD-binding
subdomains in the linear sequence.
Length = 304
Score = 30.2 bits (69), Expect = 7.3
Identities = 13/26 (50%), Positives = 16/26 (61%)
Query: 746 FNAPFATTGEKVELNEEETILIIRRL 771
FNAPF+ T EL E I++ RRL
Sbjct: 91 FNAPFSNTRSVAELVIGEIIMLARRL 116
>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27. This protein forms
the C subunit of DNA polymerase delta. It carries the
essential residues for binding to the Pol1 subunit of
polymerase alpha, from residues 293-332, which are
characterized by the motif D--G--VT, referred to as the
DPIM motif. The first 160 residues of the protein form
the minimal domain for binding to the B subunit, Cdc1, of
polymerase delta, the final 10 C-terminal residues,
362-372, being the DNA sliding clamp, PCNA, binding
motif.
Length = 427
Score = 30.2 bits (68), Expect = 7.5
Identities = 27/100 (27%), Positives = 42/100 (42%), Gaps = 26/100 (26%)
Query: 1165 VEYDDEEEEEEEEVRSKRKGKRRKKTEDDD-------------------EEPSTSKKRKK 1205
E D EEE EE+ + KR KR KK +D+ EEP KK
Sbjct: 278 GERSDSEEETEEKEKEKR--KRLKKMMEDEDEDEEMEIVPESPVEEEESEEPEPPPLPKK 335
Query: 1206 EKEKDREKDQAKLKKTLKKIMRVVIK---YTDSDGRVLSE 1242
E+EK+ + + R V+K + D +G ++++
Sbjct: 336 EEEKEEVTVSPDGGRRRGR--RRVMKKKTFKDEEGYLVTK 373
>gnl|CDD|225689 COG3147, DedD, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 226
Score = 29.9 bits (67), Expect = 7.9
Identities = 20/134 (14%), Positives = 37/134 (27%), Gaps = 5/134 (3%)
Query: 94 SAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQ 153
Q + +++ + P P + + + G + + P + P
Sbjct: 46 KPQGDRDEPRVLPAVVQVVALPTQPPEGVAQEI-QDAGDAAAASVDPQPVAQPPVESTPA 104
Query: 154 PMPNQAQPMPLQQQPPPQPHQQQGHISSQIKQSKLTNIPKPEGLDPLIILQERENRVALN 213
+P AQ + P P + + K P ++Q AL
Sbjct: 105 GVPVAAQTPKPVKPPKQPPAGAVPAKPTPKPEPKPVAEPAAAPTGQAFVVQ----LGALK 160
Query: 214 IERRIEELNGSLTS 227
R EL L
Sbjct: 161 NADRANELVAKLRG 174
>gnl|CDD|225657 COG3115, ZipA, Cell division protein [Cell division and chromosome
partitioning].
Length = 324
Score = 30.2 bits (68), Expect = 7.9
Identities = 18/127 (14%), Positives = 27/127 (21%), Gaps = 14/127 (11%)
Query: 67 QGLEEDPRYQKLIEMKANRTEIKHAFTSAQVQQLRFQIMAYRLLARNQPLTPQLAMGVQG 126
E + Q A+ QP P L
Sbjct: 75 FTQEHEAARQSPQHQYQPEYASAQIKIPVPQPPQISDPPAH-----PQPTQPALDQ---- 125
Query: 127 KRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPM--PLQQQPPPQPHQQQGHISSQIK 184
E P + P + P P P P A P + QP + +
Sbjct: 126 ---EQPPEEARQPVLPQEAPAPQPVHSAAPQPAVQTVQPAVPEQQVQPEEVVEPAPEVKR 182
Query: 185 QSKLTNI 191
+ +
Sbjct: 183 PPRKDTV 189
>gnl|CDD|148208 pfam06465, DUF1087, Domain of Unknown Function (DUF1087). Members of
this family are found in various chromatin remodelling
factors and transposases. Their exact function is, as
yet, unknown.
Length = 66
Score = 27.5 bits (61), Expect = 8.1
Identities = 10/32 (31%), Positives = 16/32 (50%)
Query: 1125 FEAKEEEKALHMGRGSRQRKQVDYTDSLTEKE 1156
+E E+ +G+G R RKQV+Y +
Sbjct: 35 YEQLRAEEEKALGKGKRSRKQVNYAEEDDIDG 66
>gnl|CDD|225603 COG3061, OapA, Cell envelope opacity-associated protein A [Cell
envelope biogenesis, outer membrane].
Length = 242
Score = 29.9 bits (67), Expect = 8.3
Identities = 19/82 (23%), Positives = 29/82 (35%), Gaps = 7/82 (8%)
Query: 112 RNQPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQ 171
N P Q+A+ Q + P MP P P+ P P+ MP
Sbjct: 9 DNPPAQNQMAV-EQMIEEQDAPQAETMPGNFEAKP-PLAEVWPAPDNNVFMPPL-----P 61
Query: 172 PHQQQGHISSQIKQSKLTNIPK 193
P ++G I + I ++P
Sbjct: 62 PMHRRGIIVAPIMLVAQAHLPS 83
>gnl|CDD|236138 PRK07994, PRK07994, DNA polymerase III subunits gamma and tau;
Validated.
Length = 647
Score = 30.2 bits (69), Expect = 8.4
Identities = 22/92 (23%), Positives = 30/92 (32%), Gaps = 16/92 (17%)
Query: 94 SAQVQQLRFQIMAYRLLARNQ-PLTPQLAMGVQGK--RM----------EGVPSGPQMPP 140
+ QL +Q + L+ R PL P MGV+ RM E P
Sbjct: 321 PPEDVQLYYQTL---LIGRKDLPLAPDRRMGVEMTLLRMLAFHPAAPLPEPEVPPQSAAP 377
Query: 141 MSLHGPMPMPPSQPMPNQAQPMPLQQQPPPQP 172
+ P + P QA +P PQ
Sbjct: 378 AASAQATAAPTAAVAPPQAPAVPPPPASAPQQ 409
>gnl|CDD|223526 COG0449, GlmS, Glucosamine 6-phosphate synthetase, contains
amidotransferase and phosphosugar isomerase domains [Cell
envelope biogenesis, outer membrane].
Length = 597
Score = 30.2 bits (69), Expect = 8.5
Identities = 11/56 (19%), Positives = 25/56 (44%), Gaps = 8/56 (14%)
Query: 1087 QRIDAERRKEQGKKSRLIEVSELPDWL---IKEDEEIEQWAFEAKEEEKALHMGRG 1139
I E + + E+ +LP+ + + +E+I++ A + + +GRG
Sbjct: 414 GTISEEEERSL-----IKELQKLPNHIPKVLAAEEKIKELAKRLADAKDFFFLGRG 464
>gnl|CDD|220603 pfam10152, DUF2360, Predicted coiled-coil domain-containing protein
(DUF2360). This is the conserved 140 amino acid region
of a family of proteins conserved from nematodes to
humans. One C. elegans member is annotated as a
Daf-16-dependent longevity protein 1 but this could not
be confirmed. The function is unknown.
Length = 147
Score = 28.9 bits (65), Expect = 8.6
Identities = 19/79 (24%), Positives = 29/79 (36%), Gaps = 23/79 (29%)
Query: 3 NSSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRAID 62
++ P PPPP + + A PP +P P + + P EN
Sbjct: 66 ITNGGPPPPPPARAE-----------AASPPPPEAPAEPPAEPEPEAPAENTVT------ 108
Query: 63 SMKEQGLEEDPRYQKLIEM 81
+ +DPRY K +M
Sbjct: 109 ------VAKDPRYAKYFKM 121
>gnl|CDD|235319 PRK04914, PRK04914, ATP-dependent helicase HepA; Validated.
Length = 956
Score = 30.2 bits (69), Expect = 8.6
Identities = 10/11 (90%), Positives = 11/11 (100%)
Query: 580 LADEMGLGKTI 590
LADE+GLGKTI
Sbjct: 174 LADEVGLGKTI 184
>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
unknown].
Length = 652
Score = 30.4 bits (69), Expect = 8.8
Identities = 17/107 (15%), Positives = 45/107 (42%), Gaps = 12/107 (11%)
Query: 287 ARATEKLEKQQKVEAERKKRQKHQEYITTVLQHCKD-------FKEYHRNNQARIMRLNK 339
A A K++++++ + ++ + IT + K +E + + + L +
Sbjct: 391 AEALSKVKEEERPREKEGTEEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKR 450
Query: 340 AVMNYHANAEKEQKKEQERIEKERMRRLMAEDEEGY---RKLIDQKK 383
+ +E E+ + + R + + R + A D ++L ++KK
Sbjct: 451 EIEKLE--SELERFRREVRDKVRKDREIRARDRRIERLEKELEEKKK 495
>gnl|CDD|233045 TIGR00601, rad23, UV excision repair protein Rad23. All proteins
in this family for which functions are known are
components of a multiprotein complex used for targeting
nucleotide excision repair to specific parts of the
genome. In humans, Rad23 complexes with the XPC protein.
This family is based on the phylogenomic analysis of JA
Eisen (1999, Ph.D. Thesis, Stanford University) [DNA
metabolism, DNA replication, recombination, and repair].
Length = 378
Score = 30.2 bits (68), Expect = 8.8
Identities = 12/57 (21%), Positives = 16/57 (28%), Gaps = 9/57 (15%)
Query: 4 SSTSPNPPPPQQQQPPLNVGQLPMGAPGSGPPGSPGPSPGQAPGQNPQENLTALQRA 60
+ PP P P PP SP AP +E + + A
Sbjct: 80 GTGKVAPPAATPTSAP---------TPTPSPPASPASGMSAAPASAVEEKSPSEESA 127
>gnl|CDD|233467 TIGR01554, major_cap_HK97, phage major capsid protein, HK97 family.
This model family represents the major capsid protein
component of the heads (capsids) of bacteriophage HK97,
phi-105, P27, and related phage. This model represents
one of several analogous families lacking detectable
sequence similarity. The gene encoding this component is
typically located in an operon encoding the small and
large terminase subunits, the portal protein and the
prohead or maturation protease [Mobile and
extrachromosomal element functions, Prophage functions].
Length = 384
Score = 30.0 bits (68), Expect = 9.0
Identities = 15/93 (16%), Positives = 35/93 (37%), Gaps = 3/93 (3%)
Query: 1150 DSLTEKEWLKAIDDGVEYDDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEK 1209
LTE E L ++ D +EE +++ ++ R + D+ E + + +
Sbjct: 16 RKLTEDEKLAEAEEEKAEYDALKEEIDKLDAEID---RLEELLDELEAKPAASGEGGGGE 72
Query: 1210 DREKDQAKLKKTLKKIMRVVIKYTDSDGRVLSE 1242
+ E++ + +R + + LS
Sbjct: 73 EEEEEAKAEAAEFRAYLRGGDDALAEERKALST 105
>gnl|CDD|240576 cd12932, RRP7_like, RRP7 domain ribosomal RNA-processing protein 7
(Rrp7p), ribosomal RNA-processing protein 7 homolog A
(Rrp7A), and similar proteins. This CD corresponds to
the RRP7 domain of Rrp7p and Rrp7A. Rrp7p is encoded by
YCL031C gene from Saccharomyces cerevisiae. It is an
essential yeast protein involved in pre-rRNA processing
and ribosome assembly, and is speculated to be required
for correct assembly of rpS27 into the pre-ribosomal
particle. Rrp7A, also termed gastric cancer antigen Zg14,
is the Rrp7p homolog mainly found in Metazoans. The
cellular function of Rrp7A remains unclear currently.
Both Rrp7p and Rrp7A harbor an N-terminal RNA recognition
motif (RRM), also termed RBD (RNA binding domain) or RNP
(ribonucleoprotein domain), and a C-terminal RRP7 domain.
Length = 118
Score = 28.4 bits (64), Expect = 9.1
Identities = 29/103 (28%), Positives = 48/103 (46%), Gaps = 25/103 (24%)
Query: 1147 DYTDSLTEKEWLKA-IDDGVE-YDDEEEEEEEE----------------VRSKRKGKRRK 1188
+Y S + L++ +D+ +E +D EEEE+EE R RKGK +
Sbjct: 8 EYKRSRPDPAELQSEVDEYMEEFDKREEEEKEEAKEARNEPDEDGFVTVTRGGRKGKTAR 67
Query: 1189 KTEDDDEEPSTSKKRKKEKEKD-------REKDQAKLKKTLKK 1224
+ + + KK+KK+KE + REK + +L + KK
Sbjct: 68 EEAVEAKAKEKEKKKKKKKELEDFYRFQIREKKKEELAELRKK 110
>gnl|CDD|128795 smart00521, CBF, CCAAT-Binding transcription Factor.
Length = 62
Score = 27.4 bits (61), Expect = 9.5
Identities = 19/44 (43%), Positives = 23/44 (52%), Gaps = 9/44 (20%)
Query: 271 VNVKAYKRTKRQGLKEARATEKLEKQQKVEAERKK-----RQKH 309
VN K Y R R+ ++ARA KLE Q K+ ERK R H
Sbjct: 8 VNAKQYHRILRR--RQARA--KLEAQGKLPKERKPYLHESRHLH 47
>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62.
Length = 217
Score = 29.4 bits (66), Expect = 9.6
Identities = 14/62 (22%), Positives = 23/62 (37%), Gaps = 7/62 (11%)
Query: 1171 EEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKK-------EKEKDREKDQAKLKKTLK 1223
E E+ + + K + K D+ K K +K +EK + K K K
Sbjct: 17 ESEKYKANKDKGNPEIYNKINSQDKAIEKFKLLIKAQMAERVKKLHSQEKKEEKKKPKKK 76
Query: 1224 KI 1225
K+
Sbjct: 77 KV 78
>gnl|CDD|150091 pfam09310, PD-C2-AF1, POU domain, class 2, associating factor 1.
Members of this family are transcriptional coactivators
that specifically associate with either OCT1 or OCT2,
through recognition of their POU domains. They are
essential for the response of B-cells to antigens and
required for the formation of germinal centres.
Length = 264
Score = 29.8 bits (66), Expect = 9.8
Identities = 12/50 (24%), Positives = 20/50 (40%), Gaps = 6/50 (12%)
Query: 114 QPLTPQLAMGVQGKRMEGVPSGPQMPPMSLHGPMPMPPSQPMPNQAQPMP 163
PL+ A +Q + PQ P+ P+P +P P + + P
Sbjct: 187 PPLSACPANTLQYQPASSTLPAPQFLPL------PIPIPEPAPQEEEDAP 230
>gnl|CDD|217829 pfam03985, Paf1, Paf1. Members of this family are components of the
RNA polymerase II associated Paf1 complex. The Paf1
complex functions during the elongation phase of
transcription in conjunction with Spt4-Spt5 and
Spt16-Pob3i.
Length = 431
Score = 30.1 bits (68), Expect = 9.8
Identities = 12/46 (26%), Positives = 24/46 (52%), Gaps = 3/46 (6%)
Query: 1168 DDEEEEEEEEVRSKRKGKRRKKTEDDDEEPSTSKKRKKEKEKDREK 1213
D++E+EEEE+ + + ++ ED +EE S S++ +
Sbjct: 371 DEDEDEEEEQRSDEHEE---EEGEDSEEEGSQSREDGSSESSSDVG 413
>gnl|CDD|217752 pfam03833, PolC_DP2, DNA polymerase II large subunit DP2.
Length = 852
Score = 30.1 bits (68), Expect = 10.0
Identities = 16/45 (35%), Positives = 23/45 (51%), Gaps = 4/45 (8%)
Query: 1143 RKQVDYTDSLTEK--EWLKAIDDGVEYDDEEEEEEEEVRSKRKGK 1185
K + YTD L + +WLK + E D+E+ EE+ SK K
Sbjct: 248 PKILKYTDKLGIEGWDWLKDLSKKKE--DKEDTEEKVAVSKPSDK 290
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.313 0.131 0.370
Gapped
Lambda K H
0.267 0.0724 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 68,798,036
Number of extensions: 7088560
Number of successful extensions: 16542
Number of sequences better than 10.0: 1
Number of HSP's gapped: 13221
Number of HSP's successfully gapped: 1051
Length of query: 1331
Length of database: 10,937,602
Length adjustment: 109
Effective length of query: 1222
Effective length of database: 6,103,016
Effective search space: 7457885552
Effective search space used: 7457885552
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 65 (28.8 bits)