RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy13778
(295 letters)
>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
Length = 943
Score = 46.6 bits (110), Expect = 1e-05
Identities = 30/97 (30%), Positives = 48/97 (49%), Gaps = 6/97 (6%)
Query: 67 PTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQG 126
PTL + P + P P+ P P+ P + P Q PT + P LP+ +P+ P P+
Sbjct: 573 PTLSKKP---EFPKDPKHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPES 629
Query: 127 PTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGP 163
P P+ P Q P+ + P +GP +++ P P+ P
Sbjct: 630 PKSPKRPPPPQRPSSPERP---EGPKIIKSPKPPKSP 663
Score = 45.8 bits (108), Expect = 2e-05
Identities = 38/135 (28%), Positives = 55/135 (40%), Gaps = 19/135 (14%)
Query: 79 PTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQGPTLLQG 138
PTL + P P+ P +DP P+ P + P Q PT P+ P LP+ +P+ P +
Sbjct: 573 PTLSKKPEFPKDPKHPKDPEEPKKP---KRPRSAQRPTRPKSPKLPELLDIPKSPKRPES 629
Query: 139 PTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQGPTLPQDPT 198
P + P Q P+ P P+GP K P+ P P+ P P+
Sbjct: 630 PKSPKRPPPPQRPS---SPERPEGP-------------KIIKSPKPPKSPKPPFDPKFKE 673
Query: 199 LLQDPTLLQGPRYKE 213
D L + KE
Sbjct: 674 KFYDDYLDAAAKSKE 688
Score = 45.8 bits (108), Expect = 2e-05
Identities = 32/111 (28%), Positives = 50/111 (45%), Gaps = 12/111 (10%)
Query: 55 PTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQG 114
PTL + P P+ P + P + P P+ P Q PT + P LP+ + + P P+
Sbjct: 573 PTLSKKPEFPKDP---KHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPES 629
Query: 115 PTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTL 165
P P+ P PQ P+ P+ P +GP +++ P + P P P
Sbjct: 630 PKSPKRPPPPQRPSSPERP---------EGPKIIKSPKPPKSPKPPFDPKF 671
Score = 44.3 bits (104), Expect = 6e-05
Identities = 24/81 (29%), Positives = 38/81 (46%)
Query: 53 QGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLP 112
+ P + P P+ P + P Q PT P+ P LP+ + + P P+ P + P P
Sbjct: 580 EFPKDPKHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPESPKSPKRPPPP 639
Query: 113 QGPTLPQGPTLPQGPTLPQGP 133
Q P+ P+ P P+ P+ P
Sbjct: 640 QRPSSPERPEGPKIIKSPKPP 660
>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1. Members
of this family are necessary for accurate chromosome
transmission during cell division.
Length = 804
Score = 40.1 bits (94), Expect = 0.001
Identities = 42/160 (26%), Positives = 46/160 (28%), Gaps = 7/160 (4%)
Query: 52 LQGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQG-PTLPQGPTLLQDPTLPQGPTLLQGPT 110
LQ P L QG Q QG P P G P +Q
Sbjct: 163 LQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPP----QGHPEQVQPQQ 218
Query: 111 LPQGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPT 170
P+ P P LPQ P LQ P + P Q P P
Sbjct: 219 FLPAPS-QAPAQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPP 277
Query: 171 LLQGPRKGPTLPQGPTLPQGP-TLPQDPTLLQDPTLLQGP 209
P P LPQG P P PQ L+Q P Q
Sbjct: 278 PQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQRG 317
Score = 35.9 bits (83), Expect = 0.025
Identities = 29/173 (16%), Positives = 37/173 (21%), Gaps = 11/173 (6%)
Query: 53 QGPTLLQGPTLPQG---------PTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGP 103
LPQG P Q L P+
Sbjct: 170 PQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQ 229
Query: 104 TLLQGPTLPQGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGP 163
L Q P L Q +P P Q P Q P P
Sbjct: 230 PPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLP 289
Query: 164 TLLQGPT-LLQGPRKGPTLPQGPTLPQGPTLPQDPT-LLQDPTLLQGPRYKEK 214
P Q P+ P + Q +GP + L Q ++
Sbjct: 290 QGQNAPLPPPQQPQLLPLVQQPQGQQRGPQFREQLVQLSQQQREALSQEEAKR 342
Score = 35.5 bits (82), Expect = 0.033
Identities = 42/156 (26%), Positives = 46/156 (29%), Gaps = 7/156 (4%)
Query: 59 QGPTLPQGPTLLQGPTLLQG-PTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTL 117
+ PQ T Q L+ Q LPQG Q QGP
Sbjct: 139 APESQPQPQTPAQKMLSLEEVEAQLQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGP-- 196
Query: 118 PQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGP-R 176
P+ P P P PQG P P LPQ P LQ P +
Sbjct: 197 PEQP--PGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQQ 254
Query: 177 KGPTLPQGPTLPQGPTLPQD-PTLLQDPTLLQGPRY 211
P PQ P Q P PQ P PT G
Sbjct: 255 MPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQ 290
>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
Length = 3151
Score = 39.2 bits (91), Expect = 0.002
Identities = 29/134 (21%), Positives = 37/134 (27%)
Query: 62 TLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGP 121
P P PT P P P P+L ++ G + + P P P
Sbjct: 2819 PPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAP 2878
Query: 122 TLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTL 181
P L + + P + P Q P PQ P Q P P
Sbjct: 2879 ARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPR 2938
Query: 182 PQGPTLPQGPTLPQ 195
PQ P P
Sbjct: 2939 PQPPLAPTTDPAGA 2952
Score = 38.4 bits (89), Expect = 0.004
Identities = 27/140 (19%), Positives = 34/140 (24%), Gaps = 1/140 (0%)
Query: 60 GPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQ 119
P P P + GP P GP P P P P P +
Sbjct: 2729 RQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAV 2788
Query: 120 GPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQ-GPTLLQGPRKG 178
+LP P + P P LP + P GP
Sbjct: 2789 ASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPP 2848
Query: 179 PTLPQGPTLPQGPTLPQDPT 198
G P G + P+
Sbjct: 2849 SLPLGGSVAPGGDVRRRPPS 2868
Score = 36.8 bits (85), Expect = 0.014
Identities = 35/150 (23%), Positives = 40/150 (26%)
Query: 59 QGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLP 118
PT P L T L P P P +P GP GP P P
Sbjct: 2704 PPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTT 2763
Query: 119 QGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKG 178
GP P P P + +L P + P P
Sbjct: 2764 AGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAS 2823
Query: 179 PTLPQGPTLPQGPTLPQDPTLLQDPTLLQG 208
P P P PT P P P+L G
Sbjct: 2824 PAGPLPPPTSAQPTAPPPPPGPPPPSLPLG 2853
Score = 36.1 bits (83), Expect = 0.024
Identities = 31/147 (21%), Positives = 35/147 (23%)
Query: 53 QGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLP 112
PT P T L P P P P + P P GP P
Sbjct: 2704 PPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTT 2763
Query: 113 QGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLL 172
GP P P P + +L P P P
Sbjct: 2764 AGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAS 2823
Query: 173 QGPRKGPTLPQGPTLPQGPTLPQDPTL 199
P PT P P P P+L
Sbjct: 2824 PAGPLPPPTSAQPTAPPPPPGPPPPSL 2850
Score = 34.1 bits (78), Expect = 0.10
Identities = 34/162 (20%), Positives = 42/162 (25%), Gaps = 14/162 (8%)
Query: 61 PTLPQGPTLLQGPTLLQGPTLPQGP---TLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTL 117
P +P GP GP P GP P P L + + P
Sbjct: 2742 PAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLP-S 2800
Query: 118 PQGPTLPQGPTLPQGPTL--LQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQG- 174
P P P L L P P PT P P P+L G ++ G
Sbjct: 2801 PWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGG 2860
Query: 175 -------PRKGPTLPQGPTLPQGPTLPQDPTLLQDPTLLQGP 209
R P P P L + + P
Sbjct: 2861 DVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPP 2902
Score = 34.1 bits (78), Expect = 0.11
Identities = 35/143 (24%), Positives = 42/143 (29%), Gaps = 5/143 (3%)
Query: 58 LQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTL 117
Q P + T L P P P L+ LP GP + + P P
Sbjct: 2680 PQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQAS-PALPAA 2738
Query: 118 PQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTL-LQGPR 176
P P +P GP P GP P GP P P L + L R
Sbjct: 2739 PAPPAVPAGPATPGGPARPARPPTTAGPP---APAPPAAPAAGPPRRLTRPAVASLSESR 2795
Query: 177 KGPTLPQGPTLPQGPTLPQDPTL 199
+ P P P L L
Sbjct: 2796 ESLPSPWDPADPPAAVLAPAAAL 2818
Score = 32.6 bits (74), Expect = 0.26
Identities = 35/207 (16%), Positives = 53/207 (25%), Gaps = 13/207 (6%)
Query: 52 LQGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTL 111
L G P G + P+ P P P L + + P
Sbjct: 2846 PPPSLPLGGSVAPGGDVRRRPPSR-SPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQ 2904
Query: 112 PQGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTL 171
P+ P PQ P PQ P P Q P + + P L
Sbjct: 2905 PERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWL 2964
Query: 172 LQGPRKGPTLPQGPTLPQGPTLPQDPTLLQDPTLLQGPRY----KEKELVERSAYGETE- 226
+P+ P+ + T R L E +
Sbjct: 2965 GALVPGRVAVPRFRVPQPAPSREAPASSTPPLTGHSLSRVSSWASSLALHEETDPPPVSL 3024
Query: 227 -------DEEEESDTEDIRDDDDDDME 246
D+ E+SD + + D D + +
Sbjct: 3025 KQTLWPPDDTEDSDADSLFDSDSERSD 3051
Score = 32.2 bits (73), Expect = 0.39
Identities = 29/128 (22%), Positives = 37/128 (28%), Gaps = 3/128 (2%)
Query: 79 PTLPQGPTLPQGPTL--LQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQGPTLL 136
P P L L P P P PT P P P P+LP G ++ G +
Sbjct: 2804 PADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVR 2863
Query: 137 QGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQGPTLPQD 196
+ P P L + P + + P P P P P P
Sbjct: 2864 RRPPSRSPAAKPAAPARPPVRRLAR-PAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQ 2922
Query: 197 PTLLQDPT 204
P P
Sbjct: 2923 PPPPPQPQ 2930
Score = 29.5 bits (66), Expect = 2.4
Identities = 29/139 (20%), Positives = 38/139 (27%), Gaps = 2/139 (1%)
Query: 73 PTLLQGPTLPQGPTLP--QGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLP 130
P L LP P P PT P P P+LP G ++ G +
Sbjct: 2804 PADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVR 2863
Query: 131 QGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQG 190
+ P P L + + P + P + P PQ
Sbjct: 2864 RRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQP 2923
Query: 191 PTLPQDPTLLQDPTLLQGP 209
P PQ P Q P
Sbjct: 2924 PPPPQPQPPPPPPPRPQPP 2942
>gnl|CDD|218556 pfam05327, RRN3, RNA polymerase I specific transcription initiation
factor RRN3. This family consists of several eukaryotic
proteins which are homologous to the yeast RRN3 protein.
RRN3 is one of the RRN genes specifically required for
the transcription of rDNA by RNA polymerase I (Pol I) in
Saccharomyces cerevisiae.
Length = 554
Score = 38.8 bits (91), Expect = 0.003
Identities = 19/60 (31%), Positives = 29/60 (48%), Gaps = 5/60 (8%)
Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMRGTRELLTMKLDSCLH 283
+ +DEEEE D DDD+DDM + D E+ +R +E+ KLD+ +
Sbjct: 224 DIDDEEEERVLADEDDDDEDDMFDM--DDDDEEESDPE--VERTSTIKEVSE-KLDAIMD 278
>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM). This
family consists of several Plasmodium falciparum SPAM
(secreted polymorphic antigen associated with
merozoites) proteins. Variation among SPAM alleles is
the result of deletions and amino acid substitutions in
non-repetitive sequences within and flanking the alanine
heptad-repeat domain. Heptad repeats in which the a and
d position contain hydrophobic residues generate
amphipathic alpha-helices which give rise to helical
bundles or coiled-coil structures in proteins. SPAM is
an example of a P. falciparum antigen in which a
repetitive sequence has features characteristic of a
well-defined structural element.
Length = 164
Score = 37.5 bits (87), Expect = 0.003
Identities = 19/48 (39%), Positives = 28/48 (58%), Gaps = 3/48 (6%)
Query: 212 KEKELVERSAYGETEDEEEESDTEDIR--DDDDDDMEIVVCTSQDSED 257
KE E V+ E ++EEEE D E+I +D +D+ EIV ++ ED
Sbjct: 38 KENEDVKDEK-QEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEED 84
>gnl|CDD|214818 smart00784, SPT2, SPT2 chromatin protein. This entry includes the
Saccharomyces cerevisiae protein SPT2 which is a
chromatin protein involved in transcriptional
regulation.
Length = 106
Score = 36.2 bits (84), Expect = 0.003
Identities = 17/52 (32%), Positives = 25/52 (48%), Gaps = 15/52 (28%)
Query: 210 RYKEKELVERSAYGETEDEEEESDTEDIR---------------DDDDDDME 246
Y E+E + + E +DEE++ D ++I DDDDDDME
Sbjct: 14 DYDEEEDEDMDDFIEDDDEEDDYDRDEIWAMFNKGRKRYAYRDDDDDDDDME 65
Score = 33.1 bits (76), Expect = 0.044
Identities = 10/37 (27%), Positives = 17/37 (45%)
Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIV 248
L + DEEE+ D +D +DDD++ +
Sbjct: 1 TSPRLERSRRSRDDYDEEEDEDMDDFIEDDDEEDDYD 37
>gnl|CDD|218303 pfam04874, Mak16, Mak16 protein C-terminal region. The precise
function of this eukaryotic protein family is unknown.
The yeast orthologues have been implicated in cell cycle
progression and biogenesis of 60S ribosomal subunits.
The Schistosoma mansoni Mak16 has been shown to target
protein transport to the nucleolus.
Length = 97
Score = 34.8 bits (80), Expect = 0.009
Identities = 21/76 (27%), Positives = 30/76 (39%), Gaps = 36/76 (47%)
Query: 213 EKELVER---SAYG----------------------------ETEDEEEESDTEDIRDDD 241
EKEL+ER YG E E+EE+E + E + DD+
Sbjct: 27 EKELLERLKQGTYGDEPYNISQSAFKKALEAEESEENDEEEEEEEEEEDEGEIEYVSDDE 86
Query: 242 DDDMEIVVCTSQDSED 257
+ + EI +D ED
Sbjct: 87 ELEEEI-----EDLED 97
>gnl|CDD|165146 PHA02781, PHA02781, hypothetical protein; Provisional.
Length = 78
Score = 33.9 bits (77), Expect = 0.011
Identities = 17/40 (42%), Positives = 24/40 (60%), Gaps = 3/40 (7%)
Query: 210 RYKEKE--LVERSAYGETED-EEEESDTEDIRDDDDDDME 246
+ KEK+ L + A E +D + EE + DI DDDD D+E
Sbjct: 37 KKKEKDVLLAQSVAVEEAKDVKVEEKNIIDIEDDDDMDVE 76
>gnl|CDD|177433 PHA02608, 67, prohead core protein; Provisional.
Length = 80
Score = 33.2 bits (76), Expect = 0.022
Identities = 15/32 (46%), Positives = 19/32 (59%)
Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDD 244
EK + RS E E+ E++ D ED DDDD D
Sbjct: 35 EKVEIARSVMIEGEEPEDDDDDEDDDDDDDKD 66
Score = 32.5 bits (74), Expect = 0.035
Identities = 11/23 (47%), Positives = 17/23 (73%)
Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
+ ED++++ D +D DDDDDD E
Sbjct: 55 DDEDDDDDDDKDDKDDDDDDDDE 77
Score = 31.7 bits (72), Expect = 0.068
Identities = 14/38 (36%), Positives = 20/38 (52%), Gaps = 9/38 (23%)
Query: 223 GET---EDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
GE +D++E+ D +D +DD DDD D ED
Sbjct: 47 GEEPEDDDDDEDDDDDDDKDDKDDD------DDDDDED 78
>gnl|CDD|218889 pfam06088, TLP-20, Nucleopolyhedrovirus telokin-like protein-20
(TLP20). This family consists of several
Nucleopolyhedrovirus telokin-like protein-20 (TLP20)
sequences. The function of this family is unknown but
TLP20 is known to shares some antigenic similarities to
the smooth muscle protein telokin although the amino
acid sequence shows no homologies to telokin.
Length = 162
Score = 34.2 bits (79), Expect = 0.030
Identities = 10/47 (21%), Positives = 17/47 (36%), Gaps = 7/47 (14%)
Query: 220 SAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKR 266
SA E+E+D ++ + D +E D P K+
Sbjct: 114 SAPVPHHSSEDENDEDEEDNADRAGIE-------SGIDDSAPPSPKK 153
>gnl|CDD|117592 pfam09026, Cenp-B_dimeris, Centromere protein B dimerisation
domain. The centromere protein B (CENP-B) dimerisation
domain is composed of two alpha-helices, which are
folded into an antiparallel configuration. Dimerisation
of CENP-B is mediated by this domain, in which monomers
dimerise to form a symmetrical, antiparallel, four-helix
bundle structure with a large hydrophobic patch in which
23 residues of one monomer form van der Waals contacts
with the other monomer. This CENP-B dimer configuration
may be suitable for capturing two distant CENP-B boxes
during centromeric heterochromatin formation.
Length = 101
Score = 33.2 bits (75), Expect = 0.035
Identities = 18/41 (43%), Positives = 24/41 (58%), Gaps = 7/41 (17%)
Query: 227 DEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRM 267
DEEE+ D ED DDD+DD E D +++P PS + M
Sbjct: 17 DEEEDDDDEDEEDDDEDDDE-------DDDEVPVPSFGEAM 50
Score = 28.6 bits (63), Expect = 1.3
Identities = 12/26 (46%), Positives = 19/26 (73%)
Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVV 249
E ED+++E + +D DDD+DD E+ V
Sbjct: 18 EEEDDDDEDEEDDDEDDDEDDDEVPV 43
>gnl|CDD|235033 PRK02363, PRK02363, DNA-directed RNA polymerase subunit delta;
Reviewed.
Length = 129
Score = 32.7 bits (75), Expect = 0.077
Identities = 9/45 (20%), Positives = 22/45 (48%)
Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
+E ++ + +++ D + + DDD D+ ++ +D ED
Sbjct: 83 LEEKFDKKKKKFMDGDDDIIDDDILPDDDFDEEDLDEEDDEDEED 127
Score = 29.6 bits (67), Expect = 0.89
Identities = 7/23 (30%), Positives = 15/23 (65%)
Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
+D+ +E D ++ D+D++D E
Sbjct: 107 LPDDDFDEEDLDEEDDEDEEDEE 129
>gnl|CDD|220284 pfam09538, FYDLN_acid, Protein of unknown function (FYDLN_acid).
Members of this family are bacterial proteins with a
conserved motif [KR]FYDLN, sometimes flanked by a pair
of CXXC motifs, followed by a long region of low
complexity sequence in which roughly half the residues
are Asp and Glu, including multiple runs of five or more
acidic residues. The function of members of this family
is unknown.
Length = 104
Score = 31.1 bits (71), Expect = 0.16
Identities = 11/36 (30%), Positives = 22/36 (61%)
Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEI 247
K+ E E +D++++ D +D+ D DDDD+++
Sbjct: 55 KKDEDEEDEDDVVLDDDDDDDDDDDLPDLDDDDVDL 90
Score = 27.7 bits (62), Expect = 3.3
Identities = 14/48 (29%), Positives = 22/48 (45%), Gaps = 5/48 (10%)
Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLP 259
K + + +DE+EE + + + DDDDDD + D DL
Sbjct: 42 KSRAPAADAEDAAKKDEDEEDEDDVVLDDDDDDDD-----DDDLPDLD 84
Score = 27.3 bits (61), Expect = 3.5
Identities = 10/35 (28%), Positives = 18/35 (51%)
Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
E E E + +D++++ D DDDD D++
Sbjct: 57 DEDEEDEDDVVLDDDDDDDDDDDLPDLDDDDVDLD 91
Score = 27.3 bits (61), Expect = 4.1
Identities = 7/41 (17%), Positives = 15/41 (36%)
Query: 209 PRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVV 249
++ + E + ++ D +D DD D + V
Sbjct: 48 ADAEDAAKKDEDEEDEDDVVLDDDDDDDDDDDLPDLDDDDV 88
>gnl|CDD|218538 pfam05285, SDA1, SDA1. This family consists of several SDA1
protein homologues. SDA1 is a Saccharomyces cerevisiae
protein which is involved in the control of the actin
cytoskeleton. The protein is essential for cell
viability and is localised in the nucleus.
Length = 317
Score = 33.1 bits (76), Expect = 0.16
Identities = 19/66 (28%), Positives = 29/66 (43%), Gaps = 15/66 (22%)
Query: 205 LLQGPRYKEKELVERSA-YGETED---------EEEESDTEDIRD---DDDDDMEIVVCT 251
LL+ ++KE+E ++ A G D E EE + D D + D EI
Sbjct: 74 LLE--KWKEEERKKKEAEQGLESDDDDDEEEEWEVEEDEDSDDEGEWIDVESDKEIESSD 131
Query: 252 SQDSED 257
S+D E+
Sbjct: 132 SEDEEE 137
>gnl|CDD|240329 PTZ00248, PTZ00248, eukaryotic translation initiation factor 2
subunit 1; Provisional.
Length = 319
Score = 32.7 bits (75), Expect = 0.18
Identities = 16/57 (28%), Positives = 28/57 (49%), Gaps = 5/57 (8%)
Query: 201 QDPTLLQGPRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
+P ++ G +EL+E + E+EEEE D + D+D++D + D D
Sbjct: 267 GEPEVVGGDEEDLEELLE-----KAEEEEEEDDYSESEDEDEEDEDEEEEEDDDEGD 318
Score = 32.7 bits (75), Expect = 0.19
Identities = 14/35 (40%), Positives = 22/35 (62%)
Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
K +E E Y E+EDE+EE + E+ +DDD+ +
Sbjct: 285 KAEEEEEEDDYSESEDEDEEDEDEEEEEDDDEGDK 319
>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
Transcription initiation factor IIA (TFIIA) is a
heterotrimer, the three subunits being known as alpha,
beta, and gamma, in order of molecular weight. The N and
C-terminal domains of the gamma subunit are represented
in pfam02268 and pfam02751, respectively. This family
represents the precursor that yields both the alpha and
beta subunits. The TFIIA heterotrimer is an essential
general transcription initiation factor for the
expression of genes transcribed by RNA polymerase II.
Together with TFIID, TFIIA binds to the promoter region;
this is the first step in the formation of a
pre-initiation complex (PIC). Binding of the rest of the
transcription machinery follows this step. After
initiation, the PIC does not completely dissociate from
the promoter. Some components, including TFIIA, remain
attached and re-initiate a subsequent round of
transcription.
Length = 332
Score = 32.8 bits (75), Expect = 0.21
Identities = 37/203 (18%), Positives = 57/203 (28%), Gaps = 27/203 (13%)
Query: 61 PTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQG----PT 116
P P GPT+ P L +P T + L P + LQ P+
Sbjct: 104 PAGPAGPTIQTEPGQLYPVQVPVMVTQNPANSPLDQPAQQRALQQLQQRYGAPASGQLPS 163
Query: 117 LPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPR 176
Q L Q P P G + TL Q +G
Sbjct: 164 QQQSAQKNDESQLQQQPNGETPPQQTDGA--GDDESEALVRLREADGTLEQRIKGAEGGG 221
Query: 177 KGPTLPQGPTLPQGPTLPQDPTLLQDPTLLQGPRYKEKELVER-SAYGETEDEEEESDTE 235
L Q + + + ++ + E + +++ D +
Sbjct: 222 ------------------AMKVLKQPKKQAKSSKRRTIAQIDGIDSDDEGDGSDDDDDED 263
Query: 236 DIRDDDDDDMEIVVCTSQDSEDL 258
I D DD + V +D EDL
Sbjct: 264 AIESDLDDSDDDVS--DEDGEDL 284
Score = 31.3 bits (71), Expect = 0.50
Identities = 41/240 (17%), Positives = 59/240 (24%), Gaps = 45/240 (18%)
Query: 52 LQGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTL 111
Q P L P Q L Q T P T P GPT+ P
Sbjct: 60 AQLPQPLPQPPPTQALQALPAGDQQQHNTPTGSPAANPPATFALPAG-PAGPTIQTEPGQ 118
Query: 112 PQGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTL 171
+P T + P + LQ Q P+ Q L
Sbjct: 119 LYPVQVPVMVTQNPANSPLDQPAQQRALQQLQQRYGAPA--SGQLPSQQQSAQKNDESQL 176
Query: 172 LQGPRKGPTLPQGPTLPQGPTLP------------------------QDPTLLQDPTLLQ 207
Q P P T G L Q +
Sbjct: 177 QQQP--NGETPPQQTDGAGDDESEALVRLREADGTLEQRIKGAEGGGAMKVLKQPKKQAK 234
Query: 208 GPRYKEKELVERSAYGETEDEEEESDTED-----IRDDDDD----------DME-IVVCT 251
+ + ++ + D ++ D ED + D DDD D + +++C
Sbjct: 235 SSKRRTIAQIDGIDSDDEGDGSDDDDDEDAIESDLDDSDDDVSDEDGEDLFDTDNVMLCQ 294
>gnl|CDD|217783 pfam03896, TRAP_alpha, Translocon-associated protein (TRAP), alpha
subunit. The alpha-subunit of the TRAP complex (TRAP
alpha) is a single-spanning membrane protein of the
endoplasmic reticulum (ER) which is found in proximity
of nascent polypeptide chains translocating across the
membrane.
Length = 281
Score = 32.1 bits (73), Expect = 0.29
Identities = 15/49 (30%), Positives = 21/49 (42%), Gaps = 4/49 (8%)
Query: 218 ERSAYGETEDEEEESDTEDIRDDDD----DDMEIVVCTSQDSEDLPGPS 262
SA TEDEE E D D ++D+ +D + +D E S
Sbjct: 28 FASAQDLTEDEEAEDDVVDEDEEDEAVVEEDENELTEEEEDEEGEVKAS 76
Score = 28.6 bits (64), Expect = 3.5
Identities = 10/44 (22%), Positives = 22/44 (50%)
Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSE 256
++E + + EDE + E+ ++++D E V S D++
Sbjct: 37 DEEAEDDVVDEDEEDEAVVEEDENELTEEEEDEEGEVKASPDAD 80
>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family. Emg1 and Nop14 are novel
proteins whose interaction is required for the
maturation of the 18S rRNA and for 40S ribosome
production.
Length = 809
Score = 32.3 bits (74), Expect = 0.29
Identities = 12/35 (34%), Positives = 16/35 (45%)
Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDL 258
EDEEEE D D D++DDD ++
Sbjct: 319 GEEDEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSD 353
Score = 30.4 bits (69), Expect = 1.5
Identities = 13/24 (54%), Positives = 18/24 (75%)
Query: 223 GETEDEEEESDTEDIRDDDDDDME 246
GE ED+EEE D+++ DD DD+ E
Sbjct: 284 GEEEDDEEEEDSKESADDLDDEFE 307
Score = 29.6 bits (67), Expect = 2.4
Identities = 13/34 (38%), Positives = 17/34 (50%)
Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
+ E+EE+ D ED DDDDD E E+
Sbjct: 322 DEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEE 355
Score = 28.4 bits (64), Expect = 5.6
Identities = 11/47 (23%), Positives = 22/47 (46%), Gaps = 4/47 (8%)
Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMRGT 270
E E++ + SD E+ +D+D D E ++ E+ K+ +
Sbjct: 344 EEEEDVDLSDEEEDEEDEDSDDE----DDEEEEEEEKEKKKKKSAES 386
Score = 28.0 bits (63), Expect = 6.9
Identities = 11/41 (26%), Positives = 21/41 (51%)
Query: 222 YGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPS 262
E ED ++ D ED DD +++ E V + ++ ++ S
Sbjct: 323 EEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEEEDEEDEDS 363
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 32.4 bits (73), Expect = 0.31
Identities = 38/137 (27%), Positives = 44/137 (32%), Gaps = 9/137 (6%)
Query: 64 PQGPTLLQGPTLLQGPTLPQGPTL------PQGPTLLQDPTLPQGPTLLQGPTLPQGPTL 117
PQGP +Q P PT PQG + P +L P+L
Sbjct: 175 PQGPPSIQVPPGAALAPSAPPPTPSAQAVPPQGSPIAAQPAPQPQQPSPL--SLISAPSL 232
Query: 118 PQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRK 177
LP Q T Q P+ + GP P L QGP LQ P
Sbjct: 233 -HPQRLPSPHPPLQPQTASQQSPQPPAPSSRHPQSSHHGPGPPMPHALQQGPVFLQHPSS 291
Query: 178 GPTLPQGPTLPQGPTLP 194
P P G Q P LP
Sbjct: 292 NPPQPFGLAQSQVPPLP 308
Score = 29.7 bits (66), Expect = 2.1
Identities = 35/142 (24%), Positives = 43/142 (30%), Gaps = 11/142 (7%)
Query: 61 PTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQG 120
P L L LP Q T Q P P+ PQ G
Sbjct: 217 QPQQPSPLSLISAPSLHPQRLPSPHPPLQPQTASQQSPQPPAPS----SRHPQSSHHGPG 272
Query: 121 PTLPQGPTLPQGPTLLQGPTLL----QGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPR 176
P +P L QGP LQ P+ G Q P L P+ Q + P+
Sbjct: 273 PPMPH--ALQQGPVFLQHPSSNPPQPFGLAQSQVP-PLPLPSQAQPHSHTPPSQSALQPQ 329
Query: 177 KGPTLPQGPTLPQGPTLPQDPT 198
+ P P P P + PT
Sbjct: 330 QPPREQPLPPAPSMPHIKPPPT 351
Score = 29.3 bits (65), Expect = 2.8
Identities = 27/94 (28%), Positives = 34/94 (36%), Gaps = 4/94 (4%)
Query: 63 LPQGPTLLQGPTLL----QGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLP 118
L QGP LQ P+ G Q P LP P + LQ P+ LP
Sbjct: 279 LQQGPVFLQHPSSNPPQPFGLAQSQVPPLPLPSQAQPHSHTPPSQSALQPQQPPREQPLP 338
Query: 119 QGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPT 152
P++P P P + P LQGP+
Sbjct: 339 PAPSMPHIKPPPTTPIPQLPNQSHKHPPHLQGPS 372
>gnl|CDD|221490 pfam12253, CAF1A, Chromatin assembly factor 1 subunit A. The CAF-1
or chromatin assembly factor-1 consists of three
subunits, and this is the first, or A. The A domain is
uniquely required for the progression of S phase in
mouse cells, independent of its ability to promote
histone deposition but dependent on its ability to
interact with HP1 - heterochromatin protein 1-rich
heterochromatin domains next to centromeres that are
crucial for chromosome segregation during mitosis. This
HP1-CAF-1 interaction module functions as a built-in
replication control for heterochromatin, which, like a
control barrier, has an impact on S-phase progression in
addition to DNA-based checkpoints.
Length = 76
Score = 29.9 bits (68), Expect = 0.32
Identities = 9/22 (40%), Positives = 15/22 (68%), Gaps = 1/22 (4%)
Query: 226 EDE-EEESDTEDIRDDDDDDME 246
+ E EEE + ED+ +D++D E
Sbjct: 45 DAEWEEEEEGEDLESEDEEDEE 66
Score = 26.8 bits (60), Expect = 3.3
Identities = 9/23 (39%), Positives = 17/23 (73%)
Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
E E+E E+ ++ED D+++DD +
Sbjct: 49 EEEEEGEDLESEDEEDEEEDDDD 71
Score = 26.8 bits (60), Expect = 3.6
Identities = 9/18 (50%), Positives = 11/18 (61%)
Query: 226 EDEEEESDTEDIRDDDDD 243
E E+EE + ED DD D
Sbjct: 58 ESEDEEDEEEDDDDDMDG 75
Score = 26.0 bits (58), Expect = 6.2
Identities = 10/22 (45%), Positives = 13/22 (59%)
Query: 223 GETEDEEEESDTEDIRDDDDDD 244
GE + E+E D E+ DDD D
Sbjct: 54 GEDLESEDEEDEEEDDDDDMDG 75
Score = 26.0 bits (58), Expect = 6.3
Identities = 8/23 (34%), Positives = 15/23 (65%)
Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
E E+EEE D E ++D+++ +
Sbjct: 47 EWEEEEEGEDLESEDEEDEEEDD 69
>gnl|CDD|220102 pfam09073, BUD22, BUD22. BUD22 has been shown in yeast to be a
nuclear protein involved in bud-site selection. It plays
a role in positioning the proximal bud pole signal. More
recently it has been shown to be involved in ribosome
biogenesis.
Length = 424
Score = 32.1 bits (73), Expect = 0.37
Identities = 14/55 (25%), Positives = 23/55 (41%), Gaps = 2/55 (3%)
Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRM 267
E E + + D+EEE D++ D M +V +S + E PS +
Sbjct: 178 EDESKSEESAEDDSDDEEEEDSDSEDYSQYDGM--LVDSSDEEEGEEAPSINYNE 230
>gnl|CDD|240226 PTZ00007, PTZ00007, (NAP-L) nucleosome assembly protein -L;
Provisional.
Length = 337
Score = 31.7 bits (72), Expect = 0.39
Identities = 9/53 (16%), Positives = 17/53 (32%)
Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMRGTRELLTM 276
+ +E++ D + D + + ED G S + LT
Sbjct: 285 DYSSDEDDDDYDSYDSSDSASSDSNSDVDTNEEDDRGEKESNGAKSNELHLTS 337
>gnl|CDD|145949 pfam03066, Nucleoplasmin, Nucleoplasmin. Nucleoplasmins are also
known as chromatin decondensation proteins. They bind to
core histones and transfer DNA to them in a reaction
that requires ATP. This is thought to play a role in the
assembly of regular nucleosomal arrays.
Length = 146
Score = 30.8 bits (70), Expect = 0.46
Identities = 10/24 (41%), Positives = 17/24 (70%)
Query: 223 GETEDEEEESDTEDIRDDDDDDME 246
E++D+EE+ + ED +DDD+D
Sbjct: 112 DESDDDEEDEEEEDDEEDDDEDES 135
Score = 30.0 bits (68), Expect = 0.79
Identities = 13/25 (52%), Positives = 16/25 (64%), Gaps = 2/25 (8%)
Query: 222 YGETEDEEEESDTEDIRDDDDDDME 246
+ EDEEEE D ED DD+D+ E
Sbjct: 115 DDDEEDEEEEDDEED--DDEDESEE 137
Score = 27.3 bits (61), Expect = 6.7
Identities = 7/21 (33%), Positives = 15/21 (71%)
Query: 226 EDEEEESDTEDIRDDDDDDME 246
E++E + D ED ++DD++ +
Sbjct: 110 EEDESDDDEEDEEEEDDEEDD 130
>gnl|CDD|148139 pfam06346, Drf_FH1, Formin Homology Region 1. This region is found
in some of the Diaphanous related formins (Drfs). It
consists of low complexity repeats of around 12
residues.
Length = 160
Score = 30.7 bits (69), Expect = 0.48
Identities = 44/144 (30%), Positives = 54/144 (37%), Gaps = 1/144 (0%)
Query: 55 PTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQG 114
P L G +P P L G + P LP G +P P L +P P L +P
Sbjct: 4 PPLPGGVGIPPPPPLPGGVCIPPPPPLPGGTGIPPPPPLPGGAAIPPPPPLPGVAGIPPP 63
Query: 115 PTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPT-LPQGPTLLQGPTLLQ 173
P LP +P P LP + P L G + P L G +P P L GP +
Sbjct: 64 PPLPGATAIPPPPPLPGAAGIPPPPPLPGGAGIPPPPPPLPGGAAVPPPPPLPGGPGVPP 123
Query: 174 GPRKGPTLPQGPTLPQGPTLPQDP 197
P P P P P G P P
Sbjct: 124 PPPPFPGAPGIPPPPPGMGSPPPP 147
>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
Rpc31. RNA polymerase III contains seventeen subunits
in yeasts and in human cells. Twelve of these are akin
to RNA polymerase I or II and the other five are RNA pol
III-specific, and form the functionally distinct groups
(i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
Rpc34 and Rpc82 form a cluster of enzyme-specific
subunits that contribute to transcription initiation in
S.cerevisiae and H.sapiens. There is evidence that these
subunits are anchored at or near the N-terminal Zn-fold
of Rpc1, itself prolonged by a highly conserved but RNA
polymerase III-specific domain.
Length = 221
Score = 30.9 bits (70), Expect = 0.53
Identities = 15/30 (50%), Positives = 19/30 (63%)
Query: 215 ELVERSAYGETEDEEEESDTEDIRDDDDDD 244
++ E E E+EEEE + ED DDDDDD
Sbjct: 167 DVDEEDEKDEEEEEEEEEEDEDFDDDDDDD 196
Score = 30.9 bits (70), Expect = 0.58
Identities = 11/21 (52%), Positives = 16/21 (76%)
Query: 226 EDEEEESDTEDIRDDDDDDME 246
E+EEEE + ++ DDDDDD +
Sbjct: 177 EEEEEEEEEDEDFDDDDDDDD 197
Score = 30.1 bits (68), Expect = 1.0
Identities = 16/37 (43%), Positives = 22/37 (59%), Gaps = 2/37 (5%)
Query: 212 KEKELV-ERSAYGETEDEEEESDTEDIRDD-DDDDME 246
K KEL E + +DEEEE + E+ +D DDDD +
Sbjct: 159 KLKELEAEDVDEEDEKDEEEEEEEEEEDEDFDDDDDD 195
Score = 29.3 bits (66), Expect = 2.0
Identities = 13/22 (59%), Positives = 16/22 (72%)
Query: 223 GETEDEEEESDTEDIRDDDDDD 244
E E+EEE+ D +D DDDDDD
Sbjct: 178 EEEEEEEEDEDFDDDDDDDDDD 199
>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain. This
family represents the C-terminus (approximately 300
residues) of proteins that are involved as binding
partners for Prp19 as part of the nuclear pore complex.
The family in Drosophila is necessary for pre-mRNA
splicing, and the human protein has been found in
purifications of the spliceosome. In the past this
family was thought, erroneously, to be associated with
microfibrillin.
Length = 277
Score = 31.0 bits (70), Expect = 0.59
Identities = 14/34 (41%), Positives = 20/34 (58%)
Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
E E++E E+ +EEEE E+ D +DDME
Sbjct: 1 ETEVLELEEEDESGEEEEEESEEEEETDSEDDME 34
>gnl|CDD|219922 pfam08595, RXT2_N, RXT2-like, N-terminal. The family represents
the N-terminal region of RXT2-like proteins. In S.
cerevisiae, RXT2 has been demonstrated to be involved in
conjugation with cellular fusion (mating) and invasive
growth. A high throughput localisation study has
localised RXT2 to the nucleus.
Length = 141
Score = 30.1 bits (68), Expect = 0.61
Identities = 16/64 (25%), Positives = 28/64 (43%), Gaps = 12/64 (18%)
Query: 209 PRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDM-------EIVVCTSQDSEDLPGP 261
PR E + +DE++E D E +DDD++ EI+ + S+ P
Sbjct: 45 PRIDEDGGDI-----DDDDEDDEDDEEADAEDDDENPYKLIRLEEILAPLTHPSDLPTHP 99
Query: 262 SHSK 265
+ S+
Sbjct: 100 AISR 103
>gnl|CDD|225880 COG3343, RpoE, DNA-directed RNA polymerase, delta subunit
[Transcription].
Length = 175
Score = 30.5 bits (69), Expect = 0.62
Identities = 12/29 (41%), Positives = 17/29 (58%), Gaps = 1/29 (3%)
Query: 220 SAYGETEDEEEESDTEDIRDDDDDDMEIV 248
E +DE + D E+ D+D+DD EIV
Sbjct: 123 DKEEEEDDEVDSLDDEN-DDEDEDDDEIV 150
>gnl|CDD|218333 pfam04931, DNA_pol_phi, DNA polymerase phi. This family includes
the fifth essential DNA polymerase in yeast EC:2.7.7.7.
Pol5p is localised exclusively to the nucleolus and
binds near or at the enhancer region of rRNA-encoding
DNA repeating units.
Length = 784
Score = 31.4 bits (71), Expect = 0.63
Identities = 15/46 (32%), Positives = 22/46 (47%), Gaps = 1/46 (2%)
Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
K E R E EEE+ D + DDD+D+ E + + +SE
Sbjct: 633 KADENKSRHQ-QLFEGEEEDEDDLEETDDDEDECEAIEDSESESES 677
Score = 29.1 bits (65), Expect = 3.6
Identities = 13/64 (20%), Positives = 22/64 (34%), Gaps = 4/64 (6%)
Query: 213 EKELVERSAYGETEDEEEES---DTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMRG 269
E E + ET+D+E+E + + + D + D+E G
Sbjct: 646 EGEEEDEDDLEETDDDEDECEAIEDSESESESDGEDGEEDEQEDDAEANEGVVPI-DKAV 704
Query: 270 TREL 273
R L
Sbjct: 705 RRAL 708
>gnl|CDD|130706 TIGR01645, half-pint, poly-U binding splicing factor, half-pint
family. The proteins represented by this model contain
three RNA recognition motifs (rrm: pfam00076) and have
been characterized as poly-pyrimidine tract binding
proteins associated with RNA splicing factors. In the
case of PUF60 (GP|6176532), in complex with p54, and in
the presence of U2AF, facilitates association of U2
snRNP with pre-mRNA.
Length = 612
Score = 31.2 bits (70), Expect = 0.74
Identities = 26/107 (24%), Positives = 38/107 (35%), Gaps = 11/107 (10%)
Query: 125 QGPTLPQG--PTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLP 182
Q P P PT + ++ P LPQ + P ++ P P P
Sbjct: 329 QSPATPSSSLPTDIGNKAVVSSAKKEAEEV----PPLPQAAPAVVKPGPMEIPT--PVPP 382
Query: 183 QGPTLPQGPTLPQ--DPTLLQDPTLLQGPRYKEKELVERSAYGETED 227
G +P P PT + +P+ L PR K K +G +D
Sbjct: 383 PGLAIPSLVAPPGLVAPTEI-NPSFLASPRKKMKREKLPVTFGALDD 428
Score = 30.0 bits (67), Expect = 1.4
Identities = 23/108 (21%), Positives = 33/108 (30%), Gaps = 14/108 (12%)
Query: 59 QGPTLPQG--PTLLQGPTLLQGP-TLPQG-PTLPQGPTLLQDPTLPQGPTLLQGPTLPQG 114
Q P P PT + ++ + P LPQ + P + PT P P G
Sbjct: 329 QSPATPSSSLPTDIGNKAVVSSAKKEAEEVPPLPQAAPAVVKPGPMEIPT----PVPPPG 384
Query: 115 PTLPQGPTLPQ--GPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLP 160
+P P PT P+ L P + + G
Sbjct: 385 LAIPSLVAPPGLVAPTEIN-PSFLASPRK---KMKREKLPVTFGALDD 428
Score = 28.1 bits (62), Expect = 7.5
Identities = 17/75 (22%), Positives = 22/75 (29%), Gaps = 3/75 (4%)
Query: 55 PTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQG 114
P L Q P ++ PT + P L + P L PT P+ L P
Sbjct: 359 PPLPQAAPAVVKPGPMEIPTPVPPPGLAIPSLVA--PPGLVAPTEIN-PSFLASPRKKMK 415
Query: 115 PTLPQGPTLPQGPTL 129
TL
Sbjct: 416 REKLPVTFGALDDTL 430
>gnl|CDD|218598 pfam05470, eIF-3c_N, Eukaryotic translation initiation factor 3
subunit 8 N-terminus. The largest of the mammalian
translation initiation factors, eIF3, consists of at
least eight subunits ranging in mass from 35 to 170 kDa.
eIF3 binds to the 40 S ribosome in an early step of
translation initiation and promotes the binding of
methionyl-tRNAi and mRNA.
Length = 593
Score = 30.9 bits (70), Expect = 0.77
Identities = 13/35 (37%), Positives = 19/35 (54%), Gaps = 2/35 (5%)
Query: 210 RYKEKELVERSAYGETEDEEEESDTEDIRDDDDDD 244
RY+E E E EDE+++ D D D+D+D
Sbjct: 130 RYREDP--ESEDEEEEEDEDDDDDGSDDEDEDEDG 162
Score = 29.4 bits (66), Expect = 2.3
Identities = 18/45 (40%), Positives = 23/45 (51%), Gaps = 1/45 (2%)
Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMR 268
E+EDEEEE D ED DD DD + +E++ S S R
Sbjct: 136 ESEDEEEEED-EDDDDDGSDDEDEDEDGVGATEEVAASSESGVDR 179
>gnl|CDD|220413 pfam09805, Nop25, Nucleolar protein 12 (25kDa). Members of this
family of proteins are part of the yeast nuclear pore
complex-associated pre-60S ribosomal subunit. The family
functions as a highly conserved exonuclease that is
required for the 5'-end maturation of 5.8S and 25S
rRNAs, demonstrating that 5'-end processing also has a
redundant pathway. Nop25 binds late pre-60S ribosomes,
accompanying them from the nucleolus to the nuclear
periphery; and there is evidence for both physical and
functional links between late 60S subunit processing and
export.
Length = 134
Score = 29.6 bits (67), Expect = 0.79
Identities = 8/33 (24%), Positives = 18/33 (54%), Gaps = 2/33 (6%)
Query: 214 KELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
KE ++ ++E+ E++ + D +DD+ E
Sbjct: 73 KEALKLLEEENDDEEDAETEDTE--DVEDDEWE 103
>gnl|CDD|219912 pfam08574, DUF1762, Protein of unknown function (DUF1762). This is
a family of proteins of unknown function. Yeast IWR1 is
known to interact with RNA polymerase II and deletion of
this protein results in hypersensitivity to the K1
killer toxin.
Length = 77
Score = 28.5 bits (64), Expect = 0.88
Identities = 11/31 (35%), Positives = 19/31 (61%)
Query: 227 DEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
+E+E ED+ +D+DDD + V+ +DS
Sbjct: 35 IDEDEEYHEDLANDEDDDADQVLSDDEDSNA 65
>gnl|CDD|215601 PLN03142, PLN03142, Probable chromatin-remodeling complex ATPase
chain; Provisional.
Length = 1033
Score = 30.9 bits (70), Expect = 0.90
Identities = 13/34 (38%), Positives = 23/34 (67%)
Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
E E V RSA +++D+E ++ ED ++DD++ E
Sbjct: 17 ELEAVARSAGSDSDDDEVPAEDEDEDEEDDEEAE 50
>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein.
Length = 529
Score = 30.5 bits (69), Expect = 1.1
Identities = 10/42 (23%), Positives = 21/42 (50%)
Query: 216 LVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
VE+ E DEEEE + E+ +++++ + ++ E
Sbjct: 26 WVEKEVEKEVPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEK 67
>gnl|CDD|237047 PRK12298, obgE, GTPase CgtA; Reviewed.
Length = 390
Score = 30.2 bits (69), Expect = 1.2
Identities = 9/31 (29%), Positives = 19/31 (61%)
Query: 218 ERSAYGETEDEEEESDTEDIRDDDDDDMEIV 248
R E E+E+++ +D +DDD+ +E++
Sbjct: 357 HREQLEEVEEEDDDDWDDDWDEDDDEGVEVI 387
>gnl|CDD|203043 pfam04546, Sigma70_ner, Sigma-70, non-essential region. The domain
is found in the primary vegetative sigma factor. The
function of this domain is unclear and can be removed
without loss of function.
Length = 211
Score = 29.8 bits (68), Expect = 1.3
Identities = 10/21 (47%), Positives = 13/21 (61%)
Query: 224 ETEDEEEESDTEDIRDDDDDD 244
E E +E D ED DDD+D+
Sbjct: 43 AIESELDEEDLEDDDDDDEDE 63
Score = 28.7 bits (65), Expect = 3.0
Identities = 11/37 (29%), Positives = 16/37 (43%), Gaps = 4/37 (10%)
Query: 223 GETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLP 259
DEE+ D +D +D+D+D E D P
Sbjct: 45 ESELDEEDLEDDDDDDEDEDEDDE----EEADLGPDP 77
Score = 27.9 bits (63), Expect = 4.6
Identities = 7/21 (33%), Positives = 10/21 (47%)
Query: 224 ETEDEEEESDTEDIRDDDDDD 244
E E + + DDDDD+
Sbjct: 41 AAAIESELDEEDLEDDDDDDE 61
>gnl|CDD|235124 PRK03427, PRK03427, cell division protein ZipA; Provisional.
Length = 333
Score = 30.0 bits (68), Expect = 1.3
Identities = 14/53 (26%), Positives = 16/53 (30%)
Query: 79 PTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQ 131
+ Q P P P P + Q PQ Q P PQ PQ
Sbjct: 103 QPVQQPPEAQVPPQHAPRPAQPAPQPVQQPAYQPQPEQPLQQPVSPQVAPAPQ 155
>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein. CDC45 is an essential gene
required for initiation of DNA replication in S.
cerevisiae, forming a complex with MCM5/CDC46.
Homologues of CDC45 have been identified in human, mouse
and smut fungus among others.
Length = 583
Score = 30.0 bits (68), Expect = 1.5
Identities = 19/62 (30%), Positives = 27/62 (43%), Gaps = 7/62 (11%)
Query: 209 PRYKEK--ELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKR 266
PRY + +L E E DEE+E ++ D+DDDD + D S +R
Sbjct: 114 PRYDDAYRDLEEDDDDDEESDEEDEESSKSEDDEDDDDDD-----DDDDIATRERSLERR 168
Query: 267 MR 268
R
Sbjct: 169 RR 170
>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein. The proteins in this family
are designated YL1. These proteins have been shown to be
DNA-binding and may be a transcription factor.
Length = 238
Score = 29.7 bits (67), Expect = 1.8
Identities = 10/25 (40%), Positives = 14/25 (56%)
Query: 222 YGETEDEEEESDTEDIRDDDDDDME 246
+ E+EEEE D D +DD+ E
Sbjct: 43 FEIEEEEEEEEVDSDFDDSEDDEPE 67
>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein. This family includes proteins
related to Mpp10 (M phase phosphoprotein 10). The U3
small nucleolar ribonucleoprotein (snoRNP) is required
for three cleavage events that generate the mature 18S
rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
depletion of Mpp10, a U3 snoRNP-specific protein, halts
18S rRNA production and impairs cleavage at the three U3
snoRNP-dependent sites.
Length = 613
Score = 30.0 bits (67), Expect = 1.8
Identities = 15/68 (22%), Positives = 23/68 (33%), Gaps = 9/68 (13%)
Query: 200 LQDPTLLQGPRYKEKELVER----SAYGETEDEEEESDTEDI-----RDDDDDDMEIVVC 250
LQ+ +L K E + + +D E D D D DD+ E
Sbjct: 65 LQNKPILDDLNQKYVEFLINKEHIRVLAKLQDSESHEDGSDGSDMDSEDSADDEEEEEED 124
Query: 251 TSQDSEDL 258
S + E +
Sbjct: 125 ESLEDEMI 132
>gnl|CDD|184885 PRK14891, PRK14891, 50S ribosomal protein L24e/unknown domain
fusion protein; Provisional.
Length = 131
Score = 28.8 bits (64), Expect = 1.9
Identities = 11/53 (20%), Positives = 21/53 (39%), Gaps = 3/53 (5%)
Query: 208 GPRYKEKELVERSAYGETEDEEE---ESDTEDIRDDDDDDMEIVVCTSQDSED 257
GP E + E D +E E+ D D+ D++ E + +++
Sbjct: 61 GPAAAATAAAEAAEEAEAADADEDADEAAEADAADEADEEEETDEAVDETADE 113
>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
Validated.
Length = 824
Score = 30.0 bits (68), Expect = 2.0
Identities = 38/216 (17%), Positives = 49/216 (22%), Gaps = 29/216 (13%)
Query: 60 GPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQ 119
G P P P P P P P P P + P + P+
Sbjct: 596 GGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPK 655
Query: 120 GPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGP 179
+P G G P P P P P G
Sbjct: 656 HVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPA--ATPPAGQ 713
Query: 180 TLPQGPTLPQGPT--------------LPQDPTLLQDPTLLQGPRYKEKELVERSAYGET 225
PQ LP +P DP +A
Sbjct: 714 ADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAA 773
Query: 226 -------------EDEEEESDTEDIRDDDDDDMEIV 248
ED+ D ED RD ++ ME++
Sbjct: 774 PPPSPPSEEEEMAEDDAPSMDDEDRRDAEEVAMELL 809
>gnl|CDD|222843 PHA02030, PHA02030, hypothetical protein.
Length = 336
Score = 29.6 bits (66), Expect = 2.1
Identities = 12/50 (24%), Positives = 15/50 (30%), Gaps = 1/50 (2%)
Query: 84 GPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQGP 133
G LP P + D P + P P +P LP P
Sbjct: 263 GSNLPAVPNVAADAGSAAAPAV-PAAAAAVAQAAPSVPQVPNVAVLPDVP 311
>gnl|CDD|217502 pfam03343, SART-1, SART-1 family. SART-1 is a protein involved in
cell cycle arrest and pre-mRNA splicing. It has been
shown to be a component of U4/U6 x U5 tri-snRNP complex
in human, Schizosaccharomyces pombe and Saccharomyces
cerevisiae. SART-1 is a known tumour antigen in a range
of cancers recognised by T cells.
Length = 603
Score = 29.7 bits (67), Expect = 2.3
Identities = 15/58 (25%), Positives = 25/58 (43%), Gaps = 5/58 (8%)
Query: 200 LQDPTLLQGPRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
LQ L ++ E + S ++ EE++ D ED D D +M V + E+
Sbjct: 403 LQKEPL-----EEKPENKDESVEEISDAEEDDEDEEDEDGDGDVEMSAVDNDEEKEEE 455
>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381). This
domain is functionally uncharacterized. This domain is
found in eukaryotes. This presumed domain is typically
between 156 to 174 amino acids in length. This domain is
found associated with pfam07780, pfam01728.
Length = 154
Score = 28.8 bits (65), Expect = 2.3
Identities = 10/38 (26%), Positives = 22/38 (57%)
Query: 210 RYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEI 247
R K ++L+ + E+EEEE + E++ +++ D +
Sbjct: 87 RKKVRKLLGLDKKEKEEEEEEEVEVEELDEEEQIDELL 124
>gnl|CDD|227931 COG5644, COG5644, Uncharacterized conserved protein [Function
unknown].
Length = 869
Score = 29.7 bits (66), Expect = 2.4
Identities = 9/29 (31%), Positives = 12/29 (41%)
Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTS 252
E+E E +SD +D D D S
Sbjct: 172 ESEIESSDSDHDDENSDSKLDNLRNYIVS 200
>gnl|CDD|219419 pfam07462, MSP1_C, Merozoite surface protein 1 (MSP1) C-terminus.
This family represents the C-terminal region of
merozoite surface protein 1 (MSP1) which are found in a
number of Plasmodium species. MSP-1 is a 200-kDa protein
expressed on the surface of the P. vivax merozoite.
MSP-1 of Plasmodium species is synthesised as a
high-molecular-weight precursor and then processed into
several fragments. At the time of red cell invasion by
the merozoite, only the 19-kDa C-terminal fragment
(MSP-119), which contains two epidermal growth
factor-like domains, remains on the surface. Antibodies
against MSP-119 inhibit merozoite entry into red cells,
and immunisation with MSP-119 protects monkeys from
challenging infections. Hence, MSP-119 is considered a
promising vaccine candidate.
Length = 574
Score = 29.5 bits (66), Expect = 2.4
Identities = 24/117 (20%), Positives = 34/117 (29%), Gaps = 13/117 (11%)
Query: 141 LLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQGPTLPQDPTLL 200
L +G T T + P Q PT T L
Sbjct: 258 LPKGTTQEAKVTTVVTPPQADAAPSPLSVRPAGSSGSASGSTQIPTSGSVLGPGAAATEL 317
Query: 201 QDPTLLQGPRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
Q LQ ++ LV +G DDDD+D++ V +SE+
Sbjct: 318 QQVVQLQNYDEEDDSLVVLPIFGND-------------DDDDEDLDQVATGEAESEE 361
>gnl|CDD|115196 pfam06524, NOA36, NOA36 protein. This family consists of several
NOA36 proteins which contain 29 highly conserved
cysteine residues. The function of this protein is
unknown.
Length = 314
Score = 29.2 bits (65), Expect = 3.0
Identities = 12/28 (42%), Positives = 16/28 (57%), Gaps = 3/28 (10%)
Query: 222 YGETEDEEEES---DTEDIRDDDDDDME 246
YG D++E S D ++ D DDDD E
Sbjct: 269 YGYESDDDEGSSSNDYDEEEDGDDDDNE 296
>gnl|CDD|221323 pfam11931, DUF3449, Domain of unknown function (DUF3449). This
presumed domain is functionally uncharacterized. This
domain is found in eukaryotes. This domain is typically
between 181 to 207 amino acids in length. This domain
has two conserved sequence motifs: PIP and CEICG. The
domain carries a zinc-finger domain of the C2H2-type.
Length = 187
Score = 28.4 bits (64), Expect = 3.4
Identities = 12/42 (28%), Positives = 18/42 (42%), Gaps = 5/42 (11%)
Query: 218 ERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLP 259
ER A + E+ D D DDD++ I + +LP
Sbjct: 35 ERQASADESSEDASEDGSDDDSDDDEEEPIY-----NPLNLP 71
Score = 28.4 bits (64), Expect = 3.7
Identities = 19/58 (32%), Positives = 24/58 (41%), Gaps = 10/58 (17%)
Query: 205 LLQGPRYKEKELVER-SA--YGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLP 259
LL+ R E VER A E + +ES + D DDD S D E+ P
Sbjct: 13 LLKKEREDTIENVERKQALTEEERQASADESSEDASEDGSDDD-------SDDDEEEP 63
>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
(vWA) domain [General function prediction only].
Length = 4600
Score = 29.2 bits (65), Expect = 3.8
Identities = 23/85 (27%), Positives = 34/85 (40%), Gaps = 14/85 (16%)
Query: 175 PRKGPTLPQGPTLPQGPTLPQDPTL------------LQDPTLLQGPRYKEKELVERSAY 222
P Q P + LP+D L L+D + KE+ E+
Sbjct: 3971 PDIQENNSQPPPENEDLDLPEDLKLDEKEGDVSKDSDLEDMDMEAADENKEEADAEKDEP 4030
Query: 223 GETEDEEEESDT--EDIRDDDDDDM 245
+ ED EE++T EDI+ DD D+
Sbjct: 4031 MQDEDPLEENNTLDEDIQQDDFSDL 4055
>gnl|CDD|217861 pfam04050, Upf2, Up-frameshift suppressor 2. Transcripts
harbouring premature signals for translation termination
are recognised and rapidly degraded by eukaryotic cells
through a pathway known as nonsense-mediated mRNA decay.
In Saccharomyces cerevisiae, three trans-acting factors
(Upf1 to Upf3) are required for nonsense-mediated mRNA
decay.
Length = 171
Score = 28.1 bits (63), Expect = 4.5
Identities = 13/36 (36%), Positives = 21/36 (58%), Gaps = 3/36 (8%)
Query: 223 GETEDEEEESDTEDIRDD--DDDDMEIVVCTSQDSE 256
E+ DEEE +D +D+ D ++ +I V T Q+ E
Sbjct: 23 DESSDEEEVDLPDDEQDEESDSEEEQIFV-TRQEEE 57
>gnl|CDD|220759 pfam10446, DUF2457, Protein of unknown function (DUF2457). This is
a family of uncharacterized proteins.
Length = 449
Score = 28.4 bits (63), Expect = 4.5
Identities = 14/35 (40%), Positives = 21/35 (60%)
Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
K + E A E +D+EE+ D +D D+DDDD +
Sbjct: 40 KLGKEAEEEAMEEEDDDEEDDDDDDDEDEDDDDDD 74
>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT. This family
consists of several bacterial cobalamin biosynthesis
(CobT) proteins. CobT is involved in the transformation
of precorrin-3 into cobyrinic acid.
Length = 282
Score = 28.2 bits (63), Expect = 4.6
Identities = 10/23 (43%), Positives = 14/23 (60%)
Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
E DE E +D+ED D+DD +
Sbjct: 211 ELGDEPESADSEDNEDEDDPKED 233
>gnl|CDD|236669 PRK10263, PRK10263, DNA translocase FtsK; Provisional.
Length = 1355
Score = 28.5 bits (63), Expect = 4.9
Identities = 32/162 (19%), Positives = 43/162 (26%), Gaps = 16/162 (9%)
Query: 83 QGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQGPTLLQGP--T 140
Q P + PT+ P + GP + P PQ Q P
Sbjct: 342 QTPPVASVDVPPAQPTVAWQP--VPGPQTGEPVIAPAPEGYPQQSQYAQPAVQYNEPLQQ 399
Query: 141 LLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQGPTLPQDPTLL 200
+Q P Q P + P P P Q Q T
Sbjct: 400 PVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAE--EQQSTFA 457
Query: 201 QDPT----------LLQGPRYKEKELVERSAYGETEDEEEES 232
T Q P Y++ + VE+ E E EE+
Sbjct: 458 PQSTYQTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPVVEET 499
>gnl|CDD|165563 PHA03308, PHA03308, transcriptional regulator ICP4; Provisional.
Length = 1463
Score = 28.6 bits (63), Expect = 4.9
Identities = 17/52 (32%), Positives = 25/52 (48%), Gaps = 2/52 (3%)
Query: 169 PTLLQGPRKGPTLPQGPTLPQGPTLP--QDPTLLQDPTLLQGPRYKEKELVE 218
P + + R P LP P P+GP P ++P Q+P Q P + E+ E
Sbjct: 830 PAVPETDRDNPLLPPCPITPEGPPCPPREEPQQPQEPQEPQSPSFHISEIGE 881
>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein. This protein is found to be part
of a large ribonucleoprotein complex containing the U3
snoRNA. Depletion of the Utp proteins impedes production
of the 18S rRNA, indicating that they are part of the
active pre-rRNA processing complex. This large RNP
complex has been termed the small subunit (SSU)
processome.
Length = 728
Score = 28.5 bits (64), Expect = 5.0
Identities = 12/42 (28%), Positives = 18/42 (42%), Gaps = 2/42 (4%)
Query: 205 LLQGPRYKEKELV--ERSAYGETEDEEEESDTEDIRDDDDDD 244
L QG + K + + + EE D +D DDDD +
Sbjct: 308 LRQGEELRRKIEGKSVSEEDEDEDSDSEEEDEDDDEDDDDGE 349
Score = 28.5 bits (64), Expect = 5.2
Identities = 9/36 (25%), Positives = 20/36 (55%), Gaps = 3/36 (8%)
Query: 212 KEKELVER-SAYGETEDEEEESDTEDIRDDDDDDME 246
+ +EL + +E++E+E + ++D+DD E
Sbjct: 310 QGEELRRKIEGKSVSEEDEDEDSDSE--EEDEDDDE 343
>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
Provisional.
Length = 2849
Score = 28.5 bits (63), Expect = 5.2
Identities = 11/36 (30%), Positives = 19/36 (52%)
Query: 226 EDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGP 261
+DE+E+ D +D DD++++ E D ED
Sbjct: 156 DDEDEDEDDDDEEDDEEEEEEEEEIKGFDDEDEEDE 191
>gnl|CDD|226907 COG4530, COG4530, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 129
Score = 27.1 bits (60), Expect = 5.6
Identities = 14/41 (34%), Positives = 18/41 (43%), Gaps = 5/41 (12%)
Query: 222 YGETEDEEEESDT-----EDIRDDDDDDMEIVVCTSQDSED 257
G+ ED + + D ED DDDDD I+ D E
Sbjct: 89 LGDDEDVDLDDDDDDTFLEDEEDDDDDVSGIIGVPGDDEEV 129
>gnl|CDD|218003 pfam04281, Tom22, Mitochondrial import receptor subunit Tom22. The
mitochondrial protein translocase family, which is
responsible for movement of nuclear encoded pre-proteins
into mitochondria, is very complex with at least 19
components. These proteins include several chaperone
proteins, four proteins of the outer membrane
translocase (Tom) import receptor, five proteins of the
Tom channel complex, five proteins of the inner membrane
translocase (Tim) and three "motor" proteins. This
family represents the Tom22 proteins. The N terminal
region of Tom22 has been shown to have chaperone-like
activity, and the C terminal region faces the
intermembrane face.
Length = 136
Score = 26.8 bits (60), Expect = 7.4
Identities = 13/39 (33%), Positives = 22/39 (56%), Gaps = 3/39 (7%)
Query: 211 YKEKELVERSAYGETEDEEEESDTE---DIRDDDDDDME 246
++EK ++ E D+++E DT+ DI DD D + E
Sbjct: 12 FQEKPAAPKNLAQEESDDDDEDDTDTDSDISDDSDFENE 50
>gnl|CDD|130712 TIGR01651, CobT, cobaltochelatase, CobT subunit. This model
describes Pseudomonas denitrificans CobT gene product,
which is a cobalt chelatase subunit that functions in
cobalamin biosynthesis. Cobalamin (vitamin B12) can be
synthesized via several pathways, including an aerobic
pathway (found in Pseudomonas denitrificans) and an
anaerobic pathway (found in P. shermanii and Salmonella
typhimurium). These pathways differ in the point of
cobalt insertion during corrin ring formation. There are
apparently a number of variations on these two pathways,
where the major differences seem to be concerned with
the process of ring contraction. Confusion regarding the
functions of enzymes found in the aerobic vs. anaerobic
pathways has arisen because nonhomologous genes in these
different pathways were given the same gene symbols.
Thus, cobT in the aerobic pathway (P. denitrificans) is
not a homolog of cobT in the anaerobic pathway (S.
typhimurium). It should be noted that E. coli
synthesizes cobalamin only when it is supplied with the
precursor cobinamide, which is a complex intermediate.
Additionally, all E. coli cobalamin synthesis genes
(cobU, cobS and cobT) were named after their Salmonella
typhimurium homologs which function in the anaerobic
cobalamin synthesis pathway. This model describes the
aerobic cobalamin pathway Pseudomonas denitrificans CobT
gene product, which is a cobalt chelatase subunit, with
a MW ~70 kDa. The aerobic pathway cobalt chelatase is a
heterotrimeric, ATP-dependent enzyme that catalyzes
cobalt insertion during cobalamin biosynthesis. The
other two subunits are the P. denitrificans CobS
(TIGR01650) and CobN (pfam02514 CobN/Magnesium
Chelatase) proteins. To avoid potential confusion with
the nonhomologous Salmonella typhimurium/E.coli cobT
gene product, the P. denitrificans gene symbol is not
used in the name of this model [Biosynthesis of
cofactors, prosthetic groups, and carriers, Heme,
porphyrin, and cobalamin].
Length = 600
Score = 28.0 bits (62), Expect = 7.5
Identities = 10/42 (23%), Positives = 17/42 (40%), Gaps = 6/42 (14%)
Query: 225 TEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKR 266
T+ E E + E ++ D DD + + +D P R
Sbjct: 247 TDRESESGEEEMVQSDQDDLPD------ESDDDSETPGEGAR 282
>gnl|CDD|143416 cd07098, ALDH_F15-22, Aldehyde dehydrogenase family 15A1 and
22A1-like. Aldehyde dehydrogenase family members
ALDH15A1 (Saccharomyces cerevisiae YHR039C) and ALDH22A1
(Arabidopsis thaliana, EC=1.2.1.3), and similar
sequences, are in this CD. Significant improvement of
stress tolerance in tobacco plants was observed by
overexpressing the ALDH22A1 gene from maize (Zea mays)
and was accompanied by a reduction of malondialdehyde
derived from cellular lipid peroxidation.
Length = 465
Score = 28.0 bits (63), Expect = 7.6
Identities = 15/34 (44%), Positives = 16/34 (47%), Gaps = 2/34 (5%)
Query: 65 QGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPT 98
+G LL G P PQG P PTLL D T
Sbjct: 327 KGARLLAGGKRYPHPEYPQGHYFP--PTLLVDVT 358
>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
Length = 619
Score = 27.8 bits (63), Expect = 7.6
Identities = 12/43 (27%), Positives = 17/43 (39%)
Query: 204 TLLQGPRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
P +E S E +D+E+E + ED DD E
Sbjct: 171 DGFVDPNAEEDPAHVGSELEELDDDEDEEEEEDENDDSLAADE 213
>gnl|CDD|220135 pfam09184, PPP4R2, PPP4R2. PPP4R2 (protein phosphatase 4 core
regulatory subunit R2) is the regulatory subunit of the
histone H2A phosphatase complex. It has been shown to
confer resistance to the anticancer drug cisplatin in
yeast, and may confer resistance in higher eukaryotes.
Length = 285
Score = 27.5 bits (61), Expect = 7.9
Identities = 10/30 (33%), Positives = 17/30 (56%)
Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDD 242
+ + VE E E+EEE + E+ D+D+
Sbjct: 256 DGDYVEEKELKEDEEEEETEEEEEEEDEDE 285
>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27. This protein forms
the C subunit of DNA polymerase delta. It carries the
essential residues for binding to the Pol1 subunit of
polymerase alpha, from residues 293-332, which are
characterized by the motif D--G--VT, referred to as the
DPIM motif. The first 160 residues of the protein form
the minimal domain for binding to the B subunit, Cdc1,
of polymerase delta, the final 10 C-terminal residues,
362-372, being the DNA sliding clamp, PCNA, binding
motif.
Length = 427
Score = 27.9 bits (62), Expect = 8.1
Identities = 14/57 (24%), Positives = 27/57 (47%), Gaps = 8/57 (14%)
Query: 217 VERSAYGETEDEEEESDTEDIR----DDDDDDMEIVV----CTSQDSEDLPGPSHSK 265
ERS E +E+E+ + ++ D+D+D+ +V ++SE+ P K
Sbjct: 278 GERSDSEEETEEKEKEKRKRLKKMMEDEDEDEEMEIVPESPVEEEESEEPEPPPLPK 334
>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572). Family of
eukaryotic proteins with undetermined function.
Length = 321
Score = 27.4 bits (61), Expect = 8.5
Identities = 18/103 (17%), Positives = 34/103 (33%), Gaps = 16/103 (15%)
Query: 204 TLLQGPRYKEKELVERSAYGETEDE--------EEESDTEDIRDDDDDDMEIVVCTSQDS 255
++L+ +EK+ E E E E E D D+D +D E D+
Sbjct: 171 SMLEALFRREKKEEEEEE-EEDEALIKSLSFGPETEEDRRRADDEDSEDDE----EDNDN 225
Query: 256 EDLPGPSHSKRMRGT---RELLTMKLDSCLHFPRQRGSPWTVS 295
P S + T ++ + ++ ++ S
Sbjct: 226 TPSPKSGSSSPAKPTSILKKSAAKRSEAPSSSKAKKNSRGIPK 268
>gnl|CDD|183115 PRK11394, PRK11394, 23S rRNA pseudouridine synthase E; Provisional.
Length = 217
Score = 27.4 bits (60), Expect = 9.3
Identities = 17/48 (35%), Positives = 24/48 (50%), Gaps = 2/48 (4%)
Query: 67 PTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPT--LPQGPTLLQGPTLP 112
PT L G TL GPTLP G L+ +P P+ P + + ++P
Sbjct: 117 PTQDALEALRNGVTLNDGPTLPAGAELVDEPAWLWPRNPPIRERKSIP 164
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.314 0.137 0.435
Gapped
Lambda K H
0.267 0.0647 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,461,746
Number of extensions: 1499183
Number of successful extensions: 2767
Number of sequences better than 10.0: 1
Number of HSP's gapped: 2104
Number of HSP's successfully gapped: 245
Length of query: 295
Length of database: 10,937,602
Length adjustment: 96
Effective length of query: 199
Effective length of database: 6,679,618
Effective search space: 1329243982
Effective search space used: 1329243982
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 59 (26.7 bits)