RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy14591
(164 letters)
>gnl|CDD|219001 pfam06371, Drf_GBD, Diaphanous GTPase-binding Domain. This domain
is bound to by GTP-attached Rho proteins, leading to
activation of the Drf protein.
Length = 187
Score = 50.4 bits (121), Expect = 3e-08
Identities = 17/48 (35%), Positives = 27/48 (56%), Gaps = 1/48 (2%)
Query: 106 ARLNMGDPKDDIHV-CILCLRAIMNNKYGLNMVIKHTEAINSIALSLM 152
+ + D D + CL+A+MNNK+G++ V+ H E I +A SL
Sbjct: 121 RKKSESDEDLDREYEILKCLKALMNNKFGIDHVLGHPEVILLLARSLD 168
>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger. [Transport and
binding proteins, Cations and iron carrying compounds].
Length = 1096
Score = 42.3 bits (99), Expect = 4e-05
Identities = 27/75 (36%), Positives = 37/75 (49%), Gaps = 5/75 (6%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGG-----ERTEEEGGEEEEEEEEEEEEEGLPAST 71
E E + + E K+ ++ +G GG E EEE EEEEEEEEEEEEE
Sbjct: 831 ETGEQELNAENQGEAKQDEKGVDGGGGSDGGDSEEEEEEEEEEEEEEEEEEEEEEEEEEN 890
Query: 72 NPPANVISPQSLGHQ 86
P ++ P++ Q
Sbjct: 891 EEPLSLEWPETRQKQ 905
Score = 36.5 bits (84), Expect = 0.004
Identities = 21/48 (43%), Positives = 29/48 (60%)
Query: 18 EEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
++E Q AE + + K+ +G GG ++ EEEEEEEEEEEEE
Sbjct: 829 KDETGEQELNAENQGEAKQDEKGVDGGGGSDGGDSEEEEEEEEEEEEE 876
Score = 31.1 bits (70), Expect = 0.25
Identities = 18/63 (28%), Positives = 25/63 (39%)
Query: 4 GVEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE 63
EA T ER +E E + E+ + EG + E GE E + E+E
Sbjct: 635 VAEAEHTGERTGEEGERPTEAEGENGEESGGEAEQEGETETKGENESEGEIPAERKGEQE 694
Query: 64 EEG 66
EG
Sbjct: 695 GEG 697
Score = 31.1 bits (70), Expect = 0.25
Identities = 16/53 (30%), Positives = 24/53 (45%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPA 69
+E + KG+ + E + + + AEG E EE E+E E E EG
Sbjct: 702 KEADHKGETEAEEVEHEGETEAEGTEDEGEIETGEEGEEVEDEGEGEAEGKHE 754
Score = 31.1 bits (70), Expect = 0.27
Identities = 15/58 (25%), Positives = 23/58 (39%), Gaps = 3/58 (5%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGE---RTEEEGGEEEEEEEEEEEEEG 66
E E E+ K E + + + AE +G E E + + + E E EE E
Sbjct: 661 EESGGEAEQEGETETKGENESEGEIPAERKGEQEGEGEIEAKEADHKGETEAEEVEHE 718
Score = 30.3 bits (68), Expect = 0.53
Identities = 21/54 (38%), Positives = 30/54 (55%), Gaps = 2/54 (3%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
ERK ++E E + + K+A+ K + + AE TE EG E+E E E EE E
Sbjct: 688 ERKGEQEGEGEIEAKEADHKGETE--AEEVEHEGETEAEGTEDEGEIETGEEGE 739
Score = 30.0 bits (67), Expect = 0.60
Identities = 22/66 (33%), Positives = 29/66 (43%), Gaps = 3/66 (4%)
Query: 5 VEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGER-TEEEGGEEEEEEEEEEE 63
V AL + E E G+R E ++ + AEG G E E E E E + E E
Sbjct: 624 VMALGDLSKGDVAEAEHTGERTGEEGERPTE--AEGENGEESGGEAEQEGETETKGENES 681
Query: 64 EEGLPA 69
E +PA
Sbjct: 682 EGEIPA 687
Score = 29.2 bits (65), Expect = 1.0
Identities = 17/51 (33%), Positives = 24/51 (47%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
+ E E+ +RK +E + + E GE EE E E E E E+EG
Sbjct: 680 ESEGEIPAERKGEQEGEGEIEAKEADHKGETEAEEVEHEGETEAEGTEDEG 730
Score = 27.3 bits (60), Expect = 5.0
Identities = 26/107 (24%), Positives = 41/107 (38%), Gaps = 8/107 (7%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
E + K E E +G RK+ E + E G+ E+EG + E+ E + +EG
Sbjct: 748 EAEGKHEVETEGDRKETEHE------GETEAEGKEDEDEGEIQAGEDGEMKGDEGAEGKV 801
Query: 72 NPPANVISPQSLGHQRPSLDLASSPSVKKR--SRHAARLNMGDPKDD 116
+ + H+ S A VK + N G+ K D
Sbjct: 802 EHEGETEAGEKDEHEGQSETQADDTEVKDETGEQELNAENQGEAKQD 848
Score = 26.9 bits (59), Expect = 6.5
Identities = 15/49 (30%), Positives = 26/49 (53%)
Query: 18 EEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
E +G+ + E + + + + E G+R E E E E E +E+E+EG
Sbjct: 733 ETGEEGEEVEDEGEGEAEGKHEVETEGDRKETEHEGETEAEGKEDEDEG 781
Score = 26.5 bits (58), Expect = 8.7
Identities = 18/50 (36%), Positives = 28/50 (56%), Gaps = 1/50 (2%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
E+E +G+ + E + + R E GE TE EG E+E+E E + E+G
Sbjct: 741 VEDEGEGEAEGKHEVETEGDRKETEHEGE-TEAEGKEDEDEGEIQAGEDG 789
>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
Provisional.
Length = 482
Score = 39.1 bits (92), Expect = 4e-04
Identities = 25/55 (45%), Positives = 33/55 (60%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
+ + K EEE K ++KKA KKK+ E + E+E EEE EEE+EEEEE
Sbjct: 418 KAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEK 472
Score = 38.4 bits (90), Expect = 7e-04
Identities = 22/55 (40%), Positives = 35/55 (63%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
E+K +EE++ K ++ A +KK+++ E E EEE E EEE+EEEEE++
Sbjct: 420 EKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEKKK 474
Score = 37.2 bits (87), Expect = 0.002
Identities = 22/52 (42%), Positives = 35/52 (67%)
Query: 14 KLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
K+K+ E ++++ E+K+KKK+ G+ E EEE ++EEE+EEEEEE
Sbjct: 411 KIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEA 462
Score = 37.2 bits (87), Expect = 0.002
Identities = 17/54 (31%), Positives = 31/54 (57%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+ E++ + +++K ++ K++ E + +EE EEEEEE EEE+EE
Sbjct: 415 IVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEE 468
Score = 36.8 bits (86), Expect = 0.002
Identities = 24/61 (39%), Positives = 35/61 (57%)
Query: 14 KLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNP 73
K ++ K ++K+ EEKK+KK++A E EEE E++EEE+EEEEEE
Sbjct: 410 KKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEE 469
Query: 74 P 74
Sbjct: 470 E 470
Score = 36.4 bits (85), Expect = 0.004
Identities = 16/51 (31%), Positives = 35/51 (68%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
K +++K +KAE+K++++++ + + ++E EEEE+E++EEE+E
Sbjct: 407 KATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEE 457
Score = 35.7 bits (83), Expect = 0.006
Identities = 22/63 (34%), Positives = 31/63 (49%), Gaps = 8/63 (12%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
K +EEE+ + ++K KKK++ E E+E EEE+EEEEEE EE
Sbjct: 419 AEKKREEEKKEKKKKAFAGKKKEEE--------EEEEKEKKEEEKEEEEEEAEEEKEEEE 470
Query: 72 NPP 74
Sbjct: 471 EKK 473
Score = 34.5 bits (80), Expect = 0.018
Identities = 16/50 (32%), Positives = 29/50 (58%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+E E + G +K ++ KK +AE + E+ E++ +++EEEEEE
Sbjct: 397 EEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEE 446
Score = 33.4 bits (77), Expect = 0.038
Identities = 18/51 (35%), Positives = 29/51 (56%)
Query: 15 LKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
L ++ + KK EK +KKR E + ++ +EEEEEEE+E++E
Sbjct: 402 LTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKE 452
Score = 33.4 bits (77), Expect = 0.040
Identities = 15/72 (20%), Positives = 32/72 (44%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPA 75
K+ + + + EKK+++ + E + +++ EEEEE+E++EEE+
Sbjct: 406 KKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEE 465
Query: 76 NVISPQSLGHQR 87
+ Q
Sbjct: 466 KEEEEEKKKKQA 477
Score = 33.0 bits (76), Expect = 0.052
Identities = 15/53 (28%), Positives = 32/53 (60%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
K ++ K K ++++++K+ + + + +EE EEE+E++EEE+EE
Sbjct: 405 SKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEE 457
>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168). This
family consists of several hypothetical eukaryotic
proteins of unknown function.
Length = 142
Score = 36.2 bits (84), Expect = 0.002
Identities = 23/63 (36%), Positives = 37/63 (58%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
E+K K+EE+ +R K ++KK+KK++ + G + EE+ G + EE +EEEEG
Sbjct: 74 EKKRKDEEKTAKKRAKRQKKKQKKKKKKKAKKGNKKEEKEGSKSSEESSDEEEEGEEDKQ 133
Query: 72 NPP 74
P
Sbjct: 134 EEP 136
>gnl|CDD|221775 pfam12794, MscS_TM, Mechanosensitive ion channel inner membrane
domain 1. The small mechanosensitive channel, MscS, is
a part of the turgor-driven solute efflux system that
protects bacteria from lysis in the event of osmotic
shock. The MscS protein alone is sufficient to form a
functional mechanosensitive channel gated directly by
tension in the lipid bilayer. The MscS proteins are
heptamers of three transmembrane subunits with seven
converging M3 domains, and this domain is one of the
inner membrane domains.
Length = 339
Score = 36.0 bits (84), Expect = 0.004
Identities = 22/61 (36%), Positives = 25/61 (40%), Gaps = 10/61 (16%)
Query: 24 QRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSL 83
+R+ A E+ K KR E EEE E E EE E L T IS QSL
Sbjct: 255 RRRLAYERAKAKRAEILAQRAE--EEEESSEGAAETIEEPELDL--ET------ISAQSL 304
Query: 84 G 84
Sbjct: 305 R 305
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 35.8 bits (82), Expect = 0.007
Identities = 26/74 (35%), Positives = 37/74 (50%), Gaps = 4/74 (5%)
Query: 22 KGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE-EEGLPASTNPPANVISP 80
K +K+ E +K KR AE + ER E+ E+E E E E E E AS++ + +S
Sbjct: 576 KLAKKREEAVEKAKREAEQKAREEREREKEKEKEREREREREAERAAKASSSSHESRMSE 635
Query: 81 QSL---GHQRPSLD 91
L H RPS +
Sbjct: 636 PQLSGPAHMRPSFE 649
>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
This is a family of fungal proteins of unknown function.
Length = 182
Score = 34.7 bits (80), Expect = 0.010
Identities = 17/67 (25%), Positives = 39/67 (58%), Gaps = 1/67 (1%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTN 72
K+K+E E K + K ++K KKK+ + ++ +++ +++E+E E++ E+ L S +
Sbjct: 72 EKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLED-LTKSYS 130
Query: 73 PPANVIS 79
+ +S
Sbjct: 131 ETLSTLS 137
>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510). This
family consists of several hypothetical bacterial
proteins of around 200 residues in length. The function
of this family is unknown.
Length = 214
Score = 34.3 bits (79), Expect = 0.013
Identities = 23/68 (33%), Positives = 34/68 (50%), Gaps = 2/68 (2%)
Query: 6 EALRTCERKLKEEEEVKGQRKKA--EEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE 63
EA ++ +++ E EEVK + K+A E K+ K AE E E +EE +E E+E
Sbjct: 50 EAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEEEDEESSDENEKE 109
Query: 64 EEGLPAST 71
E S
Sbjct: 110 TEEKTESN 117
Score = 28.9 bits (65), Expect = 0.86
Identities = 14/65 (21%), Positives = 23/65 (35%), Gaps = 4/65 (6%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST---N 72
KE++ + + E++ ++ E E E E E E+E P T
Sbjct: 78 KEDKGDAEKEDEESEEENEEEDEESSDENE-KETEEKTESNVEKEITNPSWKPVGTEQTG 136
Query: 73 PPANV 77
P A
Sbjct: 137 PHAMT 141
>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
Rpc31. RNA polymerase III contains seventeen subunits
in yeasts and in human cells. Twelve of these are akin
to RNA polymerase I or II and the other five are RNA pol
III-specific, and form the functionally distinct groups
(i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
Rpc34 and Rpc82 form a cluster of enzyme-specific
subunits that contribute to transcription initiation in
S.cerevisiae and H.sapiens. There is evidence that these
subunits are anchored at or near the N-terminal Zn-fold
of Rpc1, itself prolonged by a highly conserved but RNA
polymerase III-specific domain.
Length = 221
Score = 34.3 bits (79), Expect = 0.015
Identities = 19/50 (38%), Positives = 29/50 (58%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
EEEE ++ EKK K+ AE + +EE EEEEEE+E+ +++
Sbjct: 143 TEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEEEEDEDFDDD 192
Score = 33.6 bits (77), Expect = 0.030
Identities = 16/49 (32%), Positives = 29/49 (59%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
EEE++ + E+K K+ + E+ EEE EEEEE+E+ ++++
Sbjct: 145 EEEDIDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEEEEDEDFDDDD 193
>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region. The myc family
belongs to the basic helix-loop-helix leucine zipper
class of transcription factors, see pfam00010. Myc forms
a heterodimer with Max, and this complex regulates cell
growth through direct activation of genes involved in
cell replication. Mutations in the C-terminal 20
residues of this domain cause unique changes in the
induction of apoptosis, transformation, and G2 arrest.
Length = 329
Score = 34.1 bits (78), Expect = 0.021
Identities = 26/64 (40%), Positives = 30/64 (46%), Gaps = 11/64 (17%)
Query: 44 GERTEEEGGEEEEEEEEEEEE------EGLPASTNPPANVISPQSLGHQRPSLDLASSPS 97
G +E E EEEEEEEEEEEE E +S+N A+ S R SP
Sbjct: 223 GSDSESEEDEEEEEEEEEEEEIDVVTVEKRRSSSNRKAST-SESITVPSRRH----HSPL 277
Query: 98 VKKR 101
V KR
Sbjct: 278 VLKR 281
>gnl|CDD|217049 pfam02459, Adeno_terminal, Adenoviral DNA terminal protein. This
protein is covalently attached to the terminii of
replicating DNA in vivo.
Length = 548
Score = 33.5 bits (77), Expect = 0.044
Identities = 17/37 (45%), Positives = 20/37 (54%)
Query: 34 KKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAS 70
++RR E EEE EEE EEEEEEEE +
Sbjct: 293 RRRRRPPPSPPEPEEEEEEEEEVPEEEEEEEEEEERT 329
Score = 30.8 bits (70), Expect = 0.28
Identities = 15/24 (62%), Positives = 15/24 (62%)
Query: 48 EEEGGEEEEEEEEEEEEEGLPAST 71
EEE EEEE EEEEEEE T
Sbjct: 306 EEEEEEEEEVPEEEEEEEEEEERT 329
Score = 30.0 bits (68), Expect = 0.53
Identities = 14/20 (70%), Positives = 14/20 (70%)
Query: 48 EEEGGEEEEEEEEEEEEEGL 67
EEE EEEEEEEEEEE
Sbjct: 311 EEEEVPEEEEEEEEEEERTF 330
>gnl|CDD|222366 pfam13764, E3_UbLigase_R4, E3 ubiquitin-protein ligase UBR4. This
is a family of E## ubiquitin ligase enzymes.
Length = 794
Score = 33.1 bits (76), Expect = 0.062
Identities = 19/66 (28%), Positives = 26/66 (39%), Gaps = 13/66 (19%)
Query: 15 LKEEEEVKGQRKK--AEEKKKKKRRAEGR------GGGERTEEEG-----GEEEEEEEEE 61
L E+E V + + E + +K+R A G RT G E E
Sbjct: 400 LAEKEGVAKKIDEVRDETRAEKRRLAMAMREKQLQALGMRTSSGGQIVASPRLLEGIESL 459
Query: 62 EEEEGL 67
EEE+GL
Sbjct: 460 EEEDGL 465
>gnl|CDD|218223 pfam04712, Radial_spoke, Radial spokehead-like protein. This
family includes the radial spoke head proteins RSP4 and
RSP6 from Chlamydomonas reinhardtii, and several
eukaryotic homologues, including mammalian RSHL1, the
protein product of a familial ciliary dyskinesia
candidate gene.
Length = 481
Score = 32.7 bits (75), Expect = 0.073
Identities = 15/27 (55%), Positives = 17/27 (62%)
Query: 42 GGGERTEEEGGEEEEEEEEEEEEEGLP 68
E+ +EE EEEEE EE E EEG P
Sbjct: 349 EEEEQEDEEEEEEEEEPEEPEPEEGPP 375
Score = 31.6 bits (72), Expect = 0.16
Identities = 16/30 (53%), Positives = 19/30 (63%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPASTNPP 74
++ EEE E+EEEEEEEEE E PP
Sbjct: 346 QKDEEEEQEDEEEEEEEEEPEEPEPEEGPP 375
>gnl|CDD|224530 COG1614, CdhC, CO dehydrogenase/acetyl-CoA synthase beta subunit
[Energy production and conversion].
Length = 470
Score = 32.5 bits (74), Expect = 0.081
Identities = 16/30 (53%), Positives = 16/30 (53%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPASTNPP 74
ER EE EEEEEEEEE E P P
Sbjct: 399 ERWAEEEEEEEEEEEEEAAEAEAPMEEPVP 428
Score = 29.0 bits (65), Expect = 1.2
Identities = 18/56 (32%), Positives = 24/56 (42%), Gaps = 1/56 (1%)
Query: 14 KLKEEEEVKGQRK-KAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLP 68
K+ EE+ + + K+K E E EEE EEE E E EE +P
Sbjct: 373 KIATEEDATTIDELREFLKEKGHPVVERWAEEEEEEEEEEEEEAAEAEAPMEEPVP 428
>gnl|CDD|223079 PHA03419, PHA03419, E4 protein; Provisional.
Length = 200
Score = 31.8 bits (72), Expect = 0.092
Identities = 17/52 (32%), Positives = 27/52 (51%), Gaps = 3/52 (5%)
Query: 23 GQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPP 74
G +KK ++KK+ ++ A+G ++ E GE E E E+ E P PP
Sbjct: 91 GGKKKEKKKKETEKPAQGGEKPDQGPEAKGEGEGHEPEDPPPEDTP---PPP 139
>gnl|CDD|179712 PRK04019, rplP0, acidic ribosomal protein P0; Validated.
Length = 330
Score = 32.1 bits (74), Expect = 0.10
Identities = 14/23 (60%), Positives = 14/23 (60%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEGL 67
EEE EEEEEEEE EEE
Sbjct: 300 AAEEEEEEEEEEEEEEPSEEEAA 322
Score = 31.4 bits (72), Expect = 0.18
Identities = 15/23 (65%), Positives = 15/23 (65%)
Query: 48 EEEGGEEEEEEEEEEEEEGLPAS 70
EEE EEEEEEEEE EE A
Sbjct: 302 EEEEEEEEEEEEEEPSEEEAAAG 324
Score = 29.1 bits (66), Expect = 1.1
Identities = 15/25 (60%), Positives = 15/25 (60%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPA 69
E EEE EEEEE EEE GL A
Sbjct: 303 EEEEEEEEEEEEEPSEEEAAAGLGA 327
>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain. This is a family of proteins of
approximately 300 residues, found in plants and
vertebrates. They contain a highly conserved DDRGK
motif.
Length = 189
Score = 31.6 bits (72), Expect = 0.11
Identities = 22/53 (41%), Positives = 38/53 (71%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
KL+E++ + QR+ EE+++++++ E + GER EEE EEE E+++EEEE
Sbjct: 9 AKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEER 61
Score = 29.3 bits (66), Expect = 0.60
Identities = 17/48 (35%), Positives = 27/48 (56%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
E EE K +K E ++K++ E ++ EEE E EE+ +E+EE
Sbjct: 27 EREERKKLEEKREGERKEEEELEEEREKKKEEEERKEREEQARKEQEE 74
Score = 28.9 bits (65), Expect = 1.0
Identities = 19/64 (29%), Positives = 35/64 (54%)
Query: 2 RLGVEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEE 61
++G + E K ++ + + ++ EE+KK + + EG E EE E+++EEEE
Sbjct: 2 KIGAKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEER 61
Query: 62 EEEE 65
+E E
Sbjct: 62 KERE 65
Score = 26.2 bits (58), Expect = 6.9
Identities = 20/48 (41%), Positives = 29/48 (60%), Gaps = 3/48 (6%)
Query: 18 EEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
EEE + +RKK EEK++ +R+ E E E +EEEE +E EE+
Sbjct: 24 EEEEREERKKLEEKREGERKEEEE---LEEEREKKKEEEERKEREEQA 68
>gnl|CDD|235401 PRK05306, infB, translation initiation factor IF-2; Validated.
Length = 746
Score = 32.1 bits (74), Expect = 0.12
Identities = 27/105 (25%), Positives = 42/105 (40%)
Query: 3 LGVEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEE 62
LGV + E+ + EVK EE++ +K A+ E E E EEE
Sbjct: 13 LGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAEEAAAAEAEEEA 72
Query: 63 EEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSRHAAR 107
+ E A+ A + + RP+ D A+ P+ R A+
Sbjct: 73 KAEAAAAAPAEEAAEAAAAAEAAARPAEDEAARPAEAAARRPKAK 117
>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62.
Length = 217
Score = 31.7 bits (72), Expect = 0.13
Identities = 13/73 (17%), Positives = 34/73 (46%), Gaps = 6/73 (8%)
Query: 7 ALRTCERKLKEEEEVKGQRKKA------EEKKKKKRRAEGRGGGERTEEEGGEEEEEEEE 60
+R E + + + KG + ++ +K + ER ++ +E++EE++
Sbjct: 12 VVRALESEKYKANKDKGNPEIYNKINSQDKAIEKFKLLIKAQMAERVKKLHSQEKKEEKK 71
Query: 61 EEEEEGLPASTNP 73
+ +++ +P NP
Sbjct: 72 KPKKKKVPLQVNP 84
>gnl|CDD|220102 pfam09073, BUD22, BUD22. BUD22 has been shown in yeast to be a
nuclear protein involved in bud-site selection. It plays
a role in positioning the proximal bud pole signal. More
recently it has been shown to be involved in ribosome
biogenesis.
Length = 424
Score = 31.7 bits (72), Expect = 0.13
Identities = 15/59 (25%), Positives = 30/59 (50%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAS 70
++ +++ K ++K+A+E K E E EE E++ ++EEEE+ + S
Sbjct: 147 KKGKAKKKTKKSKKKEAKESSDKDDEEESESEDESKSEESAEDDSDDEEEEDSDSEDYS 205
Score = 26.3 bits (58), Expect = 8.2
Identities = 14/54 (25%), Positives = 24/54 (44%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
K+ E + K ++ KK K++ + EEE E+E + EE E+
Sbjct: 137 DKILGIETKAKKGKAKKKTKKSKKKEAKESSDKDDEEESESEDESKSEESAEDD 190
>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
Provisional.
Length = 2849
Score = 31.9 bits (72), Expect = 0.13
Identities = 16/45 (35%), Positives = 23/45 (51%)
Query: 21 VKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+ R AEE + E +E+ +EE++EEEEEEEE
Sbjct: 134 RRRARHLAEEDMSPRDNFVIDDDDEDEDEDDDDEEDDEEEEEEEE 178
Score = 30.8 bits (69), Expect = 0.36
Identities = 10/54 (18%), Positives = 30/54 (55%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+ ++ + + + +R++A ++ + +E+ E++++EE++EEEE
Sbjct: 121 KAEIGDLDMIIIKRRRARHLAEEDMSPRDNFVIDDDDEDEDEDDDDEEDDEEEE 174
Score = 29.6 bits (66), Expect = 0.82
Identities = 10/24 (41%), Positives = 17/24 (70%)
Query: 44 GERTEEEGGEEEEEEEEEEEEEGL 67
+ +++ ++EEEEEEEEE +G
Sbjct: 160 EDEDDDDEEDDEEEEEEEEEIKGF 183
Score = 27.7 bits (61), Expect = 3.2
Identities = 9/21 (42%), Positives = 16/21 (76%)
Query: 48 EEEGGEEEEEEEEEEEEEGLP 68
E+E ++++EE++EEEEE
Sbjct: 158 EDEDEDDDDEEDDEEEEEEEE 178
Score = 27.3 bits (60), Expect = 4.5
Identities = 6/18 (33%), Positives = 15/18 (83%)
Query: 49 EEGGEEEEEEEEEEEEEG 66
++ E+E++++EE++EE
Sbjct: 156 DDEDEDEDDDDEEDDEEE 173
Score = 26.9 bits (59), Expect = 6.0
Identities = 5/21 (23%), Positives = 13/21 (61%)
Query: 53 EEEEEEEEEEEEEGLPASTNP 73
E+E+E++++EE++
Sbjct: 158 EDEDEDDDDEEDDEEEEEEEE 178
>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
Length = 1066
Score = 31.8 bits (72), Expect = 0.18
Identities = 15/58 (25%), Positives = 30/58 (51%), Gaps = 4/58 (6%)
Query: 12 ERKLKEEEEVKGQRKK----AEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
ERK K+EE+ K + K A+++ K K +A+ G ++ ++ + + E+E
Sbjct: 19 ERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKSEKKSRKRDVEDENP 76
Score = 30.6 bits (69), Expect = 0.33
Identities = 14/54 (25%), Positives = 33/54 (61%), Gaps = 2/54 (3%)
Query: 12 ERKLKEEEE--VKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE 63
E K KE+E +K +K+A+ K + ++ ++G +++E++ + + E+E E+
Sbjct: 25 EEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKSEKKSRKRDVEDENPED 78
>gnl|CDD|240329 PTZ00248, PTZ00248, eukaryotic translation initiation factor 2
subunit 1; Provisional.
Length = 319
Score = 31.2 bits (71), Expect = 0.19
Identities = 21/54 (38%), Positives = 31/54 (57%), Gaps = 1/54 (1%)
Query: 14 KLKEEEEVKG-QRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
K+K E EV G + EE +K E +E+E E+E+EEEEE+++EG
Sbjct: 264 KVKGEPEVVGGDEEDLEELLEKAEEEEEEDDYSESEDEDEEDEDEEEEEDDDEG 317
Score = 27.7 bits (62), Expect = 3.0
Identities = 18/73 (24%), Positives = 30/73 (41%), Gaps = 2/73 (2%)
Query: 3 LGVEALRTCERKLKEE-EEVKGQRK-KAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEE 60
G+E + +KE ++ G K K E + + E+ EEE E++ E E
Sbjct: 240 KGMEIIGAALEAIKEVIKKKGGDFKVKGEPEVVGGDEEDLEELLEKAEEEEEEDDYSESE 299
Query: 61 EEEEEGLPASTNP 73
+E+EE
Sbjct: 300 DEDEEDEDEEEEE 312
>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
Length = 413
Score = 31.2 bits (71), Expect = 0.19
Identities = 16/77 (20%), Positives = 29/77 (37%)
Query: 4 GVEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE 63
V L EEE K ++ KK+ + E + E+ +++ E++E + E E
Sbjct: 30 NVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEGET 89
Query: 64 EEGLPASTNPPANVISP 80
+ G P
Sbjct: 90 KLGFKTPKKSKKTKKKP 106
>gnl|CDD|219900 pfam08553, VID27, VID27 cytoplasmic protein. This is a family of
fungal and plant proteins and contains many hypothetical
proteins. VID27 is a cytoplasmic protein that plays a
potential role in vacuolar protein degradation.
Length = 794
Score = 31.3 bits (71), Expect = 0.20
Identities = 12/24 (50%), Positives = 17/24 (70%)
Query: 43 GGERTEEEGGEEEEEEEEEEEEEG 66
++E E+EEEEEEE+E+EG
Sbjct: 384 ANTERDDEEEEDEEEEEEEDEDEG 407
Score = 31.3 bits (71), Expect = 0.24
Identities = 13/24 (54%), Positives = 17/24 (70%)
Query: 42 GGGERTEEEGGEEEEEEEEEEEEE 65
ER +EE +EEEEEEE+E+E
Sbjct: 384 ANTERDDEEEEDEEEEEEEDEDEG 407
Score = 30.1 bits (68), Expect = 0.55
Identities = 13/28 (46%), Positives = 15/28 (53%)
Query: 39 EGRGGGERTEEEGGEEEEEEEEEEEEEG 66
E EEE EEEEEEE+E+E
Sbjct: 382 EDANTERDDEEEEDEEEEEEEDEDEGPS 409
Score = 27.8 bits (62), Expect = 2.9
Identities = 13/46 (28%), Positives = 23/46 (50%)
Query: 20 EVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
E ++K + K+ ++ E+ E ++EEEE+EEEE
Sbjct: 354 ETLNKQKWTKAKETEQDYILDAFSALEIEDANTERDDEEEEDEEEE 399
>gnl|CDD|222792 PHA00435, PHA00435, capsid assembly protein.
Length = 306
Score = 31.0 bits (70), Expect = 0.21
Identities = 18/63 (28%), Positives = 29/63 (46%)
Query: 2 RLGVEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEE 61
R G +A+ E + ++ +++ + + R G E EE +EEEE EEE
Sbjct: 39 RDGDDAIELAEPETSDDPYGNPDPFGEDDEGRIEVRISEDGEEEEVEEGEEDEEEEGEEE 98
Query: 62 EEE 64
EE
Sbjct: 99 SEE 101
>gnl|CDD|216652 pfam01698, FLO_LFY, Floricaula / Leafy protein. This family
consists of various plant development proteins which are
homologues of floricaula (FLO) and Leafy (LFY) proteins
which are floral meristem identity proteins. Mutations
in the sequences of these proteins affect flower and
leaf development.
Length = 382
Score = 31.1 bits (71), Expect = 0.21
Identities = 15/60 (25%), Positives = 24/60 (40%)
Query: 4 GVEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE 63
G T ++KK +K+++KR E R + E+E + E E EE
Sbjct: 162 GGGGGGTWGLVGVPGHSSDSEKKKQRKKQRRKRSKELREDDDDDEDEDDDGEGGGEGGEE 221
Score = 29.6 bits (67), Expect = 0.65
Identities = 11/46 (23%), Positives = 23/46 (50%)
Query: 21 VKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
EKKK++++ + E E++ +E+E+++ E EG
Sbjct: 173 GVPGHSSDSEKKKQRKKQRRKRSKELREDDDDDEDEDDDGEGGGEG 218
Score = 27.7 bits (62), Expect = 3.2
Identities = 11/47 (23%), Positives = 25/47 (53%)
Query: 20 EVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
+ G + + +KKK+R + R + E +++E+E+++ E G
Sbjct: 170 GLVGVPGHSSDSEKKKQRKKQRRKRSKELREDDDDDEDEDDDGEGGG 216
Score = 26.9 bits (60), Expect = 4.9
Identities = 14/48 (29%), Positives = 22/48 (45%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE 63
+ E K QRKK K+ K+ R + + ++ G E EE + E
Sbjct: 178 SSDSEKKKQRKKQRRKRSKELREDDDDDEDEDDDGEGGGEGGEERQRE 225
>gnl|CDD|235795 PRK06402, rpl12p, 50S ribosomal protein L12P; Reviewed.
Length = 106
Score = 29.9 bits (68), Expect = 0.25
Identities = 16/32 (50%), Positives = 18/32 (56%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEEGLPA 69
A E+ EEE EEE+EE EEE GL A
Sbjct: 72 AAAAAAEEKKEEEEEEEEKEESEEEAAAGLGA 103
>gnl|CDD|227701 COG5414, COG5414, TATA-binding protein-associated factor
[Transcription].
Length = 392
Score = 30.8 bits (69), Expect = 0.28
Identities = 19/87 (21%), Positives = 43/87 (49%), Gaps = 3/87 (3%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPAN 76
E E + E+++++ AE +++ EE+EE++E EE T A+
Sbjct: 283 EIENKEVSEGDKEQQQEEVENAEAHKEEVQSDRPDEIGEEKEEDDENEE-NERHTELLAD 341
Query: 77 VISP--QSLGHQRPSLDLASSPSVKKR 101
++ + + +R ++ A++P ++KR
Sbjct: 342 ELNELEKGIEEKRRQMESATNPILQKR 368
>gnl|CDD|237001 PRK11856, PRK11856, branched-chain alpha-keto acid dehydrogenase
subunit E2; Reviewed.
Length = 411
Score = 30.5 bits (70), Expect = 0.33
Identities = 13/63 (20%), Positives = 20/63 (31%), Gaps = 3/63 (4%)
Query: 41 RGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKK 100
GE E E E A+ A + + + AS P+V+K
Sbjct: 77 EEEGEAEAAAAAEAAPEAPAPEPAP--AAAAAAAAAPAAAAAPAAPAAAAAKAS-PAVRK 133
Query: 101 RSR 103
+R
Sbjct: 134 LAR 136
>gnl|CDD|217503 pfam03344, Daxx, Daxx Family. The Daxx protein (also known as the
Fas-binding protein) is thought to play a role in
apoptosis, but precise role played by Daxx remains to be
determined. Daxx forms a complex with Axin.
Length = 715
Score = 30.7 bits (69), Expect = 0.35
Identities = 24/113 (21%), Positives = 41/113 (36%), Gaps = 1/113 (0%)
Query: 2 RLGVEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEE 61
R G + R+ + + ++E ++++ E E EEE E EEEE E+
Sbjct: 411 RQGTSS-RSSDPSKASSTSGESPSMASQESEEEESVEEEEEEEEEEEEEEQESEEEEGED 469
Query: 62 EEEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSRHAARLNMGDPK 114
EEEE + N + S G + +R++ G
Sbjct: 470 EEEEEEVEADNGSEEEMEGSSEGDGDGEEPEEDAERRNSEMAGISRMSEGQQP 522
Score = 27.6 bits (61), Expect = 3.7
Identities = 21/66 (31%), Positives = 34/66 (51%), Gaps = 15/66 (22%)
Query: 15 LKEEEEVKGQRKKAEEKKKKKRRAEG---------------RGGGERTEEEGGEEEEEEE 59
+K+++ + +R+K +E++++ + E EEE EEEEEEE
Sbjct: 393 MKQDDTEEEERRKRQERERQGTSSRSSDPSKASSTSGESPSMASQESEEEESVEEEEEEE 452
Query: 60 EEEEEE 65
EEEEEE
Sbjct: 453 EEEEEE 458
>gnl|CDD|215601 PLN03142, PLN03142, Probable chromatin-remodeling complex ATPase
chain; Provisional.
Length = 1033
Score = 30.5 bits (69), Expect = 0.35
Identities = 19/46 (41%), Positives = 25/46 (54%), Gaps = 6/46 (13%)
Query: 27 KAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTN 72
++ KK K R GR + TE EEE+EE +EEE+GL S
Sbjct: 117 QSASAKKAKGR--GRHASKLTE----EEEDEEYLKEEEDGLGGSGG 156
Score = 29.8 bits (67), Expect = 0.82
Identities = 18/53 (33%), Positives = 31/53 (58%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPA 69
EEE+V Q + E++++ + A G +E E+E+E+EE++EE PA
Sbjct: 1 EEEQVNTQANEEEDEEELEAVARSAGSDSDDDEVPAEDEDEDEEDDEEAESPA 53
>gnl|CDD|148682 pfam07222, PBP_sp32, Proacrosin binding protein sp32. This family
consists of several mammalian specific proacrosin
binding protein sp32 sequences. sp32 is a sperm specific
protein which is known to bind with with 55- and 53-kDa
proacrosins and the 49-kDa acrosin intermediate. The
exact function of sp32 is unclear, it is thought however
that the binding of sp32 to proacrosin may be involved
in packaging the acrosin zymogen into the acrosomal
matrix.
Length = 243
Score = 30.0 bits (67), Expect = 0.40
Identities = 21/62 (33%), Positives = 27/62 (43%), Gaps = 4/62 (6%)
Query: 5 VEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
VE L L +VK + K E+ K + E EE +EE+EEEE EE
Sbjct: 176 VEELLQSSLSLGGSVQVKAPKPKQEQLLSKLQEYLQ----EHKTEEKQPQEEQEEEEVEE 231
Query: 65 EG 66
E
Sbjct: 232 EA 233
>gnl|CDD|227496 COG5167, VID27, Protein involved in vacuole import and degradation
[Intracellular trafficking and secretion].
Length = 776
Score = 30.3 bits (68), Expect = 0.42
Identities = 14/54 (25%), Positives = 21/54 (38%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
E+ EE E K + +K+ + + E EE E EEE E+
Sbjct: 339 EKWGNEEAERKDYILDSSSVPLEKQFDDILYFEKMEIENRNPEESEHEEEVEDY 392
>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
Length = 651
Score = 30.4 bits (69), Expect = 0.43
Identities = 17/68 (25%), Positives = 36/68 (52%), Gaps = 13/68 (19%)
Query: 14 KLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNP 73
KL ++EE++ ++++ E K++KR + +++EE++++E E L + P
Sbjct: 546 KLDDKEELQREKEEKEALKEQKRLRK-------------LKKQEEKKKKELEKLEKAKIP 592
Query: 74 PANVISPQ 81
PA Q
Sbjct: 593 PAEFFKRQ 600
Score = 26.9 bits (60), Expect = 5.5
Identities = 11/32 (34%), Positives = 18/32 (56%)
Query: 6 EALRTCERKLKEEEEVKGQRKKAEEKKKKKRR 37
E R E K +E+ + ++ K +E+KKKK
Sbjct: 552 ELQREKEEKEALKEQKRLRKLKKQEEKKKKEL 583
>gnl|CDD|222440 pfam13897, GOLD_2, Golgi-dynamics membrane-trafficking.
Sec14-like Golgi-trafficking domain The GOLD domain is
always found combined with lipid- or
membrane-association domains.
Length = 136
Score = 29.7 bits (67), Expect = 0.44
Identities = 13/33 (39%), Positives = 17/33 (51%)
Query: 48 EEEGGEEEEEEEEEEEEEGLPASTNPPANVISP 80
EEE EEEE E + E G + + P + I P
Sbjct: 58 EEEEEAEEEEAETGDVEAGSKSQSRPLVDEIIP 90
>gnl|CDD|235396 PRK05299, rpsB, 30S ribosomal protein S2; Provisional.
Length = 258
Score = 30.1 bits (69), Expect = 0.44
Identities = 17/28 (60%), Positives = 17/28 (60%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEE 65
EGR G E EEE EEEEEEEEE
Sbjct: 222 LEGRQGRLAEAAEEEEEEAEEEEEEEEE 249
>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family. Emg1 and Nop14 are novel
proteins whose interaction is required for the
maturation of the 18S rRNA and for 40S ribosome
production.
Length = 809
Score = 30.4 bits (69), Expect = 0.48
Identities = 13/55 (23%), Positives = 27/55 (49%), Gaps = 2/55 (3%)
Query: 14 KLKEEEEVKGQRKKAEEK--KKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
+ K EEE+ + + +K ++ RR G + EE+ E ++ ++E E +
Sbjct: 256 RTKTEEELAKEEAERLKKLEAERLRRMRGEEEDDEEEEDSKESADDLDDEFEPDD 310
Score = 28.8 bits (65), Expect = 1.6
Identities = 15/66 (22%), Positives = 30/66 (45%), Gaps = 4/66 (6%)
Query: 8 LRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRG-GGERTEEEGGEEEEEEEEEEEEEG 66
+T E KEE E + KK E ++ ++ R E E +E ++ ++E E ++++
Sbjct: 257 TKTEEELAKEEAE---RLKKLEAERLRRMRGEEEDDEEEEDSKESADDLDDEFEPDDDDN 313
Query: 67 LPASTN 72
Sbjct: 314 FGLGQG 319
>gnl|CDD|235302 PRK04456, PRK04456, acetyl-CoA decarbonylase/synthase complex
subunit beta; Reviewed.
Length = 463
Score = 30.0 bits (68), Expect = 0.49
Identities = 13/32 (40%), Positives = 13/32 (40%)
Query: 52 GEEEEEEEEEEEEEGLPASTNPPANVISPQSL 83
EEEEEEEEEEEE L
Sbjct: 403 AAEEEEEEEEEEEEEEEPVAEVMMMPAPEMQL 434
Score = 29.7 bits (67), Expect = 0.76
Identities = 15/21 (71%), Positives = 15/21 (71%)
Query: 45 ERTEEEGGEEEEEEEEEEEEE 65
ER E EEEEEEEEEEEE
Sbjct: 400 ERWAAEEEEEEEEEEEEEEEP 420
Score = 28.5 bits (64), Expect = 1.6
Identities = 22/60 (36%), Positives = 29/60 (48%)
Query: 14 KLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNP 73
K+ EE+VK + + K+K+ R E EEE EEEEEEE E +PA
Sbjct: 374 KIATEEDVKDIEELKKFLKEKEHPVVERWAAEEEEEEEEEEEEEEEPVAEVMMMPAPEMQ 433
Score = 27.3 bits (61), Expect = 4.4
Identities = 14/27 (51%), Positives = 15/27 (55%)
Query: 48 EEEGGEEEEEEEEEEEEEGLPASTNPP 74
EEE EEEEEEEEEE + P
Sbjct: 405 EEEEEEEEEEEEEEEPVAEVMMMPAPE 431
>gnl|CDD|173965 cd08045, TAF4, TATA Binding Protein (TBP) Associated Factor 4
(TAF4) is one of several TAFs that bind TBP and is
involved in forming Transcription Factor IID (TFIID)
complex. The TATA Binding Protein (TBP) Associated
Factor 4 (TAF4) is one of several TAFs that bind TBP and
are involved in forming the Transcription Factor IID
(TFIID) complex. TFIID is one of seven General
Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE,
TFIIF, and TFIID) that are involved in accurate
initiation of transcription by RNA polymerase II in
eukaryote. TFIID plays an important role in the
recognition of promoter DNA and assembly of the
pre-initiation complex. TFIID complex is composed of the
TBP and at least 13 TAFs. TAFs from various species were
originally named by their predicted molecular weight or
their electrophoretic mobility in polyacrylamide gels. A
new, unified nomenclature for the pol II TAFs has been
suggested to show the relationship between TAF orthologs
and paralogs. Several hypotheses are proposed for TAFs
functions such as serving as activator-binding sites,
core-promoter recognition or a role in essential
catalytic activity. Each TAF, with the help of a
specific activator, is required only for the expression
of subset of genes and is not universally involved for
transcription as are GTFs. In yeast and human cells,
TAFs have been found as components of other complexes
besides TFIID. Several TAFs interact via histone-fold
(HFD) motifs; HFD is the interaction motif involved in
heterodimerization of the core histones and their
assembly into nucleosome octamers. The minimal HFD
contains three alpha-helices linked by two loops and is
found in core histones, TAFS and many other
transcription factors. TFIID has a histone octamer-like
substructure. TAF4 domain interacts with TAF12 and makes
a novel histone-like heterodimer that binds DNA and has
a core promoter function of a subset of genes.
Length = 212
Score = 29.6 bits (67), Expect = 0.49
Identities = 18/54 (33%), Positives = 32/54 (59%), Gaps = 4/54 (7%)
Query: 8 LRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEE 61
LR E+ +EEEE ++ EE+++ R A+ R R +++ E ++EE+EE
Sbjct: 116 LRFLEQLEREEEE----KRDEEERERLLRAAKSRSEQSRLKQKAKEMQKEEDEE 165
>gnl|CDD|237744 PRK14521, rpsP, 30S ribosomal protein S16; Provisional.
Length = 186
Score = 29.7 bits (67), Expect = 0.51
Identities = 16/62 (25%), Positives = 23/62 (37%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
++ K ++ K +AE+K + R EE EEEE E PA
Sbjct: 119 DKLSKAKKAAKKAALEAEKKVNEARAEAVAEKKAAEAAAVAAEEAAAAEEEEAEEAPAEE 178
Query: 72 NP 73
P
Sbjct: 179 AP 180
Score = 28.2 bits (63), Expect = 1.4
Identities = 19/61 (31%), Positives = 23/61 (37%), Gaps = 4/61 (6%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
+K E E K +AE +KK EE EEEE EE E PA
Sbjct: 128 AKKAALEAEKKVNEARAEAVAEKKAAEAA----AVAAEEAAAAEEEEAEEAPAEEAPAEE 183
Query: 72 N 72
+
Sbjct: 184 S 184
>gnl|CDD|233224 TIGR00993, 3a0901s04IAP86, chloroplast protein import component
Toc86/159, G and M domains. The long precursor of the
86K protein originally described is proposed to have
three domains. The N-terminal A-domain is acidic,
repetitive, weakly conserved, readily removed by
proteolysis during chloroplast isolation, and not
required for protein translocation. The other domains
are designated G (GTPase) and M (membrane anchor); this
family includes most of the G domain and all of M
[Transport and binding proteins, Amino acids, peptides
and amines].
Length = 763
Score = 30.3 bits (68), Expect = 0.52
Identities = 18/67 (26%), Positives = 32/67 (47%), Gaps = 6/67 (8%)
Query: 24 QRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPP-ANVISPQS 82
Q+K+ E+ K+ + + G +E G + EE +EE G PA+ P +++ P S
Sbjct: 444 QKKQWREELKRMKMMKKFG-----KEIGELPDGYSEEVDEENGGPAAVPVPLPDMVLPAS 498
Query: 83 LGHQRPS 89
P+
Sbjct: 499 FDSDNPA 505
>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
Length = 434
Score = 30.1 bits (68), Expect = 0.56
Identities = 14/65 (21%), Positives = 33/65 (50%), Gaps = 6/65 (9%)
Query: 5 VEALRTCERKL----KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEE 60
E L+ +KL K+ EVK + ++ + + E + E E++ E+ +E++E
Sbjct: 368 QELLKEYNKKLQDYTKKLGEVKDE--TDASEEAEAKAKEEKLKQEENEKKQKEQADEDKE 425
Query: 61 EEEEE 65
+ +++
Sbjct: 426 KRQKD 430
Score = 29.0 bits (65), Expect = 1.3
Identities = 13/31 (41%), Positives = 23/31 (74%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRG 42
E KLK+EE K Q+++A+E K+K+++ E +
Sbjct: 404 EEKLKQEENEKKQKEQADEDKEKRQKDERKK 434
>gnl|CDD|220774 pfam10477, EIF4E-T, Nucleocytoplasmic shuttling protein for mRNA
cap-binding EIF4E. EIF4E-T is the transporter protein
for shuttling the mRNA cap-binding protein EIF4E
protein, targeting it for nuclear import. EIF4E-T
contains several key binding domains including two
functional leucine-rich NESs (nuclear export signals)
between residues 438-447 and 613-638 in the human
protein. The other two binding domains are an
EIF4E-binding site, between residues 27-42 in Q9EST3,
and a bipartite NLS (nuclear localisation signals)
between 194-211, and these lie in family EIF4E-T_N.
EIF4E is the eukaryotic translation initiation factor 4E
that is the rate-limiting factor for cap-dependent
translation initiation.
Length = 520
Score = 29.9 bits (66), Expect = 0.62
Identities = 16/97 (16%), Positives = 33/97 (34%)
Query: 8 LRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGL 67
L + K+ EE++ ++ +K+ ++ E GG E +E EE
Sbjct: 219 LIGFDDKILEEDDKTSNGDGKQKGRKRTKKRTASVKEGHVECNGGVSLEPHDEVGLEEEN 278
Query: 68 PASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSRH 104
A P+N + P+ + ++
Sbjct: 279 AADQEVPSNAVLPEPARSTPTKHMAEFNHMMEDPGEF 315
>gnl|CDD|218734 pfam05758, Ycf1, Ycf1. The chloroplast genomes of most higher
plants contain two giant open reading frames designated
ycf1 and ycf2. Although the function of Ycf1 is unknown,
it is known to be an essential gene.
Length = 832
Score = 29.6 bits (67), Expect = 0.70
Identities = 16/54 (29%), Positives = 24/54 (44%), Gaps = 4/54 (7%)
Query: 14 KLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGL 67
KLKE E ++ EE+ + G + E+EG EE+ EE+E
Sbjct: 225 KLKETSE----TEEREEETDVEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDP 274
Score = 28.0 bits (63), Expect = 2.7
Identities = 11/41 (26%), Positives = 18/41 (43%)
Query: 25 RKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
KK +E + + R E T E ++E+E EE+
Sbjct: 223 TKKLKETSETEEREEETDVEIETTSETKGTKQEQEGSTEED 263
>gnl|CDD|129416 TIGR00316, cdhC, CO dehydrogenase/CO-methylating acetyl-CoA
synthase complex, beta subunit. Nomenclature follows
the description for Methanosarcina thermophila. The
CO-methylating acetyl-CoA synthase is considered the
defining enzyme of the Wood-Ljungdahl pathway, used for
acetate catabolism by sulfate reducing bacteria but for
acetate biosynthesis by acetogenic bacteria such as
oorella thermoacetica (f. Clostridium thermoaceticum)
[Energy metabolism, Chemoautotrophy].
Length = 458
Score = 29.8 bits (67), Expect = 0.71
Identities = 20/61 (32%), Positives = 26/61 (42%), Gaps = 1/61 (1%)
Query: 14 KLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE-EGLPASTN 72
K+ EE+ K + + K+K R E EEE EEEE + EE E EG
Sbjct: 371 KIATEEDAKTTDELRKFLKEKGHPVVKRVVREVDEEEIEEEEEAMQPEEMEMEGFEVPAL 430
Query: 73 P 73
Sbjct: 431 Q 431
>gnl|CDD|130366 TIGR01299, synapt_SV2, synaptic vesicle protein SV2. This model
describes a tightly conserved subfamily of the larger
family of sugar (and other) transporters described by
PFAM model pfam00083. Members of this subfamily include
closely related forms SV2A and SV2B of synaptic vesicle
protein from vertebrates and a more distantly related
homolog (below trusted cutoff) from Drosophila
melanogaster. Members are predicted to have two sets of
six transmembrane helices.
Length = 742
Score = 29.6 bits (66), Expect = 0.74
Identities = 12/34 (35%), Positives = 19/34 (55%), Gaps = 3/34 (8%)
Query: 39 EGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTN 72
EG + TE G +E++E E E +G+P + N
Sbjct: 76 EGEASSDATE---GHDEDDEIYEGEYQGIPRAEN 106
>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
Length = 2084
Score = 29.7 bits (66), Expect = 0.76
Identities = 20/60 (33%), Positives = 33/60 (55%), Gaps = 2/60 (3%)
Query: 8 LRTCERKLKEEEEVKG--QRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
++ E K EE+ K + KKAEE +KK A + E + E +++E EE+++ EE
Sbjct: 1661 IKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEE 1720
Score = 29.0 bits (64), Expect = 1.6
Identities = 20/54 (37%), Positives = 30/54 (55%), Gaps = 2/54 (3%)
Query: 13 RKLKEEEEVKGQ--RKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
+K +EE ++K KKAEE KKK A+ E+ E ++E EE ++ EE
Sbjct: 1653 KKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEE 1706
Score = 27.4 bits (60), Expect = 4.1
Identities = 17/59 (28%), Positives = 26/59 (44%)
Query: 6 EALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
E + E K EE+ +KAEE KK + EE+ + EE ++ EE +
Sbjct: 1561 EEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAK 1619
Score = 27.4 bits (60), Expect = 4.3
Identities = 24/63 (38%), Positives = 31/63 (49%), Gaps = 3/63 (4%)
Query: 6 EALRTCERKLKEEEEVK---GQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEE 62
EA + E K K EE K +KKAEE KK A+ E E EE+ E E++
Sbjct: 1313 EAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKK 1372
Query: 63 EEE 65
+EE
Sbjct: 1373 KEE 1375
Score = 27.0 bits (59), Expect = 5.5
Identities = 21/56 (37%), Positives = 32/56 (57%), Gaps = 6/56 (10%)
Query: 12 ERKLKEEEEVKG--QRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+ K KE EE K + KKAEE+ K K E + + EE+ + EE ++ EE+E+
Sbjct: 1637 QLKKKEAEEKKKAEELKKAEEENKIKAAEEAK----KAEEDKKKAEEAKKAEEDEK 1688
Score = 27.0 bits (59), Expect = 6.0
Identities = 21/59 (35%), Positives = 31/59 (52%), Gaps = 2/59 (3%)
Query: 6 EALRTCERKLKEEEEVKGQ--RKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEE 62
EA + E K K EE K +KKAEE KKK A+ ++ +E + EE ++ +E
Sbjct: 1468 EAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKADE 1526
>gnl|CDD|220413 pfam09805, Nop25, Nucleolar protein 12 (25kDa). Members of this
family of proteins are part of the yeast nuclear pore
complex-associated pre-60S ribosomal subunit. The family
functions as a highly conserved exonuclease that is
required for the 5'-end maturation of 5.8S and 25S
rRNAs, demonstrating that 5'-end processing also has a
redundant pathway. Nop25 binds late pre-60S ribosomes,
accompanying them from the nucleolus to the nuclear
periphery; and there is evidence for both physical and
functional links between late 60S subunit processing and
export.
Length = 134
Score = 28.8 bits (65), Expect = 0.77
Identities = 14/53 (26%), Positives = 27/53 (50%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
++++EE + + +++ E K+ K E E E E E+ E++E E E
Sbjct: 55 KRIREERKQELEKQLKERKEALKLLEEENDDEEDAETEDTEDVEDDEWEGFPE 107
Score = 25.8 bits (57), Expect = 8.6
Identities = 15/55 (27%), Positives = 33/55 (60%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
E +++E + ++ +RK+ EK+ K+R+ + E ++E E E+ E+ E++E
Sbjct: 48 EERIEERKRIREERKQELEKQLKERKEALKLLEEENDDEEDAETEDTEDVEDDEW 102
>gnl|CDD|234338 TIGR03742, PRTRC_F, PRTRC system protein F. A novel genetic system
characterized by seven (usually) major proteins,
including a ParB homolog and a ThiF homolog, is commonly
found on plasmids or in bacterial chromosomal regions
near phage, plasmid, or transposon markers. It is most
common among the beta Proteobacteria. We designate the
system PRTRC, or ParB-Related,ThiF-Related Cassette.
This protein family is designated protein F. It is the
most divergent of the families.
Length = 342
Score = 29.3 bits (66), Expect = 0.84
Identities = 26/97 (26%), Positives = 37/97 (38%), Gaps = 9/97 (9%)
Query: 43 GGERTEEEGGEEEEEEEEEEEEEGLPASTNP---PANVISPQSLGHQ-RPSLDLASSPSV 98
GE EEE EE +E++E+ E LP+ + + LG R LD A
Sbjct: 164 EGETDEEEALEELCDEDDEDREAYLPSVVEQALLEDDFLPSHLLGWAARKQLDAAGR--- 220
Query: 99 KKRSRHAARLNMGDPKDDIHVCILCLRAIMNNKYGLN 135
R AR + D+ L L A+ + G
Sbjct: 221 VALVRLRARSDPAV--RDVVTAALELPALFAKRRGAY 255
>gnl|CDD|217476 pfam03286, Pox_Ag35, Pox virus Ag35 surface protein.
Length = 198
Score = 29.0 bits (65), Expect = 0.89
Identities = 15/46 (32%), Positives = 23/46 (50%), Gaps = 6/46 (13%)
Query: 25 RKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAS 70
RK A KK KK+ E+ EE + E ++++ EE E P +
Sbjct: 60 RKPATTKKSKKKDK------EKLTEEEKKPESDDDKTEENENDPDN 99
>gnl|CDD|218191 pfam04652, DUF605, Vta1 like. Vta1 (VPS20-associated protein 1) is
a positive regulator of Vps4. Vps4 is an ATPase that is
required in the multivesicular body (MVB) sorting
pathway to dissociate the endosomal sorting complex
required for transport (ESCRT). Vta1 promotes correct
assembly of Vps4 and stimulates its ATPase activity
through its conserved Vta1/SBP1/LIP5 region.
Length = 315
Score = 29.3 bits (66), Expect = 0.92
Identities = 18/89 (20%), Positives = 31/89 (34%), Gaps = 4/89 (4%)
Query: 18 EEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANV 77
+EEV + K A+ K + +A G G +EE+E+ + ++ P +
Sbjct: 121 DEEVAQKIKYAKWKAARIHKALKEG---EDPNPGPPLDEEDEDADVATTNSDNSFPGEDA 177
Query: 78 ISPQSLGHQRPSLDLASSPSVKKRSRHAA 106
P S P PS +
Sbjct: 178 -DPASASPSDPPSSSPGVPSFPSPPEDPS 205
>gnl|CDD|132187 TIGR03143, AhpF_homolog, putative alkyl hydroperoxide reductase F
subunit. This family of thioredoxin reductase homologs
is found adjacent to alkylhydroperoxide reductase C
subunit predominantly in cases where there is only one C
subunit in the genome and that genome is lacking the F
subunit partner (also a thioredcxin reductase homolog)
that is usually found (TIGR03140).
Length = 555
Score = 29.4 bits (66), Expect = 0.92
Identities = 15/39 (38%), Positives = 18/39 (46%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSL 83
E E+ G EE EEEE +E A+ PA SL
Sbjct: 310 ELKEKLGIAEEYEEEEAKEASEASAAETTPAATTKKGSL 348
>gnl|CDD|220135 pfam09184, PPP4R2, PPP4R2. PPP4R2 (protein phosphatase 4 core
regulatory subunit R2) is the regulatory subunit of the
histone H2A phosphatase complex. It has been shown to
confer resistance to the anticancer drug cisplatin in
yeast, and may confer resistance in higher eukaryotes.
Length = 285
Score = 29.0 bits (65), Expect = 0.95
Identities = 16/36 (44%), Positives = 20/36 (55%)
Query: 30 EKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
KK + + G E+E E+EEEEE EEEEE
Sbjct: 244 NKKSDDEEDDDQDGDYVEEKELKEDEEEEETEEEEE 279
Score = 27.5 bits (61), Expect = 3.4
Identities = 15/41 (36%), Positives = 20/41 (48%)
Query: 31 KKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
KK + G+ EE+ +E+EEEEE EEEE
Sbjct: 244 NKKSDDEEDDDQDGDYVEEKELKEDEEEEETEEEEEEEDED 284
Score = 27.5 bits (61), Expect = 3.5
Identities = 15/42 (35%), Positives = 26/42 (61%)
Query: 24 QRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+ KK+++++ + + E E+E EE EEEEEEE+E+
Sbjct: 243 KNKKSDDEEDDDQDGDYVEEKELKEDEEEEETEEEEEEEDED 284
>gnl|CDD|222571 pfam14153, Spore_coat_CotO, Spore coat protein CotO. Bacillus
spores are protected by a protein shell consisting of
over 50 different polypeptides, known as the coat. This
family of proteins has an important morphogenetic role
in coat assembly, it is involved in the assembly of at
least 5 different coat proteins including CotB, CotG,
CotS, CotSA and CotW. It is likely to act at a late
stage of coat assembly.
Length = 185
Score = 29.0 bits (65), Expect = 0.96
Identities = 17/65 (26%), Positives = 30/65 (46%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
+ +K +EE + + EK+K+ E E+ E+E +EEE+EEE E+
Sbjct: 47 DEHVKSKEEEQKIEYEEAEKEKEAGEPEREDIAEQQEKEEIAQEEEKEEEAEDVKQQEVF 106
Query: 72 NPPAN 76
+
Sbjct: 107 SFKRK 111
>gnl|CDD|189041 cd09871, PIN_UPF0110, PIN domain of the VapC-like UPF0110 protein
Mb0640 and homologs. Virulence associated protein C
(VapC)-like PIN (PilT N terminus) domain of the
Mycobacterium bovis UPF0110 protein Mb0640 and other
uncharacterized homologs are included in this subfamily.
They are similar to the PIN domains of the Mycobacterium
tuberculosis VapC and Neisseria gonorrhoeae FitB toxins
of the prokaryotic toxin/antitoxin operons, VapBC and
FitAB, respectively, which are believed to be involved
in growth inhibition by regulating translation. These
toxins are nearly always co-expressed with an antitoxin,
a cognate protein inhibitor, forming an inert protein
complex. Disassociation of the protein complex activates
the ribonuclease activity of the toxin by an, as yet
undefined mechanism. VapC-like PIN domains are homologs
of flap endonuclease-1 (FEN1)-like PIN domains, but lack
the extensive arch/clamp region and the H3TH
(helix-3-turn-helix) domain, an atypical
helix-hairpin-helix-2-like region, seen in FEN1-like
PIN domains. PIN domains within this subgroup contain
three highly conserved acidic residues. These putative
active site residues are thought to bind Mg2+ and/or
Mn2+ ions and be essential for single-stranded
ribonuclease activity.
Length = 128
Score = 28.3 bits (64), Expect = 0.96
Identities = 8/12 (66%), Positives = 10/12 (83%)
Query: 101 RSRHAARLNMGD 112
+ RH ARLN+GD
Sbjct: 88 KGRHPARLNLGD 99
>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR). This
family consists of several bovine specific leukaemia
virus receptors which are thought to function as
transmembrane proteins, although their exact function is
unknown.
Length = 561
Score = 29.3 bits (65), Expect = 0.97
Identities = 17/56 (30%), Positives = 30/56 (53%), Gaps = 1/56 (1%)
Query: 3 LGVEALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEE 58
L V L ++ +K EEE + R++ E+ K++K++ E G R G E +E+
Sbjct: 70 LKVPGLPMSDQYVKLEEE-RRHRQRLEKDKREKKKREKEKRGRRRHHSLGTESDED 124
>gnl|CDD|114172 pfam05432, BSP_II, Bone sialoprotein II (BSP-II). Bone
sialoprotein (BSP) is a major structural protein of the
bone matrix that is specifically expressed by
fully-differentiated osteoblasts. The expression of bone
sialoprotein (BSP) is normally restricted to mineralised
connective tissues of bones and teeth where it has been
associated with mineral crystal formation. However, it
has been found that ectopic expression of BSP occurs in
various lesions, including oral and extraoral
carcinomas, in which it has been associated with the
formation of microcrystalline deposits and the
metastasis of cancer cells to bone.
Length = 291
Score = 29.3 bits (65), Expect = 1.0
Identities = 16/30 (53%), Positives = 20/30 (66%)
Query: 36 RRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
++A G E+E E+EEEEEEEEEEE
Sbjct: 120 KKAGNAGKKATKEDESDEDEEEEEEEEEEE 149
Score = 27.7 bits (61), Expect = 3.2
Identities = 15/32 (46%), Positives = 18/32 (56%)
Query: 35 KRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
K+ + E EEE EEEEE E EE E+G
Sbjct: 127 KKATKEDESDEDEEEEEEEEEEEAEVEENEQG 158
>gnl|CDD|100109 cd05831, Ribosomal_P1, Ribosomal protein P1. This subfamily
represents the eukaryotic large ribosomal protein P1.
Eukaryotic P1 and P2 are functionally equivalent to the
bacterial protein L7/L12, but are not homologous to
L7/L12. P1 is located in the L12 stalk, with proteins
P2, P0, L11, and 28S rRNA. P1 and P2 are the only
proteins in the ribosome to occur as multimers, always
appearing as sets of heterodimers. Recent data indicate
that eukaryotes have four copies (two heterodimers),
while most archaeal species contain six copies of L12p
(three homodimers) and bacteria may have four or six
copies (two or three homodimers), depending on the
species. Experiments using S. cerevisiae P1 and P2
indicate that P1 proteins are positioned more
internally with limited reactivity in the C-terminal
domains, while P2 proteins seem to be more externally
located and are more likely to interact with other
cellular components. In lower eukaryotes, P1 and P2 are
further subdivided into P1A, P1B, P2A, and P2B, which
form P1A/P2B and P1B/P2A heterodimers. Some plant
species have a third P-protein, called P3, which is not
homologous to P1 and P2. In humans, P1 and P2 are
strongly autoimmunogenic. They play a significant role
in the etiology and pathogenesis of systemic lupus
erythema (SLE). In addition, the ribosome-inactivating
protein trichosanthin (TCS) interacts with human P0,
P1, and P2, with its primary binding site located in
the C-terminal region of P2. TCS inactivates the
ribosome by depurinating a specific adenine in the
sarcin-ricin loop of 28S rRNA.
Length = 103
Score = 28.1 bits (63), Expect = 1.0
Identities = 10/28 (35%), Positives = 14/28 (50%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEE 65
A E +EE++EEEEEE +
Sbjct: 68 AAAAAAAAAAAAEAKKEEKKEEEEEESD 95
Score = 25.7 bits (57), Expect = 6.3
Identities = 9/24 (37%), Positives = 13/24 (54%)
Query: 42 GGGERTEEEGGEEEEEEEEEEEEE 65
E EE++EEEEEE ++
Sbjct: 73 AAAAAAAEAKKEEKKEEEEEESDD 96
>gnl|CDD|237799 PRK14715, PRK14715, DNA polymerase II large subunit; Provisional.
Length = 1627
Score = 29.4 bits (66), Expect = 1.0
Identities = 17/41 (41%), Positives = 23/41 (56%), Gaps = 8/41 (19%)
Query: 26 KKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
K+ +EKK++K E EE EE +EE EEEE+G
Sbjct: 277 KELKEKKEEKD--------EEKSEEVKTEEVDEEFEEEEKG 309
>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
Length = 943
Score = 29.3 bits (65), Expect = 1.1
Identities = 22/102 (21%), Positives = 39/102 (38%), Gaps = 2/102 (1%)
Query: 8 LRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGL 67
++ ++KL EE + + + + G++ EEG E+ +E +E +E G
Sbjct: 489 IKKSKKKLAPIEEEDSDKHDEPPEGPEASGLPPKAPGDKEGEEGEHEDSKESDEPKEGGK 548
Query: 68 PASTNPPANVISPQSLGHQRPSL--DLASSPSVKKRSRHAAR 107
P T P +PS L+ P K +H
Sbjct: 549 PGETKEGEVGKKPGPAKEHKPSKIPTLSKKPEFPKDPKHPKD 590
>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3
subunit. This is a family of proteins which are
subunits of the eukaryotic translation initiation
factor 3 (eIF3). In yeast it is called Hcr1. The
Saccharomyces cerevisiae protein eIF3j (HCR1) has been
shown to be required for processing of 20S pre-rRNA and
binds to 18S rRNA and eIF3 subunits Rpg1p and Prt1p.
Length = 242
Score = 28.9 bits (65), Expect = 1.1
Identities = 13/49 (26%), Positives = 24/49 (48%), Gaps = 5/49 (10%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
E+EE + ++ K K K K+ + + EE+ + E+EE+ E
Sbjct: 41 EDEEKEEEKAKVAAKAKAKKALK-----AKIEEKEKAKREKEEKGLREL 84
Score = 26.2 bits (58), Expect = 9.0
Identities = 13/53 (24%), Positives = 25/53 (47%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
+ + K + K + KKA + K +++ R E+ E E+ E+E E+
Sbjct: 45 KEEEKAKVAAKAKAKKALKAKIEEKEKAKREKEEKGLRELEEDTPEDELAEKL 97
>gnl|CDD|217829 pfam03985, Paf1, Paf1. Members of this family are components of
the RNA polymerase II associated Paf1 complex. The Paf1
complex functions during the elongation phase of
transcription in conjunction with Spt4-Spt5 and
Spt16-Pob3i.
Length = 431
Score = 28.9 bits (65), Expect = 1.1
Identities = 18/40 (45%), Positives = 23/40 (57%), Gaps = 2/40 (5%)
Query: 29 EEKKKKKRRA--EGRGGGERTEEEGGEEEEEEEEEEEEEG 66
E K + KRRA + E E+E EEE+ +E EEEEG
Sbjct: 351 ESKMRDKRRARLDPIDFEEVDEDEDEEEEQRSDEHEEEEG 390
>gnl|CDD|215914 pfam00428, Ribosomal_60s, 60s Acidic ribosomal protein. This
family includes archaebacterial L12, eukaryotic P0, P1
and P2.
Length = 88
Score = 27.6 bits (62), Expect = 1.2
Identities = 12/29 (41%), Positives = 16/29 (55%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
A EE +EEEEEEEE+++ G
Sbjct: 56 AAAAAAAAAAAEEEKKEEEEEEEEDDDMG 84
>gnl|CDD|145949 pfam03066, Nucleoplasmin, Nucleoplasmin. Nucleoplasmins are also
known as chromatin decondensation proteins. They bind to
core histones and transfer DNA to them in a reaction
that requires ATP. This is thought to play a role in the
assembly of regular nucleosomal arrays.
Length = 146
Score = 28.4 bits (64), Expect = 1.2
Identities = 12/21 (57%), Positives = 16/21 (76%)
Query: 45 ERTEEEGGEEEEEEEEEEEEE 65
E EEE EE+++E+E EEEE
Sbjct: 119 EDEEEEDDEEDDDEDESEEEE 139
Score = 28.1 bits (63), Expect = 1.5
Identities = 12/25 (48%), Positives = 17/25 (68%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPA 69
E EEE +EE+++E+E EEE P
Sbjct: 118 EEDEEEEDDEEDDDEDESEEEESPV 142
Score = 27.3 bits (61), Expect = 2.4
Identities = 9/18 (50%), Positives = 16/18 (88%)
Query: 48 EEEGGEEEEEEEEEEEEE 65
+EE EEE++EE+++E+E
Sbjct: 117 DEEDEEEEDDEEDDDEDE 134
Score = 27.3 bits (61), Expect = 2.9
Identities = 9/18 (50%), Positives = 15/18 (83%)
Query: 48 EEEGGEEEEEEEEEEEEE 65
E + EE+EEEE++EE++
Sbjct: 113 ESDDDEEDEEEEDDEEDD 130
Score = 26.9 bits (60), Expect = 3.8
Identities = 8/22 (36%), Positives = 16/22 (72%)
Query: 44 GERTEEEGGEEEEEEEEEEEEE 65
E ++ E+EEEE++EE+++
Sbjct: 110 EEDESDDDEEDEEEEDDEEDDD 131
Score = 26.9 bits (60), Expect = 4.2
Identities = 7/23 (30%), Positives = 18/23 (78%)
Query: 43 GGERTEEEGGEEEEEEEEEEEEE 65
+ ++++ +EEEE++EE+++E
Sbjct: 110 EEDESDDDEEDEEEEDDEEDDDE 132
Score = 26.9 bits (60), Expect = 4.2
Identities = 11/36 (30%), Positives = 20/36 (55%)
Query: 42 GGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANV 77
E ++E EEEE++EE+++E+ +P V
Sbjct: 110 EEDESDDDEEDEEEEDDEEDDDEDESEEEESPVKKV 145
Score = 26.5 bits (59), Expect = 4.8
Identities = 9/28 (32%), Positives = 17/28 (60%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEE 65
A + EE+ EE++EE+++E+E
Sbjct: 108 ASEEDESDDDEEDEEEEDDEEDDDEDES 135
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This
protein, which interacts with both microtubules and
TRAF3 (tumour necrosis factor receptor-associated factor
3), is conserved from worms to humans. The N-terminal
region is the microtubule binding domain and is
well-conserved; the C-terminal 100 residues, also
well-conserved, constitute the coiled-coil region which
binds to TRAF3. The central region of the protein is
rich in lysine and glutamic acid and carries KKE motifs
which may also be necessary for tubulin-binding, but
this region is the least well-conserved.
Length = 506
Score = 29.1 bits (65), Expect = 1.2
Identities = 16/54 (29%), Positives = 27/54 (50%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
++K ++ +E RK EE K+K+ E E+ EE + EEE++ E
Sbjct: 118 KKKKEKPKEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRA 171
>gnl|CDD|219461 pfam07543, PGA2, Protein trafficking PGA2. A Saccharomyces
cerevisiae member of this family (PGA2) is an ER
protein which has been implicated in protein
trafficking.
Length = 139
Score = 28.2 bits (63), Expect = 1.2
Identities = 20/55 (36%), Positives = 32/55 (58%), Gaps = 3/55 (5%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAS 70
KE E+ + +R++A EKK K RGG GEE+ ++EE+EE+ P++
Sbjct: 44 KEHEKERAEREEAREKKAKISPNALRGG---ATAGHGEEDTDDEEDEEDFATPSA 95
Score = 25.9 bits (57), Expect = 8.6
Identities = 16/63 (25%), Positives = 31/63 (49%), Gaps = 1/63 (1%)
Query: 13 RKLKEEEEVKGQRK-KAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
KL + + K K +AE ++ ++++A+ R G EE+ ++EE+EE +
Sbjct: 35 IKLGAKAQEKEHEKERAEREEAREKKAKISPNALRGGATAGHGEEDTDDEEDEEDFATPS 94
Query: 72 NPP 74
P
Sbjct: 95 AVP 97
>gnl|CDD|220972 pfam11081, DUF2890, Protein of unknown function (DUF2890). This
family is conserved in dsDNA adenoviruses of
vertebrates. The function is not known.
Length = 172
Score = 28.3 bits (63), Expect = 1.3
Identities = 16/56 (28%), Positives = 24/56 (42%), Gaps = 3/56 (5%)
Query: 48 EEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSR 103
+ E +EE+EE EE EE AS+ P+ S Q + P+ + R
Sbjct: 40 DWEDSLDEEDEEAEEVEEETAASSKAPS---SSSKSSSQETISIPPTPPARRPSRR 92
>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein. This protein is found to be part
of a large ribonucleoprotein complex containing the U3
snoRNA. Depletion of the Utp proteins impedes production
of the 18S rRNA, indicating that they are part of the
active pre-rRNA processing complex. This large RNP
complex has been termed the small subunit (SSU)
processome.
Length = 728
Score = 28.9 bits (65), Expect = 1.3
Identities = 22/69 (31%), Positives = 32/69 (46%), Gaps = 9/69 (13%)
Query: 7 ALRTCERKLKEEEEVKGQR--------KKAEEKKKKKRRAEGRG-GGERTEEEGGEEEEE 57
LR KLKE E+ + ++AE +KK++ AE E EE +EEE
Sbjct: 353 MLRKKLGKLKEGEDDEENSGLLSMKFMQRAEARKKEENDAEIEELRRELEGEEESDEEEN 412
Query: 58 EEEEEEEEG 66
EE ++ G
Sbjct: 413 EEPSKKNVG 421
>gnl|CDD|149438 pfam08374, Protocadherin, Protocadherin. The structure of
protocadherins is similar to that of classic cadherins
(pfam00028), but particularly on the cytoplasmic domains
they also have some unique features. They are expressed
in a variety of organisms and are found in high
concentrations in the brain where they seem to be
localised mainly at cell-cell contact sites. Their
expression seems to be developmentally regulated.
Length = 223
Score = 28.7 bits (64), Expect = 1.3
Identities = 21/84 (25%), Positives = 38/84 (45%), Gaps = 7/84 (8%)
Query: 25 RKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLG 84
++K ++K KKK+ + T EE ++E E++ E LP + QS+G
Sbjct: 90 KQKKKKKDKKKKSPKSLLLNFVTVEESKPDDEVHEQKSETLSLPIE-------LEEQSMG 142
Query: 85 HQRPSLDLASSPSVKKRSRHAARL 108
P+ SP + + + A+ L
Sbjct: 143 RYLPTTFKPGSPDLARHYKSASPL 166
>gnl|CDD|234311 TIGR03685, L12P_arch, 50S ribosomal protein L12P. This model
represents the L12P protein of the large (50S) subunit
of the archaeal ribosome.
Length = 105
Score = 27.7 bits (62), Expect = 1.3
Identities = 16/25 (64%), Positives = 16/25 (64%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPA 69
E EEE EEEEEE EEE GL A
Sbjct: 78 EEEEEEEEEEEEEESEEEAMAGLGA 102
Score = 27.3 bits (61), Expect = 2.0
Identities = 16/30 (53%), Positives = 17/30 (56%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEEGL 67
A EEE EEEEEEEEE EEE +
Sbjct: 68 AAAAAAAAEEEEEEEEEEEEEEEESEEEAM 97
>gnl|CDD|227466 COG5137, COG5137, Histone chaperone involved in gene silencing
[Transcription / Chromatin structure and dynamics].
Length = 279
Score = 28.8 bits (64), Expect = 1.3
Identities = 21/71 (29%), Positives = 29/71 (40%), Gaps = 6/71 (8%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEE------EEEEEEGLPAS 70
E + ++ E + GER +++ GEEEE EE E E EE P+
Sbjct: 203 EGNRELNEEEEEEAEGSDDGEDVVDYEGERIDKKQGEEEEMEEEVINLFEIEWEEESPSE 262
Query: 71 TNPPANVISPQ 81
P N SP
Sbjct: 263 EVPRNNEESPA 273
>gnl|CDD|237753 PRK14552, PRK14552, C/D box methylation guide ribonucleoprotein
complex aNOP56 subunit; Provisional.
Length = 414
Score = 28.8 bits (65), Expect = 1.3
Identities = 11/30 (36%), Positives = 20/30 (66%)
Query: 14 KLKEEEEVKGQRKKAEEKKKKKRRAEGRGG 43
K K EE+ +RKK +++KKK ++ + +G
Sbjct: 384 KKKREEKKPQKRKKKKKRKKKGKKRKKKGR 413
Score = 28.4 bits (64), Expect = 1.7
Identities = 11/30 (36%), Positives = 20/30 (66%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGE 45
K+ EE K Q++K ++K+KKK + + G +
Sbjct: 385 KKREEKKPQKRKKKKKRKKKGKKRKKKGRK 414
Score = 28.0 bits (63), Expect = 2.5
Identities = 15/52 (28%), Positives = 27/52 (51%), Gaps = 10/52 (19%)
Query: 14 KLKEE-----EEVKGQ-----RKKAEEKKKKKRRAEGRGGGERTEEEGGEEE 55
+LKEE EE+K + +KK EEKK +KR+ + + + + + +
Sbjct: 363 ELKEELNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRKKKGRK 414
Score = 26.5 bits (59), Expect = 7.3
Identities = 12/29 (41%), Positives = 21/29 (72%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGR 41
+K +E++ K ++KK +KK KKR+ +GR
Sbjct: 385 KKREEKKPQKRKKKKKRKKKGKKRKKKGR 413
>gnl|CDD|240415 PTZ00429, PTZ00429, beta-adaptin; Provisional.
Length = 746
Score = 28.7 bits (64), Expect = 1.4
Identities = 13/64 (20%), Positives = 24/64 (37%), Gaps = 11/64 (17%)
Query: 49 EEGGEEEEEEEEEEE---------EEGLPASTNPPA--NVISPQSLGHQRPSLDLASSPS 97
E EE+ E+++ E ++G PA + PA ++ G P + S
Sbjct: 606 VELDEEDTEDDDAVELPSTPSMGTQDGSPAPSAAPAGYDIFEFAGDGTGAPHPVASGSNG 665
Query: 98 VKKR 101
+
Sbjct: 666 AQHA 669
>gnl|CDD|184724 PRK14520, rpsP, 30S ribosomal protein S16; Provisional.
Length = 155
Score = 28.1 bits (63), Expect = 1.4
Identities = 11/55 (20%), Positives = 13/55 (23%), Gaps = 6/55 (10%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
L E + +KKK A E E EEE
Sbjct: 107 NAALAEADGGPTAEATTPKKKKAAAEAAA------AEAAAPAAEAAAAAAAEEEA 155
>gnl|CDD|218581 pfam05416, Peptidase_C37, Southampton virus-type processing
peptidase. Corresponds to Merops family C37.
Norwalk-like viruses (NLVs), including the Southampton
virus, cause acute non-bacterial gastroenteritis in
humans. The NLV genome encodes three open reading frames
(ORFs). ORF1 encodes a polyprotein, which is processed
by the viral protease into six proteins.
Length = 535
Score = 28.7 bits (64), Expect = 1.5
Identities = 28/119 (23%), Positives = 40/119 (33%), Gaps = 25/119 (21%)
Query: 10 TCERKLKEEEEVKGQRKKA---------------EEKKKKKRRAEGRGGG---------- 44
E + E KG+ KK EE + K+ E RGG
Sbjct: 222 YIEPQDATPEGKKGKNKKGRGKKHNAFSRRGLSDEEYDEYKKIREERGGKYSIQEYLEDR 281
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSR 103
ER EEE E + E + EEE + + +R L L + ++KR
Sbjct: 282 ERYEEELAERQATEADFCEEEEAKIRQRIFGLRKTRKQRKEERAKLGLVTGSDIRKRKP 340
>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
Transcription initiation factor IIA (TFIIA) is a
heterotrimer, the three subunits being known as alpha,
beta, and gamma, in order of molecular weight. The N and
C-terminal domains of the gamma subunit are represented
in pfam02268 and pfam02751, respectively. This family
represents the precursor that yields both the alpha and
beta subunits. The TFIIA heterotrimer is an essential
general transcription initiation factor for the
expression of genes transcribed by RNA polymerase II.
Together with TFIID, TFIIA binds to the promoter region;
this is the first step in the formation of a
pre-initiation complex (PIC). Binding of the rest of the
transcription machinery follows this step. After
initiation, the PIC does not completely dissociate from
the promoter. Some components, including TFIIA, remain
attached and re-initiate a subsequent round of
transcription.
Length = 332
Score = 28.6 bits (64), Expect = 1.7
Identities = 10/42 (23%), Positives = 22/42 (52%)
Query: 23 GQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
Q KK + K++ A+ G E +G +++++E+ E +
Sbjct: 227 KQPKKQAKSSKRRTIAQIDGIDSDDEGDGSDDDDDEDAIESD 268
Score = 27.4 bits (61), Expect = 3.8
Identities = 9/45 (20%), Positives = 22/45 (48%)
Query: 21 VKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
++ K + K K+R G + +E G +++++E+ E +
Sbjct: 224 KVLKQPKKQAKSSKRRTIAQIDGIDSDDEGDGSDDDDDEDAIESD 268
Score = 26.6 bits (59), Expect = 5.8
Identities = 10/44 (22%), Positives = 24/44 (54%)
Query: 24 QRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGL 67
++ ++ K KRR + G +++EG +++++E+ E L
Sbjct: 226 LKQPKKQAKSSKRRTIAQIDGIDSDDEGDGSDDDDDEDAIESDL 269
>gnl|CDD|183610 PRK12585, PRK12585, putative monovalent cation/H+ antiporter
subunit G; Reviewed.
Length = 197
Score = 28.1 bits (62), Expect = 1.7
Identities = 15/51 (29%), Positives = 33/51 (64%)
Query: 15 LKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+K+++ + ++++ E+ ++++ E R ER EE+ E E++EE+E E E
Sbjct: 115 IKKKKSLIIRQEQIEKARQEREELEERMEWERREEKIDEREDQEEQERERE 165
Score = 26.2 bits (57), Expect = 7.0
Identities = 15/47 (31%), Positives = 28/47 (59%)
Query: 19 EEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
E+ + +R++ EE+ + +RR E E EE+ E EE+ EE+ ++
Sbjct: 129 EKARQEREELEERMEWERREEKIDEREDQEEQEREREEQTIEEQSDD 175
>gnl|CDD|223044 PHA03325, PHA03325, nuclear-egress-membrane-like protein;
Provisional.
Length = 418
Score = 28.7 bits (64), Expect = 1.7
Identities = 15/74 (20%), Positives = 26/74 (35%), Gaps = 1/74 (1%)
Query: 34 KKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNP-PANVISPQSLGHQRPSLDL 92
K+R E +++ E + E LPAS P P + + D
Sbjct: 274 KRRSRRAGAMRAAAGETADLADDDGSEHSDPEPLPASLPPPPVRRPRVKHPEAGKEEPDG 333
Query: 93 ASSPSVKKRSRHAA 106
A + K+ ++ A
Sbjct: 334 ARNAEAKEPAQPAT 347
>gnl|CDD|224969 COG2058, RPP1A, Ribosomal protein L12E/L44/L45/RPP1/RPP2
[Translation, ribosomal structure and biogenesis].
Length = 109
Score = 27.4 bits (61), Expect = 1.7
Identities = 14/28 (50%), Positives = 15/28 (53%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEE 65
A G E E EEEE+EEE EEE
Sbjct: 71 AAAAAGAEAAAEADEAEEEEKEEEAEEE 98
Score = 26.6 bits (59), Expect = 3.2
Identities = 12/30 (40%), Positives = 16/30 (53%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEEGL 67
A E E E E+EEE EEE +++ L
Sbjct: 75 AGAEAAAEADEAEEEEKEEEAEEESDDDML 104
Score = 26.6 bits (59), Expect = 3.5
Identities = 13/28 (46%), Positives = 14/28 (50%)
Query: 38 AEGRGGGERTEEEGGEEEEEEEEEEEEE 65
A G E E EEEE+EEE EE
Sbjct: 70 AAAAAAGAEAAAEADEAEEEEKEEEAEE 97
>gnl|CDD|222927 PHA02774, PHA02774, E1; Provisional.
Length = 613
Score = 28.3 bits (64), Expect = 1.8
Identities = 18/70 (25%), Positives = 24/70 (34%), Gaps = 6/70 (8%)
Query: 30 EKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPS 89
KK K+R E + G G EEE + EE S+ G
Sbjct: 104 RKKAKRRLFEEQDSGL------GNSLEEESTDVVEEEGVESSGGGEGGSETGQGGGNGLV 157
Query: 90 LDLASSPSVK 99
LDL S + +
Sbjct: 158 LDLLRSSNRR 167
>gnl|CDD|221490 pfam12253, CAF1A, Chromatin assembly factor 1 subunit A. The
CAF-1 or chromatin assembly factor-1 consists of three
subunits, and this is the first, or A. The A domain is
uniquely required for the progression of S phase in
mouse cells, independent of its ability to promote
histone deposition but dependent on its ability to
interact with HP1 - heterochromatin protein 1-rich
heterochromatin domains next to centromeres that are
crucial for chromosome segregation during mitosis. This
HP1-CAF-1 interaction module functions as a built-in
replication control for heterochromatin, which, like a
control barrier, has an impact on S-phase progression
in addition to DNA-based checkpoints.
Length = 76
Score = 26.8 bits (60), Expect = 1.9
Identities = 10/22 (45%), Positives = 17/22 (77%), Gaps = 4/22 (18%)
Query: 48 EEEGG----EEEEEEEEEEEEE 65
EEEG E+EE+EEE+++++
Sbjct: 51 EEEGEDLESEDEEDEEEDDDDD 72
Score = 26.4 bits (59), Expect = 2.7
Identities = 11/22 (50%), Positives = 17/22 (77%), Gaps = 3/22 (13%)
Query: 48 EEEGGEE---EEEEEEEEEEEG 66
EEE GE+ E+EE+EEE+++
Sbjct: 50 EEEEGEDLESEDEEDEEEDDDD 71
>gnl|CDD|179615 PRK03634, PRK03634, rhamnulose-1-phosphate aldolase; Provisional.
Length = 274
Score = 28.0 bits (63), Expect = 1.9
Identities = 15/51 (29%), Positives = 19/51 (37%), Gaps = 20/51 (39%)
Query: 104 HAARLNMGDPKDDIHVCILCLRAIMNNKYGLNMVIKHTEAINSIALSLMHK 154
H ARL + KD R IM H A N IAL+ + +
Sbjct: 125 HIARLKATNGKD---------RVIM-----------HCHATNLIALTYVLE 155
>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis). This nucleolar
family of proteins are involved in 60S ribosomal
biogenesis. They are specifically involved in the
processing beyond the 27S stage of 25S rRNA maturation.
This family contains sequences that bear similarity to
the glioma tumour suppressor candidate region gene 2
protein (p60). This protein has been found to interact
with herpes simplex type 1 regulatory proteins.
Length = 387
Score = 28.5 bits (64), Expect = 1.9
Identities = 14/65 (21%), Positives = 25/65 (38%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
+ K+ +E++ +K EK + EE + EEE ++E EG +
Sbjct: 207 VKAEKKRQELERVEEKKLEKMAPEASRLDEMSEGLLEESDDDGEEESDDESAWEGFESEY 266
Query: 72 NPPAN 76
P
Sbjct: 267 EPINK 271
>gnl|CDD|114603 pfam05887, Trypan_PARP, Procyclic acidic repetitive protein (PARP).
This family consists of several Trypanosoma brucei
procyclic acidic repetitive protein (PARP) like
sequences. The procyclic acidic repetitive protein
(parp) genes of Trypanosoma brucei encode a small family
of abundant surface proteins whose expression is
restricted to the procyclic form of the parasite. They
are found at two unlinked loci, parpA and parpB;
transcription of both loci is developmentally regulated.
Length = 145
Score = 27.6 bits (60), Expect = 2.0
Identities = 23/75 (30%), Positives = 28/75 (37%)
Query: 22 KGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQ 81
KG + A++ E E EE GEEE E EEE EEE P T P+
Sbjct: 46 KGTKVGADDTNGTDPDDEPEEEEEPEPEEEGEEEPEPEEEGEEEPEPEETGEEEPEPEPE 105
Query: 82 SLGHQRPSLDLASSP 96
P + P
Sbjct: 106 PEPEPEPEPEPEPEP 120
Score = 26.5 bits (57), Expect = 5.4
Identities = 21/80 (26%), Positives = 30/80 (37%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPAN 76
+E KG K + K +K + ++ EEEEE E EEE E P
Sbjct: 29 DEPADKGITKGGKGKGEKGTKVGADDTNGTDPDDEPEEEEEPEPEEEGEEEPEPEEEGEE 88
Query: 77 VISPQSLGHQRPSLDLASSP 96
P+ G + P + P
Sbjct: 89 EPEPEETGEEEPEPEPEPEP 108
>gnl|CDD|240327 PTZ00243, PTZ00243, ABC transporter; Provisional.
Length = 1560
Score = 28.6 bits (64), Expect = 2.1
Identities = 7/34 (20%), Positives = 12/34 (35%)
Query: 42 GGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPA 75
E + + G+ + E E + G PP
Sbjct: 880 ELKENKDSKEGDADAEVAEVDAAPGGAVDHEPPV 913
>gnl|CDD|218538 pfam05285, SDA1, SDA1. This family consists of several SDA1
protein homologues. SDA1 is a Saccharomyces cerevisiae
protein which is involved in the control of the actin
cytoskeleton. The protein is essential for cell
viability and is localised in the nucleus.
Length = 317
Score = 28.1 bits (63), Expect = 2.1
Identities = 14/41 (34%), Positives = 23/41 (56%)
Query: 26 KKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
K EE++KKK +G + +EE E EE+E+ ++E
Sbjct: 77 KWKEEERKKKEAEQGLESDDDDDEEEEWEVEEDEDSDDEGE 117
Score = 26.2 bits (58), Expect = 9.1
Identities = 15/37 (40%), Positives = 22/37 (59%)
Query: 29 EEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
EEK + ++A+ E +EE+ E EEEE E E+E
Sbjct: 136 EEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKE 172
>gnl|CDD|235307 PRK04537, PRK04537, ATP-dependent RNA helicase RhlB; Provisional.
Length = 572
Score = 28.4 bits (63), Expect = 2.2
Identities = 19/106 (17%), Positives = 36/106 (33%), Gaps = 14/106 (13%)
Query: 20 EVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEE------------EEEEEEGL 67
E + QR E+++ R G G + GG + E E +
Sbjct: 420 EAREQRAAEEQRRGGGRSGPGGGSRSGSVGGGGRRDGAGADGKPRPRRKPRVEGEADAAA 479
Query: 68 PASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSRHAARLNMGDP 113
+ P + Q+ G + D +P ++R R+ + +P
Sbjct: 480 AGAETPVVAAAAAQAPG--VVAADGERAPRKRRRRRNGRPVEGAEP 523
>gnl|CDD|218328 pfam04921, XAP5, XAP5, circadian clock regulator. This protein
is found in a wide range of eukaryotes. It is a nuclear
protein and is suggested to be DNA binding. In plants,
this family is essential for correct circadian clock
functioning by acting as a light-quality regulator
coordinating the activities of blue and red light
signalling pathways during plant growth - inhibiting
growth in red light but promoting growth in blue light.
Length = 233
Score = 27.7 bits (62), Expect = 2.4
Identities = 16/46 (34%), Positives = 24/46 (52%)
Query: 31 KKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPAN 76
KKKKK++ G+ EEE +E E+E++ +E P N N
Sbjct: 1 KKKKKKKKSKLSFGDDDEEEDEDEGEDEKKVPKESSEPDEANVNPN 46
>gnl|CDD|223065 PHA03378, PHA03378, EBNA-3B; Provisional.
Length = 991
Score = 28.1 bits (62), Expect = 2.4
Identities = 15/61 (24%), Positives = 28/61 (45%), Gaps = 5/61 (8%)
Query: 49 EEGGEEEEEE---EEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSRHA 105
E+ E EE E + +E++ G A + P + P ++ + RP + A +K +
Sbjct: 357 EDDDESEEIESECDPDEDKSGAEALASIPQTLPDPPTV-YGRPKV-FARKADLKSTKKCR 414
Query: 106 A 106
A
Sbjct: 415 A 415
>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex. This
entry is characterized by proteins with alternating
conserved and low-complexity regions. Bud13 together
with Snu17p and a newly identified factor,
Pml1p/Ylr016c, form a novel trimeric complex. called
The RES complex, pre-mRNA retention and splicing
complex. Subunits of this complex are not essential for
viability of yeasts but they are required for efficient
splicing in vitro and in vivo. Furthermore,
inactivation of this complex causes pre-mRNA leakage
from the nucleus. Bud13 contains a unique,
phylogenetically conserved C-terminal region of unknown
function.
Length = 141
Score = 27.3 bits (61), Expect = 2.4
Identities = 14/52 (26%), Positives = 30/52 (57%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
R + EE+ + + ++ EEK++K+ + + G G +EE + EE E+ + +
Sbjct: 10 RIIDIEEKREEKEREKEEKERKEEKEKEWGKGLVQKEEREKRLEELEKAKNK 61
>gnl|CDD|221937 pfam13148, DUF3987, Protein of unknown function (DUF3987). A
family of uncharacterized proteins found by clustering
human gut metagenomic sequences.
Length = 379
Score = 28.0 bits (63), Expect = 2.7
Identities = 21/71 (29%), Positives = 33/71 (46%), Gaps = 7/71 (9%)
Query: 1 MRLGVEALRTCERKLKEE--EEVKGQRK-----KAEEKKKKKRRAEGRGGGERTEEEGGE 53
+L ++ L E +L+EE EE+K +AE+K +K+ + G+ E E
Sbjct: 56 DKLAMKPLEEIEEELREEYEEELKEYEAEKEIWEAEKKGLEKKAKKAIKKGKDEEALAEE 115
Query: 54 EEEEEEEEEEE 64
E E EE E
Sbjct: 116 LLELEAEEPEP 126
>gnl|CDD|218555 pfam05320, Pox_RNA_Pol_19, Poxvirus DNA-directed RNA polymerase
19 kDa subunit. This family contains several
DNA-directed RNA polymerase 19 kDa polypeptides. The
Poxvirus DNA-directed RNA polymerase (EC: 2.7.7.6)
catalyzes DNA-template-directed extension of the 3'-end
of an RNA strand by one nucleotide at a time.
Length = 167
Score = 27.4 bits (61), Expect = 2.8
Identities = 11/28 (39%), Positives = 17/28 (60%)
Query: 48 EEEGGEEEEEEEEEEEEEGLPASTNPPA 75
+++ E EEEEE+EE+ E L +S
Sbjct: 14 DDDSEEYEEEEEDEEDAESLESSDVSSL 41
>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
(vWA) domain [General function prediction only].
Length = 4600
Score = 28.0 bits (62), Expect = 2.9
Identities = 13/58 (22%), Positives = 24/58 (41%), Gaps = 1/58 (1%)
Query: 15 LKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTN 72
L+E + +++ + + R E G T++ E E + EEE + L N
Sbjct: 3837 LEELANEEDTANQSDLDESEARELESDMNGV-TKDSVVSENENSDSEEENQDLDEEVN 3893
Score = 27.7 bits (61), Expect = 4.0
Identities = 15/70 (21%), Positives = 23/70 (32%), Gaps = 14/70 (20%)
Query: 18 EEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE-----------EEG 66
E E K + A + E + E+ +E+E+EEE +
Sbjct: 3918 ETEQKSNEQSAANNESDLVSKED---DNKALEDKDRQEKEDEEEMSDDVGIDDEIQPDIQ 3974
Query: 67 LPASTNPPAN 76
S PP N
Sbjct: 3975 ENNSQPPPEN 3984
>gnl|CDD|218258 pfam04774, HABP4_PAI-RBP1, Hyaluronan / mRNA binding family.
This family includes the HABP4 family of
hyaluronan-binding proteins, and the PAI-1 mRNA-binding
protein, PAI-RBP1. HABP4 has been observed to bind
hyaluronan (a glucosaminoglycan), but it is not known
whether this is its primary role in vivo. It has also
been observed to bind RNA, but with a lower affinity
than that for hyaluronan. PAI-1 mRNA-binding protein
specifically binds the mRNA of type-1 plasminogen
activator inhibitor (PAI-1), and is thought to be
involved in regulation of mRNA stability. However, in
both cases, the sequence motifs predicted to be
important for ligand binding are not conserved
throughout the family, so it is not known whether
members of this family share a common function.
Length = 106
Score = 26.6 bits (59), Expect = 2.9
Identities = 16/50 (32%), Positives = 28/50 (56%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
++E+ ++ E+++K E + E +EG EEEE EEEE++E
Sbjct: 29 SVKDEIAELTEEQGEEEEKNEVEEKQAVEEEANKEGVVEEEEVEEEEDKE 78
>gnl|CDD|220237 pfam09428, DUF2011, Fungal protein of unknown function (DUF2011).
This is a family of fungal proteins whose function is
unknown.
Length = 130
Score = 26.9 bits (60), Expect = 2.9
Identities = 12/31 (38%), Positives = 17/31 (54%)
Query: 6 EALRTCERKLKEEEEVKGQRKKAEEKKKKKR 36
ALR + KE E + + +K EKK K+R
Sbjct: 94 IALRLRRERTKERAEKEKRTRKNREKKFKRR 124
>gnl|CDD|218941 pfam06217, GAGA_bind, GAGA binding protein-like family. This
family includes gbp a protein from Soybean that binds to
GAGA element dinucleotide repeat DNA. It seems likely
that the this domain mediates DNA binding. This putative
domain contains several conserved cysteines and a
histidine suggesting this may be a zinc-binding DNA
interaction domain.
Length = 301
Score = 27.5 bits (61), Expect = 3.0
Identities = 15/65 (23%), Positives = 22/65 (33%)
Query: 52 GEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSRHAARLNMG 111
G+ E E P ST PP + Q P + A P K+ + ++
Sbjct: 120 GDNPYGTREMHHLEVPPISTAPPEAKEVKKPKKGQSPKVPKAPKPKKPKKKGSVSNRSVK 179
Query: 112 DPKDD 116
P D
Sbjct: 180 MPGID 184
>gnl|CDD|221203 pfam11748, DUF3306, Protein of unknown function (DUF3306). This
family of proteobacterial species proteins has no known
function.
Length = 115
Score = 26.9 bits (60), Expect = 3.2
Identities = 18/70 (25%), Positives = 25/70 (35%), Gaps = 11/70 (15%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTN 72
RKL E + + E+ EEEEE E E+EE L
Sbjct: 9 RKLAVRAEEPAEPAETAEE-----------EAAAAAPAPAPEEEEEAELEDEELLEELDL 57
Query: 73 PPANVISPQS 82
P + ++P S
Sbjct: 58 PDPDTLTPGS 67
>gnl|CDD|220098 pfam09057, Smac_DIABLO, Second Mitochondria-derived Activator of
Caspases. Second Mitochondria-derived Activator of
Caspases promotes apoptosis by activating caspases in
the cytochrome c/Apaf-1/caspase-9 pathway, and by
opposing the inhibitory activity of inhibitor of
apoptosis proteins (XIAP-BIR3). The protein assumes an
elongated three-helix bundle structure, and forms a
dimer in solution.
Length = 234
Score = 27.6 bits (61), Expect = 3.3
Identities = 13/46 (28%), Positives = 21/46 (45%), Gaps = 4/46 (8%)
Query: 19 EEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
EEV+ K+AE+K + + E + R E + E E+ E
Sbjct: 187 EEVRQLSKEAEKKLAESKAEEIQ----RMAEYASSIDLSELEDIPE 228
>gnl|CDD|220759 pfam10446, DUF2457, Protein of unknown function (DUF2457). This
is a family of uncharacterized proteins.
Length = 449
Score = 27.7 bits (61), Expect = 3.4
Identities = 12/57 (21%), Positives = 34/57 (59%)
Query: 15 LKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
+K+E ++ K+AEE+ ++ + + +E+ +++++++E++E+E ST
Sbjct: 32 MKKENAIRKLGKEAEEEAMEEEDDDEEDDDDDDDEDEDDDDDDDDEDDEDEDDDDST 88
>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682). The
members of this family are all hypothetical eukaryotic
proteins of unknown function. One member is described as
being an adipocyte-specific protein, but no evidence of
this was found.
Length = 322
Score = 27.6 bits (62), Expect = 3.4
Identities = 10/30 (33%), Positives = 18/30 (60%), Gaps = 5/30 (16%)
Query: 12 ERKLKEEEEVKG-----QRKKAEEKKKKKR 36
E+K +E E +++K EEK++KK+
Sbjct: 290 EKKKEEREAKLAKLSPEEQRKLEEKERKKQ 319
Score = 26.1 bits (58), Expect = 9.8
Identities = 12/27 (44%), Positives = 18/27 (66%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAE 39
K EEE + ++K EEKKK++R A+
Sbjct: 273 LKAAEEERQEEAQEKKEEKKKEEREAK 299
>gnl|CDD|234456 TIGR04074, bacter_Hen1, 3' terminal RNA ribose
2'-O-methyltransferase Hen1. Members of this protein
family are bacterial Hen1, a 3' terminal RNA ribose
2'-O-methyltransferase that acts in bacterial RNA
repair. All members of the seed alignment belong to a
cassette with the RNA repair enzyme polynucleotide
kinase-phosphatase (Pnkp). Chemically similar Hen1 in
eukaryotes acts instead on small regulatory RNAs
[Transcription, RNA processing, Protein synthesis, tRNA
and rRNA base modification].
Length = 462
Score = 27.7 bits (62), Expect = 3.5
Identities = 14/40 (35%), Positives = 16/40 (40%), Gaps = 10/40 (25%)
Query: 48 EEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQR 87
E + E EE E EE +EE P SL QR
Sbjct: 241 EADEAEPEEAETEEAQEEAAEK----------PPSLNRQR 270
>gnl|CDD|240285 PTZ00135, PTZ00135, 60S acidic ribosomal protein P0; Provisional.
Length = 310
Score = 27.3 bits (61), Expect = 3.6
Identities = 7/19 (36%), Positives = 8/19 (42%)
Query: 47 TEEEGGEEEEEEEEEEEEE 65
EEEEEEE+
Sbjct: 284 AAAAAAAAAPAEEEEEEED 302
Score = 27.3 bits (61), Expect = 4.1
Identities = 7/19 (36%), Positives = 9/19 (47%)
Query: 48 EEEGGEEEEEEEEEEEEEG 66
EEEEEEE++
Sbjct: 286 AAAAAAAPAEEEEEEEDDM 304
Score = 27.3 bits (61), Expect = 4.5
Identities = 8/22 (36%), Positives = 10/22 (45%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEG 66
EEEEEEE++ G
Sbjct: 284 AAAAAAAAAPAEEEEEEEDDMG 305
Score = 26.5 bits (59), Expect = 7.4
Identities = 7/20 (35%), Positives = 7/20 (35%)
Query: 47 TEEEGGEEEEEEEEEEEEEG 66
EEEEEEE
Sbjct: 283 AAAAAAAAAAPAEEEEEEED 302
>gnl|CDD|239134 cd02669, Peptidase_C19M, A subfamily of Peptidase C19. Peptidase
C19 contains ubiquitinyl hydrolases. They are
intracellular peptidases that remove ubiquitin molecules
from polyubiquinated peptides by cleavage of isopeptide
bonds. They hydrolyze bonds involving the carboxyl group
of the C-terminal Gly residue of ubiquitin. The purpose
of the de-ubiquitination is thought to be editing of the
ubiquitin conjugates, which could rescue them from
degradation, as well as recycling of the ubiquitin. The
ubiquitin/proteasome system is responsible for most
protein turnover in the mammalian cell, and with over 50
members, family C19 is one of the largest families of
peptidases in the human genome.
Length = 440
Score = 27.7 bits (62), Expect = 3.6
Identities = 13/40 (32%), Positives = 18/40 (45%), Gaps = 7/40 (17%)
Query: 124 LRAIMNNKY-----GLNMVIKHTEAINSIALSLMH-KSLR 157
R + Y GLN IK+ + N I +L H K +R
Sbjct: 107 SRDLDGKPYLPGFVGLNN-IKNNDYANVIIQALSHVKPIR 145
>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein. TolA couples the inner
membrane complex of itself with TolQ and TolR to the
outer membrane complex of TolB and OprL (also called
Pal). Most of the length of the protein consists of
low-complexity sequence that may differ in both length
and composition from one species to another,
complicating efforts to discriminate TolA (the most
divergent gene in the tol-pal system) from paralogs such
as TonB. Selection of members of the seed alignment and
criteria for setting scoring cutoffs are based largely
conserved operon struction. //The Tol-Pal complex is
required for maintaining outer membrane integrity. Also
involved in transport (uptake) of colicins and
filamentous DNA, and implicated in pathogenesis.
Transport is energized by the proton motive force. TolA
is an inner membrane protein that interacts with
periplasmic TolB and with outer membrane porins ompC,
phoE and lamB [Transport and binding proteins, Other,
Cellular processes, Pathogenesis].
Length = 346
Score = 27.1 bits (60), Expect = 4.1
Identities = 15/57 (26%), Positives = 27/57 (47%), Gaps = 5/57 (8%)
Query: 12 ERKLKEEEEVKGQ-----RKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEE 63
E K K E E + + +K+AEE+ K K AE + +++ E + + E +
Sbjct: 131 EAKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAKAEAKA 187
>gnl|CDD|235229 PRK04156, gltX, glutamyl-tRNA synthetase; Provisional.
Length = 567
Score = 27.5 bits (62), Expect = 4.1
Identities = 15/25 (60%), Positives = 18/25 (72%), Gaps = 1/25 (4%)
Query: 45 ERTEEEGGEEEEEEEEEEEEE-GLP 68
ER EE E EEEEE++EE+ GLP
Sbjct: 68 ERLEELAPELLEEEEEKKEEKKGLP 92
>gnl|CDD|147685 pfam05663, DUF809, Protein of unknown function (DUF809). This
family consists of several proteins of unknown function
Raphanus sativus (Radish) and Brassica napus (Rape).
Length = 138
Score = 26.7 bits (58), Expect = 4.3
Identities = 19/46 (41%), Positives = 28/46 (60%), Gaps = 3/46 (6%)
Query: 21 VKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
+KG+ + EEKK+ K EG+ E +E GE E +EE++E E G
Sbjct: 93 IKGEIEGKEEKKEGKGEIEGK---EEKKEGKGEIEGKEEKKEVENG 135
>gnl|CDD|215247 PLN02448, PLN02448, UDP-glycosyltransferase family protein.
Length = 459
Score = 27.3 bits (61), Expect = 4.3
Identities = 12/31 (38%), Positives = 18/31 (58%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERT 47
E EE K R++A+E ++ R A +GG T
Sbjct: 416 ESEEGKEMRRRAKELQEICRGAIAKGGSSDT 446
>gnl|CDD|223130 COG0052, RpsB, Ribosomal protein S2 [Translation, ribosomal
structure and biogenesis].
Length = 252
Score = 26.9 bits (60), Expect = 4.6
Identities = 12/30 (40%), Positives = 15/30 (50%)
Query: 41 RGGGERTEEEGGEEEEEEEEEEEEEGLPAS 70
G G +EE EE+EE EE E A+
Sbjct: 222 EGRGGALDEEEAAIEEDEEVEEFEAKEEAA 251
>gnl|CDD|110924 pfam01972, SDH_sah, Serine dehydrogenase proteinase. This family
of archaebacterial proteins, formerly known as DUF114,
has been found to be a serine dehydrogenase proteinase
distantly related to ClpP proteinases that belong to the
serine proteinase superfamily. The family has a
catalytic triad of Ser, Asp, His residues, which shows
an altered residue ordering compared with the ClpP
proteinases but similar to that of the carboxypeptidase
clan.
Length = 286
Score = 27.1 bits (60), Expect = 4.6
Identities = 19/53 (35%), Positives = 26/53 (49%), Gaps = 6/53 (11%)
Query: 60 EEEEEEGLPASTNPPANV-----ISPQSLGHQRPSLDLASSPSVKKRSRHAAR 107
EE +E GL +TN P V + PQ +G QRP ++ P KK A+
Sbjct: 235 EELKELGLEVNTNVPEEVYELMELYPQPMG-QRPPVEYIPVPYKKKEQEKNAK 286
>gnl|CDD|165442 PHA03171, PHA03171, UL37 tegument protein; Provisional.
Length = 499
Score = 27.3 bits (60), Expect = 5.0
Identities = 16/36 (44%), Positives = 18/36 (50%), Gaps = 6/36 (16%)
Query: 44 GERTEEEGGE------EEEEEEEEEEEEGLPASTNP 73
GE EEE + E EEE+EEEE E NP
Sbjct: 83 GEEAEEEDNDRECPDTEAEEEDEEEEIEAPDPEVNP 118
Score = 26.6 bits (58), Expect = 7.1
Identities = 27/94 (28%), Positives = 37/94 (39%), Gaps = 13/94 (13%)
Query: 40 GRGGGERTEE----EGGEEEEEEEEEEEEEGLPASTNPPANVISPQSL-GHQRPSLD--- 91
G E EE E + E EEE+EEEE P +P N + + L G R + D
Sbjct: 80 AEAGEEAEEEDNDRECPDTEAEEEDEEEEIEAP---DPEVNPLDAEGLSGLAREACDALK 136
Query: 92 --LASSPSVKKRSRHAARLNMGDPKDDIHVCILC 123
L + +R R A P+ H + C
Sbjct: 137 KALLRHRFLWQRRRQARCEQHNGPQQSHHAAVFC 170
>gnl|CDD|223003 PHA03169, PHA03169, hypothetical protein; Provisional.
Length = 413
Score = 27.2 bits (60), Expect = 5.0
Identities = 12/66 (18%), Positives = 19/66 (28%), Gaps = 7/66 (10%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGER-------TEEEGGEEEEEEEEEEEEEGLP 68
E + + + EE+ + G + EE E P
Sbjct: 72 DTETAEESRHGEKEERGQGGPSGSGSESVGSPTPSPSGSAEELASGLSPENTSGSSPESP 131
Query: 69 ASTNPP 74
AS +PP
Sbjct: 132 ASHSPP 137
Score = 26.9 bits (59), Expect = 6.9
Identities = 6/36 (16%), Positives = 10/36 (27%)
Query: 39 EGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPP 74
+ ++ E E+E E E E
Sbjct: 218 TPQQAPSPNTQQAVEHEDEPTEPEREGPPFPGHRSH 253
Score = 26.5 bits (58), Expect = 8.8
Identities = 9/47 (19%), Positives = 16/47 (34%)
Query: 18 EEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEE 64
+ E + + +K+ R +G G +E G EE
Sbjct: 68 QTESDTETAEESRHGEKEERGQGGPSGSGSESVGSPTPSPSGSAEEL 114
>gnl|CDD|237063 PRK12329, nusA, transcription elongation factor NusA; Provisional.
Length = 449
Score = 27.1 bits (60), Expect = 5.1
Identities = 17/54 (31%), Positives = 24/54 (44%), Gaps = 5/54 (9%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPA 69
E+ +V + EE++ +R AE ER E E E EE+ E LP
Sbjct: 380 AEDAKVAELISQREEEEALQREAE-----ERLEAEQAERAEEDARLRELYPLPE 428
>gnl|CDD|217348 pfam03064, U79_P34, HSV U79 / HCMV P34. This family represents
herpes virus protein U79 and cytomegalovirus early
phosphoprotein P34 (UL112).
Length = 238
Score = 26.8 bits (59), Expect = 5.2
Identities = 12/53 (22%), Positives = 23/53 (43%)
Query: 14 KLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEG 66
K KE+ V+ +K E+++KK+ +R GG + ++E
Sbjct: 165 KQKEKRRVEDSQKHKEDRRKKQEEKRRNDEDKRPGGGGGSSGGQSGLSTKDEP 217
>gnl|CDD|221581 pfam12446, DUF3682, Protein of unknown function (DUF3682). This
domain family is found in eukaryotes, and is typically
between 125 and 136 amino acids in length.
Length = 133
Score = 26.3 bits (58), Expect = 5.2
Identities = 11/19 (57%), Positives = 12/19 (63%)
Query: 47 TEEEGGEEEEEEEEEEEEE 65
T G +EEEEEEEE E
Sbjct: 88 TSGTGHTRQEEEEEEEENE 106
>gnl|CDD|222948 PHA02941, PHA02941, hypothetical protein; Provisional.
Length = 356
Score = 26.9 bits (59), Expect = 5.2
Identities = 18/61 (29%), Positives = 30/61 (49%), Gaps = 5/61 (8%)
Query: 6 EALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
E ++ + L E K +++++E+ + E EE E++EEEE EEEEE
Sbjct: 295 EMVKVAAKDLILGGEEKEPKQESQEQLFNPFAID-----EEMLEETQEQQEEEENEEEEE 349
Query: 66 G 66
Sbjct: 350 N 350
>gnl|CDD|152901 pfam12467, CMV_1a, Cucumber mosaic virus 1a protein. This domain
family is found in viruses, and is typically between 156
and 171 amino acids in length. The family is found in
association with pfam01443, pfam01660. 1a protein is the
major virulence factor of the cucumber mosaic virus
(CMV). The Ns strain of CMV causes necrotic lesions to
Nicotiana spp. while other strains cause systemic
mosaic. The determinant of the pathogenesis of these
different strains is the specific amino acid residue at
the 461 residue of the 1a protein.
Length = 175
Score = 26.7 bits (59), Expect = 5.4
Identities = 21/90 (23%), Positives = 35/90 (38%), Gaps = 5/90 (5%)
Query: 18 EEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANV 77
V+ + EKKKK A + +E+ EE E+ + E + P N
Sbjct: 69 AWAVEDGKTLRAEKKKKLEEAL----QQPVQEDSVSEEFEDAPDAPSESVRDDVK-PENP 123
Query: 78 ISPQSLGHQRPSLDLASSPSVKKRSRHAAR 107
+ Q+ + P L S+ + +R A R
Sbjct: 124 VVGQTQAPEPPELKSLSTQTRSPDTRLAER 153
>gnl|CDD|117592 pfam09026, Cenp-B_dimeris, Centromere protein B dimerisation
domain. The centromere protein B (CENP-B) dimerisation
domain is composed of two alpha-helices, which are
folded into an antiparallel configuration. Dimerisation
of CENP-B is mediated by this domain, in which monomers
dimerise to form a symmetrical, antiparallel,
four-helix bundle structure with a large hydrophobic
patch in which 23 residues of one monomer form van der
Waals contacts with the other monomer. This CENP-B
dimer configuration may be suitable for capturing two
distant CENP-B boxes during centromeric heterochromatin
formation.
Length = 101
Score = 25.9 bits (56), Expect = 5.7
Identities = 8/23 (34%), Positives = 18/23 (78%)
Query: 43 GGERTEEEGGEEEEEEEEEEEEE 65
G E ++ + EEE++++E+EE++
Sbjct: 8 GEEDSDSDSDEEEDDDDEDEEDD 30
>gnl|CDD|236485 PRK09368, PRK09368, gas vesicle synthesis-like protein; Reviewed.
Length = 140
Score = 26.2 bits (58), Expect = 5.7
Identities = 12/48 (25%), Positives = 16/48 (33%)
Query: 28 AEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPA 75
E + K + G E E G+ +EEEE P P
Sbjct: 88 TESGARGKSKGALTGAAETASEALGQGRGSDEEEERRRERPRPRKAPR 135
>gnl|CDD|100110 cd05832, Ribosomal_L12p, Ribosomal protein L12p. This subfamily
includes archaeal L12p, the protein that is functionally
equivalent to L7/L12 in bacteria and the P1 and P2
proteins in eukaryotes. L12p is homologous to P1 and P2
but is not homologous to bacterial L7/L12. It is located
in the L12 stalk, with proteins L10, L11, and 23S rRNA.
L12p is the only protein in the ribosome to occur as
multimers, always appearing as sets of dimers. Recent
data indicate that most archaeal species contain six
copies of L12p (three homodimers), while eukaryotes have
four copies (two heterodimers), and bacteria may have
four or six copies (two or three homodimers), depending
on the species. The organization of proteins within the
stalk has been characterized primarily in bacteria,
where L7/L12 forms either two or three homodimers and
each homodimer binds to the extended C-terminal helix of
L10. L7/L12 is attached to the ribosome through L10 and
is the only ribosomal protein that does not directly
interact with rRNA. Archaeal L12p is believed to
function in a similar fashion. However, hybrid ribosomes
containing the large subunit from E. coli with an
archaeal stalk are able to bind archaeal and eukaryotic
elongation factors but not bacterial elongation factors.
In several mesophilic and thermophilic archaeal species,
the binding of 23S rRNA to protein L11 and to the
L10/L12p pentameric complex was found to be
temperature-dependent and cooperative.
Length = 106
Score = 25.9 bits (57), Expect = 5.8
Identities = 15/26 (57%), Positives = 18/26 (69%)
Query: 44 GERTEEEGGEEEEEEEEEEEEEGLPA 69
E+ EE+ EEE+EEEEEE GL A
Sbjct: 79 EEKEEEKKKEEEKEEEEEEALAGLGA 104
>gnl|CDD|236877 PRK11192, PRK11192, ATP-dependent RNA helicase SrmB; Provisional.
Length = 434
Score = 26.8 bits (60), Expect = 5.9
Identities = 10/26 (38%), Positives = 17/26 (65%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRR 37
E+K KE+E+ K +++ + K KRR
Sbjct: 401 EKKEKEKEKPKVKKRHRDTKNIGKRR 426
>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).
This family consists of several Plasmodium falciparum
SPAM (secreted polymorphic antigen associated with
merozoites) proteins. Variation among SPAM alleles is
the result of deletions and amino acid substitutions in
non-repetitive sequences within and flanking the
alanine heptad-repeat domain. Heptad repeats in which
the a and d position contain hydrophobic residues
generate amphipathic alpha-helices which give rise to
helical bundles or coiled-coil structures in proteins.
SPAM is an example of a P. falciparum antigen in which
a repetitive sequence has features characteristic of a
well-defined structural element.
Length = 164
Score = 26.4 bits (58), Expect = 6.0
Identities = 18/50 (36%), Positives = 34/50 (68%)
Query: 16 KEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
KE E+VK ++++ +E+++++ E + +EE E+EEEEEE+EE+
Sbjct: 38 KENEDVKDEKQEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEEDEED 87
>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
Length = 1465
Score = 27.1 bits (60), Expect = 6.0
Identities = 13/50 (26%), Positives = 22/50 (44%), Gaps = 5/50 (10%)
Query: 15 LKEEEEVKGQ-----RKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEE 59
K EE + R ++ KK R+A + ++T ++ E E EE
Sbjct: 1173 AKAEEAREKLQRAAARGESGAAKKVSRQAPKKPAPKKTTKKASESETTEE 1222
>gnl|CDD|215153 PLN02271, PLN02271, serine hydroxymethyltransferase.
Length = 586
Score = 26.7 bits (59), Expect = 6.3
Identities = 20/79 (25%), Positives = 30/79 (37%), Gaps = 23/79 (29%)
Query: 39 EGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQ----RPSLDLAS 94
E + E E G+E+EEE+ E+E + LGH RP +S
Sbjct: 53 EQKEEKEEDAGEEGDEDEEEQGEDEHFSI---------------LGHPMCLKRPRDGDSS 97
Query: 95 SPSVKKRSRHAARLNMGDP 113
S S S +++ D
Sbjct: 98 SSS----SSSSSKRAAVDS 112
>gnl|CDD|222977 PHA03089, PHA03089, late transcription factor VLTF-4; Provisional.
Length = 191
Score = 26.3 bits (58), Expect = 6.4
Identities = 17/58 (29%), Positives = 26/58 (44%)
Query: 25 RKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQS 82
+KK +K KKK++ + EE EE EE +++ + LP N A V
Sbjct: 60 KKKTTKKTKKKKKEKEEVPELAAEELSDSEENEENDKKVDYELPKVQNTAAEVNHEDV 117
Score = 26.3 bits (58), Expect = 7.3
Identities = 12/45 (26%), Positives = 20/45 (44%)
Query: 21 VKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
K ++ +KK K+ + + E E EE + EE EE +
Sbjct: 51 SKKKKTTPRKKKTTKKTKKKKKEKEEVPELAAEELSDSEENEEND 95
>gnl|CDD|222648 pfam14283, DUF4366, Domain of unknown function (DUF4366). This
family of proteins is found in bacteria and eukaryotes.
Proteins in this family are typically between 227 and
387 amino acids in length.
Length = 213
Score = 26.5 bits (59), Expect = 7.0
Identities = 14/85 (16%), Positives = 30/85 (35%), Gaps = 25/85 (29%)
Query: 10 TCERKLKEEEEVKGQRKKAEEKKKKKRRAEGR----------GGG--------------- 44
C + E + + + E++ +K+ G GGG
Sbjct: 129 VCSVNMTECTGPEPEPEPEPEEEPEKKSGMGPLLLVLAVALIGGGAYYYFKFYKPKQQEK 188
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPA 69
+++ E + +E+EEE++ P
Sbjct: 189 GAPDDDLDEYDYGDEDEEEDDEPPW 213
>gnl|CDD|220298 pfam09582, AnfO_nitrog, Iron only nitrogenase protein AnfO
(AnfO_nitrog). Proteins in this entry include Anf1 from
Rhodobacter capsulatus (Rhodopseudomonas capsulata) and
AnfO from Azotobacter vinelandii. They are found
exclusively in species which contain the iron-only
nitrogenase, and are encoded immediately downstream of
the structural genes for the nitrogenase enzyme in these
species.
Length = 201
Score = 26.5 bits (59), Expect = 7.1
Identities = 15/40 (37%), Positives = 19/40 (47%), Gaps = 2/40 (5%)
Query: 53 EEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDL 92
E+EEEE ++E E N + P LG R LDL
Sbjct: 103 AEKEEEEAKKEREAADEPPNFD--IPVPLELGDGRFRLDL 140
>gnl|CDD|185618 PTZ00438, PTZ00438, gamete antigen 27/25-like protein; Provisional.
Length = 374
Score = 26.6 bits (58), Expect = 7.3
Identities = 16/58 (27%), Positives = 31/58 (53%), Gaps = 2/58 (3%)
Query: 15 LKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTN 72
+K EEE Q+++ E++ ++ E E EEE ++E+ E+++E+E N
Sbjct: 97 VKNEEERGTQKEEEEDEDVEE--IEEVEEVEVVEEEYDDDEDSEKDDEKESDAEGDEN 152
>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381). This
domain is functionally uncharacterized. This domain is
found in eukaryotes. This presumed domain is typically
between 156 to 174 amino acids in length. This domain is
found associated with pfam07780, pfam01728.
Length = 154
Score = 26.1 bits (58), Expect = 7.6
Identities = 15/34 (44%), Positives = 20/34 (58%)
Query: 32 KKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+KK R+ G E+ EEE E E EE +EEE+
Sbjct: 87 RKKVRKLLGLDKKEKEEEEEEEVEVEELDEEEQI 120
>gnl|CDD|218435 pfam05104, Rib_recp_KP_reg, Ribosome receptor lysine/proline rich
region. This highly conserved region is found towards
the C-terminus of the transmembrane domain. The
function is unclear.
Length = 151
Score = 26.1 bits (57), Expect = 7.6
Identities = 24/80 (30%), Positives = 36/80 (45%), Gaps = 3/80 (3%)
Query: 6 EALRTCERKLKEEEEVKGQRKKAE---EKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEE 62
EAL ++ + + K +KK E EKK K ++ E + G+ E E +E E E
Sbjct: 8 EALAKQRKESGKTQSQKSDKKKKEKVSEKKGKSKKKEEKPNGKIPEHEPNQEVTEVEVII 67
Query: 63 EEEGLPASTNPPANVISPQS 82
E+E +PA P V
Sbjct: 68 EKEPVPAVAVAPVPVAVVAP 87
>gnl|CDD|191249 pfam05279, Asp-B-Hydro_N, Aspartyl beta-hydroxylase N-terminal
region. This family includes the N-terminal regions of
the junctin, junctate and aspartyl beta-hydroxylase
proteins. Junctate is an integral ER/SR membrane calcium
binding protein, which comes from an alternatively
spliced form of the same gene that generates aspartyl
beta-hydroxylase and junctin. Aspartyl beta-hydroxylase
catalyzes the post-translational hydroxylation of
aspartic acid or asparagine residues contained within
epidermal growth factor (EGF) domains of proteins.
Length = 240
Score = 26.4 bits (58), Expect = 7.7
Identities = 17/60 (28%), Positives = 26/60 (43%)
Query: 6 EALRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
L + K K EEEVK Q + EK ++ E G E +E E+ ++ +E
Sbjct: 90 GQLAVRKTKQKVEEEVKEQLQSLLEKIVVSKQEEDGPGKEPQLDEDKFLLAEDSDDRQET 149
>gnl|CDD|236154 PRK08119, PRK08119, flagellar motor switch protein; Validated.
Length = 382
Score = 26.4 bits (59), Expect = 7.7
Identities = 14/50 (28%), Positives = 18/50 (36%)
Query: 47 TEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASSP 96
EEE E EEEE + + PA Q QR + + P
Sbjct: 227 EEEEEEEVEEEEAQASPAAEPATAQAAPAPKQEQQQAPPQRQEPEKEAQP 276
Score = 26.4 bits (59), Expect = 9.0
Identities = 15/31 (48%), Positives = 18/31 (58%)
Query: 52 GEEEEEEEEEEEEEGLPASTNPPANVISPQS 82
GEEEEEEEE EEEE + PA + +
Sbjct: 225 GEEEEEEEEVEEEEAQASPAAEPATAQAAPA 255
>gnl|CDD|235366 PRK05218, PRK05218, heat shock protein 90; Provisional.
Length = 613
Score = 26.6 bits (60), Expect = 7.9
Identities = 15/39 (38%), Positives = 23/39 (58%)
Query: 29 EEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGL 67
EE K ++ RG + +E+ E+EE+EE EEE + L
Sbjct: 459 EEFDGKPFKSVARGDLDLGKEDEEEKEEKEEAEEEFKPL 497
>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1. This domain
family is found in eukaryotes, and is approximately 40
amino acids in length. The family is found in
association with pfam07719, pfam00515. There is a single
completely conserved residue L that may be functionally
important. NARP1 is the mammalian homologue of a yeast
N-terminal acetyltransferase that regulates entry into
the G(0) phase of the cell cycle.
Length = 516
Score = 26.4 bits (59), Expect = 8.1
Identities = 10/29 (34%), Positives = 14/29 (48%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEG 40
E+K ++EE K KK E KK +
Sbjct: 422 EKKAEKEEAEKAAAKKKAEAAAKKAKGPD 450
Score = 26.4 bits (59), Expect = 9.3
Identities = 11/62 (17%), Positives = 28/62 (45%)
Query: 12 ERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAST 71
ERK +++ K ++K +E+ +K + + + E ++ + + E L +
Sbjct: 410 ERKKLRKKQRKAEKKAEKEEAEKAAAKKKAEAAAKKAKGPDGETKKVDPDPLGEKLARTE 469
Query: 72 NP 73
+P
Sbjct: 470 DP 471
>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
component YidC; Validated.
Length = 429
Score = 26.4 bits (58), Expect = 8.2
Identities = 13/93 (13%), Positives = 30/93 (32%), Gaps = 3/93 (3%)
Query: 13 RKLKEEEEVKGQRKK--AEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPAS 70
+ E+ RKK A++++ +R ER + +++GL +
Sbjct: 334 KTRTAEKNEAKARKKEIAQKRRAAEREINREARQERAAAMARARARRAAVKAKKKGLIDA 393
Query: 71 TNPPANVISPQSLGHQRPSLDLASSPSVKKRSR 103
+ + P A++ + R
Sbjct: 394 SPNEDTPSENEESKGSPPQ-VEATTTAEPNREP 425
>gnl|CDD|219922 pfam08595, RXT2_N, RXT2-like, N-terminal. The family represents
the N-terminal region of RXT2-like proteins. In S.
cerevisiae, RXT2 has been demonstrated to be involved in
conjugation with cellular fusion (mating) and invasive
growth. A high throughput localisation study has
localised RXT2 to the nucleus.
Length = 141
Score = 25.8 bits (57), Expect = 8.3
Identities = 18/70 (25%), Positives = 34/70 (48%), Gaps = 6/70 (8%)
Query: 31 KKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSL 90
K R + GG ++E E++EE + E+++E +++P L H PS
Sbjct: 40 KFNNPPRIDEDGGDIDDDDEDDEDDEEADAEDDDENPYKLIR-LEEILAP--LTH--PS- 93
Query: 91 DLASSPSVKK 100
DL + P++ +
Sbjct: 94 DLPTHPAISR 103
>gnl|CDD|178589 PLN03015, PLN03015, UDP-glucosyl transferase.
Length = 470
Score = 26.6 bits (58), Expect = 8.4
Identities = 17/43 (39%), Positives = 24/43 (55%), Gaps = 2/43 (4%)
Query: 3 LGVEALRTCERKLKEEEEVKGQ--RKKAEEKKKKKRRAEGRGG 43
+G E + + RK+ EE+ +GQ R KAEE + RA GG
Sbjct: 411 IGREEVASLVRKIVAEEDEEGQKIRAKAEEVRVSSERAWSHGG 453
>gnl|CDD|183731 PRK12766, PRK12766, 50S ribosomal protein L32e; Provisional.
Length = 232
Score = 26.3 bits (58), Expect = 8.5
Identities = 21/62 (33%), Positives = 29/62 (46%), Gaps = 5/62 (8%)
Query: 42 GGGERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKKR 101
GG E +EE E E+E EEEEEE T + P+ L + P L + + +R
Sbjct: 57 GGLEVSEETEAEVEDEGGEEEEEEDADVETE-----LRPRGLTEKTPELSDEEARLLTQR 111
Query: 102 SR 103
R
Sbjct: 112 RR 113
>gnl|CDD|133433 cd05297, GH4_alpha_glucosidase_galactosidase, Glycoside Hydrolases
Family 4; Alpha-glucosidases and alpha-galactosidases.
Glucosidases cleave glycosidic bonds to release glucose
from oligosaccharides. Alpha-glucosidases and
alpha-galactosidases release alpha-D-glucose and
alpha-D-galactose, respectively, via the hydrolysis of
alpha-glycopyranoside bonds. Some bacteria
simultaneously translocate and phosphorylate
disaccharides via the phosphoenolpyruvate-dependent
phosphotransferase system (PEP-PTS). After
translocation, these phospho-disaccharides may be
hydrolyzed by the GH4 glycoside hydrolases such as the
alpha-glucosidases. Other organsisms (such as archaea
and Thermotoga maritima) lack the PEP-PTS system, but
have several enzymes normally associated with the
PEP-PTS operon. Alpha-glucosidases and
alpha-galactosidases are part of the NAD(P)-binding
Rossmann fold superfamily, which includes a wide variety
of protein families including the NAD(P)-binding domains
of alcohol dehydrogenases, tyrosine-dependent
oxidoreductases, glyceraldehyde-3-phosphate
dehydrogenases, formate/glycerate dehydrogenases,
siroheme synthases, 6-phosphogluconate dehydrogenases,
aminoacid dehydrogenases, repressor rex, and NAD-binding
potassium channel domains, among others.
Length = 423
Score = 26.4 bits (59), Expect = 8.6
Identities = 13/46 (28%), Positives = 16/46 (34%)
Query: 29 EEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTNPP 74
E KK G R EE+G E EE + E +P
Sbjct: 261 ETKKIWYGEFNEDEYGGRDEEQGWEWYEERLKLILAEIDKEELDPV 306
>gnl|CDD|220777 pfam10486, PI3K_1B_p101, Phosphoinositide 3-kinase gamma adapter
protein p101 subunit. Class I PI3Ks are dual-specific
lipid and protein kinases involved in numerous
intracellular signaling pathways. Class IB PI3K,
p110gamma, is mainly activated by seven-transmembrane
G-protein-coupled receptors (GPCRs), through its
regulatory subunit p101 and G-protein beta-gamma
subunits.
Length = 856
Score = 26.5 bits (58), Expect = 8.7
Identities = 18/60 (30%), Positives = 29/60 (48%), Gaps = 4/60 (6%)
Query: 49 EEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASS-PSVKKRSRHAAR 107
E E EEEEEE++E +GL + +++S S+ S S S+ +RH+
Sbjct: 313 SEDEEVEEEEEEDDETDGLSPERD---SLLSNSSVYSNDSSDSKEDSSMSMSNLARHSLT 369
>gnl|CDD|223068 PHA03384, PHA03384, early DNA-binding protein E2A; Provisional.
Length = 445
Score = 26.2 bits (58), Expect = 8.7
Identities = 19/65 (29%), Positives = 25/65 (38%), Gaps = 1/65 (1%)
Query: 45 ERTEEEGGEEEEEEEEEEEEEGLPASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSRH 104
++ EEEEEEEE E + PP + + G + S KK S
Sbjct: 29 KKGRRRVSPVEEEEEEEEAEVVAVGFSYPPVRISRGKD-GKRPVRPLKEEKDSEKKASTE 87
Query: 105 AARLN 109
AA N
Sbjct: 88 AAVRN 92
>gnl|CDD|240412 PTZ00420, PTZ00420, coronin; Provisional.
Length = 568
Score = 26.5 bits (58), Expect = 8.9
Identities = 13/56 (23%), Positives = 22/56 (39%), Gaps = 7/56 (12%)
Query: 17 EEEEVKGQRKKAE-------EKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEE 65
+E+E+ ++ A+ E+ + G E EEE + E E E E
Sbjct: 462 KEKELLTEKGGAQFSSANSLERGADEDYLIVNGTNEPYEEEVIKTNENENFPLENE 517
>gnl|CDD|217884 pfam04086, SRP-alpha_N, Signal recognition particle, alpha subunit,
N-terminal. SRP is a complex of six distinct
polypeptides and a 7S RNA that is essential for
transferring nascent polypeptide chains that are
destined for export from the cell to the translocation
apparatus of the endoplasmic reticulum (ER) membrane.
SRP binds hydrophobic signal sequences as they emerge
from the ribosome, and arrests translation.
Length = 272
Score = 26.3 bits (58), Expect = 9.0
Identities = 25/109 (22%), Positives = 47/109 (43%), Gaps = 7/109 (6%)
Query: 8 LRTCERKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGL 67
LR E++ K++ + + EE KK K+ + ER +E G + ++ ++ +E
Sbjct: 94 LRELEKESKKQAKSPKAMRTFEESKKSKKTVDS--MIERKPKEPGLKRKQRKKAQESATS 151
Query: 68 PASTNPPANVISPQSLGHQRPSLDLASSPSVKKRSRHAARLNMGDPKDD 116
P S+ S S H L + +R++ AA+L+ D
Sbjct: 152 PESSPSSTPNSSRPSTPHL-----LKAKEGPSRRAKKAAKLSSTASSGD 195
>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein.
Length = 529
Score = 26.3 bits (58), Expect = 9.1
Identities = 16/45 (35%), Positives = 28/45 (62%), Gaps = 5/45 (11%)
Query: 18 EEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEE 62
E E + ++ EE+K++K+ E E+T ++ E +EEEE+EE
Sbjct: 30 EVEKEVPDEEEEEEKEEKKEEE-----EKTTDKEEEVDEEEEKEE 69
>gnl|CDD|100108 cd04411, Ribosomal_P1_P2_L12p, Ribosomal protein P1, P2, and
L12p. Ribosomal proteins P1 and P2 are the eukaryotic
proteins that are functionally equivalent to bacterial
L7/L12. L12p is the archaeal homolog. Unlike other
ribosomal proteins, the archaeal L12p and eukaryotic P1
and P2 do not share sequence similarity with their
bacterial counterparts. They are part of the ribosomal
stalk (called the L7/L12 stalk in bacteria), along with
28S rRNA and the proteins L11 and P0 in eukaryotes (23S
rRNA, L11, and L10e in archaea). In bacterial
ribosomes, L7/L12 homodimers bind the extended
C-terminal helix of L10 to anchor the L7/L12 molecules
to the ribosome. Eukaryotic P1/P2 heterodimers and
archaeal L12p homodimers are believed to bind the L10
equivalent proteins, eukaryotic P0 and archaeal L10e,
in a similar fashion. P1 and P2 (L12p, L7/L12) are the
only proteins in the ribosome to occur as multimers,
always appearing as sets of dimers. Recent data
indicate that most archaeal species contain six copies
of L12p (three homodimers), while eukaryotes have two
copies each of P1 and P2 (two heterodimers). Bacteria
may have four or six copies (two or three homodimers),
depending on the species. As in bacteria, the stalk is
crucial for binding of initiation, elongation, and
release factors in eukaryotes and archaea.
Length = 105
Score = 25.3 bits (55), Expect = 9.4
Identities = 11/23 (47%), Positives = 13/23 (56%)
Query: 43 GGERTEEEGGEEEEEEEEEEEEE 65
T E + EE +EEEEEEE
Sbjct: 74 TAAATAEPAEKAEEAKEEEEEEE 96
>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
and chromosome partitioning].
Length = 420
Score = 26.2 bits (58), Expect = 9.9
Identities = 9/63 (14%), Positives = 18/63 (28%)
Query: 13 RKLKEEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTN 72
++++++ R K + AE R E E + +
Sbjct: 218 ELSADQKKLEELRANESRLKNEIASAEAAAAKAREAAAAAEAAAARARAAEAKRTGETYK 277
Query: 73 PPA 75
P A
Sbjct: 278 PTA 280
>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
subunit (TFIIF-alpha). Transcription initiation factor
IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
II-associating protein 74 (RAP74) is the large subunit
of transcription factor IIF (TFIIF), which is essential
for accurate initiation and stimulates elongation by RNA
polymerase II.
Length = 528
Score = 26.1 bits (57), Expect = 10.0
Identities = 10/56 (17%), Positives = 26/56 (46%)
Query: 17 EEEEVKGQRKKAEEKKKKKRRAEGRGGGERTEEEGGEEEEEEEEEEEEEGLPASTN 72
EEE + +K KK ++ +G+ G ++ ++ ++ + + E+ + T
Sbjct: 323 SEEEKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDSGDDSDDSDIDGEDSVSLVTA 378
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.309 0.128 0.349
Gapped
Lambda K H
0.267 0.0812 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 8,552,812
Number of extensions: 823872
Number of successful extensions: 7523
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4977
Number of HSP's successfully gapped: 1030
Length of query: 164
Length of database: 10,937,602
Length adjustment: 90
Effective length of query: 74
Effective length of database: 6,945,742
Effective search space: 513984908
Effective search space used: 513984908
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.7 bits)
S2: 55 (24.8 bits)