RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy6373
(199 letters)
>gnl|CDD|133004 cd02510, pp-GalNAc-T, pp-GalNAc-T initiates the formation of
mucin-type O-linked glycans. UDP-GalNAc: polypeptide
alpha-N-acetylgalactosaminyltransferases (pp-GalNAc-T)
initiate the formation of mucin-type, O-linked glycans
by catalyzing the transfer of
alpha-N-acetylgalactosamine (GalNAc) from UDP-GalNAc to
hydroxyl groups of Ser or Thr residues of core proteins
to form the Tn antigen (GalNAc-a-1-O-Ser/Thr). These
enzymes are type II membrane proteins with a GT-A type
catalytic domain and a lectin domain located on the
lumen side of the Golgi apparatus. In human, there are
15 isozymes of pp-GalNAc-Ts, representing the largest of
all glycosyltransferase families. Each isozyme has
unique but partially redundant substrate specificity for
glycosylation sites on acceptor proteins.
Length = 299
Score = 203 bits (520), Expect = 8e-66
Identities = 72/109 (66%), Positives = 87/109 (79%), Gaps = 2/109 (1%)
Query: 3 RTPMIAGGLFSIDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCSRVGHVF 62
R+P +AGGLF+IDR++F +LG YD M +WGGENLE+SF+VWQCGGS+EIVPCSRVGH+F
Sbjct: 167 RSPTMAGGLFAIDREWFLELGGYDEGMDIWGGENLELSFKVWQCGGSIEIVPCSRVGHIF 226
Query: 63 R-KRHPYTFPGGSGNVFARNTRRAAEVWMDNYKHYYYAEVPLAKTIPFG 110
R KR PYTFPGGSG V RN +R AEVWMD YK Y+Y P + I +G
Sbjct: 227 RRKRKPYTFPGGSGTV-LRNYKRVAEVWMDEYKEYFYKARPELRNIDYG 274
>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger. [Transport and
binding proteins, Cations and iron carrying compounds].
Length = 1096
Score = 39.6 bits (92), Expect = 7e-04
Identities = 17/24 (70%), Positives = 23/24 (95%)
Query: 117 NGGSSEEEEEEKEKKKEEEEEEEQ 140
+GG SEEEEEE+E+++EEEEEEE+
Sbjct: 859 DGGDSEEEEEEEEEEEEEEEEEEE 882
Score = 36.5 bits (84), Expect = 0.007
Identities = 16/25 (64%), Positives = 22/25 (88%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQ 140
+ G S EEEEEE+E+++EEEEEEE+
Sbjct: 859 DGGDSEEEEEEEEEEEEEEEEEEEE 883
Score = 33.8 bits (77), Expect = 0.060
Identities = 15/31 (48%), Positives = 22/31 (70%)
Query: 110 GETLNLENGGSSEEEEEEKEKKKEEEEEEEQ 140
G++ E EEEEEE+E+++EEEEEE +
Sbjct: 861 GDSEEEEEEEEEEEEEEEEEEEEEEEEEENE 891
Score = 33.0 bits (75), Expect = 0.095
Identities = 13/19 (68%), Positives = 18/19 (94%)
Query: 122 EEEEEEKEKKKEEEEEEEQ 140
EEEEEE+E+++EEEEE E+
Sbjct: 874 EEEEEEEEEEEEEEEENEE 892
Score = 30.0 bits (67), Expect = 0.84
Identities = 13/20 (65%), Positives = 17/20 (85%)
Query: 122 EEEEEEKEKKKEEEEEEEQS 141
EEEEEE+E+++EEE EE S
Sbjct: 876 EEEEEEEEEEEEEENEEPLS 895
Score = 26.9 bits (59), Expect = 9.4
Identities = 24/90 (26%), Positives = 32/90 (35%), Gaps = 3/90 (3%)
Query: 104 AKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEE---QSRGGNRALRPSTYTRHRHQT 160
+ E N E G E+E E E K E E E E + +G H+
Sbjct: 649 GERPTEAEGENGEESGGEAEQEGETETKGENESEGEIPAERKGEQEGEGEIEAKEADHKG 708
Query: 161 SFIDQEVAYMKMTSKKYRMGEDEEEKEEEG 190
+EV + T + E E E EEG
Sbjct: 709 ETEAEEVEHEGETEAEGTEDEGEIETGEEG 738
>gnl|CDD|217196 pfam02709, Glyco_transf_7C, N-terminal domain of
galactosyltransferase. This is the N-terminal domain
of a family of galactosyltransferases from a wide range
of Metazoa with three related galactosyltransferases
activities, all three of which are possessed by one
sequence in some cases. EC:2.4.1.90,
N-acetyllactosamine synthase; EC:2.4.1.38,
Beta-N-acetylglucosaminyl-glycopeptide beta-1,4-
galactosyltransferase; and EC:2.4.1.22 Lactose
synthase. Note that N-acetyllactosamine synthase is a
component of Lactose synthase along with
alpha-lactalbumin, in the absence of alpha-lactalbumin
EC:2.4.1.90 is the catalyzed reaction.
Length = 78
Score = 34.5 bits (80), Expect = 0.004
Identities = 11/50 (22%), Positives = 23/50 (46%)
Query: 7 IAGGLFSIDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCS 56
GG+ + ++ F K+ + WGGE+ ++ R+ G +E +
Sbjct: 19 YFGGVLAFSKEDFLKVNGFSNNFWGWGGEDDDLYARLLLAGLKIERPKFA 68
>gnl|CDD|133029 cd04186, GT_2_like_c, Subfamily of Glycosyltransferase Family GT2
of unknown function. GT-2 includes diverse families of
glycosyltransferases with a common GT-A type structural
fold, which has two tightly associated beta/alpha/beta
domains that tend to form a continuous central sheet of
at least eight beta-strands. These are enzymes that
catalyze the transfer of sugar moieties from activated
donor molecules to specific acceptor molecules, forming
glycosidic bonds. Glycosyltransferases have been
classified into more than 90 distinct sequence based
families.
Length = 166
Score = 34.8 bits (81), Expect = 0.012
Identities = 13/57 (22%), Positives = 29/57 (50%), Gaps = 1/57 (1%)
Query: 4 TPMIAGGLFSIDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCSRVGH 60
P ++G + R+ FE++G +D ++ E++++ R G + VP + + H
Sbjct: 109 GPKVSGAFLLVRREVFEEVGGFDEDFFLYY-EDVDLCLRARLAGYRVLYVPQAVIYH 164
>gnl|CDD|235795 PRK06402, rpl12p, 50S ribosomal protein L12P; Reviewed.
Length = 106
Score = 33.4 bits (77), Expect = 0.016
Identities = 13/27 (48%), Positives = 22/27 (81%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
EE++EE+E+++E+EE EE++ G AL
Sbjct: 78 EEKKEEEEEEEEKEESEEEAAAGLGAL 104
Score = 30.7 bits (70), Expect = 0.15
Identities = 9/20 (45%), Positives = 17/20 (85%)
Query: 122 EEEEEEKEKKKEEEEEEEQS 141
EE++E+E+++EE+EE E+
Sbjct: 77 AEEKKEEEEEEEEKEESEEE 96
Score = 30.3 bits (69), Expect = 0.25
Identities = 10/18 (55%), Positives = 15/18 (83%)
Query: 122 EEEEEEKEKKKEEEEEEE 139
EE+KE+++EEEE+EE
Sbjct: 75 AAAEEKKEEEEEEEEKEE 92
Score = 30.3 bits (69), Expect = 0.25
Identities = 10/20 (50%), Positives = 16/20 (80%)
Query: 122 EEEEEEKEKKKEEEEEEEQS 141
EEK++++EEEEE+E+S
Sbjct: 74 AAAAEEKKEEEEEEEEKEES 93
Score = 26.8 bits (60), Expect = 3.5
Identities = 8/18 (44%), Positives = 13/18 (72%)
Query: 122 EEEEEEKEKKKEEEEEEE 139
E++K++EEEEEE+
Sbjct: 73 AAAAAEEKKEEEEEEEEK 90
Score = 26.1 bits (58), Expect = 7.2
Identities = 10/18 (55%), Positives = 12/18 (66%)
Query: 122 EEEEEEKEKKKEEEEEEE 139
+EKK+EEEEEEE
Sbjct: 72 AAAAAAEEKKEEEEEEEE 89
>gnl|CDD|215914 pfam00428, Ribosomal_60s, 60s Acidic ribosomal protein. This
family includes archaebacterial L12, eukaryotic P0, P1
and P2.
Length = 88
Score = 33.0 bits (76), Expect = 0.018
Identities = 10/22 (45%), Positives = 14/22 (63%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ +E+KKEEEEEEE
Sbjct: 57 AAAAAAAAAAEEEKKEEEEEEE 78
Score = 33.0 bits (76), Expect = 0.021
Identities = 9/22 (40%), Positives = 17/22 (77%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ EEEK++++EEEEE++
Sbjct: 60 AAAAAAAEEEKKEEEEEEEEDD 81
Score = 32.6 bits (75), Expect = 0.023
Identities = 9/22 (40%), Positives = 16/22 (72%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ EE++K++EEEEEE+
Sbjct: 59 AAAAAAAAEEEKKEEEEEEEED 80
Score = 32.2 bits (74), Expect = 0.031
Identities = 11/22 (50%), Positives = 15/22 (68%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ E+EKK+EEEEEEE
Sbjct: 58 AAAAAAAAAEEEKKEEEEEEEE 79
Score = 30.7 bits (70), Expect = 0.14
Identities = 8/24 (33%), Positives = 14/24 (58%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEEQS 141
++ E++K+EEEEEE+
Sbjct: 56 AAAAAAAAAAAEEEKKEEEEEEEE 79
>gnl|CDD|100109 cd05831, Ribosomal_P1, Ribosomal protein P1. This subfamily
represents the eukaryotic large ribosomal protein P1.
Eukaryotic P1 and P2 are functionally equivalent to the
bacterial protein L7/L12, but are not homologous to
L7/L12. P1 is located in the L12 stalk, with proteins
P2, P0, L11, and 28S rRNA. P1 and P2 are the only
proteins in the ribosome to occur as multimers, always
appearing as sets of heterodimers. Recent data indicate
that eukaryotes have four copies (two heterodimers),
while most archaeal species contain six copies of L12p
(three homodimers) and bacteria may have four or six
copies (two or three homodimers), depending on the
species. Experiments using S. cerevisiae P1 and P2
indicate that P1 proteins are positioned more internally
with limited reactivity in the C-terminal domains, while
P2 proteins seem to be more externally located and are
more likely to interact with other cellular components.
In lower eukaryotes, P1 and P2 are further subdivided
into P1A, P1B, P2A, and P2B, which form P1A/P2B and
P1B/P2A heterodimers. Some plant species have a third
P-protein, called P3, which is not homologous to P1 and
P2. In humans, P1 and P2 are strongly autoimmunogenic.
They play a significant role in the etiology and
pathogenesis of systemic lupus erythema (SLE). In
addition, the ribosome-inactivating protein
trichosanthin (TCS) interacts with human P0, P1, and P2,
with its primary binding site located in the C-terminal
region of P2. TCS inactivates the ribosome by
depurinating a specific adenine in the sarcin-ricin loop
of 28S rRNA.
Length = 103
Score = 32.7 bits (75), Expect = 0.031
Identities = 11/22 (50%), Positives = 15/22 (68%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ E +KE+KKEEEEEE
Sbjct: 73 AAAAAAAEAKKEEKKEEEEEES 94
Score = 30.8 bits (70), Expect = 0.17
Identities = 10/26 (38%), Positives = 17/26 (65%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEEQSRG 143
++ E +++EKK+EEEEE + G
Sbjct: 74 AAAAAAEAKKEEKKEEEEEESDDDMG 99
Score = 28.8 bits (65), Expect = 0.76
Identities = 9/22 (40%), Positives = 15/22 (68%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ E K+++K+EEEEEE
Sbjct: 72 AAAAAAAAEAKKEEKKEEEEEE 93
>gnl|CDD|179712 PRK04019, rplP0, acidic ribosomal protein P0; Validated.
Length = 330
Score = 33.7 bits (78), Expect = 0.042
Identities = 13/22 (59%), Positives = 18/22 (81%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRG 143
EEEEEE+E+++EE EEE + G
Sbjct: 303 EEEEEEEEEEEEEPSEEEAAAG 324
Score = 32.9 bits (76), Expect = 0.076
Identities = 15/27 (55%), Positives = 21/27 (77%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
EEEEEE+E+++EEE EE++ G AL
Sbjct: 302 EEEEEEEEEEEEEEPSEEEAAAGLGAL 328
Score = 31.8 bits (73), Expect = 0.20
Identities = 12/31 (38%), Positives = 19/31 (61%)
Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQS 141
E + + + EE+E+++EEEEEEE S
Sbjct: 287 ELKEVLSAQAQAAAAEEEEEEEEEEEEEEPS 317
Score = 31.0 bits (71), Expect = 0.36
Identities = 11/22 (50%), Positives = 18/22 (81%)
Query: 120 SSEEEEEEKEKKKEEEEEEEQS 141
++ EEEEE+E+++EEEE E+
Sbjct: 299 AAAEEEEEEEEEEEEEEPSEEE 320
>gnl|CDD|234311 TIGR03685, L12P_arch, 50S ribosomal protein L12P. This model
represents the L12P protein of the large (50S) subunit
of the archaeal ribosome.
Length = 105
Score = 31.9 bits (73), Expect = 0.059
Identities = 16/27 (59%), Positives = 22/27 (81%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
EEEEEE+E+++EEEE EE++ G AL
Sbjct: 77 EEEEEEEEEEEEEEESEEEAMAGLGAL 103
Score = 31.2 bits (71), Expect = 0.12
Identities = 12/22 (54%), Positives = 18/22 (81%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ EEEE+E+++EEEEEEE
Sbjct: 70 AAAAAAEEEEEEEEEEEEEEEE 91
Score = 30.8 bits (70), Expect = 0.16
Identities = 12/23 (52%), Positives = 19/23 (82%)
Query: 119 GSSEEEEEEKEKKKEEEEEEEQS 141
++ EEE+E+++EEEEEEE+S
Sbjct: 70 AAAAAAEEEEEEEEEEEEEEEES 92
Score = 30.4 bits (69), Expect = 0.18
Identities = 12/27 (44%), Positives = 19/27 (70%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEEQSRGG 144
++ EEEEE+E+++EEEEEE +
Sbjct: 71 AAAAAEEEEEEEEEEEEEEEESEEEAM 97
Score = 30.0 bits (68), Expect = 0.30
Identities = 13/23 (56%), Positives = 18/23 (78%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGG 144
EEEEEE+E+++EEEEE E+
Sbjct: 76 EEEEEEEEEEEEEEEESEEEAMA 98
Score = 27.3 bits (61), Expect = 2.3
Identities = 9/22 (40%), Positives = 15/22 (68%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ E+E+++EEEEEEE
Sbjct: 67 AAAAAAAAAEEEEEEEEEEEEE 88
Score = 26.9 bits (60), Expect = 3.5
Identities = 8/22 (36%), Positives = 14/22 (63%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ +E+++EEEEEEE
Sbjct: 66 AAAAAAAAAAEEEEEEEEEEEE 87
>gnl|CDD|220577 pfam10111, Glyco_tranf_2_2, Glycosyltransferase like family 2.
Members of this family of prokaryotic proteins include
putative glucosyltransferase, which are involved in
bacterial capsule biosynthesis.
Length = 278
Score = 33.1 bits (76), Expect = 0.068
Identities = 15/65 (23%), Positives = 25/65 (38%), Gaps = 2/65 (3%)
Query: 8 AGGLFSIDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCSRVGHVFRKRHP 67
A I+R +F K+G +D GGE+ E+ +R+ + P
Sbjct: 166 ASSCILINRDFFLKIGGFDENFRGHGGEDFELLYRLLLYYKKFPPPKDLLTYD--EYKWP 223
Query: 68 YTFPG 72
T+ G
Sbjct: 224 ITYSG 228
>gnl|CDD|100110 cd05832, Ribosomal_L12p, Ribosomal protein L12p. This subfamily
includes archaeal L12p, the protein that is functionally
equivalent to L7/L12 in bacteria and the P1 and P2
proteins in eukaryotes. L12p is homologous to P1 and P2
but is not homologous to bacterial L7/L12. It is located
in the L12 stalk, with proteins L10, L11, and 23S rRNA.
L12p is the only protein in the ribosome to occur as
multimers, always appearing as sets of dimers. Recent
data indicate that most archaeal species contain six
copies of L12p (three homodimers), while eukaryotes have
four copies (two heterodimers), and bacteria may have
four or six copies (two or three homodimers), depending
on the species. The organization of proteins within the
stalk has been characterized primarily in bacteria,
where L7/L12 forms either two or three homodimers and
each homodimer binds to the extended C-terminal helix of
L10. L7/L12 is attached to the ribosome through L10 and
is the only ribosomal protein that does not directly
interact with rRNA. Archaeal L12p is believed to
function in a similar fashion. However, hybrid ribosomes
containing the large subunit from E. coli with an
archaeal stalk are able to bind archaeal and eukaryotic
elongation factors but not bacterial elongation factors.
In several mesophilic and thermophilic archaeal species,
the binding of 23S rRNA to protein L11 and to the
L10/L12p pentameric complex was found to be
temperature-dependent and cooperative.
Length = 106
Score = 31.7 bits (72), Expect = 0.084
Identities = 16/27 (59%), Positives = 23/27 (85%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
EE+EEEK+K++E+EEEEE++ G AL
Sbjct: 79 EEKEEEKKKEEEKEEEEEEALAGLGAL 105
Score = 28.2 bits (63), Expect = 1.4
Identities = 11/19 (57%), Positives = 17/19 (89%)
Query: 121 SEEEEEEKEKKKEEEEEEE 139
E+ EE++E+KK+EEE+EE
Sbjct: 75 EEKAEEKEEEKKKEEEKEE 93
Score = 27.1 bits (60), Expect = 3.0
Identities = 11/22 (50%), Positives = 17/22 (77%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEE 137
E EEE++++E+K+EEEEE
Sbjct: 76 EKAEEKEEEKKKEEEKEEEEEE 97
Score = 26.7 bits (59), Expect = 4.6
Identities = 10/27 (37%), Positives = 18/27 (66%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRAL 148
EE+ +EK++E+++EEE+ AL
Sbjct: 73 AAEEKAEEKEEEKKKEEEKEEEEEEAL 99
Score = 26.3 bits (58), Expect = 5.2
Identities = 11/22 (50%), Positives = 20/22 (90%)
Query: 120 SSEEEEEEKEKKKEEEEEEEQS 141
++EE+ EEKE++K++EEE+E+
Sbjct: 73 AAEEKAEEKEEEKKKEEEKEEE 94
Score = 25.9 bits (57), Expect = 7.7
Identities = 8/18 (44%), Positives = 14/18 (77%)
Query: 122 EEEEEEKEKKKEEEEEEE 139
EE+ E+K+EE+++EE
Sbjct: 72 AAAEEKAEEKEEEKKKEE 89
>gnl|CDD|100111 cd05833, Ribosomal_P2, Ribosomal protein P2. This subfamily
represents the eukaryotic large ribosomal protein P2.
Eukaryotic P1 and P2 are functionally equivalent to the
bacterial protein L7/L12, but are not homologous to
L7/L12. P2 is located in the L12 stalk, with proteins
P1, P0, L11, and 28S rRNA. P1 and P2 are the only
proteins in the ribosome to occur as multimers, always
appearing as sets of heterodimers. Recent data indicate
that eukaryotes have four copies (two heterodimers),
while most archaeal species contain six copies of L12p
(three homodimers). Bacteria may have four or six copies
of L7/L12 (two or three homodimers) depending on the
species. Experiments using S. cerevisiae P1 and P2
indicate that P1 proteins are positioned more internally
with limited reactivity in the C-terminal domains, while
P2 proteins seem to be more externally located and are
more likely to interact with other cellular components.
In lower eukaryotes, P1 and P2 are further subdivided
into P1A, P1B, P2A, and P2B, which form P1A/P2B and
P1B/P2A heterodimers. Some plants have a third
P-protein, called P3, which is not homologous to P1 and
P2. In humans, P1 and P2 are strongly autoimmunogenic.
They play a significant role in the etiology and
pathogenesis of systemic lupus erythema (SLE). In
addition, the ribosome-inactivating protein
trichosanthin (TCS) interacts with human P0, P1, and P2,
with its primary binding site in the C-terminal region
of P2. TCS inactivates the ribosome by depurinating a
specific adenine in the sarcin-ricin loop of 28S rRNA.
Length = 109
Score = 30.7 bits (70), Expect = 0.18
Identities = 9/23 (39%), Positives = 13/23 (56%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEEQ 140
++ +KE+KKEE EEE
Sbjct: 78 AAAAAAAAAKKEEKKEESEEESD 100
Score = 26.8 bits (60), Expect = 4.3
Identities = 7/22 (31%), Positives = 13/22 (59%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
++ K+++K+EE EEE
Sbjct: 77 AAAAAAAAAAKKEEKKEESEEE 98
Score = 26.5 bits (59), Expect = 4.4
Identities = 8/28 (28%), Positives = 15/28 (53%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEEQSRGGN 145
++ +++EKK+E EEE + G
Sbjct: 79 AAAAAAAAKKEEKKEESEEESDDDMGFG 106
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 32.4 bits (73), Expect = 0.18
Identities = 14/35 (40%), Positives = 21/35 (60%)
Query: 124 EEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRH 158
EE E+EK+KE+E E E+ R RA + S+ +
Sbjct: 598 EEREREKEKEKEREREREREAERAAKASSSSHESR 632
>gnl|CDD|224969 COG2058, RPP1A, Ribosomal protein L12E/L44/L45/RPP1/RPP2
[Translation, ribosomal structure and biogenesis].
Length = 109
Score = 29.7 bits (67), Expect = 0.34
Identities = 10/23 (43%), Positives = 14/23 (60%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGG 144
E +E E+E+K+EE EEE
Sbjct: 82 EADEAEEEEKEEEAEEESDDDML 104
Score = 29.7 bits (67), Expect = 0.38
Identities = 11/31 (35%), Positives = 17/31 (54%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEEQSRGGNRAL 148
++ E +E +E++KEEE EEE L
Sbjct: 77 AEAAAEADEAEEEEKEEEAEEESDDDMLFGL 107
Score = 28.5 bits (64), Expect = 1.1
Identities = 10/24 (41%), Positives = 17/24 (70%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEEQS 141
G + E +E E++++EEE EE+S
Sbjct: 76 GAEAAAEADEAEEEEKEEEAEEES 99
Score = 27.0 bits (60), Expect = 3.1
Identities = 9/29 (31%), Positives = 18/29 (62%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGG 144
E ++E EEE+++++ EEE ++ G
Sbjct: 78 EAAAEADEAEEEEKEEEAEEESDDDMLFG 106
>gnl|CDD|114172 pfam05432, BSP_II, Bone sialoprotein II (BSP-II). Bone
sialoprotein (BSP) is a major structural protein of the
bone matrix that is specifically expressed by
fully-differentiated osteoblasts. The expression of bone
sialoprotein (BSP) is normally restricted to mineralised
connective tissues of bones and teeth where it has been
associated with mineral crystal formation. However, it
has been found that ectopic expression of BSP occurs in
various lesions, including oral and extraoral
carcinomas, in which it has been associated with the
formation of microcrystalline deposits and the
metastasis of cancer cells to bone.
Length = 291
Score = 30.8 bits (69), Expect = 0.35
Identities = 26/79 (32%), Positives = 35/79 (44%), Gaps = 11/79 (13%)
Query: 120 SSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRM 179
S E+EEEE+E+++EE E EE +G N ST H + +S D
Sbjct: 135 SDEDEEEEEEEEEEEAEVEENEQGTNGTSTNSTEVDHGNGSSGGDNG-----------EE 183
Query: 180 GEDEEEKEEEGTGRVGEGK 198
GE+E E E G G
Sbjct: 184 GEEESVTEAEAEGTTVAGP 202
>gnl|CDD|177133 MTH00061, ND4L, NADH dehydrogenase subunit 4L; Provisional.
Length = 86
Score = 29.1 bits (66), Expect = 0.38
Identities = 9/33 (27%), Positives = 14/33 (42%), Gaps = 11/33 (33%)
Query: 27 MAMSVWGGENLEISF------RVWQCGGSLEIV 53
M + +E+ RVW+C LE+V
Sbjct: 57 MVIFT-----VEVILGLVVLTRVWECSSLLELV 84
>gnl|CDD|185582 PTZ00373, PTZ00373, 60S Acidic ribosomal protein P2; Provisional.
Length = 112
Score = 29.5 bits (66), Expect = 0.47
Identities = 13/19 (68%), Positives = 15/19 (78%)
Query: 125 EEEKEKKKEEEEEEEQSRG 143
E +KE+KKEEEEEEE G
Sbjct: 89 EAKKEEKKEEEEEEEDDLG 107
Score = 26.8 bits (59), Expect = 4.0
Identities = 9/24 (37%), Positives = 18/24 (75%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEEEE 139
G+ E ++E++K++EEEEE++
Sbjct: 82 ATAGAKAEAKKEEKKEEEEEEEDD 105
>gnl|CDD|148682 pfam07222, PBP_sp32, Proacrosin binding protein sp32. This family
consists of several mammalian specific proacrosin
binding protein sp32 sequences. sp32 is a sperm specific
protein which is known to bind with with 55- and 53-kDa
proacrosins and the 49-kDa acrosin intermediate. The
exact function of sp32 is unclear, it is thought however
that the binding of sp32 to proacrosin may be involved
in packaging the acrosin zymogen into the acrosomal
matrix.
Length = 243
Score = 30.0 bits (67), Expect = 0.66
Identities = 25/100 (25%), Positives = 41/100 (41%), Gaps = 2/100 (2%)
Query: 99 AEV-PLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHR 157
AEV P T+P E + S + E EE + S GG+ + + +
Sbjct: 142 AEVQPTTMTLPIAEHPTITENQSFQPWPERLHNNVEELLQSSLSLGGSVQV-KAPKPKQE 200
Query: 158 HQTSFIDQEVAYMKMTSKKYRMGEDEEEKEEEGTGRVGEG 197
S + + + K K+ + ++EEE EEE G+G
Sbjct: 201 QLLSKLQEYLQEHKTEEKQPQEEQEEEEVEEEAKQEEGQG 240
>gnl|CDD|240285 PTZ00135, PTZ00135, 60S acidic ribosomal protein P0; Provisional.
Length = 310
Score = 30.0 bits (68), Expect = 0.84
Identities = 8/25 (32%), Positives = 10/25 (40%)
Query: 120 SSEEEEEEKEKKKEEEEEEEQSRGG 144
++ EEEEEEE G
Sbjct: 282 AAAAAAAAAAAPAEEEEEEEDDMGF 306
Score = 29.2 bits (66), Expect = 1.2
Identities = 7/23 (30%), Positives = 9/23 (39%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGG 144
+EEEEEE+ G
Sbjct: 285 AAAAAAAAPAEEEEEEEDDMGFG 307
>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
(vWA) domain [General function prediction only].
Length = 4600
Score = 30.0 bits (67), Expect = 0.91
Identities = 18/83 (21%), Positives = 30/83 (36%), Gaps = 7/83 (8%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQ---TSFIDQEVAYMKM 172
E+G +E E+ + + +EE +G P H +
Sbjct: 4064 EDGFEENVQENEESTEDGVKSDEELEQGE----VPEDQAIDNHPKMDAKSTFASAEADEE 4119
Query: 173 TSKKYRMGEDEEEKEEEGTGRVG 195
+ K +GE+EE EE+G G
Sbjct: 4120 NTDKGIVGENEELGEEDGVRGNG 4142
>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis). This nucleolar
family of proteins are involved in 60S ribosomal
biogenesis. They are specifically involved in the
processing beyond the 27S stage of 25S rRNA maturation.
This family contains sequences that bear similarity to
the glioma tumour suppressor candidate region gene 2
protein (p60). This protein has been found to interact
with herpes simplex type 1 regulatory proteins.
Length = 387
Score = 29.7 bits (67), Expect = 0.96
Identities = 15/76 (19%), Positives = 26/76 (34%), Gaps = 10/76 (13%)
Query: 123 EEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMGED 182
E+E + EKK++E E E+ + A S + + + E
Sbjct: 204 EKEVKAEKKRQELERVEEKKLEKMAPEASRL-DEMSEGLLEESDDD---------GEEES 253
Query: 183 EEEKEEEGTGRVGEGK 198
++E EG E
Sbjct: 254 DDESAWEGFESEYEPI 269
>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
Length = 413
Score = 29.7 bits (67), Expect = 1.1
Identities = 12/35 (34%), Positives = 21/35 (60%)
Query: 110 GETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGG 144
E N E+ SE+++++K++KKE + E E G
Sbjct: 59 KEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGF 93
>gnl|CDD|218223 pfam04712, Radial_spoke, Radial spokehead-like protein. This
family includes the radial spoke head proteins RSP4 and
RSP6 from Chlamydomonas reinhardtii, and several
eukaryotic homologues, including mammalian RSHL1, the
protein product of a familial ciliary dyskinesia
candidate gene.
Length = 481
Score = 28.9 bits (65), Expect = 1.7
Identities = 11/22 (50%), Positives = 17/22 (77%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRG 143
E+E+EE+E+++EE EE E G
Sbjct: 352 EQEDEEEEEEEEEPEEPEPEEG 373
Score = 26.9 bits (60), Expect = 9.2
Identities = 12/19 (63%), Positives = 17/19 (89%)
Query: 122 EEEEEEKEKKKEEEEEEEQ 140
EEEE+E E+++EEEEE E+
Sbjct: 349 EEEEQEDEEEEEEEEEPEE 367
>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein. The YqfQ-like protein family
includes the B. subtilis YqfQ protein, also known as
VrrA, which is functionally uncharacterized. This family
of proteins is found in bacteria. Proteins in this
family are typically between 146 and 237 amino acids in
length. There are two conserved sequence motifs: QYGP
and PKLY.
Length = 155
Score = 28.2 bits (63), Expect = 2.0
Identities = 8/31 (25%), Positives = 13/31 (41%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGGNR 146
E + EE +E E++ E + E R
Sbjct: 97 EEEETEEESTDETEQEDPPETKTESKEKKKR 127
Score = 26.6 bits (59), Expect = 6.3
Identities = 13/61 (21%), Positives = 27/61 (44%), Gaps = 6/61 (9%)
Query: 102 PLAKTIPFGETLNLENGGSSEEEEE------EKEKKKEEEEEEEQSRGGNRALRPSTYTR 155
PL + +P + E S +EEEE ++ ++++ E + +S+ + P T
Sbjct: 76 PLVRNLPAMWKIFRELSSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTE 135
Query: 156 H 156
Sbjct: 136 K 136
>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region. The myc family
belongs to the basic helix-loop-helix leucine zipper
class of transcription factors, see pfam00010. Myc forms
a heterodimer with Max, and this complex regulates cell
growth through direct activation of genes involved in
cell replication. Mutations in the C-terminal 20
residues of this domain cause unique changes in the
induction of apoptosis, transformation, and G2 arrest.
Length = 329
Score = 28.7 bits (64), Expect = 2.0
Identities = 13/21 (61%), Positives = 18/21 (85%)
Query: 119 GSSEEEEEEKEKKKEEEEEEE 139
GS E EE++E+++EEEEEEE
Sbjct: 223 GSDSESEEDEEEEEEEEEEEE 243
Score = 28.0 bits (62), Expect = 3.8
Identities = 13/21 (61%), Positives = 18/21 (85%)
Query: 118 GGSSEEEEEEKEKKKEEEEEE 138
G SE EE+E+E+++EEEEEE
Sbjct: 223 GSDSESEEDEEEEEEEEEEEE 243
>gnl|CDD|234419 TIGR03965, mycofact_glyco, mycofactocin system glycosyltransferase.
Members of this protein family are putative
glycosyltransferases, members of pfam00535 (glycosyl
transferase family 2). Members appear mostly in the
Actinobacteria, where they appear to be part of a system
for converting a precursor peptide (TIGR03969) into a
novel redox carrier designated mycofactocin. A radical
SAM enzyme, TIGR03962, is a proposed to be a key
maturase for mycofactocin.
Length = 467
Score = 28.6 bits (64), Expect = 2.1
Identities = 14/52 (26%), Positives = 29/52 (55%), Gaps = 2/52 (3%)
Query: 14 IDRKYFEKLGKYDMAMSVWGGENLEISFRVWQCGGSLEIVPCSRVGHVFRKR 65
+ R+ ++G +D + V GE++++ +R+ + GG + P + V H R R
Sbjct: 236 VRRRALLEVGGFDERLEV--GEDVDLCWRLCEAGGRVRYEPAAVVAHDHRTR 285
>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
Rpc31. RNA polymerase III contains seventeen subunits
in yeasts and in human cells. Twelve of these are akin
to RNA polymerase I or II and the other five are RNA pol
III-specific, and form the functionally distinct groups
(i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
Rpc34 and Rpc82 form a cluster of enzyme-specific
subunits that contribute to transcription initiation in
S.cerevisiae and H.sapiens. There is evidence that these
subunits are anchored at or near the N-terminal Zn-fold
of Rpc1, itself prolonged by a highly conserved but RNA
polymerase III-specific domain.
Length = 221
Score = 28.2 bits (63), Expect = 2.3
Identities = 12/19 (63%), Positives = 16/19 (84%)
Query: 122 EEEEEEKEKKKEEEEEEEQ 140
E+ +EE EK +EEEEEEE+
Sbjct: 166 EDVDEEDEKDEEEEEEEEE 184
>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
Provisional.
Length = 2849
Score = 28.9 bits (64), Expect = 2.6
Identities = 8/24 (33%), Positives = 18/24 (75%)
Query: 117 NGGSSEEEEEEKEKKKEEEEEEEQ 140
+ +E+E++ +++ +EEEEEE+
Sbjct: 154 DDDDEDEDEDDDDEEDDEEEEEEE 177
Score = 28.5 bits (63), Expect = 2.7
Identities = 10/22 (45%), Positives = 19/22 (86%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRG 143
+E+++++E +EEEEEEE+ +G
Sbjct: 161 DEDDDDEEDDEEEEEEEEEIKG 182
Score = 28.5 bits (63), Expect = 3.0
Identities = 9/27 (33%), Positives = 18/27 (66%)
Query: 114 NLENGGSSEEEEEEKEKKKEEEEEEEQ 140
N E+E+E+ + ++++EEEEE+
Sbjct: 150 NFVIDDDDEDEDEDDDDEEDDEEEEEE 176
Score = 27.3 bits (60), Expect = 7.5
Identities = 8/21 (38%), Positives = 16/21 (76%)
Query: 122 EEEEEEKEKKKEEEEEEEQSR 142
+E+E+E + +E++EEEE+
Sbjct: 157 DEDEDEDDDDEEDDEEEEEEE 177
>gnl|CDD|117592 pfam09026, Cenp-B_dimeris, Centromere protein B dimerisation
domain. The centromere protein B (CENP-B) dimerisation
domain is composed of two alpha-helices, which are
folded into an antiparallel configuration. Dimerisation
of CENP-B is mediated by this domain, in which monomers
dimerise to form a symmetrical, antiparallel, four-helix
bundle structure with a large hydrophobic patch in which
23 residues of one monomer form van der Waals contacts
with the other monomer. This CENP-B dimer configuration
may be suitable for capturing two distant CENP-B boxes
during centromeric heterochromatin formation.
Length = 101
Score = 27.4 bits (60), Expect = 2.6
Identities = 10/36 (27%), Positives = 20/36 (55%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPS 151
E+ S +EEE+ + + EE+++E+ + PS
Sbjct: 10 EDSDSDSDEEEDDDDEDEEDDDEDDDEDDDEVPVPS 45
>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein. CDC45 is an essential gene
required for initiation of DNA replication in S.
cerevisiae, forming a complex with MCM5/CDC46.
Homologues of CDC45 have been identified in human, mouse
and smut fungus among others.
Length = 583
Score = 28.4 bits (64), Expect = 2.6
Identities = 16/111 (14%), Positives = 39/111 (35%), Gaps = 13/111 (11%)
Query: 46 CGGSLEIV-----PCSRVGHVFRKRHPYTFPGGSGNVFARNTRRAAEVWMDNYKHYYYAE 100
CGG +++ + +V P+ NVF + ++ D +
Sbjct: 60 CGGMVDLEEFLQLDEDVIVYVIDSHRPWNL----DNVFGSDQVV---IFDDGDIEEELQD 112
Query: 101 VPLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPS 151
P + + ++ +EE+E+ K E++E+++ +
Sbjct: 113 EPRYDDA-YRDLEEDDDDDEESDEEDEESSKSEDDEDDDDDDDDDDIATRE 162
>gnl|CDD|217049 pfam02459, Adeno_terminal, Adenoviral DNA terminal protein. This
protein is covalently attached to the terminii of
replicating DNA in vivo.
Length = 548
Score = 28.5 bits (64), Expect = 2.7
Identities = 15/30 (50%), Positives = 20/30 (66%)
Query: 120 SSEEEEEEKEKKKEEEEEEEQSRGGNRALR 149
EEEEEE+ ++EEEEEEE+ R +R
Sbjct: 306 EEEEEEEEEVPEEEEEEEEEEERTFEEEVR 335
Score = 26.9 bits (60), Expect = 7.3
Identities = 14/22 (63%), Positives = 17/22 (77%)
Query: 120 SSEEEEEEKEKKKEEEEEEEQS 141
EEEEEE+E +EEEEEEE+
Sbjct: 305 PEEEEEEEEEVPEEEEEEEEEE 326
>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
Length = 1066
Score = 28.3 bits (63), Expect = 2.9
Identities = 20/69 (28%), Positives = 36/69 (52%), Gaps = 4/69 (5%)
Query: 121 SEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMG 180
+EEE E K+KK+E+ +E+E + +A + + + Q + V K + KK R
Sbjct: 14 TEEELERKKKKEEKAKEKELKK--LKAAQKEAKAKLQAQQASDGTNVP--KKSEKKSRKR 69
Query: 181 EDEEEKEEE 189
+ E+E E+
Sbjct: 70 DVEDENPED 78
>gnl|CDD|219900 pfam08553, VID27, VID27 cytoplasmic protein. This is a family of
fungal and plant proteins and contains many hypothetical
proteins. VID27 is a cytoplasmic protein that plays a
potential role in vacuolar protein degradation.
Length = 794
Score = 28.2 bits (63), Expect = 3.5
Identities = 9/30 (30%), Positives = 17/30 (56%)
Query: 114 NLENGGSSEEEEEEKEKKKEEEEEEEQSRG 143
+ EEEE+E+E+++E+E+E
Sbjct: 383 DANTERDDEEEEDEEEEEEEDEDEGPSKEH 412
Score = 27.8 bits (62), Expect = 4.2
Identities = 10/30 (33%), Positives = 23/30 (76%)
Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQ 140
L +E+ + ++EEE+++++EEEE+E++
Sbjct: 377 SALEIEDANTERDDEEEEDEEEEEEEDEDE 406
Score = 27.0 bits (60), Expect = 7.7
Identities = 11/33 (33%), Positives = 21/33 (63%)
Query: 110 GETLNLENGGSSEEEEEEKEKKKEEEEEEEQSR 142
+ N +EEEE++E+++EE+E+E S+
Sbjct: 378 ALEIEDANTERDDEEEEDEEEEEEEDEDEGPSK 410
>gnl|CDD|222440 pfam13897, GOLD_2, Golgi-dynamics membrane-trafficking. Sec14-like
Golgi-trafficking domain The GOLD domain is always found
combined with lipid- or membrane-association domains.
Length = 136
Score = 27.0 bits (60), Expect = 4.2
Identities = 11/38 (28%), Positives = 21/38 (55%)
Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRAL 148
++++ EEEEE +E++ E + E S+ +R L
Sbjct: 47 VSVHVSESSDEEEEEEAEEEEAETGDVEAGSKSQSRPL 84
>gnl|CDD|206063 pfam13892, DBINO, DNA-binding domain. DBINO is a DNA-binding
domain found on global transcription activator SNF2L1
proteins and chromatin re-modelling proteins.
Length = 140
Score = 26.8 bits (60), Expect = 4.3
Identities = 9/20 (45%), Positives = 12/20 (60%)
Query: 123 EEEEEKEKKKEEEEEEEQSR 142
E+E E+ K+EEE E R
Sbjct: 92 AEKEALEQAKKEEELREAKR 111
Score = 26.5 bits (59), Expect = 7.5
Identities = 10/23 (43%), Positives = 13/23 (56%), Gaps = 2/23 (8%)
Query: 122 EEEEEEKEKKKEEEEEEE--QSR 142
E+E ++ KKEEE E Q R
Sbjct: 92 AEKEALEQAKKEEELREAKRQQR 114
>gnl|CDD|118278 pfam09746, Membralin, Tumour-associated protein. Membralin is
evolutionarily highly conserved; though it seems to
represent a unique protein family. The protein appears
to contain several transmembrane regions. In humans it
is expressed in certain cancers, particularly ovarian
cancers. Membralin-like gene homologues have been
identified in plants including grape, cotton and tomato.
Length = 375
Score = 27.5 bits (61), Expect = 4.5
Identities = 11/79 (13%), Positives = 25/79 (31%), Gaps = 6/79 (7%)
Query: 113 LNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRAL--RPSTYTRHRHQTSFIDQEVAYM 170
++ + + E E Q+ G L RP+ + + D E
Sbjct: 105 SLKDSYYYGIGPQTRQN---HETLERYQNILGKLGLPVRPTFAYSNESLYYYFDAENILD 161
Query: 171 KMT-SKKYRMGEDEEEKEE 188
+ + +E ++E+
Sbjct: 162 TYSHPNAISLKNEEWDEEQ 180
>gnl|CDD|217503 pfam03344, Daxx, Daxx Family. The Daxx protein (also known as the
Fas-binding protein) is thought to play a role in
apoptosis, but precise role played by Daxx remains to be
determined. Daxx forms a complex with Axin.
Length = 715
Score = 28.0 bits (62), Expect = 4.6
Identities = 22/68 (32%), Positives = 33/68 (48%), Gaps = 10/68 (14%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMGE 181
+ EEEE+ K++E E + SR + + ST S QE S++ E
Sbjct: 397 DTEEEERRKRQERERQGTSSRSSDPSKASSTSGESPSMAS---QE-------SEEEESVE 446
Query: 182 DEEEKEEE 189
+EEE+EEE
Sbjct: 447 EEEEEEEE 454
Score = 27.2 bits (60), Expect = 6.3
Identities = 15/39 (38%), Positives = 23/39 (58%)
Query: 102 PLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQ 140
P + E ++E EEEEEE+E++ EEEE E++
Sbjct: 432 PSMASQESEEEESVEEEEEEEEEEEEEEQESEEEEGEDE 470
Score = 27.2 bits (60), Expect = 6.5
Identities = 12/22 (54%), Positives = 18/22 (81%)
Query: 120 SSEEEEEEKEKKKEEEEEEEQS 141
+S+E EEE+ ++EEEEEEE+
Sbjct: 435 ASQESEEEESVEEEEEEEEEEE 456
>gnl|CDD|214487 smart00046, DAGKc, Diacylglycerol kinase catalytic domain
(presumed). Diacylglycerol (DAG) is a second messenger
that acts as a protein kinase C activator. DAG can be
produced from the hydrolysis of phosphatidylinositol
4,5-bisphosphate (PIP2) by a phosphoinositide-specific
phospholipase C and by the degradation of
phosphatidylcholine (PC) by a phospholipase C or the
concerted actions of phospholipase D and phosphatidate
phosphohydrolase. This domain is presumed to be the
catalytic domain. Bacterial homologues areknown.
Length = 124
Score = 26.9 bits (60), Expect = 4.8
Identities = 6/14 (42%), Positives = 8/14 (57%)
Query: 70 FPGGSGNVFARNTR 83
P G+GN AR+
Sbjct: 84 LPLGTGNDLARSLG 97
>gnl|CDD|216116 pfam00781, DAGK_cat, Diacylglycerol kinase catalytic domain.
Diacylglycerol (DAG) is a second messenger that acts as
a protein kinase C activator. The catalytic domain is
assumed from the finding of bacterial homologues. YegS
is the Escherichia coli protein in this family whose
crystal structure reveals an active site in the
inter-domain cleft formed by four conserved sequence
motifs, revealing a novel metal-binding site. The
residues of this site are conserved across the family.
Length = 127
Score = 26.9 bits (60), Expect = 5.1
Identities = 7/11 (63%), Positives = 8/11 (72%)
Query: 70 FPGGSGNVFAR 80
P G+GN FAR
Sbjct: 88 IPLGTGNDFAR 98
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This
protein, which interacts with both microtubules and
TRAF3 (tumour necrosis factor receptor-associated factor
3), is conserved from worms to humans. The N-terminal
region is the microtubule binding domain and is
well-conserved; the C-terminal 100 residues, also
well-conserved, constitute the coiled-coil region which
binds to TRAF3. The central region of the protein is
rich in lysine and glutamic acid and carries KKE motifs
which may also be necessary for tubulin-binding, but
this region is the least well-conserved.
Length = 506
Score = 27.5 bits (61), Expect = 5.3
Identities = 14/44 (31%), Positives = 29/44 (65%), Gaps = 1/44 (2%)
Query: 99 AEVPLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQSR 142
++ P AKT P E N E+G E+E+E+ +++K++++E+ +
Sbjct: 85 SKGPAAKTKPAKEPKN-ESGKEEEKEKEQVKEEKKKKKEKPKEE 127
>gnl|CDD|239572 cd03490, Topoisomer_IB_N_1, Topoisomer_IB_N_1: A subgroup of the
N-terminal DNA binding fragment found in eukaryotic DNA
topoisomerase (topo) IB. Topo IB proteins include the
monomeric yeast and human topo I and heterodimeric topo
I from Leishmania donvanni. Topo I enzymes are divided
into: topo type IA (bacterial) and type IB
(eukaryotic). Topo I relaxes superhelical tension in
duplex DNA by creating a single-strand nick, the broken
strand can then rotate around the unbroken strand to
remove DNA supercoils and, the nick is religated,
liberating topo I. These enzymes regulate the
topological changes that accompany DNA replication,
transcription and other nuclear processes. Human topo I
is the target of a diverse set of anticancer drugs
including camptothecins (CPTs). CPTs bind to the topo
I-DNA complex and inhibit religation of the
single-strand nick, resulting in the accumulation of
topo I-DNA adducts. In addition to differences in
structure and some biochemical properties,
Trypanosomatid parasite topos I differ from human topo I
in their sensitivity to CPTs and other classical topo I
inhibitors. Trypanosomatid topos I have putative roles
in organizing the kinetoplast DNA network unique to
these parasites. This family may represent more than
one structural domain.
Length = 217
Score = 27.2 bits (60), Expect = 5.4
Identities = 9/22 (40%), Positives = 16/22 (72%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRG 143
EE+E++K KEE+E +++ R
Sbjct: 95 EEKEKKKNLNKEEKEAKKKERA 116
>gnl|CDD|100108 cd04411, Ribosomal_P1_P2_L12p, Ribosomal protein P1, P2, and L12p.
Ribosomal proteins P1 and P2 are the eukaryotic proteins
that are functionally equivalent to bacterial L7/L12.
L12p is the archaeal homolog. Unlike other ribosomal
proteins, the archaeal L12p and eukaryotic P1 and P2 do
not share sequence similarity with their bacterial
counterparts. They are part of the ribosomal stalk
(called the L7/L12 stalk in bacteria), along with 28S
rRNA and the proteins L11 and P0 in eukaryotes (23S
rRNA, L11, and L10e in archaea). In bacterial ribosomes,
L7/L12 homodimers bind the extended C-terminal helix of
L10 to anchor the L7/L12 molecules to the ribosome.
Eukaryotic P1/P2 heterodimers and archaeal L12p
homodimers are believed to bind the L10 equivalent
proteins, eukaryotic P0 and archaeal L10e, in a similar
fashion. P1 and P2 (L12p, L7/L12) are the only proteins
in the ribosome to occur as multimers, always appearing
as sets of dimers. Recent data indicate that most
archaeal species contain six copies of L12p (three
homodimers), while eukaryotes have two copies each of P1
and P2 (two heterodimers). Bacteria may have four or six
copies (two or three homodimers), depending on the
species. As in bacteria, the stalk is crucial for
binding of initiation, elongation, and release factors
in eukaryotes and archaea.
Length = 105
Score = 26.5 bits (58), Expect = 5.6
Identities = 9/22 (40%), Positives = 16/22 (72%)
Query: 118 GGSSEEEEEEKEKKKEEEEEEE 139
+E+ EE KE+++EEE+E+
Sbjct: 79 AEPAEKAEEAKEEEEEEEDEDF 100
>gnl|CDD|240329 PTZ00248, PTZ00248, eukaryotic translation initiation factor 2
subunit 1; Provisional.
Length = 319
Score = 27.3 bits (61), Expect = 5.6
Identities = 10/32 (31%), Positives = 21/32 (65%)
Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQSR 142
E E+ S E+E+E+++ +EEEE++++
Sbjct: 287 EEEEEEDDYSESEDEDEEDEDEEEEEDDDEGD 318
>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
Length = 1068
Score = 27.7 bits (62), Expect = 5.7
Identities = 10/33 (30%), Positives = 16/33 (48%)
Query: 114 NLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNR 146
L +GG + +E+ K E + E +Q R R
Sbjct: 578 ALFSGGEETKPQEQPAPKAEAKPERQQDRRKPR 610
>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein.
Length = 529
Score = 27.4 bits (61), Expect = 5.9
Identities = 14/20 (70%), Positives = 16/20 (80%)
Query: 120 SSEEEEEEKEKKKEEEEEEE 139
EEEEEEKE+KKEEEE+
Sbjct: 36 PDEEEEEEKEEKKEEEEKTT 55
>gnl|CDD|145949 pfam03066, Nucleoplasmin, Nucleoplasmin. Nucleoplasmins are also
known as chromatin decondensation proteins. They bind to
core histones and transfer DNA to them in a reaction
that requires ATP. This is thought to play a role in the
assembly of regular nucleosomal arrays.
Length = 146
Score = 26.5 bits (59), Expect = 6.2
Identities = 10/29 (34%), Positives = 21/29 (72%)
Query: 113 LNLENGGSSEEEEEEKEKKKEEEEEEEQS 141
+ E S ++EE+E+E+ EE+++E++S
Sbjct: 107 VASEEDESDDDEEDEEEEDDEEDDDEDES 135
Score = 26.5 bits (59), Expect = 6.4
Identities = 11/26 (42%), Positives = 18/26 (69%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQS 141
++ EEEE+++E E+E EEE+S
Sbjct: 115 DDDEEDEEEEDDEEDDDEDESEEEES 140
>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682). The
members of this family are all hypothetical eukaryotic
proteins of unknown function. One member is described as
being an adipocyte-specific protein, but no evidence of
this was found.
Length = 322
Score = 27.2 bits (61), Expect = 6.5
Identities = 10/21 (47%), Positives = 17/21 (80%)
Query: 122 EEEEEEKEKKKEEEEEEEQSR 142
EE +E+KE+KK+EE E + ++
Sbjct: 282 EEAQEKKEEKKKEEREAKLAK 302
>gnl|CDD|235302 PRK04456, PRK04456, acetyl-CoA decarbonylase/synthase complex
subunit beta; Reviewed.
Length = 463
Score = 27.3 bits (61), Expect = 6.6
Identities = 11/19 (57%), Positives = 16/19 (84%)
Query: 121 SEEEEEEKEKKKEEEEEEE 139
+ EEEEE+E+++EEEEE
Sbjct: 403 AAEEEEEEEEEEEEEEEPV 421
>gnl|CDD|204985 pfam12624, Chorein_N, N-terminal region of Chorein, a TM
vesicle-mediated sorter. Although mutations in the
full-length vacuolar protein sorting 13A (VPS13A)
protein in vertebrates lead to the disease of
chorea-acanthocytosis, the exact function of any of the
regions within the protein is not yet known. This
region is the proposed leucine zipper at the
N-terminus. The full-length protein is a transmembrane
protein with a presumed role in vesicle-mediated
sorting and intracellular protein transport.
Length = 117
Score = 26.4 bits (59), Expect = 6.7
Identities = 10/27 (37%), Positives = 16/27 (59%), Gaps = 4/27 (14%)
Query: 17 KYFEKLGKYDMAMSVWGG----ENLEI 39
+Y E L K +++S+W G ENL +
Sbjct: 15 EYVENLDKEQLSVSIWSGDVELENLRL 41
>gnl|CDD|227499 COG5171, YRB1, Ran GTPase-activating protein (Ran-binding protein)
[Intracellular trafficking and secretion].
Length = 211
Score = 26.9 bits (59), Expect = 6.7
Identities = 16/70 (22%), Positives = 28/70 (40%)
Query: 123 EEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMGED 182
++ E E +E + N P + H + + E K +K +R E+
Sbjct: 46 QQSPFLENAVPEGDEGKGPESPNIHFEPVVELQRVHLKTNEEDETVLFKARAKLFRFDEE 105
Query: 183 EEEKEEEGTG 192
+E +E GTG
Sbjct: 106 AKEWKERGTG 115
>gnl|CDD|220284 pfam09538, FYDLN_acid, Protein of unknown function (FYDLN_acid).
Members of this family are bacterial proteins with a
conserved motif [KR]FYDLN, sometimes flanked by a pair
of CXXC motifs, followed by a long region of low
complexity sequence in which roughly half the residues
are Asp and Glu, including multiple runs of five or more
acidic residues. The function of members of this family
is unknown.
Length = 104
Score = 26.1 bits (58), Expect = 6.7
Identities = 10/32 (31%), Positives = 17/32 (53%), Gaps = 2/32 (6%)
Query: 110 GETLNLENG--GSSEEEEEEKEKKKEEEEEEE 139
GE + E + + E+ KK E+EE+E+
Sbjct: 33 GEEVPPEVAKSRAPAADAEDAAKKDEDEEDED 64
>gnl|CDD|133433 cd05297, GH4_alpha_glucosidase_galactosidase, Glycoside Hydrolases
Family 4; Alpha-glucosidases and alpha-galactosidases.
Glucosidases cleave glycosidic bonds to release glucose
from oligosaccharides. Alpha-glucosidases and
alpha-galactosidases release alpha-D-glucose and
alpha-D-galactose, respectively, via the hydrolysis of
alpha-glycopyranoside bonds. Some bacteria
simultaneously translocate and phosphorylate
disaccharides via the phosphoenolpyruvate-dependent
phosphotransferase system (PEP-PTS). After
translocation, these phospho-disaccharides may be
hydrolyzed by the GH4 glycoside hydrolases such as the
alpha-glucosidases. Other organsisms (such as archaea
and Thermotoga maritima) lack the PEP-PTS system, but
have several enzymes normally associated with the
PEP-PTS operon. Alpha-glucosidases and
alpha-galactosidases are part of the NAD(P)-binding
Rossmann fold superfamily, which includes a wide variety
of protein families including the NAD(P)-binding domains
of alcohol dehydrogenases, tyrosine-dependent
oxidoreductases, glyceraldehyde-3-phosphate
dehydrogenases, formate/glycerate dehydrogenases,
siroheme synthases, 6-phosphogluconate dehydrogenases,
aminoacid dehydrogenases, repressor rex, and NAD-binding
potassium channel domains, among others.
Length = 423
Score = 27.1 bits (61), Expect = 7.7
Identities = 17/83 (20%), Positives = 32/83 (38%), Gaps = 12/83 (14%)
Query: 82 TRRAAEVWMDNYKHYYYAEVPLAKTIPFGETLNLENGGSSEEEEEEKEKKKEEEEEEEQS 141
+ +E Y +Y E K I +GE E GG EE+ E +++ + E
Sbjct: 248 SEHLSE-----YVPHYRKE---TKKIWYGEFNEDEYGGRDEEQGWEWYEERLKLILAEID 299
Query: 142 RGGNRALRPSTYTRHRHQTSFID 164
+ ++ S + + I+
Sbjct: 300 KEELDPVKRS----GEYASPIIE 318
>gnl|CDD|218652 pfam05602, CLPTM1, Cleft lip and palate transmembrane protein 1
(CLPTM1). This family consists of several eukaryotic
cleft lip and palate transmembrane protein 1 sequences.
Cleft lip with or without cleft palate is a common birth
defect that is genetically complex. The nonsyndromic
forms have been studied genetically using linkage and
candidate-gene association studies with only partial
success in defining the loci responsible for orofacial
clefting. CLPTM1 encodes a transmembrane protein and has
strong homology to two Caenorhabditis elegans genes,
suggesting that CLPTM1 may belong to a new gene family.
This family also contains the human cisplatin resistance
related protein CRR9p which is associated with
CDDP-induced apoptosis.
Length = 437
Score = 26.9 bits (60), Expect = 7.7
Identities = 12/71 (16%), Positives = 26/71 (36%), Gaps = 12/71 (16%)
Query: 75 GNVFAR--------NTRRAAEVWMDNYKHYYYAEVPLAKTIPF--GETLNLENGGSSEEE 124
G+++ + + + + PL +P + NL G S +EE
Sbjct: 110 GSLYLHVYLGLSGYSLDPTDKGYDSGKA--VHFVFPLTTYLPKKKVKKKNLLGGKSEKEE 167
Query: 125 EEEKEKKKEEE 135
EE++ ++
Sbjct: 168 PEEEKTPAPDK 178
>gnl|CDD|227602 COG5277, COG5277, Actin and related proteins [Cytoskeleton].
Length = 444
Score = 27.0 bits (60), Expect = 8.0
Identities = 10/50 (20%), Positives = 17/50 (34%)
Query: 122 EEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMK 171
EE EEE+EK E+ E ++ + + E +
Sbjct: 248 EEFEEEEEKPAEKSTESTFQLSKETSIAKESKELPDGEEIEFGNEERFKA 297
>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain. This
family represents the C-terminus (approximately 300
residues) of proteins that are involved as binding
partners for Prp19 as part of the nuclear pore complex.
The family in Drosophila is necessary for pre-mRNA
splicing, and the human protein has been found in
purifications of the spliceosome. In the past this
family was thought, erroneously, to be associated with
microfibrillin.
Length = 277
Score = 26.8 bits (59), Expect = 8.2
Identities = 23/79 (29%), Positives = 34/79 (43%), Gaps = 4/79 (5%)
Query: 111 ETLNLENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYM 170
E L LE S EEEEE+ EEEEE S +TR + + + ++E
Sbjct: 3 EVLELEEEDESGEEEEEE----SEEEEETDSEDDMEPRLKPVFTRKKDRITIQEREREAA 58
Query: 171 KMTSKKYRMGEDEEEKEEE 189
K + + EE++ E
Sbjct: 59 KEKALEEEAKRKAEERKRE 77
>gnl|CDD|222792 PHA00435, PHA00435, capsid assembly protein.
Length = 306
Score = 26.7 bits (59), Expect = 8.4
Identities = 17/44 (38%), Positives = 23/44 (52%), Gaps = 4/44 (9%)
Query: 102 PLAKTIPFGE----TLNLENGGSSEEEEEEKEKKKEEEEEEEQS 141
P PFGE + + EEEE E+ ++ EEEE EE+S
Sbjct: 56 PYGNPDPFGEDDEGRIEVRISEDGEEEEVEEGEEDEEEEGEEES 99
>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1. All
proteins in this family for which functions are known
are cyclin dependent protein kinases that are components
of TFIIH, a complex that is involved in nucleotide
excision repair and transcription initiation. Also known
as MAT1 (menage a trois 1). This family is based on the
phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
Stanford University) [DNA metabolism, DNA replication,
recombination, and repair].
Length = 309
Score = 26.7 bits (59), Expect = 8.4
Identities = 16/69 (23%), Positives = 33/69 (47%)
Query: 123 EEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQTSFIDQEVAYMKMTSKKYRMGED 182
E+EEE++++ ++EEE+ + R + + T + +A K S K M +
Sbjct: 154 EKEEEEQRRLLLQKEEEEQQMNKRKNKQALLDELETSTLPAAELIAQHKKNSVKLEMQVE 213
Query: 183 EEEKEEEGT 191
+ + E+ T
Sbjct: 214 KPKPEKPNT 222
>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572). Family of
eukaryotic proteins with undetermined function.
Length = 321
Score = 26.6 bits (59), Expect = 8.7
Identities = 22/90 (24%), Positives = 39/90 (43%), Gaps = 15/90 (16%)
Query: 115 LENGGS----SEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQT---------- 160
+E+G + +++ +EE+E++ E+E EEE + + L T R
Sbjct: 100 VESGATRNYEADKLDEEQEERVEKEREEELAGDAMKKLENRTADSKREMEVLERLEELKE 159
Query: 161 -SFIDQEVAYMKMTSKKYRMGEDEEEKEEE 189
+V M +R + EEE+EEE
Sbjct: 160 LQSRRADVDVNSMLEALFRREKKEEEEEEE 189
>gnl|CDD|222648 pfam14283, DUF4366, Domain of unknown function (DUF4366). This
family of proteins is found in bacteria and eukaryotes.
Proteins in this family are typically between 227 and
387 amino acids in length.
Length = 213
Score = 26.5 bits (59), Expect = 8.8
Identities = 10/28 (35%), Positives = 13/28 (46%)
Query: 121 SEEEEEEKEKKKEEEEEEEQSRGGNRAL 148
+E E E + E EEE E+ G L
Sbjct: 135 TECTGPEPEPEPEPEEEPEKKSGMGPLL 162
>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
This is a family of fungal proteins of unknown function.
Length = 182
Score = 26.2 bits (58), Expect = 9.2
Identities = 7/45 (15%), Positives = 23/45 (51%)
Query: 116 ENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPSTYTRHRHQT 160
+ +++ E+K++K+ E++ E+ ++ + L + + R
Sbjct: 101 KKDDKKDDKSEKKDEKEAEDKLEDLTKSYSETLSTLSELKPRKYA 145
>gnl|CDD|217829 pfam03985, Paf1, Paf1. Members of this family are components of
the RNA polymerase II associated Paf1 complex. The Paf1
complex functions during the elongation phase of
transcription in conjunction with Spt4-Spt5 and
Spt16-Pob3i.
Length = 431
Score = 26.6 bits (59), Expect = 9.3
Identities = 9/37 (24%), Positives = 19/37 (51%)
Query: 115 LENGGSSEEEEEEKEKKKEEEEEEEQSRGGNRALRPS 151
L+ E +E+E E++++ +E E+ G + S
Sbjct: 362 LDPIDFEEVDEDEDEEEEQRSDEHEEEEGEDSEEEGS 398
>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
This family represents Cwf15/Cwc15 (from
Schizosaccharomyces pombe and Saccharomyces cerevisiae
respectively) and their homologues. The function of
these proteins is unknown, but they form part of the
spliceosome and are thus thought to be involved in mRNA
splicing.
Length = 241
Score = 26.6 bits (59), Expect = 9.5
Identities = 8/22 (36%), Positives = 16/22 (72%)
Query: 121 SEEEEEEKEKKKEEEEEEEQSR 142
E EE++ +++E+ EEE++R
Sbjct: 156 KERAEEKEREEEEKAAEEEKAR 177
>gnl|CDD|204614 pfam11221, Med21, Subunit 21 of Mediator complex. Med21 has been
known as Srb7 in yeasts, hSrb7 in humans and Trap 19 in
Drosophila. The heterodimer of the two subunits Med7 and
Med21 appears to act as a hinge between the middle and
the tail regions of Mediator.
Length = 132
Score = 25.7 bits (57), Expect = 9.9
Identities = 12/25 (48%), Positives = 15/25 (60%), Gaps = 1/25 (4%)
Query: 119 GSSEEEEEEKEKKKEEE-EEEEQSR 142
SSEEE+ + K+ EEE E E R
Sbjct: 84 ESSEEEQLRRIKELEEELREVEAER 108
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.313 0.132 0.391
Gapped
Lambda K H
0.267 0.0812 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 10,627,356
Number of extensions: 999857
Number of successful extensions: 4512
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3626
Number of HSP's successfully gapped: 358
Length of query: 199
Length of database: 10,937,602
Length adjustment: 92
Effective length of query: 107
Effective length of database: 6,857,034
Effective search space: 733702638
Effective search space used: 733702638
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 56 (25.2 bits)