RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy2416
(405 letters)
>gnl|CDD|227640 COG5333, CCL1, Cdk activating kinase (CAK)/RNA polymerase II
transcription initiation/nucleotide excision repair
factor TFIIH/TFIIK, cyclin H subunit [Cell division and
chromosome partitioning / Transcription / DNA
replication, recombination, and repair].
Length = 297
Score = 104 bits (260), Expect = 3e-25
Identities = 58/196 (29%), Positives = 94/196 (47%), Gaps = 18/196 (9%)
Query: 40 LDGLDPEV----ETDLRIIGCELIQTAGILLKLPQVAMATGQVLFQRFYYSKSFVRHPME 95
L L+PE+ E +L I +LI L LPQ +AT + F RFY S +
Sbjct: 29 LLVLEPELTLEKELNLVIYYLKLIMDLCTRLNLPQTVLATAILFFSRFYLKNSVEEISLY 88
Query: 96 TTAMGCVCLASKIEEAPRRIRDVINVFHHIRQVMNQKSITPMLLTTQYMTLKTQVIKAER 155
+ CV LA K+E+ PR I L + + + + ++++ E
Sbjct: 89 SVVTTCVYLACKVEDTPRDISIESFEARD-------------LWSEEPKSSRERILEYEF 135
Query: 156 RVLKELGFCVHVKHPHKIIVTYLQVLGCEKNQKLMQLA-NYMNDSLRTDVFVRYDPETIA 214
+L+ L F +HV HP+K + +L+ L + KL+Q+A +ND+LRTD+ + Y P IA
Sbjct: 136 ELLEALDFDLHVHHPYKYLEGFLKDLQEKDKYKLLQIAWKIINDALRTDLCLLYPPHIIA 195
Query: 215 SACIYLTARKLRIPLP 230
A + + L +P+
Sbjct: 196 LAALLIACEVLGMPII 211
>gnl|CDD|238003 cd00043, CYCLIN, Cyclin box fold. Protein binding domain
functioning in cell-cycle and transcription control.
Present in cyclins, TFIIB and Retinoblastoma (RB).The
cyclins consist of 8 classes of cell cycle regulators
that regulate cyclin dependent kinases (CDKs). TFIIB is
a transcription factor that binds the TATA box. Cyclins,
TFIIB and RB contain 2 copies of the domain.
Length = 88
Score = 61.1 bits (149), Expect = 5e-12
Identities = 18/89 (20%), Positives = 34/89 (38%), Gaps = 4/89 (4%)
Query: 51 LRIIGCELIQTAGILLKLPQVAMATGQVLFQRFYYSKSFVRHPMETTAMGCVCLASKIEE 110
+R + ++ L L + L RF S + A + LA+K+EE
Sbjct: 1 MRPTPLDFLRRVAKALGLSPETLTLAVNLLDRFLLDYSVLGRSPSLVAAAALYLAAKVEE 60
Query: 111 APRRIRDVINVFHHIRQVMNQKSITPMLL 139
P ++D+++V ++ I M
Sbjct: 61 IPPWLKDLVHVTG----YATEEEILRMEK 85
Score = 49.2 bits (118), Expect = 8e-08
Identities = 20/91 (21%), Positives = 36/91 (39%), Gaps = 5/91 (5%)
Query: 166 HVKHPHKIIVTYLQVLGCEKNQKLMQLA-NYMNDSLRTDVFVRYDPETIASACIYLTARK 224
P + + LG + + + LA N ++ L + P +A+A +YL A+
Sbjct: 1 MRPTPLDFLRRVAKALGL--SPETLTLAVNLLDRFLLDYSVLGRSPSLVAAAALYLAAKV 58
Query: 225 LRIPLPRNPAWYSLFHVL-ESDIQDVCKRIL 254
IP P + E +I + K +L
Sbjct: 59 EEIP-PWLKDLVHVTGYATEEEILRMEKLLL 88
>gnl|CDD|214641 smart00385, CYCLIN, domain present in cyclins, TFIIB and
Retinoblastoma. A helical domain present in cyclins and
TFIIB (twice) and Retinoblastoma (once). A protein
recognition domain functioning in cell-cycle and
transcription control.
Length = 83
Score = 51.8 bits (125), Expect = 9e-09
Identities = 16/81 (19%), Positives = 30/81 (37%), Gaps = 4/81 (4%)
Query: 59 IQTAGILLKLPQVAMATGQVLFQRFYYSKSFVRHPMETTAMGCVCLASKIEEAPRRIRDV 118
++ L L + L RF F+++ A + LASK EE P +++
Sbjct: 3 LRRVCKALNLDPETLNLAVNLLDRFLSDYKFLKYSPSLIAAAALYLASKTEETPPWTKEL 62
Query: 119 INVFHHIRQVMNQKSITPMLL 139
++ ++ I M
Sbjct: 63 VHYT----GYFTEEEILRMER 79
Score = 48.4 bits (116), Expect = 1e-07
Identities = 18/84 (21%), Positives = 35/84 (41%), Gaps = 5/84 (5%)
Query: 174 IVTYLQVLGCEKNQKLMQLA-NYMNDSLRTDVFVRYDPETIASACIYLTARKLRIPLPRN 232
+ + L + + + LA N ++ L F++Y P IA+A +YL ++ P P
Sbjct: 3 LRRVCKALNL--DPETLNLAVNLLDRFLSDYKFLKYSPSLIAAAALYLASKTEETP-PWT 59
Query: 233 PAWYSLFHVL-ESDIQDVCKRILR 255
E +I + + +L
Sbjct: 60 KELVHYTGYFTEEEILRMERLLLE 83
>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family. This model
represents a subfamily of RNA splicing factors including
the Pad-1 protein (N. crassa), CAPER (M. musculus) and
CC1.3 (H.sapiens). These proteins are characterized by
an N-terminal arginine-rich, low complexity domain
followed by three (or in the case of 4 H. sapiens
paralogs, two) RNA recognition domains (rrm: pfam00706).
These splicing factors are closely related to the U2AF
splicing factor family (TIGR01642). A homologous gene
from Plasmodium falciparum was identified in the course
of the analysis of that genome at TIGR and was included
in the seed.
Length = 457
Score = 45.3 bits (107), Expect = 4e-05
Identities = 21/88 (23%), Positives = 34/88 (38%), Gaps = 10/88 (11%)
Query: 312 KRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSK-----KYSS---RARSRSKSPRSRSR 363
+ + R R R + + K R RSR + + + Y R+RSRS + R R
Sbjct: 3 RDRERGRLRNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRYYRPR 62
Query: 364 TPDRKYKKSHKSHKDSKDY-YTPPSPDR 390
DR Y++ + + T D
Sbjct: 63 G-DRSYRRDDRRSGRNTKEPLTEAERDD 89
Score = 39.1 bits (91), Expect = 0.004
Identities = 24/78 (30%), Positives = 32/78 (41%), Gaps = 3/78 (3%)
Query: 328 KSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHK--SHKDSKDYYTP 385
+ R R R ++S K R+R RS+S R DR Y + + S S + Y
Sbjct: 1 RYRDRERGRLRNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRYYR 60
Query: 386 PSPDRSPYSSHSRSHSRK 403
P DRS Y R R
Sbjct: 61 PRGDRS-YRRDDRRSGRN 77
Score = 35.6 bits (82), Expect = 0.041
Identities = 21/87 (24%), Positives = 32/87 (36%), Gaps = 9/87 (10%)
Query: 281 SKDRKVLVSGDNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTS-KSRSRSRSPQPP 339
+DR+ ++T S+ + SR + R R R R + RSRSRSP
Sbjct: 2 YRDRERGRLRNDTRRSDKG---RERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRY 58
Query: 340 KHKKSKKYSSRARSRSKSPRSRSRTPD 366
+ + R RS T +
Sbjct: 59 YRPRGDR-----SYRRDDRRSGRNTKE 80
Score = 29.1 bits (65), Expect = 4.4
Identities = 17/58 (29%), Positives = 23/58 (39%), Gaps = 4/58 (6%)
Query: 350 RARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDY-YTPPSPDRSPYSSHS-RSHSRKSS 405
R R R R R+ T R K +S + S+ + DR Y RS SR +
Sbjct: 1 RYRDRE-RGRLRNDTR-RSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPN 56
>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
splicing factor. These splicing factors consist of an
N-terminal arginine-rich low complexity domain followed
by three tandem RNA recognition motifs (pfam00076). The
well-characterized members of this family are auxilliary
components of the U2 small nuclear ribonuclearprotein
splicing factor (U2AF). These proteins are closely
related to the CC1-like subfamily of splicing factors
(TIGR01622). Members of this subfamily are found in
plants, metazoa and fungi.
Length = 509
Score = 41.4 bits (97), Expect = 6e-04
Identities = 26/93 (27%), Positives = 35/93 (37%), Gaps = 12/93 (12%)
Query: 312 KRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKK 371
+ KSR R R RS + RSR RS +H++S R RS + R R R +
Sbjct: 9 REKSRGRDRDRSSERPRRRSRDRSRFRDRHRRS-----RERSYREDSRPRDR-------R 56
Query: 372 SHKSHKDSKDYYTPPSPDRSPYSSHSRSHSRKS 404
+ S Y+ R SRS
Sbjct: 57 RYDSRSPRSLRYSSVRRSRDRPRRRSRSVRSIE 89
Score = 41.4 bits (97), Expect = 7e-04
Identities = 24/93 (25%), Positives = 34/93 (36%), Gaps = 9/93 (9%)
Query: 309 NNHKRKSRSRSRTRS--------PVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRS 360
+R+SR RSR R SR R R + +S +YSS RSR + PR
Sbjct: 22 ERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRSLRYSSVRRSRDR-PRR 80
Query: 361 RSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPY 393
RSR+ + + S +
Sbjct: 81 RSRSVRSIEQHRRRLRDRSPSNQWRKDDKKRSL 113
Score = 39.5 bits (92), Expect = 0.002
Identities = 25/101 (24%), Positives = 36/101 (35%), Gaps = 2/101 (1%)
Query: 303 KSPSRHNNHKR-KSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSR 361
+ P R R + R RS R S+ RSR R ++S + SR R R SR
Sbjct: 3 EEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRR-RYDSR 61
Query: 362 SRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSHSRSHSR 402
S R + + + S ++ RS S
Sbjct: 62 SPRSLRYSSVRRSRDRPRRRSRSVRSIEQHRRRLRDRSPSN 102
Score = 34.1 bits (78), Expect = 0.15
Identities = 22/81 (27%), Positives = 32/81 (39%), Gaps = 3/81 (3%)
Query: 303 KSPSRHNNHKRKSRSRSR---TRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPR 359
+ SR + R+SR RS +R + SRS + + R RSRS
Sbjct: 29 RDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRSLRYSSVRRSRDRPRRRSRSVRSI 88
Query: 360 SRSRTPDRKYKKSHKSHKDSK 380
+ R R S++ KD K
Sbjct: 89 EQHRRRLRDRSPSNQWRKDDK 109
Score = 32.6 bits (74), Expect = 0.39
Identities = 24/80 (30%), Positives = 33/80 (41%), Gaps = 8/80 (10%)
Query: 328 KSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKS-HKDSKDY---- 382
+ R + + + + S R R RS R RSR DR + +S +DS+
Sbjct: 1 RDEEPDREREKSRGRDRDRSSERPRRRS---RDRSRFRDRHRRSRERSYREDSRPRDRRR 57
Query: 383 YTPPSPDRSPYSSHSRSHSR 402
Y SP YSS RS R
Sbjct: 58 YDSRSPRSLRYSSVRRSRDR 77
Score = 29.5 bits (66), Expect = 3.5
Identities = 18/67 (26%), Positives = 24/67 (35%), Gaps = 16/67 (23%)
Query: 307 RHNNHKRKSRS-----RSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSR 361
R R RS R+R +SRS Q R R R +SP ++
Sbjct: 55 RRRYDSRSPRSLRYSSVRRSRDRPRRRSRSVRSIEQ-----------HRRRLRDRSPSNQ 103
Query: 362 SRTPDRK 368
R D+K
Sbjct: 104 WRKDDKK 110
Score = 29.1 bits (65), Expect = 4.3
Identities = 19/56 (33%), Positives = 21/56 (37%), Gaps = 5/56 (8%)
Query: 306 SRHNNHKRKSRSRSRTRS-PVTSKS----RSRSRSPQPPKHKKSKKYSSRARSRSK 356
S + R+SR R R RS V S R R RSP K KK S
Sbjct: 65 SLRYSSVRRSRDRPRRRSRSVRSIEQHRRRLRDRSPSNQWRKDDKKRSLWDIKPPG 120
>gnl|CDD|217307 pfam02984, Cyclin_C, Cyclin, C-terminal domain. Cyclins regulate
cyclin dependent kinases (CDKs). Human CCNO is a
Uracil-DNA glycosylase that is related to other cyclins.
Cyclins contain two domains of similar all-alpha fold,
of which this family corresponds with the C-terminal
domain.
Length = 117
Score = 38.4 bits (90), Expect = 8e-04
Identities = 28/100 (28%), Positives = 47/100 (47%), Gaps = 12/100 (12%)
Query: 186 NQKLMQLANY-MNDSLRTDVFVRYDPETIASACIYLTARKLRIPLPRNPA--WYSLFHVL 242
+ + LA Y + SL F++Y P IA+A +YL ARK P Y+ +
Sbjct: 17 DLETRTLAKYLLELSLLDYDFLKYPPSLIAAAAVYL-ARKTLGSPPWTETLEHYTGYS-- 73
Query: 243 ESDIQDVCKRILRLYTRPKANTDELERQIEVIKKEYQLSK 282
E D++ K +L L R + +++ ++K+Y SK
Sbjct: 74 EEDLKPCVKLLLELLLRAPNS------KLQAVRKKYSSSK 107
>gnl|CDD|219953 pfam08648, DUF1777, Protein of unknown function (DUF1777). This is
a family of eukaryotic proteins of unknown function.
Some of the proteins in this family are putative nucleic
acid binding proteins.
Length = 158
Score = 39.1 bits (91), Expect = 0.001
Identities = 27/72 (37%), Positives = 42/72 (58%), Gaps = 6/72 (8%)
Query: 312 KRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSP---RSRSRTPDRK 368
+R+ RSRSR R + R RSRS + + ++S+ S RS+SP RSRSR+P R+
Sbjct: 13 RRRGRSRSRDRR---ERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRSRSRSPSRR 69
Query: 369 YKKSHKSHKDSK 380
+ + KD++
Sbjct: 70 RDRKRERDKDAR 81
Score = 37.2 bits (86), Expect = 0.004
Identities = 31/89 (34%), Positives = 39/89 (43%), Gaps = 15/89 (16%)
Query: 317 SRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSH 376
RSR+RSP S+ R RSRS R R + RSRSR DR+ + +S
Sbjct: 2 GRSRSRSPRRSRRRGRSRS--------------RDRRERRRERSRSRERDRRRRSRSRSP 47
Query: 377 KDSKDYYTPPSPDRSPYSSHSRSHSRKSS 405
S+ + P RS S SR RK
Sbjct: 48 HRSRRSRS-PRRHRSRSRSPSRRRDRKRE 75
Score = 34.9 bits (80), Expect = 0.027
Identities = 22/62 (35%), Positives = 32/62 (51%), Gaps = 4/62 (6%)
Query: 303 KSPSRHNNHKRKSRSRSRTRSP-VTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSR 361
+S SR + +R+SRSRS RS S R RSRS P + + K+ ++ P+ R
Sbjct: 30 RSRSRERDRRRRSRSRSPHRSRRSRSPRRHRSRSRSPSRRRDRKR---ERDKDAREPKKR 86
Query: 362 SR 363
R
Sbjct: 87 ER 88
Score = 34.1 bits (78), Expect = 0.052
Identities = 31/75 (41%), Positives = 40/75 (53%), Gaps = 7/75 (9%)
Query: 303 KSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRS 362
+S SR +R+ RSRSR R + RSRSRSP + +S R RSRS+SP SR
Sbjct: 17 RSRSRDRRERRRERSRSRERD---RRRRSRSRSPHRSRRSRSP---RRHRSRSRSP-SRR 69
Query: 363 RTPDRKYKKSHKSHK 377
R R+ K + K
Sbjct: 70 RDRKRERDKDAREPK 84
>gnl|CDD|129660 TIGR00569, ccl1, cyclin ccl1. All proteins in this family for
which functions are known are cyclins that are
components of TFIIH, a complex that is involved in
nucleotide excision repair and transcription initiation.
This family is based on the phylogenomic analysis of JA
Eisen (1999, Ph.D. Thesis, StanfordUniversity) [DNA
metabolism, DNA replication, recombination, and repair].
Length = 305
Score = 40.2 bits (94), Expect = 0.001
Identities = 39/171 (22%), Positives = 74/171 (43%), Gaps = 27/171 (15%)
Query: 68 LPQVAMATGQVLFQRFYYSKSFVRHPMETTAMGCVCLASKIEEAPRRIRDVINVFHHIRQ 127
+P + T + F+RFY + S + + + + CV LA K+EE I Q
Sbjct: 74 MPTSVVGTAIMYFKRFYLNNSVMEYHPKIIMLTCVFLACKVEE----------FNVSIDQ 123
Query: 128 VMNQKSITPMLLTTQYMTLKTQVIKAERRVLKELGFCVHVKHPHKIIVTYLQVLGCEKNQ 187
+ TP+ QV++ E ++++L F + V +P++ + +L + +
Sbjct: 124 FVGNLKETPLKAL-------EQVLEYELLLIQQLNFHLIVHNPYRPLEGFL--IDIKTRL 174
Query: 188 KLMQLANYM--------NDSLRTDVFVRYDPETIASACIYLTARKLRIPLP 230
++ Y+ N +L TD ++ Y P IA A I TA + + +
Sbjct: 175 PGLENPEYLRKHADKFLNRTLLTDAYLLYTPSQIALAAILHTASRAGLNME 225
>gnl|CDD|215740 pfam00134, Cyclin_N, Cyclin, N-terminal domain. Cyclins regulate
cyclin dependent kinases (CDKs). Human cyclin-O is a
Uracil-DNA glycosylase that is related to other cyclins.
Cyclins contain two domains of similar all-alpha fold,
of which this family corresponds with the N-terminal
domain.
Length = 127
Score = 35.2 bits (82), Expect = 0.012
Identities = 26/135 (19%), Positives = 49/135 (36%), Gaps = 21/135 (15%)
Query: 30 EEKLNPTPSMLDGLDPEVETDLRIIGCELIQTAGILLKLPQVAMATGQVLFQRFYYSKSF 89
EE+ P P LD P++ +R I + + KL + RF +
Sbjct: 10 EEEDRPPPDYLDQ-QPDINPKMRAILIDWLVEVHEEFKLLPETLYLAVNYLDRFLSKQPV 68
Query: 90 VRHPMETTAMGCVCLASKIEE-APRRIRDVINVFHHIRQVMNQKSITPMLLTTQYMTLKT 148
R ++ + C+ +A+K EE P + D + + T K
Sbjct: 69 PRTKLQLVGVTCLLIAAKYEEIYPPSVEDFVYI-------------------TDNAYTKE 109
Query: 149 QVIKAERRVLKELGF 163
++++ E +L L +
Sbjct: 110 EILRMELLILSTLNW 124
>gnl|CDD|221821 pfam12871, PRP38_assoc, Pre-mRNA-splicing factor 38-associated
hydrophilic C-term. This domain is a hydrophilic region
found at the C-terminus of plant and metazoan
pre-mRNA-splicing factor 38 proteins. The function is
not known.
Length = 97
Score = 33.2 bits (76), Expect = 0.041
Identities = 17/79 (21%), Positives = 31/79 (39%)
Query: 296 SNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRS 355
I+ + + + + R RTR + RSR R + +++ R R
Sbjct: 19 EEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRDRD 78
Query: 356 KSPRSRSRTPDRKYKKSHK 374
+ RSRSR+ R + +
Sbjct: 79 RYDRSRSRSRSRSRDRRRR 97
Score = 30.5 bits (69), Expect = 0.35
Identities = 14/68 (20%), Positives = 25/68 (36%), Gaps = 1/68 (1%)
Query: 335 SPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYS 394
+ + + S R R+R +S R + R+ ++ + +D
Sbjct: 26 RRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRD-RDRARYRDRDDRDRDRYDRSR 84
Query: 395 SHSRSHSR 402
S SRS SR
Sbjct: 85 SRSRSRSR 92
Score = 29.4 bits (66), Expect = 0.98
Identities = 15/76 (19%), Positives = 26/76 (34%), Gaps = 1/76 (1%)
Query: 291 DNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSR 350
+ + R+ R + + R R R+ + + + R
Sbjct: 23 EEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRDRDRYDR 82
Query: 351 ARSRSKSPRSRSRTPD 366
+RSRS+S RSR R
Sbjct: 83 SRSRSRS-RSRDRRRR 97
>gnl|CDD|221931 pfam13136, DUF3984, Protein of unknown function (DUF3984). This
family of proteins is functionally uncharacterized. This
family of proteins is found in eukaryotes. Proteins in
this family are typically between 393 and 442 amino
acids in length.
Length = 301
Score = 35.1 bits (81), Expect = 0.046
Identities = 30/126 (23%), Positives = 41/126 (32%), Gaps = 24/126 (19%)
Query: 304 SPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSP---QPPKHKKSKKYSSRARSRSKSPRS 360
SPSR HKRK SR S KS+S + H++SK ++ R S S
Sbjct: 61 SPSRSRLHKRKKSSRRSPMSDTLLKSKSSAHLLHHQSTRSHRRSKSGTTSPRKPSSSAHR 120
Query: 361 RSRTPD-------------RKYK--------KSHKSHKDSKDYYTPPSPDRSPYSSHSRS 399
R + R+ K +S S + R S R+
Sbjct: 121 RRNDSEWLLRAGAALASSTREEKGQSWLVKRESSTSLVSEDEEDFEREAAREREHSSRRA 180
Query: 400 HSRKSS 405
R S
Sbjct: 181 SRRGRS 186
Score = 33.5 bits (77), Expect = 0.16
Identities = 19/70 (27%), Positives = 28/70 (40%), Gaps = 3/70 (4%)
Query: 315 SRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSS---RARSRSKSPRSRSRTPDRKYKK 371
S SRS +RS + + +S RSP KSK + +RS T RK
Sbjct: 57 SHSRSPSRSRLHKRKKSSRRSPMSDTLLKSKSSAHLLHHQSTRSHRRSKSGTTSPRKPSS 116
Query: 372 SHKSHKDSKD 381
S ++ +
Sbjct: 117 SAHRRRNDSE 126
Score = 33.1 bits (76), Expect = 0.23
Identities = 26/80 (32%), Positives = 34/80 (42%), Gaps = 8/80 (10%)
Query: 324 PVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRS--RTPDRKYKKSHKSHKDSKD 381
P T S SRSP + K KK S R+ +S+S +S +SH+ SK
Sbjct: 50 PTTPGILSHSRSPSRSRLHKRKKSSRRSPMSDTLLKSKSSAHLLHH---QSTRSHRRSKS 106
Query: 382 YYTPPSPDRSPYSSHSRSHS 401
T P R P SS R +
Sbjct: 107 GTTSP---RKPSSSAHRRRN 123
Score = 31.6 bits (72), Expect = 0.61
Identities = 24/118 (20%), Positives = 46/118 (38%), Gaps = 9/118 (7%)
Query: 253 ILRLYTRPKANTDELERQIEVIKKEYQLSKDRKVLVSGDNTPTSNASPNIKSPSRHNNHK 312
R + +++ L R + + K + LV +++ TS S + + R +
Sbjct: 114 PSSSAHRRRNDSEWLLRAGAALASSTREEKGQSWLVKRESS-TSLVSEDEEDFEREAARE 172
Query: 313 RKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSR---SKSPRSRSRTPDR 367
R+ SR +R + RS +P ++ + +SR SR S +P R
Sbjct: 173 REHSSRRASR-----RGRSGYSTPSAALSRRGSRSASRRGSRADLSMTPLEARRADAE 225
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This
protein, which interacts with both microtubules and
TRAF3 (tumour necrosis factor receptor-associated factor
3), is conserved from worms to humans. The N-terminal
region is the microtubule binding domain and is
well-conserved; the C-terminal 100 residues, also
well-conserved, constitute the coiled-coil region which
binds to TRAF3. The central region of the protein is
rich in lysine and glutamic acid and carries KKE motifs
which may also be necessary for tubulin-binding, but
this region is the least well-conserved.
Length = 506
Score = 34.5 bits (79), Expect = 0.095
Identities = 25/144 (17%), Positives = 51/144 (35%), Gaps = 9/144 (6%)
Query: 266 ELERQIEVIKKEYQLSKDRKVLVSGDNTPTSNASPN-----IKSPSRHNNHKRKSRSRSR 320
E + + +K + KDRK ++ P K++ R R++
Sbjct: 113 VKEEKKKKKEKPKEEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAK 172
Query: 321 TRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSK 380
+R K + ++ +PP+ +K ++ + A + +++ KD +
Sbjct: 173 SRPKKPPKKKPPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDD----GKDRE 228
Query: 381 DYYTPPSPDRSPYSSHSRSHSRKS 404
+P D S SS S S
Sbjct: 229 TTTSPMEEDESRQSSEISRRSSSS 252
Score = 32.2 bits (73), Expect = 0.53
Identities = 18/127 (14%), Positives = 35/127 (27%), Gaps = 14/127 (11%)
Query: 280 LSKD---RKVLVSGDNTPTSNASPNIKSPSR-----------HNNHKRKSRSRSRTRSPV 325
LS D ++V G P + P + + K+K + + +
Sbjct: 71 LSSDEAVKRVEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEEPKD 130
Query: 326 TSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTP 385
P + +K K+ + + R R + K K P
Sbjct: 131 RKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEP 190
Query: 386 PSPDRSP 392
P ++
Sbjct: 191 PEEEKQR 197
>gnl|CDD|216205 pfam00937, Corona_nucleoca, Coronavirus nucleocapsid protein.
Length = 346
Score = 34.3 bits (79), Expect = 0.10
Identities = 22/88 (25%), Positives = 34/88 (38%), Gaps = 13/88 (14%)
Query: 297 NASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKK-------SKKYSS 349
+S SR + SRS SR S +S++ SR+R+ P +K
Sbjct: 151 GFRGRSRSSSRSS-----SRSNSRGPSRGSSRNNSRNRNSSSPDDLVAAVLAALAKLGFG 205
Query: 350 RARSRSKSPRSRSRTPDRKYKKSHKSHK 377
+ +S SK P SR + +K
Sbjct: 206 KQKSSSKKP-SRVTKKSAAEAAKKQLNK 232
>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
Length = 943
Score = 34.7 bits (79), Expect = 0.11
Identities = 21/96 (21%), Positives = 33/96 (34%), Gaps = 3/96 (3%)
Query: 294 PTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARS 353
PT + P +H + + R RS + P+ KS K +S
Sbjct: 573 PTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPESPKS 632
Query: 354 -RSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSP 388
+ P R +P+R + K K K +P P
Sbjct: 633 PKRPPPPQRPSSPER--PEGPKIIKSPKPPKSPKPP 666
Score = 32.7 bits (74), Expect = 0.41
Identities = 18/103 (17%), Positives = 35/103 (33%), Gaps = 1/103 (0%)
Query: 303 KSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRS 362
+ + ++S P +K + P P K K K + ++ + + P+
Sbjct: 528 EGEEGEHEDSKESDEPKEGGKPGETKEGEVGKKPGPAKEHKPSKIPTLSK-KPEFPKDPK 586
Query: 363 RTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSHSRSHSRKSS 405
D + K K + ++ P SP +S R S
Sbjct: 587 HPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPES 629
>gnl|CDD|219061 pfam06495, Transformer, Fruit fly transformer protein. This family
consists of transformer proteins from several Drosophila
species and also from Ceratitis capitata (Mediterranean
fruit fly). The transformer locus (tra) produces an RNA
processing protein that alternatively splices the
doublesex pre-mRNA in the sex determination hierarchy of
Drosophila melanogaster.
Length = 182
Score = 33.1 bits (75), Expect = 0.13
Identities = 21/62 (33%), Positives = 30/62 (48%)
Query: 304 SPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSR 363
+ R + +SRS+S R+ + RSRSRS + S+ R RS+S SR
Sbjct: 51 TSHRGRRTRSRSRSQSAERNSCQRRHRSRSRSRNRSDSRHRSTSSTERRRRSRSRSRYSR 110
Query: 364 TP 365
TP
Sbjct: 111 TP 112
Score = 31.2 bits (70), Expect = 0.68
Identities = 23/73 (31%), Positives = 42/73 (57%), Gaps = 1/73 (1%)
Query: 309 NNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKS-PRSRSRTPDR 367
N +RK++S T S ++RSRSRS ++ +++ SR+RSR++S R RS +
Sbjct: 38 NLRQRKTQSTRPTTSHRGRRTRSRSRSQSAERNSCQRRHRSRSRSRNRSDSRHRSTSSTE 97
Query: 368 KYKKSHKSHKDSK 380
+ ++S + S+
Sbjct: 98 RRRRSRSRSRYSR 110
>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
Length = 1352
Score = 34.4 bits (79), Expect = 0.14
Identities = 20/81 (24%), Positives = 30/81 (37%), Gaps = 1/81 (1%)
Query: 289 SGDNTPTSNASPNIKSPSRHNNHKRKSRS-RSRTRSPVTSKSRSRSRSPQPPKHKKSKKY 347
SG + AS + S ++ S S SR + S SRS SP P
Sbjct: 305 SGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSS 364
Query: 348 SSRARSRSKSPRSRSRTPDRK 368
+ S++P S + + R
Sbjct: 365 PRKRPRPSRAPSSPAASAGRP 385
Score = 34.0 bits (78), Expect = 0.19
Identities = 30/117 (25%), Positives = 38/117 (32%), Gaps = 12/117 (10%)
Query: 289 SGDNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYS 348
SG N P+S P S S S S + +S S S S S
Sbjct: 273 SGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESS------S 326
Query: 349 SRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSHSRSHSRKSS 405
S S S+S R + +P +S S+ PP D S R SS
Sbjct: 327 SSTSSSSESSRGAAVSPGP---SPSRSPSPSRP---PPPADPSSPRKRPRPSRAPSS 377
Score = 33.2 bits (76), Expect = 0.27
Identities = 16/92 (17%), Positives = 30/92 (32%)
Query: 284 RKVLVSGDNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKK 343
R S + +S++S + S S S SR+ SP + SP+
Sbjct: 313 RASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPS 372
Query: 344 SKKYSSRARSRSKSPRSRSRTPDRKYKKSHKS 375
S A + + R + ++ +
Sbjct: 373 RAPSSPAASAGRPTRRRARAAVAGRARRRDAT 404
Score = 30.5 bits (69), Expect = 1.8
Identities = 24/113 (21%), Positives = 35/113 (30%), Gaps = 11/113 (9%)
Query: 289 SGDNTPTSNASPNIKSPSRHNNHKRKS----RSRSRTRSPVTSKSRSRSRSP-QPPKHKK 343
S + A SPSR + R S R R + S + S +P + +
Sbjct: 331 SSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRA 390
Query: 344 SKKYSSRARSRSKSPRS-RSRTPDRKYKKSHKSHKDSKDYYTPP--SPDRSPY 393
+ RAR R + R R S Y P +P P+
Sbjct: 391 RAAVAGRARRRDATGRFPAGRPRPSPLDAGAAS---GAFYARYPLLTPSGEPW 440
Score = 30.1 bits (68), Expect = 2.6
Identities = 24/110 (21%), Positives = 33/110 (30%), Gaps = 3/110 (2%)
Query: 291 DNTPTSNASPNI-KSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSS 349
+ P +P + + SR S +S R RS SP P S
Sbjct: 253 NECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSS-PRERSPSPSPSSPGSGPA-PS 310
Query: 350 RARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSHSRS 399
R+ S S SR + S S + PS SP +
Sbjct: 311 SPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPA 360
Score = 29.8 bits (67), Expect = 3.3
Identities = 20/90 (22%), Positives = 32/90 (35%), Gaps = 7/90 (7%)
Query: 310 NHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSR------ARSRSKSPRSRSR 363
KRKSRS + +S + P +S + S R+R K+ R R R
Sbjct: 816 ASKRKSRSHTPDGGSESSGPARPPGAAARPPPARSSESSKSKPAAAGGRARGKNGRRRPR 875
Query: 364 TPDRKYKKS-HKSHKDSKDYYTPPSPDRSP 392
P+ + + K + +P P
Sbjct: 876 PPEPRARPGAAAPPKAAAAAPPAGAPAPRP 905
>gnl|CDD|223020 PHA03246, PHA03246, large tegument protein UL36; Provisional.
Length = 3095
Score = 34.2 bits (78), Expect = 0.15
Identities = 33/135 (24%), Positives = 51/135 (37%), Gaps = 15/135 (11%)
Query: 283 DRKVLVSGDNTPTSNASPNIKSPSR--------HNNHKRKSRSRSRTRSPVTSKSRSRSR 334
D LV G +T + + P ++ R H + + S V + S + S
Sbjct: 364 DPSTLVGGASTNINISDPPARTDCRRYSEGSVIHESVDSHIEDVTEATSVVAAWSDAFSD 423
Query: 335 SPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDRS--- 391
+ H + A SK+ DR+ + S+ HK + +TPPS +
Sbjct: 424 ISEDYSHLTRPDLPATAHDVSKN--GHDTKSDRRSRGSNSRHKRRRPSWTPPSSSENVSS 481
Query: 392 --PYSSHSRSHSRKS 404
P S SR SRKS
Sbjct: 482 DGPTFSQSRKPSRKS 496
>gnl|CDD|165564 PHA03309, PHA03309, transcriptional regulator ICP4; Provisional.
Length = 2033
Score = 34.1 bits (77), Expect = 0.18
Identities = 25/74 (33%), Positives = 34/74 (45%), Gaps = 6/74 (8%)
Query: 298 ASPNIKSPSRHNNHKRKSRSRSRT-----RSPVTSKSRSRSRSPQPPKHKKSKKYSSRAR 352
A P + RH + +R ++ R RS +S S S RSP ++ + SSR R
Sbjct: 1537 APPGPELADRHADRRRSTKGPQRPGGKRPRSSSSSSSASHDRSPSSSSRRRDGRPSSRRR 1596
Query: 353 -SRSKSPRSRSRTP 365
SR S R SR P
Sbjct: 1597 PSRRMSARPPSRPP 1610
>gnl|CDD|223046 PHA03328, PHA03328, nuclear egress lamina protein UL31;
Provisional.
Length = 316
Score = 33.1 bits (76), Expect = 0.20
Identities = 15/54 (27%), Positives = 22/54 (40%), Gaps = 1/54 (1%)
Query: 301 NIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSR 354
+ S R R+SR R S S+ RSR RS + + + + R R
Sbjct: 9 SSSSLRRSRRAARRSRRDGRVGSRGRSRYRSRRRSSRRS-STRRAELADTERDR 61
>gnl|CDD|218883 pfam06075, DUF936, Plant protein of unknown function (DUF936).
This family consists of several hypothetical proteins
from Arabidopsis thaliana and Oryza sativa. The function
of this family is unknown.
Length = 564
Score = 32.9 bits (75), Expect = 0.34
Identities = 33/157 (21%), Positives = 49/157 (31%), Gaps = 12/157 (7%)
Query: 258 TRPKANTDELERQIEVIKKEYQLSKDRKVLVSGDNTPTSNASPNIKSPSRHNNHKRKSRS 317
+ + I+VIK++ S R+ + S S R + S
Sbjct: 116 VAADSLAFFSDAVIQVIKRKKASSAPRRGSWDSSSKSASIDSSPTVIGPRPRS---FSEL 172
Query: 318 RSRTRSPV-TSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSR----SRTPDRKYKKS 372
R+P SRS +P P S RS S R R R
Sbjct: 173 NLTDRTPAKVRSSRSELGAPSPSGGTSCPSSSGGRRSSIGSRRLRGSASLRKKVAVLSAP 232
Query: 373 HKSHKDSKDYYTPP----SPDRSPYSSHSRSHSRKSS 405
K S D + P S +SP+ S + + K+
Sbjct: 233 RKPGSRSSDCKSSPRARSSSAKSPFKSSIQRKATKAL 269
Score = 30.6 bits (69), Expect = 1.8
Identities = 17/82 (20%), Positives = 28/82 (34%), Gaps = 1/82 (1%)
Query: 284 RKVLVSGDN-TPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHK 342
+KV V P S +S SP ++ + S R + S+ R+ K
Sbjct: 224 KKVAVLSAPRKPGSRSSDCKSSPRARSSSAKSPFKSSIQRKATKALSKLSLRASPKDTSK 283
Query: 343 KSKKYSSRARSRSKSPRSRSRT 364
SK + + S S+
Sbjct: 284 SSKSEVAPPKKSEAKVPSSSKK 305
Score = 29.8 bits (67), Expect = 3.1
Identities = 20/90 (22%), Positives = 30/90 (33%), Gaps = 5/90 (5%)
Query: 312 KRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKK 371
++K S R P + S +S KS SS R +K+ S
Sbjct: 223 RKKVAVLSAPRKPGSRSSDCKSSPRARSSSAKSPFKSSIQRKATKALSKLSLR-----AS 277
Query: 372 SHKSHKDSKDYYTPPSPDRSPYSSHSRSHS 401
+ K SK PP + S S+ +
Sbjct: 278 PKDTSKSSKSEVAPPKKSEAKVPSSSKKWT 307
>gnl|CDD|151665 pfam11223, DUF3020, Protein of unknown function (DUF3020). This
family of fungal proteins is conserved towards the
C-terminus of HMG domains. The function is not known.
Length = 49
Score = 29.1 bits (65), Expect = 0.48
Identities = 13/44 (29%), Positives = 18/44 (40%)
Query: 340 KHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYY 383
++ KK+ R+K RSR + KK K KD Y
Sbjct: 1 NRERKKKWREANSERNKDNDLRSRVKKKAKKKFGKEDSKEKDAY 44
>gnl|CDD|215138 PLN02248, PLN02248, cellulose synthase-like protein.
Length = 1135
Score = 32.3 bits (74), Expect = 0.57
Identities = 23/89 (25%), Positives = 34/89 (38%), Gaps = 12/89 (13%)
Query: 308 HNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDR 367
++ + SR + S + S SPQ K AR R+ S R S + D
Sbjct: 2 ASSSSKPSRKSLSSSSSSAGPPSNNSSSPQSVKF---------AR-RTSSGRYVSLSRDD 51
Query: 368 KYKKSHKSHKDSKDY--YTPPSPDRSPYS 394
S D +Y + PP+PD P +
Sbjct: 52 LDLSGELSSSDYLNYTVHIPPTPDNQPMA 80
>gnl|CDD|224323 COG1405, SUA7, Transcription initiation factor TFIIIB, Brf1
subunit/Transcription initiation factor TFIIB
[Transcription].
Length = 285
Score = 31.5 bits (72), Expect = 0.59
Identities = 30/165 (18%), Positives = 58/165 (35%), Gaps = 24/165 (14%)
Query: 66 LKLPQVAMATGQVLFQRFYYSKSFVR-HPMETTAMGCVCLASKIEEAPRRIRDVINVFHH 124
L LP+ T ++ R K +R +E+ A C+ A +I PR + ++
Sbjct: 111 LGLPESVRETAARIY-RKAVDKGLLRGRSIESVAAACIYAACRINGVPRTLDEIAKALG- 168
Query: 125 IRQVMNQKSITPMLLTTQYMTLKTQVIKAERRVLKELGFCVHVKHPHKIIVTYLQVLGCE 184
K ++ + R +++EL + P I + LG
Sbjct: 169 --------------------VSKKEIGRTYRLLVRELKLKIPPVDPSDYIPRFASKLGLS 208
Query: 185 KNQKLMQLANYMNDSLRTDVFVRYDPETIASACIYLTARKLRIPL 229
++ + + + R + P +A+A IYL + L
Sbjct: 209 -DEVRRKAIEIVKKAKRAGLTAGKSPAGLAAAAIYLASLLLGERR 252
>gnl|CDD|220815 pfam10577, UPF0560, Uncharacterized protein family UPF0560. This
family of proteins has no known function.
Length = 805
Score = 31.8 bits (72), Expect = 0.68
Identities = 21/144 (14%), Positives = 44/144 (30%), Gaps = 8/144 (5%)
Query: 266 ELERQIEVIKKEYQLSKDRKVLVSGDNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPV 325
+L ++V K++ S L+S S+A + SR + + +
Sbjct: 313 QLSSALDVSKRDQATSMSHINLISTHLEMVSSAGEADMHTPMLKSAFSSSRDFTSSEELL 372
Query: 326 TSKSRSRSRSPQPPKH--KKSKKYSSRARSRSKSPRSRSRT------PDRKYKKSHKSHK 377
K+ +S+ PQ + K+ + + SK R
Sbjct: 373 AHKAEDKSQLPQSGESFPLKASRSGEQKEEYSKLETEEYRRGYGTVESSSLENHRDSFGL 432
Query: 378 DSKDYYTPPSPDRSPYSSHSRSHS 401
++++Y+ P S
Sbjct: 433 ANQNHYSAPPTVSIQPPSGPVPPK 456
>gnl|CDD|150884 pfam10278, Med19, Mediator of RNA pol II transcription subunit 19.
Med19 represents a family of conserved proteins which
are members of the multi-protein co-activator Mediator
complex. Mediator is required for activation of RNA
polymerase II transcription by DNA binding
transactivators.
Length = 178
Score = 31.0 bits (70), Expect = 0.71
Identities = 20/77 (25%), Positives = 32/77 (41%), Gaps = 9/77 (11%)
Query: 302 IKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSR 361
I+ P + + HK K + R++ P + S S KHKK K + R + K
Sbjct: 106 IQPPKKKHKHKHK-KHRTQDPLPEETPSDSEGLKGHEKKHKKKKHEDDKERKKKKK---- 160
Query: 362 SRTPDRKYKKSHKSHKD 378
++K KK S +
Sbjct: 161 ----EKKKKKKRHSPEH 173
>gnl|CDD|216191 pfam00919, UPF0004, Uncharacterized protein family UPF0004. This
family is the N terminal half of the Prosite family. The
C-terminal half has been shown to be related to MiaB
proteins. This domain is a nearly always found in
conjunction with pfam04055 and pfam01938 although its
function is uncertain.
Length = 98
Score = 29.8 bits (68), Expect = 0.74
Identities = 10/32 (31%), Positives = 16/32 (50%), Gaps = 4/32 (12%)
Query: 152 KAERRVLKELGFCVHVKHPHKIIVTYLQVLGC 183
KAE++ + + +K+P IV V GC
Sbjct: 50 KAEQKSRQTIRRLKRLKNPDAKIV----VTGC 77
>gnl|CDD|114474 pfam05750, Rubella_Capsid, Rubella capsid protein. Rubella virus
is an enveloped positive-strand RNA virus of the family
Togaviridae. Virions are composed of three structural
proteins: a capsid and two membrane-spanning
glycoproteins, E2 and E1. During virus assembly, the
capsid interacts with genomic RNA to form nucleocapsids.
It has been discovered that capsid phosphorylation
serves to negatively regulate binding of viral genomic
RNA. This may delay the initiation of nucleocapsid
assembly until sufficient amounts of virus glycoproteins
accumulate at the budding site and/or prevent
non-specific binding to cellular RNA when levels of
genomic RNA are low. It follows that at a late stage in
replication, the capsid may undergo dephosphorylation
before nucleocapsid assembly occurs.
Length = 300
Score = 31.0 bits (69), Expect = 1.1
Identities = 21/96 (21%), Positives = 39/96 (40%), Gaps = 7/96 (7%)
Query: 304 SPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSR 363
+P + ++ ++SR + S+SR P+PP+ + S + S PR R
Sbjct: 5 TPITMEDLQKALEAQSRALRAELAAGASQSRRPRPPRQRDSSTSGDDSGRDSGGPRRRRG 64
Query: 364 TPDRKYKKS-------HKSHKDSKDYYTPPSPDRSP 392
R +K + ++S+ P P R+P
Sbjct: 65 NRGRGQRKDWSRAPPPPEERQESRSQTPAPKPSRAP 100
Score = 29.8 bits (66), Expect = 2.3
Identities = 17/77 (22%), Positives = 35/77 (45%), Gaps = 9/77 (11%)
Query: 282 KDRKVLVSGDNTPTSNASPNIKSPSRHNNHKR---------KSRSRSRTRSPVTSKSRSR 332
+ R SGD++ + P + +R ++ + R SR+++P SR+
Sbjct: 41 RQRDSSTSGDDSGRDSGGPRRRRGNRGRGQRKDWSRAPPPPEERQESRSQTPAPKPSRAP 100
Query: 333 SRSPQPPKHKKSKKYSS 349
+ PQPP+ + + S+
Sbjct: 101 PQQPQPPRMQTGRGGSA 117
>gnl|CDD|185588 PTZ00385, PTZ00385, lysyl-tRNA synthetase; Provisional.
Length = 659
Score = 31.2 bits (70), Expect = 1.1
Identities = 18/83 (21%), Positives = 33/83 (39%), Gaps = 7/83 (8%)
Query: 296 SNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRS 355
+ A +K P R+ S S +RSP+ K + S +++ S RS+
Sbjct: 18 TAARQAVKGPLLPGLQLRQVASLSSSRSPLELK---KPISKASATKTVTQEASRAPRSKL 74
Query: 356 KSPRS----RSRTPDRKYKKSHK 374
P + R TP + ++ +
Sbjct: 75 DLPAAYSSFRGITPISEVRERYG 97
>gnl|CDD|222914 PHA02666, PHA02666, hypothetical protein; Provisional.
Length = 287
Score = 30.7 bits (68), Expect = 1.2
Identities = 22/105 (20%), Positives = 37/105 (35%), Gaps = 1/105 (0%)
Query: 301 NIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRS 360
N R + R+ RS RT + +S + + +P + S K SSR+ S S
Sbjct: 36 NSMESRRKSRPSRQHRSAERTPTTASSLTHENNTAPSRHGKQHSCKASSRSSHNRGSTSS 95
Query: 361 RSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSHSRSHSRKSS 405
S +H+ SK + P S + + +
Sbjct: 96 -SHNHHAHRGPHQSAHRRSKHDAVRDTYQPCPQSPETDLYKGRLP 139
Score = 28.4 bits (62), Expect = 6.9
Identities = 25/112 (22%), Positives = 46/112 (41%), Gaps = 6/112 (5%)
Query: 277 EYQLSKDRKVLVSGDNTPTSNASP---NIKSPSRH---NNHKRKSRSRSRTRSPVTSKSR 330
+ S+ + S + TPT+ +S N +PSRH ++ K SRS S +S +
Sbjct: 40 SRRKSRPSRQHRSAERTPTTASSLTHENNTAPSRHGKQHSCKASSRSSHNRGSTSSSHNH 99
Query: 331 SRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDY 382
R P H++SK + R + + + + H ++ D+
Sbjct: 100 HAHRGPHQSAHRRSKHDAVRDTYQPCPQSPETDLYKGRLPGETERHYETPDH 151
>gnl|CDD|218941 pfam06217, GAGA_bind, GAGA binding protein-like family. This
family includes gbp a protein from Soybean that binds to
GAGA element dinucleotide repeat DNA. It seems likely
that the this domain mediates DNA binding. This putative
domain contains several conserved cysteines and a
histidine suggesting this may be a zinc-binding DNA
interaction domain.
Length = 301
Score = 30.6 bits (69), Expect = 1.5
Identities = 15/44 (34%), Positives = 19/44 (43%), Gaps = 1/44 (2%)
Query: 326 TSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRT-PDRK 368
K + P+ PK KK KK S + K P R+ PD K
Sbjct: 149 KPKKGQSPKVPKAPKPKKPKKKGSVSNRSVKMPGIDPRSKPDWK 192
>gnl|CDD|222011 pfam13257, DUF4048, Domain of unknown function (DUF4048). This
presumed domain is functionally uncharacterized. This
domain family is found in eukaryotes, and is typically
between 228 and 257 amino acids in length.
Length = 242
Score = 30.1 bits (68), Expect = 1.5
Identities = 31/123 (25%), Positives = 45/123 (36%), Gaps = 26/123 (21%)
Query: 303 KSPSRHNNHKRKSRSRSRTRSPVT--SKSRSRSRSPQPPKHKKSKKYSSRARS------- 353
++ + + SRS SR+R + S S SRS K S K S A
Sbjct: 120 RTVPPPRSRRSGSRSTSRSRLRLQGGSLSSSRSSRSSTSKGATSGKDSKSADIDVSFWSE 179
Query: 354 --------RSKSPRSRSRTPDRKYKKSHKSHK---------DSKDYYTPPSPDRSPYSSH 396
+SKSP+ S TP + + D+ D + P P ++ S
Sbjct: 180 FGIDTPGQKSKSPQKASSTPAGNTNQGQSQNAQSSNLLDVDDNWDDWDTPQPKKTHTPSS 239
Query: 397 SRS 399
SRS
Sbjct: 240 SRS 242
>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein. This entry is a highly
conserved protein present in eukaryotes.
Length = 680
Score = 30.7 bits (69), Expect = 1.8
Identities = 30/104 (28%), Positives = 44/104 (42%), Gaps = 5/104 (4%)
Query: 288 VSGDNTPTSNASPNIKSPSRHNNHKRKSR-SRSRTRSPV--TSKSRSRSRSPQPPKHKKS 344
+ +N + +A +++ NH KSR S S T TS S S + K KS
Sbjct: 273 IGINNHHSKHADSKLQTIEVIENHSNKSRPSSSSTNGSKETTSNSSSAAAGSIGSKSSKS 332
Query: 345 KKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSP 388
K+S+R +S S SP+S S S + +SK S
Sbjct: 333 AKHSNRNKSNS-SPKSHSSANGSVPSSSVSDN-ESKQKRASKSS 374
>gnl|CDD|189000 cd08662, M13, Peptidase family M13 includes neprilysin,
endothelin-converting enzyme I. M13 family of
metallopeptidases includes neprilysin (neutral
endopeptidase, NEP, enkephalinase, CD10, CALLA, EC
3.4.24.11), endothelin-converting enzyme I (ECE-1, EC
3.4.24.71), erythrocyte surface antigen KELL (ECE-3),
phosphate-regulating gene on the X chromosome (PHEX),
soluble secreted endopeptidase (SEP), and damage-induced
neuronal endopeptidase (DINE)/X-converting enzyme (XCE).
These proteins consist of a short N-terminal cytoplasmic
domain, a single transmembrane helix, and a larger
C-terminal extracellular domain containing the active
site. Proteins in this family fulfill a broad range of
physiological roles due to the greater variation in the
S2' subsite allowing substrate specificity. NEP is
expressed in a variety of tissues including kidney and
brain, and is involved in many physiological and
pathological processes, including blood pressure and
inflammatory response. It degrades a wide array of
substrates such as substance P, enkephalins,
cholecystokinin, neurotensin and somatostatin. It is an
important enzyme in the regulation of amyloid-beta
(Abeta) protein that forms amyloid plaques that are
associated with Alzeimers disease (AD). ECE-1 catalyzes
the final rate-limiting step in the biosynthesis of
endothelins via post-translational conversion of the
biologically inactive big endothelins. Like NEP, it also
hydrolyses bradykinin, substance P, neurotensin and
Abeta. Endothelin-1 overproduction has been implicated
in various diseases, including stroke, asthma,
hypertension, and cardiac and renal failure. Kell is a
homolog of NEP and constitutes a major antigen on human
erythrocytes; it preferentially cleaves big endothelin-3
to produce bioactive endothelin-3, but is also known to
cleave substance P and neurokinin A. PHEX forms a
complex interaction with fibroblast growth factor 23
(FGF23) and matrix extracellular phosphoglycoprotein,
causing bone mineralization. A loss-of-function mutation
in PHEX disrupts this interaction leading to
hypophosphatemic rickets; X-linked hypophosphatemic
(XLH) rickets is the most common form of metabolic
rickets. ECEL1 is a brain metalloprotease involved in
the critical role in the nervous regulation of the
respiratory system, while DINE (damage induced neuronal
endopeptidase) is abundantly expressed in the
hypothalamus and its expression responds to nerve injury
as well. Thus, majority of these M13 proteases are prime
therapeutic targets for selective inhibition.
Length = 611
Score = 30.3 bits (69), Expect = 2.1
Identities = 12/44 (27%), Positives = 21/44 (47%)
Query: 228 PLPRNPAWYSLFHVLESDIQDVCKRILRLYTRPKANTDELERQI 271
P+P + + Y F L +++ K IL KA+ E++I
Sbjct: 20 PIPADKSSYGSFSELREKVEERLKEILEEAAAEKASDSSAEQKI 63
>gnl|CDD|219916 pfam08580, KAR9, Yeast cortical protein KAR9. The KAR9 protein in
Saccharomyces cerevisiae is a cytoskeletal protein
required for karyogamy, correct positioning of the
mitotic spindle and for orientation of cytoplasmic
microtubules. KAR9 localises at the shmoo tip in mating
cells and at the tip of the growing bud in anaphase.
Length = 626
Score = 30.2 bits (68), Expect = 2.2
Identities = 26/159 (16%), Positives = 44/159 (27%), Gaps = 21/159 (13%)
Query: 268 ERQIEVIKKEYQLSKDRKVLVSGDNTPTSN-ASPNIKSPSRHN-NHKRKSRS-------R 318
+ ++ D S +TP+S+ +S I +P SR
Sbjct: 371 SKIQQIRDSISVSGSDYSNPGSSIDTPSSSPSSSVIMTPPDSGPGSNVSSRRVGTPGSKS 430
Query: 319 SRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKD 378
R + + + + P K S S + + R P K H
Sbjct: 431 DRVGAVLLRRMNIKPTLASIPDEKPSNISVFEDSETSPNSSTLLRDPPPKKCGEESGHLP 490
Query: 379 SKDYY------------TPPSPDRSPYSSHSRSHSRKSS 405
+ ++ P + SR SR SS
Sbjct: 491 NNPFFNKLKLTLSSIPPLSPRQSIITLPTPSRPASRISS 529
Score = 28.3 bits (63), Expect = 8.8
Identities = 25/127 (19%), Positives = 36/127 (28%), Gaps = 9/127 (7%)
Query: 283 DRKVLVSGDNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHK 342
++ L P S I P+ R S R S S P K
Sbjct: 496 NKLKLTLSSIPPLSPRQSIITLPTPSRPASRISSLSLRLGSYSG-SIVSPPPYPTLVSRK 554
Query: 343 KSKKYS-SRARSRSKSPRSR------SRTPDRKYKKS-HKSHKDSKDYYTPPSPDRSPYS 394
+ S +R+ S + R +R P +K S + S +P P S
Sbjct: 555 GAAGLSFNRSVSDIEGERIGRYNLLPTRIPALPFKAESTTSSRRSSSLPSPTGVIGFPGS 614
Query: 395 SHSRSHS 401
H
Sbjct: 615 VPRFDHE 621
>gnl|CDD|217835 pfam03999, MAP65_ASE1, Microtubule associated protein (MAP65/ASE1
family).
Length = 619
Score = 30.2 bits (68), Expect = 2.3
Identities = 22/125 (17%), Positives = 36/125 (28%), Gaps = 11/125 (8%)
Query: 292 NTPTSNASPNIKSPSRH-----------NNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPK 340
NTP+ +PN+ S N HK + R T + + SRS +
Sbjct: 484 NTPSLKRTPNLTKSSLSQEASLISKSTGNTHKHSTPRRLTTLPKLPAASRSSKGNLIRSG 543
Query: 341 HKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSHSRSH 400
+ + S P + H +K ++ PS + RS
Sbjct: 544 ANGNASSDLSSPGSINSKSPEHSVPLVRVFDIHLRASTTKGRHSTPSTNEKKKRLLKRSP 603
Query: 401 SRKSS 405
Sbjct: 604 LSPPK 608
>gnl|CDD|227367 COG5034, TNG2, Chromatin remodeling protein, contains PhD zinc
finger [Chromatin structure and dynamics].
Length = 271
Score = 29.9 bits (67), Expect = 2.5
Identities = 15/80 (18%), Positives = 28/80 (35%), Gaps = 1/80 (1%)
Query: 261 KANTDELERQIEVIKKEYQLSKDRKVLVSGDNTPTSNASPNIKSPSRHNNHKRKSRSR-S 319
K +++ +IE + + + + S SRH K++
Sbjct: 109 KRPHEKVAARIENCHDAVSRLERNSYSSAARRSSGEHRSAASSQGSRHTKLKKRKNIHNL 168
Query: 320 RTRSPVTSKSRSRSRSPQPP 339
+ RSP S R S + + P
Sbjct: 169 KRRSPELSSKREVSFTLESP 188
>gnl|CDD|215814 pfam00242, DNA_pol_viral_N, DNA polymerase (viral) N-terminal
domain.
Length = 379
Score = 29.8 bits (67), Expect = 2.6
Identities = 18/110 (16%), Positives = 36/110 (32%), Gaps = 11/110 (10%)
Query: 299 SPNIKSPSRHNNHKRKSRSRSRTRSPVT-SKSRSRSRSPQPPKHKKSKKYSSR--ARSRS 355
S +S ++ K + RS + S +R P + SS S
Sbjct: 243 SQIQRSRLGLQANQGKLAHGQQGRSGSIRGRKHSTTRRPFG-----VEPSSSGVTTNRAS 297
Query: 356 KSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSHSRSHSRKSS 405
S ++ R+ ++ S S+ + + S S S +++
Sbjct: 298 SSSSCFHQSAVRE--TAYSSLSTSERH-SSSGHAVELRSIPGGSVSSQNA 344
>gnl|CDD|109440 pfam00382, TFIIB, Transcription factor TFIIB repeat.
Length = 71
Score = 27.2 bits (61), Expect = 2.8
Identities = 8/19 (42%), Positives = 13/19 (68%)
Query: 210 PETIASACIYLTARKLRIP 228
PE+IA+AC+Y+ R +
Sbjct: 36 PESIAAACLYIACRLEEVK 54
>gnl|CDD|113413 pfam04642, DUF601, Protein of unknown function, DUF601. This
family represents a conserved region found in several
uncharacterized plant proteins.
Length = 311
Score = 29.6 bits (66), Expect = 2.8
Identities = 21/79 (26%), Positives = 33/79 (41%), Gaps = 8/79 (10%)
Query: 314 KSRSRSRTRSPVT-SKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKS 372
K R+ + + P S SR + K +S +R+ SK+ S+ + KK
Sbjct: 9 KLRAAEKAKQPQAEEDSGSRQKPSTLAG-KNPDAPTSESRTPSKATSSKDPSKRYADKKR 67
Query: 373 HKSHKDSKDYYTPPSPDRS 391
+S KD+ SP RS
Sbjct: 68 KQSEKDA------RSPPRS 80
>gnl|CDD|215556 PLN03064, PLN03064, alpha,alpha-trehalose-phosphate synthase
(UDP-forming); Provisional.
Length = 934
Score = 29.8 bits (67), Expect = 3.2
Identities = 19/72 (26%), Positives = 30/72 (41%), Gaps = 7/72 (9%)
Query: 324 PVTSKSRSRSRSPQPPKHKKSKKYSSR---ARSRSKSPRSRSRTPDRKYKKSHKSHKDSK 380
P S + +RSRSP K ++ S + +RS SK+ + + + KS +H S
Sbjct: 815 PSDSPAIARSRSPDGLKSSGDRRPSGKLPSSRSNSKNSQGKKQRSLLSSAKSGVNHAASH 874
Query: 381 DYYTPPSPDRSP 392
SP
Sbjct: 875 ----GSDRRPSP 882
Score = 29.4 bits (66), Expect = 4.7
Identities = 21/98 (21%), Positives = 36/98 (36%), Gaps = 9/98 (9%)
Query: 268 ERQIEVIKKEYQLSKDRKVLVSGDNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTS 327
E ++ S+ L S + S P+ +S S+ N+ +K RS + +
Sbjct: 811 EPELPSDSPAIARSRSPDGLKSSGDRRPSGKLPSSRSNSK-NSQGKKQRSLLSSAKSGVN 869
Query: 328 KSRSRSRSPQPPKHK--------KSKKYSSRARSRSKS 357
+ S +P K K + Y S A R +S
Sbjct: 870 HAASHGSDRRPSPEKIGWSVLDLKGENYFSCAVGRKRS 907
>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein. This family consists of
AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
retardation syndrome) nuclear proteins. These proteins
have been linked to human diseases such as acute
lymphoblastic leukaemia and mental retardation. The
family also contains a Drosophila AF4 protein homologue
Lilliputian which contains an AT-hook domain.
Lilliputian represents a novel pair-rule gene that acts
in cytoskeleton regulation, segmentation and
morphogenesis in Drosophila.
Length = 1154
Score = 29.9 bits (67), Expect = 3.2
Identities = 23/92 (25%), Positives = 35/92 (38%), Gaps = 6/92 (6%)
Query: 291 DNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSK-KYSS 349
+ + + + S + +N + + SR++ + S S S S P+H K
Sbjct: 770 EKSSSCSPSSSSSHHHSSSNKESRKSSRNKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQ 829
Query: 350 RARSRSKSPRSRSRTPDRKYKKSHKSHKDSKD 381
S S P S S T K S KS SK
Sbjct: 830 EDTSSSSGPFSASST-----KSSSKSSSTSKH 856
Score = 28.7 bits (64), Expect = 6.9
Identities = 19/69 (27%), Positives = 26/69 (37%)
Query: 337 QPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSH 396
KK ++ +S SK R + +S K + K PS S + S
Sbjct: 728 SLSAPKKQTSKTASEKSSSKGKRKHKNDEEADKIESKKQRLEEKSSSCSPSSSSSHHHSS 787
Query: 397 SRSHSRKSS 405
S SRKSS
Sbjct: 788 SNKESRKSS 796
>gnl|CDD|206363 pfam14195, DUF4316, Domain of unknown function (DUF4316). This
domain is functionally uncharacterized. This domain is
found in bacteria, and is typically between 56 and 95
amino acids in length.
Length = 65
Score = 27.0 bits (60), Expect = 3.6
Identities = 9/37 (24%), Positives = 15/37 (40%)
Query: 336 PQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKS 372
P+ P +K S + + RSR P ++ K
Sbjct: 27 PEAPTVPPEEKPSILDKLKEDKKARRSRKPVKQKKHD 63
>gnl|CDD|201885 pfam01608, I_LWEQ, I/LWEQ domain. I/LWEQ domains bind to actin. It
has been shown that the I/LWEQ domains from mouse talin
and yeast Sla2p interact with F-actin. I/LWEQ domains
can be placed into four major groups based on sequence
similarity: (1) Metazoan talin; (2) Dictyostelium
TalA/TalB and SLA110; (3) metazoan Hip1p; and (4) yeast
Sla2p. The domain has four conserved blocks, the name of
the domain is derived from the initial conserved amino
acid of each of the four blocks.
Length = 194
Score = 28.7 bits (65), Expect = 3.8
Identities = 11/24 (45%), Positives = 18/24 (75%)
Query: 264 TDELERQIEVIKKEYQLSKDRKVL 287
T E+E+Q+E++K E +L + RK L
Sbjct: 160 TQEMEQQVEILKLENELEEARKRL 183
>gnl|CDD|233365 TIGR01347, sucB, 2-oxoglutarate dehydrogenase complex
dihydrolipoamide succinyltransferase (E2 component).
This model describes the TCA cycle 2-oxoglutarate system
E2 component, dihydrolipoamide succinyltransferase. It
is closely related to the pyruvate dehydrogenase E2
component, dihydrolipoamide acetyltransferase. The seed
for this model includes mitochondrial and Gram-negative
bacterial forms. Mycobacterial candidates are highly
derived, differ in having and extra copy of the
lipoyl-binding domain at the N-terminus. They score
below the trusted cutoff, but above the noise cutoff and
above all examples of dihydrolipoamide acetyltransferase
[Energy metabolism, TCA cycle].
Length = 403
Score = 29.3 bits (66), Expect = 3.9
Identities = 16/93 (17%), Positives = 33/93 (35%), Gaps = 10/93 (10%)
Query: 281 SKDRKVLVSGDNTPTSNASPNIKSPS-----RHNNHKRKSRSRSRTRSPVT-----SKSR 330
K+ S PT+ A+ SP+ + + + + VT K+
Sbjct: 91 EKEETPAASAAAAPTAAANRPSLSPAARRLAKEHGIDLSAVPGTGVTGRVTKEDIIKKTE 150
Query: 331 SRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSR 363
+ + + QP +K ++ R + +R R
Sbjct: 151 APASAQQPAPAAAAKAPANFTRPEERVKMTRLR 183
>gnl|CDD|223068 PHA03384, PHA03384, early DNA-binding protein E2A; Provisional.
Length = 445
Score = 29.3 bits (66), Expect = 3.9
Identities = 14/82 (17%), Positives = 21/82 (25%), Gaps = 3/82 (3%)
Query: 317 SRSRTRSPVTSKSRSRSRSPQP---PKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSH 373
R R S + S S S +P P KK ++ S + + Y
Sbjct: 1 MRGRGSSSDSPYSSDDSPSLEPPELPPKKKGRRRVSPVEEEEEEEEAEVVAVGFSYPPVR 60
Query: 374 KSHKDSKDYYTPPSPDRSPYSS 395
S P +
Sbjct: 61 ISRGKDGKRPVRPLKEEKDSEK 82
>gnl|CDD|99740 cd00616, AHBA_syn, 3-amino-5-hydroxybenzoic acid synthase family
(AHBA_syn). AHBA_syn family belongs to pyridoxal
phosphate (PLP)-dependent aspartate aminotransferase
superfamily (fold I). The members of this CD are
involved in various biosynthetic pathways for secondary
metabolites. Some well studied proteins in this CD are
AHBA_synthase, protein product of pleiotropic regulatory
gene degT, Arnb aminotransferase and pilin
glycosylation protein. The prototype of this family, the
AHBA_synthase, is a dimeric PLP dependent enzyme.
AHBA_syn is the terminal enzyme of
3-amino-5-hydroxybenzoic acid (AHBA) formation which is
involved in the biosynthesis of ansamycin antibiotics,
including rifamycin B. Some members of this CD are
involved in 4-amino-6-deoxy-monosaccharide D-perosamine
synthesis. Perosamine is an important element in the
glycosylation of several cell products, such as
antibiotics and lipopolysaccharides of gram-positive and
gram-negative bacteria. The pilin glycosylation protein
encoded by gene pglA, is a galactosyltransferase
involved in pilin glycosylation. Additionally, this CD
consists of ArnB (PmrH) aminotransferase, a
4-amino-4-deoxy-L-arabinose lipopolysaccharide-modifying
enzyme. This CD also consists of several predicted
pyridoxal phosphate-dependent enzymes apparently
involved in regulation of cell wall biogenesis. The
catalytic lysine which is present in all characterized
PLP dependent enzymes is replaced by histidine in some
members of this CD.
Length = 352
Score = 29.0 bits (66), Expect = 4.0
Identities = 20/81 (24%), Positives = 33/81 (40%), Gaps = 11/81 (13%)
Query: 206 VRYDPETIASA---CIYLTARKLR-----IPLPRNPAWYSLFHVLESDI---QDVCKRIL 254
+R DPE S L + PL P + L D+ +D+ +R+L
Sbjct: 272 IRLDPEAGESRDELIEALKEAGIETRVHYPPLHHQPPYKKLLGYPPGDLPNAEDLAERVL 331
Query: 255 RLYTRPKANTDELERQIEVIK 275
L P +E++R IE ++
Sbjct: 332 SLPLHPSLTEEEIDRVIEALR 352
>gnl|CDD|114270 pfam05539, Pneumo_att_G, Pneumovirinae attachment membrane
glycoprotein G.
Length = 408
Score = 29.2 bits (65), Expect = 4.2
Identities = 15/110 (13%), Positives = 36/110 (32%), Gaps = 7/110 (6%)
Query: 288 VSGDNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPP----KHKK 343
+ + T++ + P+ + +S+ ++ T+ R S P
Sbjct: 170 TAVTTSKTTSWPTEVSHPTYPSQVTPQSQPATQGHQTATANQRLSSTEPVGTQGTTTSSN 229
Query: 344 SKKYS---SRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDR 390
+ + R S SP+ T + + + ++ TPP+
Sbjct: 230 PEPQTEPPPSQRGPSGSPQHPPSTTSQDQSTTGDGQEHTQRRKTPPATSN 279
>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR). This
family consists of several bovine specific leukaemia
virus receptors which are thought to function as
transmembrane proteins, although their exact function is
unknown.
Length = 561
Score = 29.3 bits (65), Expect = 4.3
Identities = 15/106 (14%), Positives = 31/106 (29%), Gaps = 6/106 (5%)
Query: 303 KSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPP------KHKKSKKYSSRARSRSK 356
K + K K R + + + KS + P + ++ ++ + +
Sbjct: 201 KPKKKEKKEKEKERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGTAPD 260
Query: 357 SPRSRSRTPDRKYKKSHKSHKDSKDYYTPPSPDRSPYSSHSRSHSR 402
S + + + K HK K + H R H
Sbjct: 261 SEPDEPKDAEAEETKKSPKHKKKKQRKEKEEKKKKKKHHHHRCHHS 306
>gnl|CDD|223025 PHA03253, PHA03253, UL35; Provisional.
Length = 609
Score = 29.1 bits (65), Expect = 4.5
Identities = 16/49 (32%), Positives = 21/49 (42%), Gaps = 2/49 (4%)
Query: 287 LVSGDNTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRS 335
S +P+ + S SP + S R RT P+ S SRS S S
Sbjct: 465 RSSSRASPSHSTSTIPYSPPQSGRSTPTSILRQRT--PIRSNSRSSSVS 511
>gnl|CDD|236394 PRK09169, PRK09169, hypothetical protein; Validated.
Length = 2316
Score = 29.3 bits (66), Expect = 5.1
Identities = 18/70 (25%), Positives = 28/70 (40%), Gaps = 4/70 (5%)
Query: 334 RSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHKSHKDSKDYYTP-PSPDRSP 392
R P HK+ + ++ A R PR R R D ++ ++ + P DR P
Sbjct: 1 RGPAHAPHKRRRDAAAPADPR---PRRRPRLGDAPAPRTARADSGATPRGRPRAGADREP 57
Query: 393 YSSHSRSHSR 402
S R + R
Sbjct: 58 TSEQLRDYER 67
>gnl|CDD|217373 pfam03115, Astro_capsid, Astrovirus capsid protein precursor. This
product is encoded by astrovirus ORF2, one of the three
astrovirus ORFs (1a, 1b, 2). The 87kD precursor protein
undergoes an intracellular cleavage to form a 79kD
protein. Subsequently, extracellular trypsin cleavage
yields the three proteins forming the infectious virion.
Length = 787
Score = 29.0 bits (65), Expect = 5.7
Identities = 17/52 (32%), Positives = 26/52 (50%), Gaps = 6/52 (11%)
Query: 310 NHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSR 361
N+ +SRS+SR R S+SR R RS + S+ R R+K ++
Sbjct: 14 NNNGRSRSKSRAR----SQSRGRGRSVK--ITVNSRNKGRRQNGRNKYQSNQ 59
>gnl|CDD|240078 cd04727, pdxS, PdxS is a subunit of the pyridoxal 5'-phosphate
(PLP) synthase, an important enzyme in deoxyxylulose
5-phosphate (DXP)-independent pathway for de novo
biosynthesis of PLP, present in some eubacteria, in
archaea, fungi, plants, plasmodia, and some metazoa.
Together with PdxT, PdxS forms the PLP synthase, a
heteromeric glutamine amidotransferase (GATase), whereby
PdxT produces ammonia from glutamine and PdxS combines
ammonia with five- and three-carbon phosphosugars to
form PLP. PLP is the biologically active form of vitamin
B6, an essential cofactor in many biochemical processes.
PdxS subunits form two hexameric rings.
Length = 283
Score = 28.8 bits (65), Expect = 5.7
Identities = 13/52 (25%), Positives = 20/52 (38%), Gaps = 16/52 (30%)
Query: 102 VCLASKIEEAPRRIR---------------DVINVFHHIRQVMNQ-KSITPM 137
VC A + EA RRI +V+ H+R V + + + M
Sbjct: 116 VCGARNLGEALRRISEGAAMIRTKGEAGTGNVVEAVRHMRAVNGEIRKLQSM 167
>gnl|CDD|152169 pfam11733, NP1-WLL, Non-capsid protein NP1. This family is the
non-capsid protein NP1 of the ssDNA, Parvovirinae virus
Bocavirus of cattle and humans.
Length = 98
Score = 27.1 bits (59), Expect = 5.9
Identities = 27/90 (30%), Positives = 38/90 (42%), Gaps = 2/90 (2%)
Query: 298 ASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSP-QPPKHKKSKKYSSRARSRSK 356
+S N+K R K R R T+ RSRSRSP + + S Y +
Sbjct: 2 SSGNMKDKHRSYKRKGSPERGERKRHWQTTHHRSRSRSPIRHRGERGSGSYHQEHPIKHL 61
Query: 357 SPRSRSRTPDRKYKKSHKSHKDSKDYYTPP 386
S + S+T D K K+ +S K +T P
Sbjct: 62 SSCTASKTSD-KVLKTRESTSGKKTKHTIP 90
>gnl|CDD|219406 pfam07420, DUF1509, Protein of unknown function (DUF1509). This
family consists of several uncharacterized viral
proteins from the Marek's disease-like viruses. Members
of this family are typically around 400 residues in
length. The function of this family is unknown.
Length = 377
Score = 28.5 bits (63), Expect = 6.8
Identities = 14/48 (29%), Positives = 24/48 (50%)
Query: 305 PSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRAR 352
P + +R+ R R+R+ S S+SRS SR + + + S +R
Sbjct: 317 PPHSTSGERRGRRRNRSESRSRSRSRSGSRRYRRRRGRGVPGRRSESR 364
>gnl|CDD|218107 pfam04484, DUF566, Family of unknown function (DUF566). Family of
related proteins that is plant specific.
Length = 313
Score = 28.4 bits (63), Expect = 6.9
Identities = 20/74 (27%), Positives = 29/74 (39%)
Query: 293 TPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRAR 352
P+S+ + N S S K++ S SR R S SR S +S +
Sbjct: 47 PPSSSPARNTSSSSSFGLSKQRPSSLSRGRLSSRFVSPSRGSPSAAASLNGSLATASTSG 106
Query: 353 SRSKSPRSRSRTPD 366
S S S R+ + D
Sbjct: 107 SSSPSRSRRTTSSD 120
>gnl|CDD|191716 pfam07263, DMP1, Dentin matrix protein 1 (DMP1). This family
consists of several mammalian dentin matrix protein 1
(DMP1) sequences. The dentin matrix acidic
phosphoprotein 1 (DMP1) gene has been mapped to human
chromosome 4q21. DMP1 is a bone and teeth specific
protein initially identified from mineralised dentin.
DMP1 is primarily localised in the nuclear compartment
of undifferentiated osteoblasts. In the nucleus, DMP1
acts as a transcriptional component for activation of
osteoblast-specific genes like osteocalcin. During the
early phase of osteoblast maturation, Ca(2+) surges into
the nucleus from the cytoplasm, triggering the
phosphorylation of DMP1 by a nuclear isoform of casein
kinase II. This phosphorylated DMP1 is then exported out
into the extracellular matrix, where it regulates
nucleation of hydroxyapatite. DMP1 is a unique molecule
that initiates osteoblast differentiation by
transcription in the nucleus and orchestrates
mineralised matrix formation extracellularly, at later
stages of osteoblast maturation. The DMP1 gene has been
found to be ectopically expressed in lung cancer
although the reason for this is unknown.
Length = 514
Score = 28.5 bits (63), Expect = 7.0
Identities = 23/91 (25%), Positives = 40/91 (43%), Gaps = 5/91 (5%)
Query: 315 SRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDRKYKKSHK 374
S S+S + + S S S + P+ + + SS+ +S S + SR+ + + ++ +
Sbjct: 390 SESQSTEEQADSESNESLSSSEESPESTEDENSSSQEGLQSHSASTESRSQESQSEQDSR 449
Query: 375 SHKDSKDYYTPPSPDRSPYSSHSRSHSRKSS 405
S +D D S D S S S SS
Sbjct: 450 SEEDDSD-----SQDSSRSKEDSNSTESASS 475
>gnl|CDD|172341 PRK13808, PRK13808, adenylate kinase; Provisional.
Length = 333
Score = 28.3 bits (63), Expect = 7.6
Identities = 12/65 (18%), Positives = 28/65 (43%), Gaps = 1/65 (1%)
Query: 317 SRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSS-RARSRSKSPRSRSRTPDRKYKKSHKS 375
++ S ++ S + K SKK ++ A S K+ ++ ++ + K + K+
Sbjct: 200 KKAAKTPAAKSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKKAKKTAKKA 259
Query: 376 HKDSK 380
K +
Sbjct: 260 LKKAA 264
>gnl|CDD|221960 pfam13178, DUF4005, Protein of unknown function (DUF4005). This is
a C-terminal region of plant IQ-containing putative
calmodulin-binding proteins.
Length = 105
Score = 26.7 bits (59), Expect = 8.1
Identities = 21/103 (20%), Positives = 37/103 (35%), Gaps = 19/103 (18%)
Query: 292 NTPTSNASPNIKSPSRHNNHKRKSRSRSRTRSPVT-----------SKSRSRSRSP---Q 337
NTP +S + KS ++ KS + + +K++ RS+S +
Sbjct: 1 NTPRLLSSSSSKSSRSSPSNPTKSERDDNSSTSSPSLPNYMAATESAKAKVRSQSAPRQR 60
Query: 338 PPKHKKSKKYSSRAR-----SRSKSPRSRSRTPDRKYKKSHKS 375
P ++ S+ R S S S S K + +S
Sbjct: 61 PETEERESGSSATKRLSLPVSSSSGGSSSSSPRTSGGKGALRS 103
>gnl|CDD|109943 pfam00906, Hepatitis_core, Hepatitis core antigen. The core
antigen of hepatitis viruses possesses a carboxyl
terminus rich in arginine. On this basis it was
predicted that the core antigen would bind DNA. There is
some experimental evidence to support this.
Length = 182
Score = 27.5 bits (61), Expect = 8.9
Identities = 15/49 (30%), Positives = 22/49 (44%), Gaps = 1/49 (2%)
Query: 290 GDNTPTSNASPN-IKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQ 337
N P + P I R + +R++ S R RS + RS+S S Q
Sbjct: 133 PPNAPILSTLPETIVVRRRGRSPRRRTPSPRRRRSQSPRRRRSQSPSSQ 181
>gnl|CDD|179769 PRK04180, PRK04180, pyridoxal biosynthesis lyase PdxS; Provisional.
Length = 293
Score = 27.8 bits (63), Expect = 9.1
Identities = 12/45 (26%), Positives = 18/45 (40%), Gaps = 15/45 (33%)
Query: 102 VCLASKIEEAPRRIR---------------DVINVFHHIRQVMNQ 131
VC A + EA RRI +V+ H+RQ+ +
Sbjct: 125 VCGARNLGEALRRIAEGAAMIRTKGEAGTGNVVEAVRHMRQINGE 169
>gnl|CDD|215397 PLN02744, PLN02744, dihydrolipoyllysine-residue acetyltransferase
component of pyruvate dehydrogenase complex.
Length = 539
Score = 28.3 bits (63), Expect = 9.1
Identities = 15/56 (26%), Positives = 27/56 (48%), Gaps = 3/56 (5%)
Query: 312 KRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRTPDR 367
K K S + +P K++ SP PPK ++ +K +S ++ P + + DR
Sbjct: 196 KFKDYKPSSSAAPAAPKAKP---SPPPPKEEEVEKPASSPEPKASKPSAPPSSGDR 248
>gnl|CDD|112562 pfam03753, HHV6-IE, Human herpesvirus 6 immediate early protein.
The proteins in this family are poorly characterized,
but an investigation has indicated that the immediate
early protein is required the down-regulation of MHC
class I expression in dendritic cells. Human herpesvirus
6 immediate early protein is also referred to as U90.
Length = 993
Score = 28.5 bits (63), Expect = 9.3
Identities = 35/179 (19%), Positives = 61/179 (34%), Gaps = 14/179 (7%)
Query: 198 DSLRTDVFVRYDPETIASACIYLTARKLRIPLPRNPAWYSLFHVLES--DIQDVCKRILR 255
D+ F R D +T + A I + ++ LF + + D + + +
Sbjct: 523 DNQAGPTFSRTDKKTNSPAGILMERSIFNKDTQDKEQYFELFTMTDGTLDNPLISEMLSF 582
Query: 256 LYTRPKANTDELER-QIEVIKKEYQLSKDRKVLVSG-------DNTPTSNA--SPNIKSP 305
Y + E E + I Y S D + +NTP S + SP +P
Sbjct: 583 GYETDHSAPYESESDNNDEID--YIASVDSGNRTNNIHMNNTNENTPFSKSGKSPPEVTP 640
Query: 306 SRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSKKYSSRARSRSKSPRSRSRT 364
S+ + K + S R ++ ++ + K KK K S + S S
Sbjct: 641 SKTFYKRDKKKDISTNRKVKKRTAKRKTVGYKTDKSKKIKSDSLPTDTNVIVISSESED 699
>gnl|CDD|214607 smart00307, ILWEQ, I/LWEQ domain. Thought to possess an F-actin
binding function.
Length = 200
Score = 27.7 bits (62), Expect = 9.3
Identities = 12/47 (25%), Positives = 23/47 (48%)
Query: 241 VLESDIQDVCKRILRLYTRPKANTDELERQIEVIKKEYQLSKDRKVL 287
+ D + + + + T E+E+Q+E++K E +L RK L
Sbjct: 143 GMIFDEEQEEEEDFSKLSLHEGKTQEMEQQVEILKLENELEAARKKL 189
>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger. [Transport and
binding proteins, Cations and iron carrying compounds].
Length = 1096
Score = 28.4 bits (63), Expect = 10.0
Identities = 21/88 (23%), Positives = 36/88 (40%), Gaps = 10/88 (11%)
Query: 291 DNTPTS-----NASPNIKSPSRHNNHKRKSRSRSRTRSPVTSKSRSRSRSPQPPKHKKSK 345
++TP + N + R ++ K R ++ SP ++ + R +P P +
Sbjct: 142 EDTPATPSRALNHYISTSGRQRVKSYTPKPRGEVKSSSPTQTREKVRKYTPSP----LGR 197
Query: 346 KYSSRARSRSKS-PRSRSRTPDRKYKKS 372
+S A S + PRS TP K S
Sbjct: 198 MVNSYAPSTFMTMPRSHGITPRTTVKDS 225
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.316 0.130 0.377
Gapped
Lambda K H
0.267 0.0783 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 19,972,422
Number of extensions: 1882168
Number of successful extensions: 3136
Number of sequences better than 10.0: 1
Number of HSP's gapped: 2717
Number of HSP's successfully gapped: 229
Length of query: 405
Length of database: 10,937,602
Length adjustment: 99
Effective length of query: 306
Effective length of database: 6,546,556
Effective search space: 2003246136
Effective search space used: 2003246136
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 60 (26.8 bits)