RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy11273
(730 letters)
>gnl|CDD|227640 COG5333, CCL1, Cdk activating kinase (CAK)/RNA polymerase II
transcription initiation/nucleotide excision repair
factor TFIIH/TFIIK, cyclin H subunit [Cell division and
chromosome partitioning / Transcription / DNA
replication, recombination, and repair].
Length = 297
Score = 82.9 bits (205), Expect = 2e-17
Identities = 48/177 (27%), Positives = 83/177 (46%), Gaps = 13/177 (7%)
Query: 35 EKELSCRQQAANLIQDMGQRLQVTQLCINTAIVYMHRFYVFHSFTQFHRNSIATAALFLA 94
EKEL+ LI D+ RL + Q + TAI++ RFY+ +S + S+ T ++LA
Sbjct: 39 EKELNLVIYYLKLIMDLCTRLNLPQTVLATAILFFSRFYLKNSVEEISLYSVVTTCVYLA 98
Query: 95 AKVEEQPRKLEHVIRVAQLCLFKNQPPLDPRSEAYQEQAQEIVVNENVLLQTLGFDVGIE 154
KVE+ PR I + S + I+ E LL+ L FD+ +
Sbjct: 99 CKVEDTPRD----ISIESFEARDLWSEEPKSSR------ERILEYEFELLEALDFDLHVH 148
Query: 155 HPHTYVVKCCHLVRA--SKDLAQTSYFMASNSLHLTTMCLQYRSTVVACFCIHLACK 209
HP+ Y+ ++ L Q ++ + +++L T +CL Y ++A + +AC+
Sbjct: 149 HPYKYLEGFLKDLQEKDKYKLLQIAWKIINDALR-TDLCLLYPPHIIALAALLIACE 204
>gnl|CDD|238003 cd00043, CYCLIN, Cyclin box fold. Protein binding domain
functioning in cell-cycle and transcription control.
Present in cyclins, TFIIB and Retinoblastoma (RB).The
cyclins consist of 8 classes of cell cycle regulators
that regulate cyclin dependent kinases (CDKs). TFIIB is
a transcription factor that binds the TATA box. Cyclins,
TFIIB and RB contain 2 copies of the domain.
Length = 88
Score = 74.6 bits (184), Expect = 2e-16
Identities = 20/72 (27%), Positives = 38/72 (52%)
Query: 41 RQQAANLIQDMGQRLQVTQLCINTAIVYMHRFYVFHSFTQFHRNSIATAALFLAAKVEEQ 100
R + ++ + + L ++ + A+ + RF + +S + +A AAL+LAAKVEE
Sbjct: 2 RPTPLDFLRRVAKALGLSPETLTLAVNLLDRFLLDYSVLGRSPSLVAAAALYLAAKVEEI 61
Query: 101 PRKLEHVIRVAQ 112
P L+ ++ V
Sbjct: 62 PPWLKDLVHVTG 73
>gnl|CDD|215740 pfam00134, Cyclin_N, Cyclin, N-terminal domain. Cyclins regulate
cyclin dependent kinases (CDKs). Human cyclin-O is a
Uracil-DNA glycosylase that is related to other cyclins.
Cyclins contain two domains of similar all-alpha fold,
of which this family corresponds with the N-terminal
domain.
Length = 127
Score = 74.9 bits (185), Expect = 5e-16
Identities = 22/138 (15%), Positives = 48/138 (34%), Gaps = 17/138 (12%)
Query: 16 YFTKEQLENTPSRKCGYDAEKELSCRQQAANLIQDMGQRLQVTQLCINTAIVYMHRFYVF 75
E+ + P + R + + ++ + ++ + A+ Y+ RF
Sbjct: 6 LRELEEEDRPPPDYLDQQPDINPKMRAILIDWLVEVHEEFKLLPETLYLAVNYLDRFLSK 65
Query: 76 HSFTQFHRNSIATAALFLAAKVEE-QPRKLEHVIRVAQLCLFKNQPPLDPRSEAYQEQAQ 134
+ + L +AAK EE P +E + + + +
Sbjct: 66 QPVPRTKLQLVGVTCLLIAAKYEEIYPPSVEDFVYITD-NAYT---------------KE 109
Query: 135 EIVVNENVLLQTLGFDVG 152
EI+ E ++L TL +D+
Sbjct: 110 EILRMELLILSTLNWDLS 127
>gnl|CDD|214641 smart00385, CYCLIN, domain present in cyclins, TFIIB and
Retinoblastoma. A helical domain present in cyclins and
TFIIB (twice) and Retinoblastoma (once). A protein
recognition domain functioning in cell-cycle and
transcription control.
Length = 83
Score = 67.6 bits (166), Expect = 5e-14
Identities = 17/67 (25%), Positives = 35/67 (52%)
Query: 46 NLIQDMGQRLQVTQLCINTAIVYMHRFYVFHSFTQFHRNSIATAALFLAAKVEEQPRKLE 105
+ ++ + + L + +N A+ + RF + F ++ + IA AAL+LA+K EE P +
Sbjct: 1 DFLRRVCKALNLDPETLNLAVNLLDRFLSDYKFLKYSPSLIAAAALYLASKTEETPPWTK 60
Query: 106 HVIRVAQ 112
++
Sbjct: 61 ELVHYTG 67
>gnl|CDD|129660 TIGR00569, ccl1, cyclin ccl1. All proteins in this family for
which functions are known are cyclins that are
components of TFIIH, a complex that is involved in
nucleotide excision repair and transcription initiation.
This family is based on the phylogenomic analysis of JA
Eisen (1999, Ph.D. Thesis, StanfordUniversity) [DNA
metabolism, DNA replication, recombination, and repair].
Length = 305
Score = 45.6 bits (108), Expect = 5e-05
Identities = 31/124 (25%), Positives = 54/124 (43%), Gaps = 16/124 (12%)
Query: 35 EKELSCRQQAANLIQDMGQRLQVT--QLCINTAIVYMHRFYVFHSFTQFHRNSIATAALF 92
E+EL + + D + T + TAI+Y RFY+ +S ++H I +F
Sbjct: 50 EEELDLVKYYEKRLLDFCSAFKPTMPTSVVGTAIMYFKRFYLNNSVMEYHPKIIMLTCVF 109
Query: 93 LAAKVEEQPRKLEHVIRVAQLCLFKNQPPLDPRSEAYQEQAQEIVVNENVLLQTLGFDVG 152
LA KVE E + + Q N E + ++++ E +L+Q L F +
Sbjct: 110 LACKVE------EFNVSIDQFV--GNLK------ETPLKALEQVLEYELLLIQQLNFHLI 155
Query: 153 IEHP 156
+ +P
Sbjct: 156 VHNP 159
>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1. Members
of this family are necessary for accurate chromosome
transmission during cell division.
Length = 804
Score = 45.1 bits (107), Expect = 1e-04
Identities = 35/179 (19%), Positives = 51/179 (28%), Gaps = 11/179 (6%)
Query: 315 YRKLMAGGRDMNSRSSTSSTAVPINSMPSANT-NKPPAHVFQTSSSSRVPPPPPPHHHHS 373
+ S +ST P S A+T + + S P P P +
Sbjct: 80 NAPGAPSVGPDSDLSQKTSTFSPCQSGYEASTDPEYIPDLQPDPSLWGTAPKPEPQPPQA 139
Query: 374 SAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPIN 433
P+ +T M + EA L+Q Q P +P + P + P + P
Sbjct: 140 PESQPQPQTPAQKM-LSLEEVEAQLQQRQQAPQLP----QPPQQVLPQGMPPRQAAFPQQ 194
Query: 434 -----PRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSHSD 487
P PQ P + P P PP P + P S
Sbjct: 195 GPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQ 253
Score = 38.2 bits (89), Expect = 0.018
Identities = 26/133 (19%), Positives = 41/133 (30%), Gaps = 6/133 (4%)
Query: 362 VPPPPPPHHHHSSAHVPKIKTE-HPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEP 420
+P PP P + +P + + P Q P P +
Sbjct: 180 LPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQ 239
Query: 421 HSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPP-PPYMSAAK 479
L + ++ +M PQ P + ++ P P P P P P A
Sbjct: 240 PPPLQQPQFPGLSQQMPPPPPQP----PQQQQQPPQPQAQPPPQNQPTPHPGLPQGQNAP 295
Query: 480 LPPPSHSDVITNV 492
LPPP ++ V
Sbjct: 296 LPPPQQPQLLPLV 308
Score = 37.8 bits (88), Expect = 0.022
Identities = 32/219 (14%), Positives = 44/219 (20%), Gaps = 29/219 (13%)
Query: 289 FAPPHSTSGRVTD----DKRRSEHNGPPPEYRKLMAGGRDMNSRSSTSSTAVPINS---M 341
F + +K + S S+ I
Sbjct: 64 FGGETAKLSAAVRYNQNAPGAPSVGPDSDLSQK-TSTFSPCQSGYEASTDPEYIPDLQPD 122
Query: 342 PSANTNKPPAHVFQTSSSSRVPPPPPPH--------------HHHSSAHVPKIKTEHPP- 386
PS P + P P P + +P+ + P
Sbjct: 123 PSLWGTAPKPEPQPPQAPESQPQPQTPAQKMLSLEEVEAQLQQRQQAPQLPQPPQQVLPQ 182
Query: 387 MKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPV 446
P P Q + +P LP P PQ
Sbjct: 183 GMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLP---APSQAPAQPPLPPQLPQQ 239
Query: 447 IPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSH 485
P P PPPPP PP P +
Sbjct: 240 PPP---LQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQ 275
Score = 31.3 bits (71), Expect = 2.3
Identities = 24/162 (14%), Positives = 38/162 (23%), Gaps = 10/162 (6%)
Query: 337 PINSMPSANTNKPPAHVFQTSSSSRVP---PPPPPHHHHSSAHVPKIKTEHPPMKVEPVM 393
P +P P F P P PP H + P
Sbjct: 176 PQQVLP--QGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLP--APSQAPAQPP 231
Query: 394 KEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRK 453
L Q P Q+P P P ++ +P P+ P +
Sbjct: 232 LPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQG 291
Query: 454 NGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSHSDVITNVIKE 495
P P P + P + + + ++
Sbjct: 292 Q---NAPLPPPQQPQLLPLVQQPQGQQRGPQFREQLVQLSQQ 330
>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
Length = 3151
Score = 44.9 bits (106), Expect = 2e-04
Identities = 57/306 (18%), Positives = 78/306 (25%), Gaps = 31/306 (10%)
Query: 291 PPHSTSGRVTDDKRRSEHNGPPPEY-RKLMAGGRDMNSRSSTSSTAVPINSMPSANTNKP 349
PP S S + PPPE R A GR R + S P +
Sbjct: 2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRR 2686
Query: 350 --PAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKT---------EHPPMKVEPVMKEALL 398
V +S + PPPPP A V P + P
Sbjct: 2687 AARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAV-- 2744
Query: 399 KQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKP-----QSKPVIPHEVRK 453
P P P + P + P PR + +S+ +P
Sbjct: 2745 PAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDP 2804
Query: 454 NGHDLPVPKPHP---------PPPPPPPPYMSAAKLPPPSHSDVITNVIKEVTYSKQMEK 504
V P P PPP A PPP + V + +
Sbjct: 2805 ADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRR 2864
Query: 505 ---SSILNHKESSISHPPHLSMSIPQTKAPSIFSPEKNTSPVINKTPFKMKTPTPPSFSP 561
S K ++ + PP ++ P + P P P P P
Sbjct: 2865 RPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPP 2924
Query: 562 IKISPS 567
P
Sbjct: 2925 PPPQPQ 2930
Score = 44.2 bits (104), Expect = 3e-04
Identities = 45/220 (20%), Positives = 64/220 (29%), Gaps = 28/220 (12%)
Query: 285 GNATFAPPHSTSGRVTDDKRRSEHNGPPPEYRKLMAGGRDMNSRSSTSSTAVPINSMPSA 344
G A A P +T+G + GPP R + S S ++P P+
Sbjct: 2753 GPARPARPPTTAGPPAPAPPAAPAAGPPR------RLTRPAVASLSESRESLPSPWDPAD 2806
Query: 345 NTN--KPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVE----------PV 392
PA ++S P PPP S+ PP V
Sbjct: 2807 PPAAVLAPAAALPPAASPAGPLPPPT----SAQPTAPPPPPGPPPPSLPLGGSVAPGGDV 2862
Query: 393 MKEALLKQP---PLIKQEPNVKQ--EPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVI 447
+ + P P P V++ P + S L +P P P +P
Sbjct: 2863 RRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESF-ALPPDQPERPPQPQAPPPPQPQP 2921
Query: 448 PHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSHSD 487
P P P P PP P + A P +
Sbjct: 2922 QPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQ 2961
Score = 38.0 bits (88), Expect = 0.022
Identities = 45/188 (23%), Positives = 63/188 (33%), Gaps = 32/188 (17%)
Query: 310 GPPPEYRKL---MAGGRDMNSRSSTSSTAVPINSMPSANTNKPPAHVFQTSSSSRVPPPP 366
GPPP L +A G D+ R + S A + P+A +PP R+ P
Sbjct: 2844 GPPPPSLPLGGSVAPGGDVRRRPPSRSPA----AKPAAPA-RPPV--------RRLARPA 2890
Query: 367 PPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPL 426
S A +P + E PP P PP + +P +P P P
Sbjct: 2891 VSRSTESFA-LPPDQPERPPQPQAP--------PPPQPQPQPPPPPQPQPPPPP----PP 2937
Query: 427 RKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLP--VPKPHPPPPPPPPPYMSAAKLPPPS 484
R P+ P +P G +P V P P P P + A PP
Sbjct: 2938 RPQPPLAPTTD-PAGAGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPSREAPASSTPPL 2996
Query: 485 HSDVITNV 492
++ V
Sbjct: 2997 TGHSLSRV 3004
Score = 37.6 bits (87), Expect = 0.030
Identities = 48/296 (16%), Positives = 65/296 (21%), Gaps = 38/296 (12%)
Query: 290 APPHSTSGRVTDDKRRSEHNGPPPEYRKLMAGGRDMNSRSSTSSTAVPINSMPSANTNKP 349
APP S R D R G P + S P P P
Sbjct: 2591 APPQSARPRAPVDDR-----GDPR--------------GPAPPSPLPPDTHAPDPPPPSP 2631
Query: 350 PAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPN 409
+ PPP A P +V + L + P
Sbjct: 2632 SPAANEPDPHPPPTVPPPERPRDDPA----------PGRVSRPRRARRLGRAAQASSPPQ 2681
Query: 410 VKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPP 469
+ + SL L P P P P P P
Sbjct: 2682 RPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAP 2741
Query: 470 PPPPYMSAAKLPPPSHSDVITNVIKEVTYSKQMEKSSILNHKESSISHPPHLSMSIPQTK 529
P P A P + T + ++ P S+S +
Sbjct: 2742 PAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAG----PPRRLTRPAVASLSESRES 2797
Query: 530 APSIFSPEKNTSPVINKTPFKMKT-----PTPPSFSPIKISPSKSSEGLKAKLEPE 580
PS + P + V+ P PP S +P L
Sbjct: 2798 LPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLG 2853
Score = 35.3 bits (81), Expect = 0.14
Identities = 30/158 (18%), Positives = 41/158 (25%), Gaps = 3/158 (1%)
Query: 328 RSSTSSTAVPINSMPSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPM 387
RS P S P+ + S+ R P + P H P
Sbjct: 2566 RSVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPD 2625
Query: 388 KVEPVMKEALLKQP-PLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQ--SK 444
P A + P ++ + P R + PQ +
Sbjct: 2626 PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRR 2685
Query: 445 PVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPP 482
V P P P P P P +SA LPP
Sbjct: 2686 RAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPP 2723
>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
RPA34.5. This is a family of proteins conserved from
yeasts to human. Subunit A34.5 of RNA polymerase I is a
non-essential subunit which is thought to help Pol I
overcome topological constraints imposed on ribosomal
DNA during the process of transcription.
Length = 193
Score = 38.2 bits (89), Expect = 0.006
Identities = 15/50 (30%), Positives = 28/50 (56%), Gaps = 1/50 (2%)
Query: 608 KQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRE 657
++ E + K+KKK KE K++ +K K+KK+ + K+K K ++
Sbjct: 143 VEKEAEVEEEEKKEKKKKKEVKKEKKEK-KDKKEKMVEPKGSKKKKKKKK 191
Score = 30.1 bits (68), Expect = 3.2
Identities = 19/80 (23%), Positives = 29/80 (36%), Gaps = 6/80 (7%)
Query: 575 AKLEPELVPVITKLGDDILTKPKILASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDK 634
A P + T K+ ++E E + KK+K ++K DK
Sbjct: 120 APDGPPSELGSESETSEKETTAKVEKEA--EVEEEEKKEKKKKKEVKKEKKEKK----DK 173
Query: 635 KHKEKKKHKDDKHKDKRKDK 654
K K + K K K+K K
Sbjct: 174 KEKMVEPKGSKKKKKKKKKK 193
>gnl|CDD|237862 PRK14948, PRK14948, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 620
Score = 39.6 bits (93), Expect = 0.007
Identities = 37/239 (15%), Positives = 59/239 (24%), Gaps = 48/239 (20%)
Query: 342 PSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQP 401
PSA ++ + + P PPP SA K P + P
Sbjct: 361 PSAFISEIANASAPANPTPAPNPSPPPAPIQPSAPKTKQAATTPSPPPAK--ASPPIPVP 418
Query: 402 PLIKQEPNVKQEPYIKSEPHSLLP------LRKHEPINPRMM------------------ 437
+ + P L L K E + RM+
Sbjct: 419 AEPTEPSPTPPANAANAPPSLNLEELWQQILAKLELPSTRMLLSQQAELVSLDSNRAVIA 478
Query: 438 -----MNKPQSK--------------PVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAA 478
+ QS+ + + ++G K PPP PPP
Sbjct: 479 VSPNWLGMVQSRKPLLEQAFAKVLGRSIKLNLESQSGSASNTAKTPPPPQKSPPPPAPTP 538
Query: 479 KLPPPSHSDVITNVIKEVTYSKQMEKSSILNHKESSISHPPHLSMSIPQTKAPSIFSPE 537
LP P+ ++S + P + T +P+ S
Sbjct: 539 PLPQPT---ATAPPPTPPPPPPTATQASSNAPAQIPADSSPPPPIPEEPTPSPTKDSSP 594
Score = 31.1 bits (71), Expect = 2.4
Identities = 15/66 (22%), Positives = 23/66 (34%)
Query: 326 NSRSSTSSTAVPINSMPSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHP 385
++ ++ + P S P P T+ PPPPP SS +I +
Sbjct: 516 SASNTAKTPPPPQKSPPPPAPTPPLPQPTATAPPPTPPPPPPTATQASSNAPAQIPADSS 575
Query: 386 PMKVEP 391
P P
Sbjct: 576 PPPPIP 581
>gnl|CDD|150884 pfam10278, Med19, Mediator of RNA pol II transcription subunit 19.
Med19 represents a family of conserved proteins which
are members of the multi-protein co-activator Mediator
complex. Mediator is required for activation of RNA
polymerase II transcription by DNA binding
transactivators.
Length = 178
Score = 37.9 bits (88), Expect = 0.008
Identities = 23/68 (33%), Positives = 30/68 (44%), Gaps = 18/68 (26%)
Query: 615 HSQHPKKKKKHKEK------------------DKDRDKKHKEKKKHKDDKHKDKRKDKHR 656
H Q PKKK KHK K K +KKHK+KK D + K K+K+K +
Sbjct: 105 HIQPPKKKHKHKHKKHRTQDPLPEETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKKK 164
Query: 657 ESSHVDPA 664
+ P
Sbjct: 165 KKKRHSPE 172
>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein. This family consists of
AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
retardation syndrome) nuclear proteins. These proteins
have been linked to human diseases such as acute
lymphoblastic leukaemia and mental retardation. The
family also contains a Drosophila AF4 protein homologue
Lilliputian which contains an AT-hook domain.
Lilliputian represents a novel pair-rule gene that acts
in cytoskeleton regulation, segmentation and
morphogenesis in Drosophila.
Length = 1154
Score = 39.1 bits (91), Expect = 0.010
Identities = 57/284 (20%), Positives = 93/284 (32%), Gaps = 24/284 (8%)
Query: 448 PHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSHSDVITNVIKEVTYSKQMEKSSI 507
+ G P P+ P PP K P + + + S E S+
Sbjct: 577 RPKAATKGSRKPSPRKEPKSSVPPAAEKRKYKSPSKIVPKSREFIETDSSSSDSPEDESL 636
Query: 508 LNHKESSISHPPHLSMSIPQTKAPSIFSPEKNTSPVINKTPFKMKTPTPPSFSPIKISPS 567
+S S S+ +P +S K + +P + + +SP
Sbjct: 637 PPSSQSP--GNTESSKES----CASLRTPVCRSSVGSQNDLSKDRLLSPMRETEL-LSPL 689
Query: 568 KSSEGLKAKLEPELVPVITKLGDDILTKPKILASEIISETKQEPHESHSQHPKKKKKHKE 627
+ S E + K+ D+L++ + P + KK
Sbjct: 690 RDS--------EERYSLWVKIDLDLLSR----IPGHPYKKGVPPKPAEKDSLSAPKKQTS 737
Query: 628 KDKDRDKKHKEKKKHKDDKHKDKRKDK----HRESSHVDPAPIKITIPKDKIIEMPCSSN 683
K K K+KHK+D+ DK + K +SS P+ E SS
Sbjct: 738 KTASEKSSSKGKRKHKNDEEADKIESKKQRLEEKSSSCSPSSSSSHHHSSSNKESRKSSR 797
Query: 684 LKKIKMKDSFENPLKIRISK-DFLSKDSKKRERDTDDSDYPSSK 726
K+ +M S +PL K + S+ +R+ DT S P S
Sbjct: 798 NKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQEDTSSSSGPFSA 841
>gnl|CDD|197891 smart00818, Amelogenin, Amelogenins, cell adhesion proteins, play a
role in the biomineralisation of teeth. They seem to
regulate formation of crystallites during the secretory
stage of tooth enamel development and are thought to
play a major role in the structural organisation and
mineralisation of developing enamel. The extracellular
matrix of the developing enamel comprises two major
classes of protein: the hydrophobic amelogenins and the
acidic enamelins. Circular dichroism studies of porcine
amelogenin have shown that the protein consists of 3
discrete folding units: the N-terminal region appears to
contain beta-strand structures, while the C-terminal
region displays characteristics of a random coil
conformation. Subsequent studies on the bovine protein
have indicated the amelogenin structure to contain a
repetitive beta-turn segment and a "beta-spiral" between
Gln112 and Leu138, which sequester a (Pro, Leu, Gln)
rich region. The beta-spiral offers a probable site for
interactions with Ca2+ ions. Muatations in the human
amelogenin gene (AMGX) cause X-linked hypoplastic
amelogenesis imperfecta, a disease characterised by
defective enamel. A 9bp deletion in exon 2 of AMGX
results in the loss of codons for Ile5, Leu6, Phe7 and
Ala8, and replacement by a new threonine codon,
disrupting the 16-residue (Met1-Ala16) amelogenin signal
peptide.
Length = 165
Score = 37.1 bits (86), Expect = 0.012
Identities = 26/84 (30%), Positives = 38/84 (45%), Gaps = 1/84 (1%)
Query: 401 PPLIKQEPNVKQEPYIK-SEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLP 459
P L Q+P V Q+P + HS+ P + H+P P+ Q +P+ P + ++ P
Sbjct: 59 PVLPAQQPVVPQQPLMPVPGQHSMTPTQHHQPNLPQPAQQPFQPQPLQPPQPQQPMQPQP 118
Query: 460 VPKPHPPPPPPPPPYMSAAKLPPP 483
P PP PP PP P P
Sbjct: 119 PVHPIPPLPPQPPLPPMFPMQPLP 142
>gnl|CDD|236877 PRK11192, PRK11192, ATP-dependent RNA helicase SrmB; Provisional.
Length = 434
Score = 37.6 bits (88), Expect = 0.023
Identities = 18/54 (33%), Positives = 26/54 (48%), Gaps = 2/54 (3%)
Query: 607 TKQEPHESHSQHPKKK--KKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRES 658
+ P E + P KK K EK + +K K KK+H+D K+ KR+ S
Sbjct: 379 KTKAPSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPSGTS 432
Score = 34.9 bits (81), Expect = 0.13
Identities = 14/40 (35%), Positives = 18/40 (45%)
Query: 620 KKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESS 659
K KK K ++ +K KEK K K K K R+ S
Sbjct: 390 KPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPS 429
Score = 32.6 bits (75), Expect = 0.74
Identities = 14/43 (32%), Positives = 25/43 (58%), Gaps = 3/43 (6%)
Query: 619 PKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSHV 661
P +KK K K K+ ++K+K K+ K K K +HR++ ++
Sbjct: 383 PSEKKTGKPSKKVLAKRAEKKEKEKE---KPKVKKRHRDTKNI 422
Score = 31.1 bits (71), Expect = 2.4
Identities = 20/79 (25%), Positives = 33/79 (41%), Gaps = 10/79 (12%)
Query: 571 EGLKAKLEPELVPVITKLGDDILTKPKILASEIISETKQEPHESHSQHPKKKKKHKEKDK 630
E LKA++ EL P + KP +E K+ K+K+K K K +
Sbjct: 366 EPLKARVIDELRPKTKAPSEKKTGKPSKKVLAKRAEKKE----------KEKEKPKVKKR 415
Query: 631 DRDKKHKEKKKHKDDKHKD 649
RD K+ K++ ++
Sbjct: 416 HRDTKNIGKRRKPSGTSEE 434
>gnl|CDD|227505 COG5178, PRP8, U5 snRNP spliceosome subunit [RNA processing and
modification].
Length = 2365
Score = 37.7 bits (87), Expect = 0.025
Identities = 16/49 (32%), Positives = 24/49 (48%), Gaps = 2/49 (4%)
Query: 460 VPKPHPPPPPPPPPYM--SAAKLPPPSHSDVITNVIKEVTYSKQMEKSS 506
+P +PPPPPPPP + S PPP +V K+++ + S
Sbjct: 4 LPPGNPPPPPPPPGFEPPSQPPPPPPPGVNVKKRSRKQLSIVGDILGHS 52
>gnl|CDD|235124 PRK03427, PRK03427, cell division protein ZipA; Provisional.
Length = 333
Score = 36.9 bits (86), Expect = 0.029
Identities = 25/159 (15%), Positives = 42/159 (26%), Gaps = 19/159 (11%)
Query: 361 RVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEP 420
RV P H +A PP ++QPP + P P
Sbjct: 70 RVNHAPANAQEHEAARPSPQHQYQPPY--ASAQPRQPVQQPPEAQVPPQHAPRP------ 121
Query: 421 HSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKL 480
P + +P+ P P+P P P A+
Sbjct: 122 --AQPAPQPVQQPAY---QPQPEQPLQQPVSP---QVAPAPQPVHSAPQPAQQAFQPAEP 173
Query: 481 PPPSHSDVITNVIKEVTYSKQMEKSSILN---HKESSIS 516
+ + + K+ E ++N H S ++
Sbjct: 174 VAAPQPEPVAEPAPVMDKPKRKEAVIVMNVAAHHGSELN 212
>gnl|CDD|237753 PRK14552, PRK14552, C/D box methylation guide ribonucleoprotein
complex aNOP56 subunit; Provisional.
Length = 414
Score = 36.9 bits (86), Expect = 0.038
Identities = 17/55 (30%), Positives = 29/55 (52%), Gaps = 1/55 (1%)
Query: 600 ASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDK 654
E+ E + E ++PK KK +E+ K + +K K+KK+ K K + K+ K
Sbjct: 361 GDELKEELNKRIEEIKEKYPKPPKKKREEKKPQKRK-KKKKRKKKGKKRKKKGRK 414
>gnl|CDD|171499 PRK12438, PRK12438, hypothetical protein; Provisional.
Length = 991
Score = 36.8 bits (85), Expect = 0.047
Identities = 11/34 (32%), Positives = 14/34 (41%)
Query: 463 PHPPPPPPPPPYMSAAKLPPPSHSDVITNVIKEV 496
P PP PPP + PP DV + E+
Sbjct: 920 PPAPPQAVPPPRTTQPPAAPPRGPDVPPAAVAEL 953
>gnl|CDD|221188 pfam11725, AvrE, Pathogenicity factor. This family is secreted by
gram-negative Gammaproteobacteria such as Pseudomonas
syringae of tomato and the fire blight plant pathogen
Erwinia amylovora, amongst others. It is an essential
pathogenicity factor of approximately 198 kDa. Its
injection into the host-plant is dependent upon the
bacterial type III or Hrp secretion system. The family
is long and carries a number of predicted functional
regions, including an ERMS or endoplasmic reticulum
membrane retention signal at both the C- and the
N-termini, a leucine-zipper motif from residues 539-560,
and a nuclear localisation signal at 1358-1361. this
conserved AvrE-family of effectors is among the few that
are required for full virulence of many phytopathogenic
pseudomonads, erwinias and pantoeas.
Length = 1771
Score = 35.9 bits (83), Expect = 0.10
Identities = 52/281 (18%), Positives = 79/281 (28%), Gaps = 44/281 (15%)
Query: 327 SRSSTSSTAVPINSMPSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPP 386
S S T + + S N K P VFQ SS+ R PP T P
Sbjct: 32 SESPTQRASHSLASEGKKNRKKMPK-VFQKSSAPRQIQAAPP--------QALNPTAAAP 82
Query: 387 MKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPV 446
LL P + + P + L + E + M + V
Sbjct: 83 QSSRGPTLRELLALPEDDGETQAPESSPSARR-------LTRSEGVARHEMEDLAGRPVV 135
Query: 447 IPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLP----PPSHSDVITNVIKEVTYSKQM 502
P R+ D+ PP +++K+P + + +EV +
Sbjct: 136 KPDADRQLRQDILNKSSSSRRPPVSKEEGTSSKMPATALASAALFKDDEIRQEVDAA--- 192
Query: 503 EKSSILNHKESSISHPPHLSMSIPQTKAPS-IFSPEKNTSPVINKTPFKMKT-----PTP 556
S + LS S A +P + F+ +
Sbjct: 193 ---------RSDQASQSRLSRSRGNPPAIPPDAAPRQPMLTRSAGGRFEGEDENLERNLQ 243
Query: 557 PSFSPIKISPSKSSEGLKAKLEPELVPVITKLGDDILTKPK 597
P SPI + +G K P + L L KP
Sbjct: 244 PQ-SPITL----DKKG-KLDFSGFNPPALNTLLQQTLGKPG 278
>gnl|CDD|236669 PRK10263, PRK10263, DNA translocase FtsK; Provisional.
Length = 1355
Score = 35.4 bits (81), Expect = 0.12
Identities = 30/137 (21%), Positives = 49/137 (35%), Gaps = 16/137 (11%)
Query: 341 MPSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQ 400
P +P A Q + P P + P+ + + P V P + +Q
Sbjct: 750 EPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQ 809
Query: 401 PPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPV 460
P + + Q+P + + +P+ P +PQ + P +R NG P+
Sbjct: 810 PVAPQPQYQQPQQPVAPQPQYQ----QPQQPVAP-----QPQDTLLHPLLMR-NGDSRPL 859
Query: 461 PKPHPPPP------PPP 471
KP P P PPP
Sbjct: 860 HKPTTPLPSLDLLTPPP 876
Score = 33.1 bits (75), Expect = 0.69
Identities = 30/153 (19%), Positives = 53/153 (34%), Gaps = 19/153 (12%)
Query: 337 PINSMPSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEA 396
P P+ PA + ++ S + + + P + +
Sbjct: 423 PAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQPQ 482
Query: 397 LLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQ----SKPVIPHEVR 452
++Q P+++ EP V E + P PL E + + + Q +P IP V+
Sbjct: 483 PVEQQPVVEPEPVV--EETKPARP----PLYYFEEVEEKRAREREQLAAWYQP-IPEPVK 535
Query: 453 KNGHDLPVP-KPHPPPPPPP--PPYMSAAKLPP 482
+ P P K P PP +AA + P
Sbjct: 536 E-----PEPIKSSLKAPSVAAVPPVEAAAAVSP 563
Score = 32.4 bits (73), Expect = 1.2
Identities = 23/110 (20%), Positives = 38/110 (34%), Gaps = 9/110 (8%)
Query: 382 TEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKP 441
H P+ V +QP +Q+ Q+P + + +P+ P+ +P
Sbjct: 739 GPHEPLFTPIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQ----QPQQPVAPQPQYQQP 794
Query: 442 QSKPVIPHEVRKNGHDLPVPKPHPPPPP----PPPPYMSAAKLPPPSHSD 487
Q +PV P + P+P P P P Y + P D
Sbjct: 795 Q-QPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQD 843
Score = 32.0 bits (72), Expect = 1.4
Identities = 20/137 (14%), Positives = 37/137 (27%), Gaps = 12/137 (8%)
Query: 350 PAHVFQTSSSSRVPPPP--PPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQE 407
P QT P P P ++ V + P++ + +QP Q+
Sbjct: 362 PVPGPQTGEPVIAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPA---QQ 418
Query: 408 PNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPP 467
P P ++ P +P+ + Q P + + + P
Sbjct: 419 PYYAPAPEQPAQQPYYAP-APEQPVAGNAWQAEEQQSTFAPQSTYQ------TEQTYQQP 471
Query: 468 PPPPPPYMSAAKLPPPS 484
P Y +
Sbjct: 472 AAQEPLYQQPQPVEQQP 488
>gnl|CDD|219339 pfam07223, DUF1421, Protein of unknown function (DUF1421). This
family represents a conserved region approximately 350
residues long within a number of plant proteins of
unknown function.
Length = 357
Score = 34.5 bits (79), Expect = 0.16
Identities = 24/147 (16%), Positives = 37/147 (25%), Gaps = 10/147 (6%)
Query: 340 SMPSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLK 399
+ + Q S + P PP + + + P P +
Sbjct: 76 APAPQSPQPDQQQQSQAPPSHQYPSQLPPQQ--VQSVPQQPTPQQEPYYPPPSQPQPPPA 133
Query: 400 QPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLP 459
Q P +Q Q P + P P P+ N P P + P
Sbjct: 134 QQPQAQQPQPPPQVP---QQQQYQSP-----PQQPQYQQNPPPQAQSAPQVSGLYPEESP 185
Query: 460 VPKPHPPPPPPPPPYMSAAKLPPPSHS 486
PP P P M+ +
Sbjct: 186 YQPQSYPPNEPLPSSMAMQPPYSGAPP 212
>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR). This
family consists of several bovine specific leukaemia
virus receptors which are thought to function as
transmembrane proteins, although their exact function is
unknown.
Length = 561
Score = 34.7 bits (79), Expect = 0.21
Identities = 27/118 (22%), Positives = 44/118 (37%), Gaps = 4/118 (3%)
Query: 613 ESHSQHPKKK-KKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSHVDPAPIKITIP 671
E S+ PKKK KK KEK++D+DKK + + D + D A + T+
Sbjct: 196 EKKSKKPKKKEKKEKEKERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEASLANTVS 255
Query: 672 KDKIIEMPCSSNLKKIKMKDSFENPLKIRISKDFLSKDSKKRERDTDDSDYPSSKRMA 729
P + + + K + K K+ KK+++ S A
Sbjct: 256 GTAPDSEPDEPKDAE---AEETKKSPKHKKKKQRKEKEEKKKKKKHHHHRCHHSDGGA 310
Score = 32.0 bits (72), Expect = 1.1
Identities = 28/126 (22%), Positives = 47/126 (37%), Gaps = 7/126 (5%)
Query: 535 SPEKNTSPVINKTPFKMKTPTPPSFSPIKISPSKSSEGLKAKLEPELVPVITKLGDDILT 594
SPEK P + K K K K K + K K ++ L D +
Sbjct: 186 SPEKGDVPAVEKKSKKPK-------KKEKKEKEKERDKDKKKEVEGFKSLLLALDDSPAS 238
Query: 595 KPKILASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDK 654
+ ++ S + P + K + ++ + KHK+KK+ K+ + K K+K
Sbjct: 239 AASVAEADEASLANTVSGTAPDSEPDEPKDAEAEETKKSPKHKKKKQRKEKEEKKKKKKH 298
Query: 655 HRESSH 660
H H
Sbjct: 299 HHHRCH 304
>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
Provisional.
Length = 482
Score = 34.1 bits (79), Expect = 0.23
Identities = 16/67 (23%), Positives = 25/67 (37%)
Query: 620 KKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSHVDPAPIKITIPKDKIIEMP 679
KK K +K K+ +EKK+ K K+K++ E K ++ E
Sbjct: 408 ATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKE 467
Query: 680 CSSNLKK 686
KK
Sbjct: 468 EEEEKKK 474
Score = 29.5 bits (67), Expect = 7.1
Identities = 14/71 (19%), Positives = 38/71 (53%), Gaps = 5/71 (7%)
Query: 592 ILTKPKILASEIISETKQEPH----ESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKH 647
LT K A++ I + ++ E + KK K+K+++ +++ ++K++ K+++
Sbjct: 401 FLTGSKK-ATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEE 459
Query: 648 KDKRKDKHRES 658
++ ++K E
Sbjct: 460 EEAEEEKEEEE 470
>gnl|CDD|225657 COG3115, ZipA, Cell division protein [Cell division and chromosome
partitioning].
Length = 324
Score = 34.1 bits (78), Expect = 0.25
Identities = 17/118 (14%), Positives = 27/118 (22%), Gaps = 14/118 (11%)
Query: 368 PHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLR 427
P+ EH + P + IK + P
Sbjct: 63 EVRVVRKNEAPQFTQEHEAARQSPQHQYQPEYASAQIKIPVPQPPQISD--------PPA 114
Query: 428 KHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSH 485
+P P + +P + P P P P P P + + P
Sbjct: 115 HPQPTQPALDQEQPPEEARQPVL------PQEAPAPQPVHSAAPQPAVQTVQPAVPEQ 166
>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
splicing factor. These splicing factors consist of an
N-terminal arginine-rich low complexity domain followed
by three tandem RNA recognition motifs (pfam00076). The
well-characterized members of this family are auxilliary
components of the U2 small nuclear ribonuclearprotein
splicing factor (U2AF). These proteins are closely
related to the CC1-like subfamily of splicing factors
(TIGR01622). Members of this subfamily are found in
plants, metazoa and fungi.
Length = 509
Score = 34.1 bits (78), Expect = 0.30
Identities = 9/55 (16%), Positives = 29/55 (52%)
Query: 608 KQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSHVD 662
+ + S+ P+++ + + + +DR ++ +E+ +D + +D+R+ R +
Sbjct: 13 RGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRSLR 67
>gnl|CDD|221028 pfam11208, DUF2992, Protein of unknown function (DUF2992). This
bacterial family of proteins has no known function.
However, the cis-regulatory yjdF motif, just upstream
from the gene encoding the proteins for this family, is
a small non-coding RNA, Rfam:RF01764. The yjdF motif is
found in many Firmicutes, including Bacillus subtilis.
In most cases, it resides in potential 5' UTRs of
homologues of the yjdF gene whose function is unknown.
However, in Streptococcus thermophilus, a yjdF RNA motif
is associated with an operon whose protein products
synthesise nicotinamide adenine dinucleotide (NAD+).
Also, the S. thermophilus yjdF RNA lacks typical yjdF
motif consensus features downstream of and including the
P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the
S. thermophilus RNAs might sense a distinct compound
that structurally resembles the ligand bound by other
yjdF RNAs. On the ohter hand, perhaps these RNAs have an
alternative solution forming a similar binding site, as
is observed with some SAM riboswitches.
Length = 132
Score = 32.2 bits (74), Expect = 0.36
Identities = 18/67 (26%), Positives = 35/67 (52%), Gaps = 9/67 (13%)
Query: 596 PKILASEIISETKQEPHESHSQ------HPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKD 649
PK L + E K+ + +Q H + K++ K++ K++ ++ KE+K+ +
Sbjct: 67 PKRLQRQAAKEVKKPGISTKAQQALKLEHERNKQEKKKRSKEKKEEEKERKR---QLKQQ 123
Query: 650 KRKDKHR 656
K+K KHR
Sbjct: 124 KKKAKHR 130
>gnl|CDD|233692 TIGR02031, BchD-ChlD, magnesium chelatase ATPase subunit D. This
model represents one of two ATPase subunits of the
trimeric magnesium chelatase responsible for insertion
of magnesium ion into protoporphyrin IX. This is an
essential step in the biosynthesis of both chlorophyll
and bacteriochlorophyll. This subunit is found in green
plants, photosynthetic algae, cyanobacteria and other
photosynthetic bacteria. Unlike subunit I (TIGR02030),
this subunit is not found in archaea [Biosynthesis of
cofactors, prosthetic groups, and carriers, Chlorophyll
and bacteriochlorphyll].
Length = 589
Score = 33.6 bits (77), Expect = 0.37
Identities = 12/16 (75%), Positives = 13/16 (81%)
Query: 458 LPVPKPHPPPPPPPPP 473
LP P+P PPPPPPPP
Sbjct: 266 LPEPEPQPPPPPPPPE 281
Score = 32.5 bits (74), Expect = 0.88
Identities = 11/28 (39%), Positives = 13/28 (46%)
Query: 446 VIPHEVRKNGHDLPVPKPHPPPPPPPPP 473
V+ + P P P PPPP PP P
Sbjct: 258 VLLPRATRLPEPEPQPPPPPPPPEPPEP 285
Score = 31.7 bits (72), Expect = 1.5
Identities = 10/31 (32%), Positives = 12/31 (38%)
Query: 457 DLPVPKPHPPPPPPPPPYMSAAKLPPPSHSD 487
PPPPPPP P + P +D
Sbjct: 266 LPEPEPQPPPPPPPPEPPEPEEEPDEPDQTD 296
>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
Length = 1352
Score = 34.0 bits (78), Expect = 0.37
Identities = 26/143 (18%), Positives = 42/143 (29%), Gaps = 2/143 (1%)
Query: 430 EPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSHSDVI 489
P N + P + G P P PPPP PP S P P S+++
Sbjct: 79 APANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEML 138
Query: 490 TNVIKEVTYSKQMEKSSILNH--KESSISHPPHLSMSIPQTKAPSIFSPEKNTSPVINKT 547
V ++ + S + ++ + + + P +
Sbjct: 139 RPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPEETARAPSSPPAEPPPSTP 198
Query: 548 PFKMKTPTPPSFSPIKISPSKSS 570
P P SPI S S +
Sbjct: 199 PAAASPRPPRRSSPISASASSPA 221
Score = 29.8 bits (67), Expect = 7.5
Identities = 17/92 (18%), Positives = 25/92 (27%), Gaps = 6/92 (6%)
Query: 292 PHSTSGRVTDDKRRSE-HNGPPPEYRKLMAGGRDMNSRSSTSSTAVPINSMPSANTNKPP 350
P GR D + + E G + + +P A+ P
Sbjct: 220 PAPAPGRSAADDAGASSSDSSSSESSGCGWGPENECPLPRPAPITLPTRIW-EASGWNGP 278
Query: 351 AHVFQTSSSSRVP----PPPPPHHHHSSAHVP 378
+ +SSS P P P P S
Sbjct: 279 SSRPGPASSSSSPRERSPSPSPSSPGSGPAPS 310
>gnl|CDD|237865 PRK14951, PRK14951, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 618
Score = 33.5 bits (77), Expect = 0.39
Identities = 14/133 (10%), Positives = 25/133 (18%), Gaps = 11/133 (8%)
Query: 342 PSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQP 401
+A PA + + +R P + A P P
Sbjct: 367 AAAAEAAAPA---EKKTPARPEAAAPAAAPVAQAAAAPAPAAAPA-AAASAPAAPPAAAP 422
Query: 402 PLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVP 461
P P + + + + + V +
Sbjct: 423 PAPVAAPA--AAAPAAAPAAAPAAVALAPAPPAQ-----AAPETVAIPVRVAPEPAVASA 475
Query: 462 KPHPPPPPPPPPY 474
P P P
Sbjct: 476 APAPAAAPAAARL 488
>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
This is a family of fungal proteins of unknown function.
Length = 182
Score = 32.7 bits (75), Expect = 0.39
Identities = 25/65 (38%), Positives = 35/65 (53%), Gaps = 1/65 (1%)
Query: 593 LTKPKILASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRK 652
K K LA EI + K+E E KKKK K+KDKD+DKK +K + K + + +
Sbjct: 61 KKKKKELAEEI-EKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAE 119
Query: 653 DKHRE 657
DK +
Sbjct: 120 DKLED 124
Score = 29.3 bits (66), Expect = 4.8
Identities = 17/56 (30%), Positives = 28/56 (50%), Gaps = 3/56 (5%)
Query: 620 KKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSHVDPAPIKITIPKDKI 675
KKKK K+KDK DKK + +K + + +DK +D + S ++ K +
Sbjct: 91 SKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYS---ETLSTLSELKPRK 143
Score = 28.9 bits (65), Expect = 7.7
Identities = 22/75 (29%), Positives = 31/75 (41%), Gaps = 4/75 (5%)
Query: 588 LGDDILTKPKILASEIISETKQEPHESHSQHPKK----KKKHKEKDKDRDKKHKEKKKHK 643
L D P A ++ K++ + KK K+K K K K KK + K K
Sbjct: 43 LQDRHFATPIYDAEYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKK 102
Query: 644 DDKHKDKRKDKHRES 658
DDK DK + K +
Sbjct: 103 DDKKDDKSEKKDEKE 117
Score = 28.5 bits (64), Expect = 9.5
Identities = 15/44 (34%), Positives = 21/44 (47%)
Query: 619 PKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSHVD 662
K K K K+ DK DK K+ +K +DK +D K S +
Sbjct: 94 KKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYSETLSTLS 137
>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
Length = 509
Score = 33.4 bits (77), Expect = 0.42
Identities = 31/200 (15%), Positives = 55/200 (27%), Gaps = 20/200 (10%)
Query: 523 MSIPQTKAPSIFSPEKNTSPVINKTPFKMKTP-TPPSFSPIKISPSKSSEGLKAKLEPEL 581
M+ TKA + E+ + K K K+ + SK +
Sbjct: 1 MTTASTKAEL--AAEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIF 58
Query: 582 VPVITKLGDDILTKPKILASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKK 641
+ + K DD I + + K K K K +D+ KK
Sbjct: 59 LSGMVKDTDDATES-DIPKKKTKTAAK-----------AAAAKAPAKKKLKDELDSSKKA 106
Query: 642 HKDDKHKDKRKDKHRESSHVDPAPIKITIPKDKIIEMPCSSNLKKIKMKDSFENPLKIRI 701
K + + + V D + + I D E+ +
Sbjct: 107 EKKNALDKDDDLNYVKDIDVLNQ-----ADDDDDDDDDDDLDDDDIDDDDDDEDDDEDDD 161
Query: 702 SKDFLSKDSKKRERDTDDSD 721
D +D +K+E +
Sbjct: 162 DDDVDDEDEEKKEAKELEKL 181
>gnl|CDD|235638 PRK05896, PRK05896, DNA polymerase III subunits gamma and tau;
Validated.
Length = 605
Score = 33.7 bits (77), Expect = 0.45
Identities = 23/106 (21%), Positives = 42/106 (39%), Gaps = 5/106 (4%)
Query: 592 ILTKPKILASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEK--KKHKDDKHKD 649
I TK + E + ++ KKK + +K++D K +E K+ K
Sbjct: 390 ITTKKINIVEESNKNSVHFDTLYKTKIFYHKKKINQNNKEQDIKKEELLEKEFVKKSEKI 449
Query: 650 KRKDKHRESSHVDPAPIKITIPKDKIIEMPCSSNLKKIKMKDSFEN 695
+ D+ ++ ++ A K KD + K K ++S EN
Sbjct: 450 PKNDELLDN--LELAKQKF-FNKDIELSKNMLQKFNKFKNEESAEN 492
>gnl|CDD|223747 COG0675, COG0675, Transposase and inactivated derivatives [DNA
replication, recombination, and repair].
Length = 364
Score = 32.7 bits (74), Expect = 0.63
Identities = 11/52 (21%), Positives = 23/52 (44%)
Query: 610 EPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSHV 661
+ +K+ K+ R KK K K ++ +++RKD H+ + +
Sbjct: 203 RKLLKRLKKAQKRLSRKKSRSKRRKKAKLKLARLRERIRNRRKDFHKLAKKL 254
>gnl|CDD|217469 pfam03276, Gag_spuma, Spumavirus gag protein.
Length = 582
Score = 32.6 bits (74), Expect = 0.83
Identities = 31/120 (25%), Positives = 39/120 (32%), Gaps = 24/120 (20%)
Query: 447 IPHEVRKNGHDLPVP-KPHPPPPPPPPPYMSAAKLPPPSHSDVITNVIKEVTYSKQMEKS 505
+ E++ G +P PPPP P LPP S S S
Sbjct: 169 MLVELQIGGRGGNIPGAIQPPPPSSLP------GLPPGSS-------------SLAPSAS 209
Query: 506 SILNHKESSISHPPHLSMSIP-QTKAPSIFSPEKNTSPVINKTPFKMKTPTPPSFSPIKI 564
S ++ +S P L P Q AP P PVI + P PP I I
Sbjct: 210 STPGNRLPRVSFNPFLPGPSPAQPSAPPASIPAPPIPPVI---QYVAPPPVPPPQPIIPI 266
>gnl|CDD|219055 pfam06484, Ten_N, Teneurin Intracellular Region. This family is
found in the intracellular N-terminal region of the
Teneurin family of proteins. These proteins are
'pair-rule' genes and are involved in tissue patterning,
specifically probably neural patterning. The
intracellular domain is cleaved in response to
homophilic interaction of the extracellular domain, and
translocates to the nucleus. Here it probably carries
out to some transcriptional regulatory activity. The
length of this region and the conservation suggests that
there may be two structural domains here (personal obs:C
Yeats).
Length = 370
Score = 32.3 bits (73), Expect = 0.84
Identities = 13/41 (31%), Positives = 16/41 (39%), Gaps = 2/41 (4%)
Query: 342 PSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKT 382
P+ + P Q S R PP P H H S H I +
Sbjct: 226 PAQASASPGNF--QNHSRLRTPPLPLSHSHSPSHHAASINS 264
>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
Length = 943
Score = 32.7 bits (74), Expect = 0.88
Identities = 62/297 (20%), Positives = 100/297 (33%), Gaps = 21/297 (7%)
Query: 357 SSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYI 416
S + P P EH P K+ + K+ + P + P +EP
Sbjct: 540 SDEPKEGGKPGETKEGEVGKKPGPAKEHKPSKIPTLSKKPEFPKDP---KHPKDPEEPKK 596
Query: 417 KSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMS 476
P S R P +P++ K E K+ P P+ P P P +
Sbjct: 597 PKRPRS--AQRPTRPKSPKLPELLDIPKSPKRPESPKSPKRPPPPQRPSSPERPEGPKII 654
Query: 477 AAKLPPPSHSDVITNVIKEVTYSKQMEKSSILNHKESSISHPPHLSMSIPQTKAPSIFSP 536
+ PP S KE Y ++ ++ ++++ + +T P
Sbjct: 655 KSPKPPKSPKPPFDPKFKEKFYDDYLDAAAKSKETKTTVVLDESFESILKET------LP 708
Query: 537 EKNTSPVINKTPFKMKTPTPPS--FSPIKISPSKSSEGLKAKLEPELVPVITKLGDDILT 594
E +P P K P F PI ++ + ++ PE
Sbjct: 709 ETPGTPFTTPRPLPPKLPRDEEFPFEPIGDPDAEQPDDIEFFTPPEEERTFFHETPADTP 768
Query: 595 KPKILASE-----IISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDK 646
P ILA E I +ET EP E+ + P +H++K D KK+H+ D
Sbjct: 769 LPDILAEEFKEEDIHAETG-EPDEA-MKRPDSPSEHEDKPPG-DHPSLPKKRHRLDG 822
>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
Length = 434
Score = 32.5 bits (74), Expect = 0.89
Identities = 17/49 (34%), Positives = 27/49 (55%), Gaps = 2/49 (4%)
Query: 606 ETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKK-KHKDDKHKDKRKD 653
E K E S + K K+ K K ++ +KK KE+ + K+ + KD+RK
Sbjct: 387 EVKDETDAS-EEAEAKAKEEKLKQEENEKKQKEQADEDKEKRQKDERKK 434
>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family. Atrophin-1 is the
protein product of the dentatorubral-pallidoluysian
atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
neurodegenerative disorder. It is caused by the
expansion of a CAG repeat in the DRPLA gene on
chromosome 12p. This results in an extended
polyglutamine region in atrophin-1, that is thought to
confer toxicity to the protein, possibly through
altering its interactions with other proteins. The
expansion of a CAG repeat is also the underlying defect
in six other neurodegenerative disorders, including
Huntington's disease. One interaction of expanded
polyglutamine repeats that is thought to be pathogenic
is that with the short glutamine repeat in the
transcriptional coactivator CREB binding protein, CBP.
This interaction draws CBP away from its usual nuclear
location to the expanded polyglutamine repeat protein
aggregates that are characteristic of the polyglutamine
neurodegenerative disorders. This interferes with
CBP-mediated transcription and causes cytotoxicity.
Length = 979
Score = 32.4 bits (73), Expect = 0.97
Identities = 38/155 (24%), Positives = 59/155 (38%), Gaps = 18/155 (11%)
Query: 333 STAVPINSMPSANTNKPPAHVFQTSSSSRVP---PPPPPHHHHSSAHVPKIKTEHPPMKV 389
+T +P S +K P H+ S ++P PPPP SS + T HPP
Sbjct: 351 TTPIPQLPNQS---HKHPPHLQGPSPFPQMPSNLPPPPALKPLSS-----LPTHHPPSAH 402
Query: 390 EPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPH 449
P ++ L+ Q + +V +P + ++ SL P P + + P P H
Sbjct: 403 PPPLQ--LMPQS---QPLQSVPAQPPVLTQSQSLPPKASTHP--HSGLHSGPPQSPFAQH 455
Query: 450 EVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPS 484
G P P P P P +++ PP
Sbjct: 456 PFTSGGLPAIGPPPSLPTSTPAAPPRASSGSQPPG 490
Score = 30.0 bits (67), Expect = 6.3
Identities = 37/149 (24%), Positives = 51/149 (34%), Gaps = 25/149 (16%)
Query: 342 PSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQP 401
P + P QT+S PP P H S+H H P P M AL + P
Sbjct: 234 PQRLPSPHPPLQPQTASQQSPQPPAPSSRHPQSSH-------HGPG---PPMPHALQQGP 283
Query: 402 PLIKQEPNVKQEPY---IKSEPHSLLP--LRKHEPINPRMMMNKPQSKPVIPHEVRKNGH 456
++ + +P+ P LP + H P +PQ P
Sbjct: 284 VFLQHPSSNPPQPFGLAQSQVPPLPLPSQAQPHSHTPPSQSALQPQQPP----------R 333
Query: 457 DLPVPKPHPPPPPPPPPYMSAAKLPPPSH 485
+ P+P P PPP +LP SH
Sbjct: 334 EQPLPPAPSMPHIKPPPTTPIPQLPNQSH 362
>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3. This
protein, which interacts with both microtubules and
TRAF3 (tumour necrosis factor receptor-associated factor
3), is conserved from worms to humans. The N-terminal
region is the microtubule binding domain and is
well-conserved; the C-terminal 100 residues, also
well-conserved, constitute the coiled-coil region which
binds to TRAF3. The central region of the protein is
rich in lysine and glutamic acid and carries KKE motifs
which may also be necessary for tubulin-binding, but
this region is the least well-conserved.
Length = 506
Score = 31.8 bits (72), Expect = 1.4
Identities = 35/172 (20%), Positives = 73/172 (42%), Gaps = 5/172 (2%)
Query: 560 SPIKISPSKSSEGLKAKLEPELVPVITKLGDDILTKPKILASEIISETKQEPHESHSQHP 619
P+ + P+K G + + EL+ + K + L+ + + +K ++
Sbjct: 38 EPLAVKPAKIVAGQEPERTNELLQALAKCAESKLSSDEAVKRVEKGGSKGPAAKTKPAKE 97
Query: 620 KKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSHVDPAPIKITIPKDKIIEMP 679
K + KE++K++++ +EKKK K +K K++ KD+ + + P K K+K E
Sbjct: 98 PKNESGKEEEKEKEQVKEEKKK-KKEKPKEEPKDRKPKEEAKEKRPPK---EKEKEKEKK 153
Query: 680 CSSNLKKIKMKDSFENPLKIRISKDFLSKD-SKKRERDTDDSDYPSSKRMAS 730
+ + K K R K K +KK+E ++ +++
Sbjct: 154 VEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEPPEEEKQRQAAREAVK 205
>gnl|CDD|218549 pfam05308, Mito_fiss_reg, Mitochondrial fission regulator. In
eukaryotes, this family of proteins induces
mitochondrial fission.
Length = 248
Score = 31.3 bits (71), Expect = 1.5
Identities = 12/28 (42%), Positives = 14/28 (50%)
Query: 463 PHPPPPPPPPPYMSAAKLPPPSHSDVIT 490
PPPPPPPPP + S D+I
Sbjct: 182 EVPPPPPPPPPPPPPSLQQSTSAIDLIK 209
Score = 30.5 bits (69), Expect = 2.8
Identities = 16/60 (26%), Positives = 24/60 (40%), Gaps = 3/60 (5%)
Query: 459 PVPKPHPPPPPPPPPYMSAAKLPPPSHSDVITNVIKEVTYSKQMEKSSILNHKESSISHP 518
P PPPPPPPP L + + ++IKE + +++ K S P
Sbjct: 177 EEPVLEVPPPPPPPPPPPPPSLQQSTSA---IDLIKERKGQRSAAGKTLVLSKPKSPEFP 233
>gnl|CDD|214395 CHL00204, ycf1, Ycf1; Provisional.
Length = 1832
Score = 32.0 bits (73), Expect = 1.6
Identities = 13/36 (36%), Positives = 20/36 (55%)
Query: 621 KKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHR 656
K + K D +K K+KKK K + + KR++K R
Sbjct: 731 KDAEFKISDSVEEKTKKKKKKEKKKEEEYKREEKAR 766
>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1. This domain
family is found in eukaryotes, and is approximately 40
amino acids in length. The family is found in
association with pfam07719, pfam00515. There is a single
completely conserved residue L that may be functionally
important. NARP1 is the mammalian homologue of a yeast
N-terminal acetyltransferase that regulates entry into
the G(0) phase of the cell cycle.
Length = 516
Score = 31.4 bits (72), Expect = 1.6
Identities = 16/54 (29%), Positives = 26/54 (48%), Gaps = 6/54 (11%)
Query: 619 PKKKKKHKEKDKDRDKKHK--EKKKHKDDK----HKDKRKDKHRESSHVDPAPI 666
P ++KK ++K + +KK + E +K K K K E+ VDP P+
Sbjct: 408 PAERKKLRKKQRKAEKKAEKEEAEKAAAKKKAEAAAKKAKGPDGETKKVDPDPL 461
>gnl|CDD|223035 PHA03294, PHA03294, envelope glycoprotein H; Provisional.
Length = 835
Score = 31.6 bits (72), Expect = 1.6
Identities = 18/114 (15%), Positives = 24/114 (21%), Gaps = 20/114 (17%)
Query: 392 VMKEALLKQPPLIKQEPNVKQEPYIKS-EPHSLL-------------PLRKHEPINPRMM 437
V+ ALL QP K P S L L +
Sbjct: 96 VLPTALLGQPTFAKLPARAPTGRLPPPVAPLSGLLGNPNLAPYLRTRHLVDFSVVPD--- 152
Query: 438 MNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSHSDVITN 491
+ P PPP PP + +D +
Sbjct: 153 PRSLTRWVFDRSDTAATKAH---PSGVALPPPRAPPPRNTTDPATIKPNDHLNP 203
>gnl|CDD|237864 PRK14950, PRK14950, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 585
Score = 31.7 bits (72), Expect = 1.6
Identities = 9/26 (34%), Positives = 10/26 (38%)
Query: 461 PKPHPPPPPPPPPYMSAAKLPPPSHS 486
P PPP PP A +P S
Sbjct: 402 PVRETATPPPVPPRPVAPPVPHTPES 427
Score = 30.2 bits (68), Expect = 5.0
Identities = 11/35 (31%), Positives = 13/35 (37%)
Query: 459 PVPKPHPPPPPPPPPYMSAAKLPPPSHSDVITNVI 493
PV + PPP PP P P S + I
Sbjct: 402 PVRETATPPPVPPRPVAPPVPHTPESAPKLTRAAI 436
>gnl|CDD|218181 pfam04621, ETS_PEA3_N, PEA3 subfamily ETS-domain transcription
factor N terminal domain. The N terminus of the PEA3
transcription factors is implicated in transactivation
and in inhibition of DNA binding. Transactivation is
potentiated by activation of the Ras/MAP kinase and
protein kinase A signalling cascades. The N terminal
region contains conserved MAP kinase phosphorylation
sites.
Length = 336
Score = 31.4 bits (71), Expect = 1.7
Identities = 25/152 (16%), Positives = 44/152 (28%), Gaps = 19/152 (12%)
Query: 337 PINSMPSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEA 396
P + P + R P P + S P+
Sbjct: 126 PASGFKPPTPPSTPCSPVNPQETVRQLQPSGPLSNSSPPSPHTPLPNQSPL--------- 176
Query: 397 LLKQPPLIKQEPNVKQEPYIK---SEP-HSLLPLRKHEPINPRMMMNKPQSKPVIPH--- 449
PP+ + + E + SEP P + R ++ S+P++P+
Sbjct: 177 ---PPPMSSPDSSYPSEHRFQRQLSEPCLPFPPPPGRGSRDGRPPYHRQMSEPLVPYPPQ 233
Query: 450 EVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLP 481
++ HD + P P P M + P
Sbjct: 234 GFKQEYHDPLYEEAGVPNQGPFPHPMMIKQEP 265
>gnl|CDD|219901 pfam08555, DUF1754, Eukaryotic family of unknown function
(DUF1754). This is a eukaryotic protein family of
unknown function.
Length = 90
Score = 29.3 bits (66), Expect = 1.7
Identities = 11/41 (26%), Positives = 23/41 (56%)
Query: 620 KKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRESSH 660
KKKKK K+K+K +++ EK++ + + K+ + +
Sbjct: 21 KKKKKKKKKNKSKEEVVTEKEEEEKSSAESDLKEGEEDEDN 61
>gnl|CDD|221012 pfam11169, DUF2956, Protein of unknown function (DUF2956). This
family of proteins with unknown function appears to be
restricted to Gammaproteobacteria.
Length = 103
Score = 29.6 bits (67), Expect = 1.8
Identities = 9/22 (40%), Positives = 14/22 (63%)
Query: 622 KKKHKEKDKDRDKKHKEKKKHK 643
KK+ K K ++ DK K++ K K
Sbjct: 40 KKQQKAKAREADKARKQQLKAK 61
>gnl|CDD|237057 PRK12323, PRK12323, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 700
Score = 31.4 bits (71), Expect = 2.0
Identities = 19/131 (14%), Positives = 25/131 (19%), Gaps = 10/131 (7%)
Query: 354 FQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQE 413
F+ S P + P A P V
Sbjct: 363 FRPGQSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAA 422
Query: 414 PYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPP 473
P +S L + P P P P P P
Sbjct: 423 PARRSPAPEALAAARQASARGPGGAPAPAPAPA----------AAPAAAARPAAAGPRPV 472
Query: 474 YMSAAKLPPPS 484
+AA P +
Sbjct: 473 AAAAAAAPARA 483
Score = 29.8 bits (67), Expect = 5.7
Identities = 31/201 (15%), Positives = 51/201 (25%), Gaps = 28/201 (13%)
Query: 287 ATFAPPHSTSGRVTDDKRRSEHNGPPPEYRKLMAGGRDMNSRSSTSSTAVPINSMPSANT 346
A A + RRS R+ A G + + A P A
Sbjct: 408 AAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAP------AAA 461
Query: 347 NKPPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQ 406
+P A + +++ P + A + PP + P P +
Sbjct: 462 ARPAAAGPRPVAAAAAAAPARAAPAAAPAPADD---DPPPWEELP----PEFASPAPAQP 514
Query: 407 EPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPP 466
+ S+ +P + + + P P
Sbjct: 515 DA-----APAGWVAESIPDPATADPDDAFETLAPAPAAAPAP----------RAAAATEP 559
Query: 467 PPPPPPPYMSAAKLPPPSHSD 487
P PP SA+ LP D
Sbjct: 560 VVAPRPPRASASGLPDMFDGD 580
>gnl|CDD|177089 CHL00189, infB, translation initiation factor 2; Provisional.
Length = 742
Score = 31.3 bits (71), Expect = 2.2
Identities = 30/125 (24%), Positives = 45/125 (36%), Gaps = 20/125 (16%)
Query: 614 SHSQHPKKKKKHKEKDKDRD--------KKHKEKKKHKDDKHKDKRKDKHRESSHVDPAP 665
H + KK KK + D +D KK +KK H DD + + K+ P
Sbjct: 52 LHEKLDKKNKKFNKTDDLKDSKKTKLKQKKKIKKKLHIDDDYDNFFDSKNNSKQFAGPLA 111
Query: 666 IKITIPKDKIIEMPCSSNLKKIKMKDSFENPLKIRISKDFLSKDSKKRERDTDDSDYPSS 725
I + P + LKK + N K ++ S K E D++ P S
Sbjct: 112 ISLMRKP-----KPKTEKLKKKITVNKSTNKKKKKVL-------SSKDELIKYDNNKPKS 159
Query: 726 KRMAS 730
+ S
Sbjct: 160 ISIHS 164
>gnl|CDD|216368 pfam01213, CAP_N, Adenylate cyclase associated (CAP) N terminal.
Length = 313
Score = 30.9 bits (70), Expect = 2.3
Identities = 12/35 (34%), Positives = 16/35 (45%), Gaps = 2/35 (5%)
Query: 453 KNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSHSD 487
+ P P PPPPPP P +S + + SD
Sbjct: 225 VSSSAPSAPPPPPPPPPPSVPTISNSV--ESASSD 257
Score = 30.2 bits (68), Expect = 3.8
Identities = 26/105 (24%), Positives = 34/105 (32%), Gaps = 19/105 (18%)
Query: 458 LPVPKPHPPPPPPPPPYMSAAKLPPPSHSDVITNVIKEVTYSKQMEKSSI------LNHK 511
LP P PPPPP PPP I+N V + K LN
Sbjct: 222 LPAVSSSAPSAPPPPP------PPPPPSVPTISN---SVESASSDSKGGRGAVFAELNKG 272
Query: 512 ESSISHPPHLSMSIPQTKAPSIFSPEKNTSPVINKTPFKMKTPTP 556
E S ++ + K P + + S + P K P P
Sbjct: 273 EGITSGLKKVTDDMKTHKNPEL----RAQSGPTSSGPKPGKPPAP 313
>gnl|CDD|214832 smart00817, Amelin, Ameloblastin precursor (Amelin). This family
consists of several mammalian Ameloblastin precursor
(Amelin) proteins. Matrix proteins of tooth enamel
consist mainly of amelogenin but also of non-amelogenin
proteins, which, although their volumetric percentage is
low, have an important role in enamel mineralisation.
One of the non-amelogenin proteins is ameloblastin, also
known as amelin and sheathlin. Ameloblastin (AMBN) is
one of the enamel sheath proteins which is though to
have a role in determining the prismatic structure of
growing enamel crystals.
Length = 411
Score = 31.0 bits (70), Expect = 2.4
Identities = 31/141 (21%), Positives = 45/141 (31%), Gaps = 20/141 (14%)
Query: 355 QTSSSSRVPPPPPPHHHHSSAHVPKIK-----TEHPPMKVEPVMKEAL----LKQPPLIK 405
Q S V PPP P P +K T P + P L QPPL +
Sbjct: 88 QYEYSLPVHPPPLPSQPSLQPQQPGLKPFLQPTALPTNQATPQKNGPQPPMHLGQPPLQQ 147
Query: 406 QEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHP 465
E + S+ + + P PQ+ + + P+P+
Sbjct: 148 AELPMIPPQVAPSD-------KPPQTELPLYDFADPQNPLLFQ--IAHLMSRGPMPQNKQ 198
Query: 466 PPPPPPPPYMS--AAKLPPPS 484
P YMS A +L P+
Sbjct: 199 QHLYPGLFYMSYGANQLGAPA 219
>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
Length = 1388
Score = 31.2 bits (71), Expect = 2.4
Identities = 36/224 (16%), Positives = 67/224 (29%), Gaps = 21/224 (9%)
Query: 500 KQMEKSSILNHKESSISHPPHLSMSIPQTKAPSIFSPEKNTSPVINKTPFKMKTPTPPSF 559
K+ +KSS K++S+ S + K +K+ S ++ + + P
Sbjct: 1179 KKKKKSSADKSKKASVVGNSKRVDSDEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKS 1238
Query: 560 SPIKISPSKSSEGLKAKLEPELVPVITKLGDDILTKPKI--LASEIISETKQEPHESHSQ 617
S ++ K++ ++ E PK + P +
Sbjct: 1239 SVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNG 1298
Query: 618 HPKKKKKHKEKDKDRD-----KKHKEKKKHKDDKHKDKRKDKHRESSHVDPAPIKITIPK 672
K K+K K R K+KK K K K K + +++S + + K
Sbjct: 1299 GSKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRLLRRPRK 1358
Query: 673 DKIIEMPCSSNLKKIKMKDSFENPLKIRISKDFLSKDSKKRERD 716
K S E+ + D + D
Sbjct: 1359 KKS--------------DSSSEDDDDSEVDDSEDEDDEDDEDDD 1388
>gnl|CDD|152747 pfam12312, NeA_P2, Nepovirus subgroup A polyprotein. This family
of proteins is found in viruses. Proteins in this family
are typically between 259 and 1110 amino acids in
length. The family is found in association with
pfam03688, pfam03689, pfam03391. This family is one of
the polyproteins expressed by Nepoviruses in subgroup A.
Length = 175
Score = 30.3 bits (68), Expect = 2.4
Identities = 22/72 (30%), Positives = 34/72 (47%), Gaps = 6/72 (8%)
Query: 438 MNKPQSKPVIPHEVRKNGHDLPVPKPH-PPPPPPPPPYMSAAKLPPPSHSDVITNVIKEV 496
M KP ++PV EV +P PK P PPP P PY P+ S I ++ +
Sbjct: 1 MGKP-TEPVKADEVV----VVPQPKKVIPSPPPVPTPYFRPVGAFAPTRSGFIRATVERL 55
Query: 497 TYSKQMEKSSIL 508
T ++ +++ L
Sbjct: 56 TREREESRAAAL 67
>gnl|CDD|234402 TIGR03928, T7_EssCb_Firm, type VII secretion protein EssC,
C-terminal domain. This model describes the C-terminal
domain, or longer subunit, of the Firmicutes type VII
secretion protein EssC. This protein (homologous to EccC
in Actinobacteria) and the WXG100 target proteins are
the only homologous parts of type VII secretion between
Firmicutes and Actinobacteria [Protein fate, Protein and
peptide secretion and trafficking].
Length = 1296
Score = 31.1 bits (71), Expect = 2.5
Identities = 9/43 (20%), Positives = 23/43 (53%), Gaps = 1/43 (2%)
Query: 614 SHSQHPKKKKKHKEKDKDRDKKHKEK-KKHKDDKHKDKRKDKH 655
S + + ++KKK+K+ + R++ ++ K + + K +H
Sbjct: 78 STTTYFREKKKYKKDVEKRNRSYRLYLDKKRKELQALSEKQRH 120
>gnl|CDD|237605 PRK14086, dnaA, chromosomal replication initiation protein;
Provisional.
Length = 617
Score = 30.9 bits (70), Expect = 2.5
Identities = 28/151 (18%), Positives = 37/151 (24%), Gaps = 49/151 (32%)
Query: 349 PPAHVFQTSSSSRVPPPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEP 408
P + T S P PPP H EP + P
Sbjct: 80 RPIRIAITVDPSAGEPAPPP--------------PHARRTSEP--------------ELP 111
Query: 409 NVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPP 468
+ PY P ++ P ++P P ++ P P P
Sbjct: 112 RPGRRPYEGYGGPRADDRPPGLPRQDQL----PTARPAYPAYQQR-----PEPGAWPRAA 162
Query: 469 ------------PPPPPYMSAAKLPPPSHSD 487
PP PY S A P D
Sbjct: 163 DDYGWQQQRLGFPPRAPYASPASYAPEQERD 193
>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal. This
domain is found to the N-terminus of bacterial signal
peptidases of the S49 family (pfam01343).
Length = 154
Score = 29.8 bits (68), Expect = 2.6
Identities = 10/34 (29%), Positives = 14/34 (41%)
Query: 622 KKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKH 655
KK+ K +K K K K K + K K +
Sbjct: 65 KKELKAWEKAEKKAEKAKAKAEKKKAKKEEPKPR 98
Score = 29.8 bits (68), Expect = 2.8
Identities = 10/37 (27%), Positives = 16/37 (43%)
Query: 612 HESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHK 648
+ K + EK ++ K EKKK K ++ K
Sbjct: 60 AALLDKKELKAWEKAEKKAEKAKAKAEKKKAKKEEPK 96
Score = 29.4 bits (67), Expect = 3.3
Identities = 10/38 (26%), Positives = 19/38 (50%)
Query: 615 HSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRK 652
+ KK+ K EK + + +K K K + K K ++ +
Sbjct: 60 AALLDKKELKAWEKAEKKAEKAKAKAEKKKAKKEEPKP 97
>gnl|CDD|178391 PLN02794, PLN02794, cardiolipin synthase.
Length = 341
Score = 31.0 bits (70), Expect = 2.6
Identities = 10/44 (22%), Positives = 14/44 (31%), Gaps = 2/44 (4%)
Query: 337 PINSMPSANTNKP--PAHVFQTSSSSRVPPPPPPHHHHSSAHVP 378
+ N + +SSS +P P P H S V
Sbjct: 9 TLIRKNPINKPRSFLTLAAAAAASSSIIPSPFSPLALHFSHRVS 52
>gnl|CDD|235309 PRK04596, minC, septum formation inhibitor; Reviewed.
Length = 248
Score = 30.7 bits (69), Expect = 2.6
Identities = 10/22 (45%), Positives = 12/22 (54%)
Query: 463 PHPPPPPPPPPYMSAAKLPPPS 484
P PPPPPPP A + P+
Sbjct: 119 PPPPPPPPPARAEPAPPVARPA 140
>gnl|CDD|237378 PRK13406, bchD, magnesium chelatase subunit D; Provisional.
Length = 584
Score = 31.1 bits (71), Expect = 2.6
Identities = 10/27 (37%), Positives = 11/27 (40%)
Query: 461 PKPHPPPPPPPPPYMSAAKLPPPSHSD 487
P+ PPPPPPPP D
Sbjct: 265 PEEEPPPPPPPPEDDDDPPEDEEEQDD 291
Score = 30.4 bits (69), Expect = 3.8
Identities = 10/15 (66%), Positives = 10/15 (66%)
Query: 459 PVPKPHPPPPPPPPP 473
P P PPPPPPPP
Sbjct: 262 PQPPEEEPPPPPPPP 276
>gnl|CDD|148682 pfam07222, PBP_sp32, Proacrosin binding protein sp32. This family
consists of several mammalian specific proacrosin
binding protein sp32 sequences. sp32 is a sperm specific
protein which is known to bind with with 55- and 53-kDa
proacrosins and the 49-kDa acrosin intermediate. The
exact function of sp32 is unclear, it is thought however
that the binding of sp32 to proacrosin may be involved
in packaging the acrosin zymogen into the acrosomal
matrix.
Length = 243
Score = 30.4 bits (68), Expect = 2.7
Identities = 28/135 (20%), Positives = 55/135 (40%), Gaps = 8/135 (5%)
Query: 516 SHPPHLSMSIPQTKAPSIFSPE--KNTSPVINKTPFKMKTPTPPSFSPIKISPSKSSEGL 573
S+ + + +P ++ SI SP K P P M P I+ ++S +
Sbjct: 112 SNHVYYAKRVPCSQPVSILSPNTLKEAEPSAEVQPTTMTLPIAEHP---TITENQSFQPW 168
Query: 574 KAKL---EPELVPVITKLGDDILTKPKILASEIISETKQEPHESHSQHPKKKKKHKEKDK 630
+L EL+ LG + K E + QE + H K+ ++ +E+++
Sbjct: 169 PERLHNNVEELLQSSLSLGGSVQVKAPKPKQEQLLSKLQEYLQEHKTEEKQPQEEQEEEE 228
Query: 631 DRDKKHKEKKKHKDD 645
++ +E+ + DD
Sbjct: 229 VEEEAKQEEGQGTDD 243
>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
(vWA) domain [General function prediction only].
Length = 4600
Score = 31.1 bits (70), Expect = 3.1
Identities = 19/81 (23%), Positives = 39/81 (48%), Gaps = 7/81 (8%)
Query: 600 ASEIISETKQEPHESHSQHPKKKKKHKE------KDKDRDKKHKEKKKHKDDKHKDKRKD 653
E + ET+Q+ +E + + + KE +DKDR +K E++ D D+ +
Sbjct: 3912 NEEDLLETEQKSNEQSAANNESDLVSKEDDNKALEDKDRQEKEDEEEMSDDVGIDDEIQP 3971
Query: 654 KHRESSHVDPAPIK-ITIPKD 673
+E++ P + + +P+D
Sbjct: 3972 DIQENNSQPPPENEDLDLPED 3992
>gnl|CDD|221173 pfam11702, DUF3295, Protein of unknown function (DUF3295). This
family is conserved in fungi but the function is not
known.
Length = 509
Score = 30.7 bits (69), Expect = 3.2
Identities = 24/115 (20%), Positives = 44/115 (38%), Gaps = 5/115 (4%)
Query: 459 PVPKPHPPPPPPPPPYMSAAKLPPPSH-----SDVITNVIKEVTYSKQMEKSSILNHKES 513
P P P PP +A + P P+ ++ + + S+Q ++ + S
Sbjct: 89 ITPPSSEPTPAPPSSESTATRTPDPNQQALESTESTSTTSADCNDSEQSSTPNLNSSDTS 148
Query: 514 SISHPPHLSMSIPQTKAPSIFSPEKNTSPVINKTPFKMKTPTPPSFSPIKISPSK 568
+ S S S+ + +PS S ++ +NK P K+ P + K K
Sbjct: 149 TSSSGALPSTSVVRGFSPSHISSSYRSTAQLNKAPSPTKSAEPTAAPQAKPELPK 203
>gnl|CDD|219321 pfam07174, FAP, Fibronectin-attachment protein (FAP). This family
contains bacterial fibronectin-attachment proteins
(FAP). Family members are rich in alanine and proline,
are approximately 300 long, and seem to be restricted to
mycobacteria. These proteins contain a
fibronectin-binding motif that allows mycobacteria to
bind to fibronectin in the extracellular matrix.
Length = 297
Score = 30.2 bits (68), Expect = 3.4
Identities = 10/25 (40%), Positives = 11/25 (44%)
Query: 459 PVPKPHPPPPPPPPPYMSAAKLPPP 483
P P P PP P +A PPP
Sbjct: 41 PAPPPPPPSTAAAAPAPAAPPPPPP 65
Score = 30.2 bits (68), Expect = 3.8
Identities = 12/25 (48%), Positives = 12/25 (48%)
Query: 459 PVPKPHPPPPPPPPPYMSAAKLPPP 483
P P P P PP P P A PPP
Sbjct: 60 PPPPPPPAAPPAPQPDDPNAAPPPP 84
Score = 29.5 bits (66), Expect = 6.2
Identities = 15/48 (31%), Positives = 16/48 (33%), Gaps = 5/48 (10%)
Query: 441 PQSKPVIPHEVRKNGHDLPVPKPHPPP-----PPPPPPYMSAAKLPPP 483
P + P P P P P PPPPP A PPP
Sbjct: 48 PSTAAAAPAPAAPPPPPPPAAPPAPQPDDPNAAPPPPPADPNAPPPPP 95
>gnl|CDD|236978 PRK11778, PRK11778, putative inner membrane peptidase; Provisional.
Length = 330
Score = 30.6 bits (70), Expect = 3.4
Identities = 13/45 (28%), Positives = 20/45 (44%)
Query: 604 ISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHK 648
++E +E E KK+ K K + KK K++ K K K
Sbjct: 45 LNEQYKEMKEELKAALLDKKELKAWHKAQKKKEKQEAKAAKAKSK 89
Score = 29.0 bits (66), Expect = 10.0
Identities = 8/36 (22%), Positives = 13/36 (36%), Gaps = 2/36 (5%)
Query: 623 KKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRES 658
K K+ HK +KK +K + K +
Sbjct: 57 KAALLDKKELKAWHKAQKKK--EKQEAKAAKAKSKP 90
>gnl|CDD|217298 pfam02948, Amelogenin, Amelogenin. Amelogenins play a role in
biomineralisation. They seem to regulate the formation
of crystallites during the secretory stage of tooth
enamel development. thought to play a major role in the
structural organisation and mineralisation of developing
enamel. They are found in the extracellular matrix.
Mutations in X-chromosomal amelogenin can cause
Amelogenesis imperfecta.
Length = 174
Score = 29.5 bits (66), Expect = 3.8
Identities = 21/120 (17%), Positives = 37/120 (30%), Gaps = 19/120 (15%)
Query: 364 PPPPPHHHHSSAHVPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSL 423
P P + PK+ H + + P Q P++ H +
Sbjct: 52 PLSPQMPQQQQSAHPKLTPHHQLLILPP--------QQPMMPVPG-----------HHPM 92
Query: 424 LPLRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPP 483
+P+ +P + Q + ++ H P +P P P P M + PP
Sbjct: 93 VPMTGQQPHLQPPAQHPLQPTYGQNPQPQQPTHTQPPVQPQQPADPQPGQPMFPMQPLPP 152
>gnl|CDD|221509 pfam12287, Caprin-1_C, Cytoplasmic
activation/proliferation-associated protein-1 C term.
This family of proteins is found in eukaryotes. Proteins
in this family are typically between 343 and 708 amino
acids in length. This family is the C terminal region of
caprin-1. Caprin-1 is a protein involved in regulating
cellular proliferation. In mutated phenotypes, the G1
phase of the cell cycle is greatly lengthened, impairing
normal proliferation. The C terminal region of caprin-1
contains RGG motifs which are characteristic of RNA
binding domains. It is possible that caprin-1 functions
through an RNA binding mechanism.
Length = 319
Score = 29.9 bits (67), Expect = 4.1
Identities = 25/111 (22%), Positives = 36/111 (32%), Gaps = 9/111 (8%)
Query: 378 PKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLL---PL-RKHEPIN 433
P + P M PV E+ L QP + +P Q P + PL +
Sbjct: 31 PAQSMDLPQMVCPPVHSESRLSQPSAVPVQPEPTQVPMVSPTSEGYTSSPPLYQPSHTAE 90
Query: 434 PRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPS 484
PR PQ+ P+ P + + + P P P P S
Sbjct: 91 PR-----PQTDPIDPIQASMSLNSEQTPTSSSLPAASQPQVFQTGSKPLHS 136
>gnl|CDD|240246 PTZ00053, PTZ00053, methionine aminopeptidase 2; Provisional.
Length = 470
Score = 30.1 bits (68), Expect = 4.4
Identities = 22/112 (19%), Positives = 39/112 (34%), Gaps = 12/112 (10%)
Query: 606 ETKQEPHESHSQHPKKKKKHKEKDKDRD--------KKHKEKKKHKDDKHKDKRKDKHRE 657
+Q+ KK KK K+ D D + + + K + K K K+K K ++
Sbjct: 10 VKQQKQQNKQKGTKKKNKKSKKDVDDDDAFLAELISENQEAENKQNNKKKKKKKKKKKKK 69
Query: 658 SSHVDPAPIKITIPKDKIIEMPCSSNLKKIKMKDSFE----NPLKIRISKDF 705
+ +S+++K+ E P I +SK F
Sbjct: 70 NLGEAYDLAYDLPVVWSSAAFQDNSHIRKLGNWPEQEWKQTQPPTIPVSKQF 121
>gnl|CDD|219621 pfam07890, Rrp15p, Rrp15p. Rrp15p is required for the formation of
60S ribosomal subunits.
Length = 132
Score = 28.9 bits (65), Expect = 4.5
Identities = 25/91 (27%), Positives = 44/91 (48%), Gaps = 18/91 (19%)
Query: 597 KILASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHR 656
KILAS++ + +++P S S KK K K+K K + K K++ + +K + +K R
Sbjct: 8 KILASKLPASKRKDPILSRS---KKLLKAKKKLKSEKLEKKAKRQLRAEKR--QALEKGR 62
Query: 657 ESSHVDPAPIKITIPKDKIIEMPCSSNLKKI 687
+K +P+D E L+K+
Sbjct: 63 ---------VKPVLPEDLEKE----RRLRKV 80
>gnl|CDD|113398 pfam04625, DEC-1_N, DEC-1 protein, N-terminal region. The
defective chorion-1 gene (dec-1) in Drosophila encodes
follicle cell proteins necessary for proper eggshell
assembly. Multiple products of the dec-1 gene are formed
by alternative RNA splicing and proteolytic processing.
Cleavage products include S80 (80 kDa) which is
incorporated into the eggshell, and further proteolysis
of S80 gives S60 (60 kDa).
Length = 407
Score = 30.2 bits (67), Expect = 4.7
Identities = 13/40 (32%), Positives = 16/40 (40%)
Query: 445 PVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPS 484
P +P G PVP P P P PP + A P +
Sbjct: 94 PAMPSMPGLLGAAAPVPAPAPAPAAAPPAAPAPAADTPAA 133
>gnl|CDD|237541 PRK13881, PRK13881, conjugal transfer protein TrbI; Provisional.
Length = 472
Score = 30.1 bits (68), Expect = 4.8
Identities = 12/49 (24%), Positives = 19/49 (38%), Gaps = 4/49 (8%)
Query: 441 PQSKPVIPHEVRKNGHD--LPVPKPHPPPPPPPPPYMSAAKLPPPSHSD 487
+ P+ E+ LP+ +P P PP PP + P + D
Sbjct: 87 EPASPLKVPEMPTGPASAPLPIARPDNPDAPPTPP--ANPGNPGQVNDD 133
>gnl|CDD|224415 COG1498, SIK1, Protein implicated in ribosomal biogenesis, Nop56p
homolog [Translation, ribosomal structure and
biogenesis].
Length = 395
Score = 29.7 bits (67), Expect = 5.5
Identities = 9/52 (17%), Positives = 24/52 (46%)
Query: 602 EIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKD 653
E + + ++ E + P K K ++K + + ++KK+ K + ++
Sbjct: 344 EELEKRIEKLKEKPPKPPTKAKPERDKKERPGRYRRKKKEKKAKSERRGLQN 395
>gnl|CDD|217310 pfam02993, MCPVI, Minor capsid protein VI. This minor capsid
protein may act as a link between the external capsid
and the internal DNA-protein core. The C-terminal 11
residues may function as a protease cofactor leading to
enzyme activation.
Length = 238
Score = 29.4 bits (66), Expect = 5.5
Identities = 25/100 (25%), Positives = 33/100 (33%), Gaps = 9/100 (9%)
Query: 385 PPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSK 444
P + +AL +P +E V P EP S K P + +P
Sbjct: 117 PQEETVADPIQALQPRPRPDVEEVLVPAAP----EPPSYEETIKPGPA----PVEEPVDS 168
Query: 445 PVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPS 484
I L +P P P PPPP P S + S
Sbjct: 169 MAIAVPAIDTPVTLELP-PAPQPPPPVVPQPSTMVVHRRS 207
>gnl|CDD|221790 pfam12819, Malectin_like, Carbohydrate-binding protein of the ER.
Malectin is a membrane-anchored protein of the
endoplasmic reticulum that recognises and binds
Glc2-N-glycan. The domain is found on a number of plant
receptor kinases.
Length = 335
Score = 29.6 bits (67), Expect = 5.7
Identities = 8/33 (24%), Positives = 15/33 (45%)
Query: 324 DMNSRSSTSSTAVPINSMPSANTNKPPAHVFQT 356
+ S ST++ ++ + PP+ V QT
Sbjct: 199 FSSPGWSQISTSLSVDISSNNAPYIPPSAVLQT 231
>gnl|CDD|240274 PTZ00112, PTZ00112, origin recognition complex 1 protein;
Provisional.
Length = 1164
Score = 30.0 bits (67), Expect = 5.9
Identities = 16/122 (13%), Positives = 47/122 (38%), Gaps = 2/122 (1%)
Query: 537 EKNTSPVINKTPFKMKTPTPPSFSPIKISPSKSSEGLKAKLEPELVPVITKLGDDILTKP 596
++S + + +P S +S S SS+ ++ + D+
Sbjct: 124 SSSSSSISSSLTNISFFSSPTSIYSC-LSNSLSSKHSPKVIKENQSTHVNISSDNSPRNK 182
Query: 597 KILASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHR 656
+I +++ + + H + ++ ++ K+ ++K + DK+ K +D +
Sbjct: 183 EI-SNKQLKKQTNVTHTTCYDKMRRSPRNTSTIKNNTNDKNKEKNKEKDKNIKKDRDGDK 241
Query: 657 ES 658
++
Sbjct: 242 QT 243
>gnl|CDD|220596 pfam10138, Tellurium_res, Tellurium resistance protein. Members of
this family confer resistance to the metalloid element
tellurium and its salts.
Length = 98
Score = 27.7 bits (62), Expect = 6.4
Identities = 10/15 (66%), Positives = 10/15 (66%)
Query: 459 PVPKPHPPPPPPPPP 473
PVP P P PP P PP
Sbjct: 4 PVPPPAPAPPAPAPP 18
>gnl|CDD|165527 PHA03269, PHA03269, envelope glycoprotein C; Provisional.
Length = 566
Score = 29.7 bits (66), Expect = 7.0
Identities = 18/83 (21%), Positives = 27/83 (32%), Gaps = 5/83 (6%)
Query: 327 SRSSTSSTAVPINSMPSANTNKPPAHVFQTSSSSRVPPPPPPHHHHSS-----AHVPKIK 381
+ S P + SA + KP T ++S P P H +S A P++
Sbjct: 46 PHQAASRAPDPAVAPTSAASRKPDLAQAPTPAASEKFDPAPAPHQAASRAPDPAVAPQLA 105
Query: 382 TEHPPMKVEPVMKEALLKQPPLI 404
P E A + P
Sbjct: 106 AAPKPDAAEAFTSAAQAHEAPAD 128
>gnl|CDD|237866 PRK14952, PRK14952, DNA polymerase III subunits gamma and tau;
Provisional.
Length = 584
Score = 29.5 bits (66), Expect = 7.1
Identities = 13/63 (20%), Positives = 21/63 (33%)
Query: 426 LRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSH 485
L++ E I R+ M+ P + + H P P P P P+
Sbjct: 373 LQRVERIETRLDMSIPANLLHNAPQAAPAPSAAAPEPKHQPAPEPRPVLAPTPASGEPNA 432
Query: 486 SDV 488
+ V
Sbjct: 433 AAV 435
>gnl|CDD|227458 COG5129, MAK16, Nuclear protein with HMG-like acidic region
[General function prediction only].
Length = 303
Score = 29.2 bits (65), Expect = 7.2
Identities = 20/73 (27%), Positives = 34/73 (46%), Gaps = 2/73 (2%)
Query: 585 ITKLGDDILTKPKILASEIISETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKD 644
+T + TK K L + S+ E ES + + + ++D+D D K K +K+ D
Sbjct: 217 VTDDSEKEKTKKKDLEKWLGSDQSMETSESEEE--ESSESESDEDEDEDNKGKIRKRKTD 274
Query: 645 DKHKDKRKDKHRE 657
D K ++ H E
Sbjct: 275 DAKKSRKPHIHIE 287
>gnl|CDD|233421 TIGR01453, grpIintron_endo, group I intron endonuclease. This
model represents one subfamily of endonucleases
containing the endo/excinuclease amino terminal domain,
pfam01541 at its amino end. A distinct subfamily
includes excinuclease abc subunit c (uvrC). Members of
pfam01541 are often termed GIY-YIG endonucleases after
conserved motifs near the amino end. This subfamily in
This model is found in open reading frames of group I
introns in both phage and mitochondria. The closely
related endonucleases of phage T4: segA, segB, segC,
segD and segE, score below the trusted cutoff for the
family.
Length = 214
Score = 28.9 bits (65), Expect = 7.3
Identities = 17/62 (27%), Positives = 24/62 (38%), Gaps = 9/62 (14%)
Query: 605 SETKQEPHESHSQ---HPKKKKKHKEKDKDRDKKHKEKKKHKDD----KHKDKRKDKHRE 657
ETK + + S +P K H E+ K K K K + KH ++ K K E
Sbjct: 101 EETKAKMSKLFSGKKNNPWYGKTHSEETK--AKISKNKLGENNPFFGKKHSEETKKKISE 158
Query: 658 SS 659
Sbjct: 159 KE 160
>gnl|CDD|215533 PLN02983, PLN02983, biotin carboxyl carrier protein of acetyl-CoA
carboxylase.
Length = 274
Score = 29.0 bits (65), Expect = 7.7
Identities = 18/60 (30%), Positives = 21/60 (35%), Gaps = 3/60 (5%)
Query: 426 LRKHEPINPRMMMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSH 485
L + P P +MM P + P P P PPP P AK P SH
Sbjct: 141 LPQPPPPAPVVMMQPPPPHAMPPASPPAAQ---PAPSAPASSPPPTPASPPPAKAPKSSH 197
>gnl|CDD|217348 pfam03064, U79_P34, HSV U79 / HCMV P34. This family represents
herpes virus protein U79 and cytomegalovirus early
phosphoprotein P34 (UL112).
Length = 238
Score = 29.1 bits (65), Expect = 7.9
Identities = 13/45 (28%), Positives = 23/45 (51%), Gaps = 3/45 (6%)
Query: 614 SHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRES 658
S Q K++ + +K K+ +K +E+K+ D+ DKR S
Sbjct: 163 SGKQKEKRRVEDSQKHKEDRRKKQEEKRRNDE---DKRPGGGGGS 204
>gnl|CDD|222301 pfam13665, DUF4150, Domain of unknown function (DUF4150).
Length = 110
Score = 27.9 bits (63), Expect = 7.9
Identities = 8/17 (47%), Positives = 9/17 (52%)
Query: 465 PPPPPPPPPYMSAAKLP 481
PP PP P PY + A
Sbjct: 15 PPGPPVPIPYPNFAMSA 31
>gnl|CDD|172376 PRK13855, PRK13855, type IV secretion system protein VirB10;
Provisional.
Length = 376
Score = 29.1 bits (65), Expect = 8.1
Identities = 7/28 (25%), Positives = 8/28 (28%)
Query: 459 PVPKPHPPPPPPPPPYMSAAKLPPPSHS 486
P P PP PP + P
Sbjct: 75 PAPIDVPPDPPAAQEAVQPTAPPSAQSE 102
>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
Length = 2084
Score = 29.7 bits (66), Expect = 8.3
Identities = 12/52 (23%), Positives = 22/52 (42%)
Query: 606 ETKQEPHESHSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDKHKDKRKDKHRE 657
E K++ E+ + KKK +E K + E + D+ + K + E
Sbjct: 1319 EAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAE 1370
>gnl|CDD|177433 PHA02608, 67, prohead core protein; Provisional.
Length = 80
Score = 27.1 bits (60), Expect = 8.5
Identities = 13/47 (27%), Positives = 20/47 (42%), Gaps = 3/47 (6%)
Query: 603 IISETKQEPHES---HSQHPKKKKKHKEKDKDRDKKHKEKKKHKDDK 646
+I E K E S + P+ ++ D D DK K+ DD+
Sbjct: 31 LIEEEKVEIARSVMIEGEEPEDDDDDEDDDDDDDKDDKDDDDDDDDE 77
Score = 27.1 bits (60), Expect = 9.3
Identities = 6/28 (21%), Positives = 9/28 (32%)
Query: 627 EKDKDRDKKHKEKKKHKDDKHKDKRKDK 654
E+ +D D + D D D
Sbjct: 48 EEPEDDDDDEDDDDDDDKDDKDDDDDDD 75
>gnl|CDD|236940 PRK11633, PRK11633, cell division protein DedD; Provisional.
Length = 226
Score = 28.8 bits (65), Expect = 8.8
Identities = 28/107 (26%), Positives = 36/107 (33%), Gaps = 19/107 (17%)
Query: 377 VPKIKTEHPPMKVEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRM 436
VPK + + P +AL QPP E + + SL P + P
Sbjct: 44 VPK-PGDRDEPDMMPAATQALPTQPP----EGAAEAVRAGDAAAPSLDPAT----VAPPN 94
Query: 437 MMNKPQSKPVIPHEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPP 483
+P+ PV P P PKP P P P P P P
Sbjct: 95 TPVEPEPAPVEP----------PKPKPVEKPKPKPKPQQKVEAPPAP 131
>gnl|CDD|223880 COG0810, TonB, Periplasmic protein TonB, links inner and outer
membranes [Cell envelope biogenesis, outer membrane].
Length = 244
Score = 29.0 bits (65), Expect = 9.0
Identities = 26/100 (26%), Positives = 34/100 (34%), Gaps = 10/100 (10%)
Query: 389 VEPVMKEALLKQPPLIKQEPNVKQEPYIKSEPHSLLPLRKHEPINPRMMMNKPQSKPVIP 448
V + K +EP + EP + P EP P KP+ KP
Sbjct: 37 VPLAVFLLAAKVLEAPTEEPQPEPEPPEEQPKPPTEPETPPEPTPP-----KPKEKPKPE 91
Query: 449 HEVRKNGHDLPVPKPHPPPPPPPPPYMSAAKLPPPSHSDV 488
+ +K P PKP P P P P PPS +
Sbjct: 92 KKPKK-----PKPKPKPKPKPKPKVKPQPKPKKPPSKTAA 126
>gnl|CDD|227106 COG4765, COG4765, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 164
Score = 28.3 bits (63), Expect = 9.2
Identities = 14/80 (17%), Positives = 24/80 (30%), Gaps = 4/80 (5%)
Query: 81 FHRNSIATAALFLAAKVEEQPRKLEHVIRVAQLCLFKNQPPLDPRSEAYQEQAQ----EI 136
IA + V E +E R Q+ L P + A + + +
Sbjct: 4 LSIIFIAGICVARQIGVAENLESVERSARDVQIPLDTAHPITNAVDPAEAARFKNYVAKF 63
Query: 137 VVNENVLLQTLGFDVGIEHP 156
+ + + FDV I
Sbjct: 64 SALDKITGRITEFDVYIGET 83
>gnl|CDD|233848 TIGR02400, trehalose_OtsA, alpha,alpha-trehalose-phosphate synthase
[UDP-forming]. This enzyme catalyzes the key,
penultimate step in biosynthesis of trehalose, a
compatible solute made as an osmoprotectant in some
species in all three domains of life. The gene symbol
OtsA stands for osmotically regulated trehalose
synthesis A. Trehalose helps protect against both
osmotic and thermal stresses, and is made from two
glucose subunits. This model excludes
glucosylglycerol-phosphate synthase, an enzyme of an
analogous osmoprotectant system in many cyanobacterial
strains. This model does not identify archaeal examples,
as they are more divergent than
glucosylglycerol-phosphate synthase. Sequences that
score in the gray zone between the trusted and noise
cutoffs include a number of yeast multidomain proteins
in which the N-terminal domain may be functionally
equivalent to this family. The gray zone also includes
the OtsA of Cornyebacterium glutamicum (and related
species), shown to be responsible for synthesis of only
trace amounts of trehalose while the majority is
synthesized by the TreYZ pathway; the significance of
OtsA in this species is unclear (see Wolf, et al.,
PMID:12890033) [Cellular processes, Adaptations to
atypical conditions].
Length = 456
Score = 29.2 bits (66), Expect = 9.4
Identities = 6/22 (27%), Positives = 10/22 (45%)
Query: 64 TAIVYMHRFYVFHSFTQFHRNS 85
T I Y++R Y +R +
Sbjct: 335 TPIRYLNRSYDREELMALYRAA 356
>gnl|CDD|240577 cd12950, RRP7_Rrp7p, RRP7 domain ribosomal RNA-processing protein 7
(Rrp7p) and similar proteins. This CD corresponds to
the RRP7 domain of Rrp7p. Rrp7p is encoded by YCL031C
gene from Saccharomyces cerevisiae. It is an essential
yeast protein involved in pre-rRNA processing and
ribosome assembly. Rrp7p contains an N-terminal RNA
recognition motif (RRM), also termed RBD (RNA binding
domain) or RNP (ribonucleoprotein domain), and a
C-terminal RRP7 domain.
Length = 128
Score = 28.0 bits (63), Expect = 9.6
Identities = 20/53 (37%), Positives = 29/53 (54%), Gaps = 12/53 (22%)
Query: 613 ESHSQHPKKKKKHKEKDKD------RDKKHKE----KKKHKDDKHK-DKRKDK 654
E + KKKKK KE +D R+KK +E KK ++DK + +K K+K
Sbjct: 76 EEKKEKEKKKKKKKEL-EDFYRFQLREKKKEEQADLLKKFEEDKERVEKMKEK 127
>gnl|CDD|177952 PLN02318, PLN02318, phosphoribulokinase/uridine kinase.
Length = 656
Score = 29.1 bits (65), Expect = 9.6
Identities = 24/98 (24%), Positives = 40/98 (40%), Gaps = 12/98 (12%)
Query: 227 YIDKEVTQEQLEQLTEEFLAIFDKCPSKLKKRICSISSNQNSTLMAAFDGDSKKMSGLGN 286
YI+ Q QLE+L E +A+ + +KL +SS + + A+ D +K +
Sbjct: 404 YIE----QIQLEKLVNEVMALPEDLKTKLSLDDDLVSSPKEALSRASADRRNKNLKS--- 456
Query: 287 ATFAPPHSTSGRVTDDKRRSEHNGPPPEYRKLMAGGRD 324
HS S + DK S+ G R+ +
Sbjct: 457 ---GLSHSYSTQ--RDKNLSKLTGLAVTNRRFDERNSE 489
>gnl|CDD|237030 PRK12270, kgd, alpha-ketoglutarate decarboxylase; Reviewed.
Length = 1228
Score = 29.1 bits (66), Expect = 9.9
Identities = 6/25 (24%), Positives = 7/25 (28%)
Query: 459 PVPKPHPPPPPPPPPYMSAAKLPPP 483
P P P +AA P
Sbjct: 84 PKPAAAAAAAAAPAAPPAAAAAAAP 108
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.313 0.128 0.382
Gapped
Lambda K H
0.267 0.0723 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 36,115,245
Number of extensions: 3453919
Number of successful extensions: 8065
Number of sequences better than 10.0: 1
Number of HSP's gapped: 6462
Number of HSP's successfully gapped: 389
Length of query: 730
Length of database: 10,937,602
Length adjustment: 104
Effective length of query: 626
Effective length of database: 6,324,786
Effective search space: 3959316036
Effective search space used: 3959316036
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 63 (28.1 bits)