RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy13778
         (295 letters)



>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
          Length = 943

 Score = 46.6 bits (110), Expect = 1e-05
 Identities = 30/97 (30%), Positives = 48/97 (49%), Gaps = 6/97 (6%)

Query: 67  PTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQG 126
           PTL + P   + P  P+ P  P+ P   + P   Q PT  + P LP+   +P+ P  P+ 
Sbjct: 573 PTLSKKP---EFPKDPKHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPES 629

Query: 127 PTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGP 163
           P  P+ P   Q P+  + P   +GP +++ P  P+ P
Sbjct: 630 PKSPKRPPPPQRPSSPERP---EGPKIIKSPKPPKSP 663



 Score = 45.8 bits (108), Expect = 2e-05
 Identities = 38/135 (28%), Positives = 55/135 (40%), Gaps = 19/135 (14%)

Query: 79  PTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQGPTLLQG 138
           PTL + P  P+ P   +DP  P+ P   + P   Q PT P+ P LP+   +P+ P   + 
Sbjct: 573 PTLSKKPEFPKDPKHPKDPEEPKKP---KRPRSAQRPTRPKSPKLPELLDIPKSPKRPES 629

Query: 139 PTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQGPTLPQDPT 198
           P   + P   Q P+    P  P+GP             K    P+ P  P+ P  P+   
Sbjct: 630 PKSPKRPPPPQRPS---SPERPEGP-------------KIIKSPKPPKSPKPPFDPKFKE 673

Query: 199 LLQDPTLLQGPRYKE 213
              D  L    + KE
Sbjct: 674 KFYDDYLDAAAKSKE 688



 Score = 45.8 bits (108), Expect = 2e-05
 Identities = 32/111 (28%), Positives = 50/111 (45%), Gaps = 12/111 (10%)

Query: 55  PTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQG 114
           PTL + P  P+ P   + P   + P  P+ P   Q PT  + P LP+   + + P  P+ 
Sbjct: 573 PTLSKKPEFPKDP---KHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPES 629

Query: 115 PTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTL 165
           P  P+ P  PQ P+ P+ P         +GP +++ P   + P  P  P  
Sbjct: 630 PKSPKRPPPPQRPSSPERP---------EGPKIIKSPKPPKSPKPPFDPKF 671



 Score = 44.3 bits (104), Expect = 6e-05
 Identities = 24/81 (29%), Positives = 38/81 (46%)

Query: 53  QGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLP 112
           + P   + P  P+ P   + P   Q PT P+ P LP+   + + P  P+ P   + P  P
Sbjct: 580 EFPKDPKHPKDPEEPKKPKRPRSAQRPTRPKSPKLPELLDIPKSPKRPESPKSPKRPPPP 639

Query: 113 QGPTLPQGPTLPQGPTLPQGP 133
           Q P+ P+ P  P+    P+ P
Sbjct: 640 QRPSSPERPEGPKIIKSPKPP 660


>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1.  Members
           of this family are necessary for accurate chromosome
           transmission during cell division.
          Length = 804

 Score = 40.1 bits (94), Expect = 0.001
 Identities = 42/160 (26%), Positives = 46/160 (28%), Gaps = 7/160 (4%)

Query: 52  LQGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQG-PTLPQGPTLLQDPTLPQGPTLLQGPT 110
           LQ          P    L QG    Q     QG P  P G            P  +Q   
Sbjct: 163 LQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPP----QGHPEQVQPQQ 218

Query: 111 LPQGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPT 170
               P+       P  P LPQ P  LQ P        +  P         Q P     P 
Sbjct: 219 FLPAPS-QAPAQPPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPP 277

Query: 171 LLQGPRKGPTLPQGPTLPQGP-TLPQDPTLLQDPTLLQGP 209
               P   P LPQG   P  P   PQ   L+Q P   Q  
Sbjct: 278 PQNQPTPHPGLPQGQNAPLPPPQQPQLLPLVQQPQGQQRG 317



 Score = 35.9 bits (83), Expect = 0.025
 Identities = 29/173 (16%), Positives = 37/173 (21%), Gaps = 11/173 (6%)

Query: 53  QGPTLLQGPTLPQG---------PTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGP 103
                     LPQG                           P   Q    L  P+     
Sbjct: 170 PQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGYPQPPQGHPEQVQPQQFLPAPSQAPAQ 229

Query: 104 TLLQGPTLPQGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGP 163
             L      Q P L Q         +P  P               Q P   Q    P  P
Sbjct: 230 PPLPPQLPQQPPPLQQPQFPGLSQQMPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLP 289

Query: 164 TLLQGPT-LLQGPRKGPTLPQGPTLPQGPTLPQDPT-LLQDPTLLQGPRYKEK 214
                P    Q P+  P + Q     +GP   +    L Q           ++
Sbjct: 290 QGQNAPLPPPQQPQLLPLVQQPQGQQRGPQFREQLVQLSQQQREALSQEEAKR 342



 Score = 35.5 bits (82), Expect = 0.033
 Identities = 42/156 (26%), Positives = 46/156 (29%), Gaps = 7/156 (4%)

Query: 59  QGPTLPQGPTLLQGPTLLQG-PTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTL 117
              + PQ  T  Q    L+      Q               LPQG    Q     QGP  
Sbjct: 139 APESQPQPQTPAQKMLSLEEVEAQLQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGP-- 196

Query: 118 PQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGP-R 176
           P+ P  P  P  PQG      P                 P LPQ P  LQ P       +
Sbjct: 197 PEQP--PGYPQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQPPPLQQPQFPGLSQQ 254

Query: 177 KGPTLPQGPTLPQGPTLPQD-PTLLQDPTLLQGPRY 211
             P  PQ P   Q P  PQ  P     PT   G   
Sbjct: 255 MPPPPPQPPQQQQQPPQPQAQPPPQNQPTPHPGLPQ 290


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 39.2 bits (91), Expect = 0.002
 Identities = 29/134 (21%), Positives = 37/134 (27%)

Query: 62   TLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGP 121
                 P     P     PT P  P  P  P+L    ++  G  + + P        P  P
Sbjct: 2819 PPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAP 2878

Query: 122  TLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTL 181
              P    L +        +    P   + P   Q P  PQ       P   Q P   P  
Sbjct: 2879 ARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPR 2938

Query: 182  PQGPTLPQGPTLPQ 195
            PQ P  P       
Sbjct: 2939 PQPPLAPTTDPAGA 2952



 Score = 38.4 bits (89), Expect = 0.004
 Identities = 27/140 (19%), Positives = 34/140 (24%), Gaps = 1/140 (0%)

Query: 60   GPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQ 119
                P  P     P +  GP  P GP  P  P     P  P  P         +      
Sbjct: 2729 RQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAV 2788

Query: 120  GPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQ-GPTLLQGPRKG 178
                    +LP        P  +  P     P       LP   +     P    GP   
Sbjct: 2789 ASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPP 2848

Query: 179  PTLPQGPTLPQGPTLPQDPT 198
                 G   P G    + P+
Sbjct: 2849 SLPLGGSVAPGGDVRRRPPS 2868



 Score = 36.8 bits (85), Expect = 0.014
 Identities = 35/150 (23%), Positives = 40/150 (26%)

Query: 59   QGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLP 118
              PT    P  L   T L           P  P     P +P GP    GP  P  P   
Sbjct: 2704 PPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTT 2763

Query: 119  QGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKG 178
             GP  P  P  P      +              +L         P  +  P     P   
Sbjct: 2764 AGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAS 2823

Query: 179  PTLPQGPTLPQGPTLPQDPTLLQDPTLLQG 208
            P  P  P     PT P  P     P+L  G
Sbjct: 2824 PAGPLPPPTSAQPTAPPPPPGPPPPSLPLG 2853



 Score = 36.1 bits (83), Expect = 0.024
 Identities = 31/147 (21%), Positives = 35/147 (23%)

Query: 53   QGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLP 112
              PT    P      T L           P  P  P  P +   P  P GP     P   
Sbjct: 2704 PPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTT 2763

Query: 113  QGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLL 172
             GP  P  P  P      +              +L         P     P     P   
Sbjct: 2764 AGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAS 2823

Query: 173  QGPRKGPTLPQGPTLPQGPTLPQDPTL 199
                  P     PT P  P  P  P+L
Sbjct: 2824 PAGPLPPPTSAQPTAPPPPPGPPPPSL 2850



 Score = 34.1 bits (78), Expect = 0.10
 Identities = 34/162 (20%), Positives = 42/162 (25%), Gaps = 14/162 (8%)

Query: 61   PTLPQGPTLLQGPTLLQGPTLPQGP---TLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTL 117
            P +P GP    GP     P    GP     P  P       L +        +    P  
Sbjct: 2742 PAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLP-S 2800

Query: 118  PQGPTLPQGPTLPQGPTL--LQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQG- 174
            P  P  P    L     L     P     P     PT    P  P  P+L  G ++  G 
Sbjct: 2801 PWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGG 2860

Query: 175  -------PRKGPTLPQGPTLPQGPTLPQDPTLLQDPTLLQGP 209
                    R     P  P  P    L +        +    P
Sbjct: 2861 DVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPP 2902



 Score = 34.1 bits (78), Expect = 0.11
 Identities = 35/143 (24%), Positives = 42/143 (29%), Gaps = 5/143 (3%)

Query: 58   LQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTL 117
             Q P        +   T L  P  P     P    L+    LP GP   +  + P  P  
Sbjct: 2680 PQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQAS-PALPAA 2738

Query: 118  PQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTL-LQGPR 176
            P  P +P GP  P GP     P    GP     P     P       L +     L   R
Sbjct: 2739 PAPPAVPAGPATPGGPARPARPPTTAGPP---APAPPAAPAAGPPRRLTRPAVASLSESR 2795

Query: 177  KGPTLPQGPTLPQGPTLPQDPTL 199
            +    P  P  P    L     L
Sbjct: 2796 ESLPSPWDPADPPAAVLAPAAAL 2818



 Score = 32.6 bits (74), Expect = 0.26
 Identities = 35/207 (16%), Positives = 53/207 (25%), Gaps = 13/207 (6%)

Query: 52   LQGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTL 111
                  L G   P G    + P+       P  P  P    L +        +    P  
Sbjct: 2846 PPPSLPLGGSVAPGGDVRRRPPSR-SPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQ 2904

Query: 112  PQGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTL 171
            P+ P  PQ P  PQ    P  P   Q P                     +    +  P L
Sbjct: 2905 PERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWL 2964

Query: 172  LQGPRKGPTLPQGPTLPQGPTLPQDPTLLQDPTLLQGPRY----KEKELVERSAYGETE- 226
                     +P+       P+     +     T     R         L E +       
Sbjct: 2965 GALVPGRVAVPRFRVPQPAPSREAPASSTPPLTGHSLSRVSSWASSLALHEETDPPPVSL 3024

Query: 227  -------DEEEESDTEDIRDDDDDDME 246
                   D+ E+SD + + D D +  +
Sbjct: 3025 KQTLWPPDDTEDSDADSLFDSDSERSD 3051



 Score = 32.2 bits (73), Expect = 0.39
 Identities = 29/128 (22%), Positives = 37/128 (28%), Gaps = 3/128 (2%)

Query: 79   PTLPQGPTLPQGPTL--LQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQGPTLL 136
            P  P    L     L     P  P  P     PT P  P  P  P+LP G ++  G  + 
Sbjct: 2804 PADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVR 2863

Query: 137  QGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQGPTLPQD 196
            + P           P       L + P + +       P   P  P  P  P  P     
Sbjct: 2864 RRPPSRSPAAKPAAPARPPVRRLAR-PAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQ 2922

Query: 197  PTLLQDPT 204
            P     P 
Sbjct: 2923 PPPPPQPQ 2930



 Score = 29.5 bits (66), Expect = 2.4
 Identities = 29/139 (20%), Positives = 38/139 (27%), Gaps = 2/139 (1%)

Query: 73   PTLLQGPTLPQGPTLP--QGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLP 130
            P       L     LP    P     P     PT    P  P  P+LP G ++  G  + 
Sbjct: 2804 PADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVR 2863

Query: 131  QGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQG 190
            + P           P       L +        +    P   + P +    P     PQ 
Sbjct: 2864 RRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQP 2923

Query: 191  PTLPQDPTLLQDPTLLQGP 209
            P  PQ       P   Q P
Sbjct: 2924 PPPPQPQPPPPPPPRPQPP 2942


>gnl|CDD|218556 pfam05327, RRN3, RNA polymerase I specific transcription initiation
           factor RRN3.  This family consists of several eukaryotic
           proteins which are homologous to the yeast RRN3 protein.
           RRN3 is one of the RRN genes specifically required for
           the transcription of rDNA by RNA polymerase I (Pol I) in
           Saccharomyces cerevisiae.
          Length = 554

 Score = 38.8 bits (91), Expect = 0.003
 Identities = 19/60 (31%), Positives = 29/60 (48%), Gaps = 5/60 (8%)

Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMRGTRELLTMKLDSCLH 283
           + +DEEEE    D  DDD+DDM  +     D E+       +R    +E+   KLD+ + 
Sbjct: 224 DIDDEEEERVLADEDDDDEDDMFDM--DDDDEEESDPE--VERTSTIKEVSE-KLDAIMD 278


>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).  This
           family consists of several Plasmodium falciparum SPAM
           (secreted polymorphic antigen associated with
           merozoites) proteins. Variation among SPAM alleles is
           the result of deletions and amino acid substitutions in
           non-repetitive sequences within and flanking the alanine
           heptad-repeat domain. Heptad repeats in which the a and
           d position contain hydrophobic residues generate
           amphipathic alpha-helices which give rise to helical
           bundles or coiled-coil structures in proteins. SPAM is
           an example of a P. falciparum antigen in which a
           repetitive sequence has features characteristic of a
           well-defined structural element.
          Length = 164

 Score = 37.5 bits (87), Expect = 0.003
 Identities = 19/48 (39%), Positives = 28/48 (58%), Gaps = 3/48 (6%)

Query: 212 KEKELVERSAYGETEDEEEESDTEDIR--DDDDDDMEIVVCTSQDSED 257
           KE E V+     E ++EEEE D E+I   +D +D+ EIV    ++ ED
Sbjct: 38  KENEDVKDEK-QEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEED 84


>gnl|CDD|214818 smart00784, SPT2, SPT2 chromatin protein.  This entry includes the
           Saccharomyces cerevisiae protein SPT2 which is a
           chromatin protein involved in transcriptional
           regulation.
          Length = 106

 Score = 36.2 bits (84), Expect = 0.003
 Identities = 17/52 (32%), Positives = 25/52 (48%), Gaps = 15/52 (28%)

Query: 210 RYKEKELVERSAYGETEDEEEESDTEDIR---------------DDDDDDME 246
            Y E+E  +   + E +DEE++ D ++I                DDDDDDME
Sbjct: 14  DYDEEEDEDMDDFIEDDDEEDDYDRDEIWAMFNKGRKRYAYRDDDDDDDDME 65



 Score = 33.1 bits (76), Expect = 0.044
 Identities = 10/37 (27%), Positives = 17/37 (45%)

Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIV 248
               L       +  DEEE+ D +D  +DDD++ +  
Sbjct: 1   TSPRLERSRRSRDDYDEEEDEDMDDFIEDDDEEDDYD 37


>gnl|CDD|218303 pfam04874, Mak16, Mak16 protein C-terminal region.  The precise
           function of this eukaryotic protein family is unknown.
           The yeast orthologues have been implicated in cell cycle
           progression and biogenesis of 60S ribosomal subunits.
           The Schistosoma mansoni Mak16 has been shown to target
           protein transport to the nucleolus.
          Length = 97

 Score = 34.8 bits (80), Expect = 0.009
 Identities = 21/76 (27%), Positives = 30/76 (39%), Gaps = 36/76 (47%)

Query: 213 EKELVER---SAYG----------------------------ETEDEEEESDTEDIRDDD 241
           EKEL+ER     YG                            E E+EE+E + E + DD+
Sbjct: 27  EKELLERLKQGTYGDEPYNISQSAFKKALEAEESEENDEEEEEEEEEEDEGEIEYVSDDE 86

Query: 242 DDDMEIVVCTSQDSED 257
           + + EI     +D ED
Sbjct: 87  ELEEEI-----EDLED 97


>gnl|CDD|165146 PHA02781, PHA02781, hypothetical protein; Provisional.
          Length = 78

 Score = 33.9 bits (77), Expect = 0.011
 Identities = 17/40 (42%), Positives = 24/40 (60%), Gaps = 3/40 (7%)

Query: 210 RYKEKE--LVERSAYGETED-EEEESDTEDIRDDDDDDME 246
           + KEK+  L +  A  E +D + EE +  DI DDDD D+E
Sbjct: 37  KKKEKDVLLAQSVAVEEAKDVKVEEKNIIDIEDDDDMDVE 76


>gnl|CDD|177433 PHA02608, 67, prohead core protein; Provisional.
          Length = 80

 Score = 33.2 bits (76), Expect = 0.022
 Identities = 15/32 (46%), Positives = 19/32 (59%)

Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDD 244
           EK  + RS   E E+ E++ D ED  DDDD D
Sbjct: 35  EKVEIARSVMIEGEEPEDDDDDEDDDDDDDKD 66



 Score = 32.5 bits (74), Expect = 0.035
 Identities = 11/23 (47%), Positives = 17/23 (73%)

Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
           + ED++++ D +D  DDDDDD E
Sbjct: 55  DDEDDDDDDDKDDKDDDDDDDDE 77



 Score = 31.7 bits (72), Expect = 0.068
 Identities = 14/38 (36%), Positives = 20/38 (52%), Gaps = 9/38 (23%)

Query: 223 GET---EDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
           GE    +D++E+ D +D +DD DDD         D ED
Sbjct: 47  GEEPEDDDDDEDDDDDDDKDDKDDD------DDDDDED 78


>gnl|CDD|218889 pfam06088, TLP-20, Nucleopolyhedrovirus telokin-like protein-20
           (TLP20).  This family consists of several
           Nucleopolyhedrovirus telokin-like protein-20 (TLP20)
           sequences. The function of this family is unknown but
           TLP20 is known to shares some antigenic similarities to
           the smooth muscle protein telokin although the amino
           acid sequence shows no homologies to telokin.
          Length = 162

 Score = 34.2 bits (79), Expect = 0.030
 Identities = 10/47 (21%), Positives = 17/47 (36%), Gaps = 7/47 (14%)

Query: 220 SAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKR 266
           SA       E+E+D ++  + D   +E          D   P   K+
Sbjct: 114 SAPVPHHSSEDENDEDEEDNADRAGIE-------SGIDDSAPPSPKK 153


>gnl|CDD|117592 pfam09026, Cenp-B_dimeris, Centromere protein B dimerisation
           domain.  The centromere protein B (CENP-B) dimerisation
           domain is composed of two alpha-helices, which are
           folded into an antiparallel configuration. Dimerisation
           of CENP-B is mediated by this domain, in which monomers
           dimerise to form a symmetrical, antiparallel, four-helix
           bundle structure with a large hydrophobic patch in which
           23 residues of one monomer form van der Waals contacts
           with the other monomer. This CENP-B dimer configuration
           may be suitable for capturing two distant CENP-B boxes
           during centromeric heterochromatin formation.
          Length = 101

 Score = 33.2 bits (75), Expect = 0.035
 Identities = 18/41 (43%), Positives = 24/41 (58%), Gaps = 7/41 (17%)

Query: 227 DEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRM 267
           DEEE+ D ED  DDD+DD E       D +++P PS  + M
Sbjct: 17  DEEEDDDDEDEEDDDEDDDE-------DDDEVPVPSFGEAM 50



 Score = 28.6 bits (63), Expect = 1.3
 Identities = 12/26 (46%), Positives = 19/26 (73%)

Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVV 249
           E ED+++E + +D  DDD+DD E+ V
Sbjct: 18  EEEDDDDEDEEDDDEDDDEDDDEVPV 43


>gnl|CDD|235033 PRK02363, PRK02363, DNA-directed RNA polymerase subunit delta;
           Reviewed.
          Length = 129

 Score = 32.7 bits (75), Expect = 0.077
 Identities = 9/45 (20%), Positives = 22/45 (48%)

Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
            +E  ++      + +++  D + + DDD D+ ++     +D ED
Sbjct: 83  LEEKFDKKKKKFMDGDDDIIDDDILPDDDFDEEDLDEEDDEDEED 127



 Score = 29.6 bits (67), Expect = 0.89
 Identities = 7/23 (30%), Positives = 15/23 (65%)

Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
             +D+ +E D ++  D+D++D E
Sbjct: 107 LPDDDFDEEDLDEEDDEDEEDEE 129


>gnl|CDD|220284 pfam09538, FYDLN_acid, Protein of unknown function (FYDLN_acid).
           Members of this family are bacterial proteins with a
           conserved motif [KR]FYDLN, sometimes flanked by a pair
           of CXXC motifs, followed by a long region of low
           complexity sequence in which roughly half the residues
           are Asp and Glu, including multiple runs of five or more
           acidic residues. The function of members of this family
           is unknown.
          Length = 104

 Score = 31.1 bits (71), Expect = 0.16
 Identities = 11/36 (30%), Positives = 22/36 (61%)

Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEI 247
           K+ E  E       +D++++ D +D+ D DDDD+++
Sbjct: 55  KKDEDEEDEDDVVLDDDDDDDDDDDLPDLDDDDVDL 90



 Score = 27.7 bits (62), Expect = 3.3
 Identities = 14/48 (29%), Positives = 22/48 (45%), Gaps = 5/48 (10%)

Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLP 259
           K +     +     +DE+EE + + + DDDDDD +       D  DL 
Sbjct: 42  KSRAPAADAEDAAKKDEDEEDEDDVVLDDDDDDDD-----DDDLPDLD 84



 Score = 27.3 bits (61), Expect = 3.5
 Identities = 10/35 (28%), Positives = 18/35 (51%)

Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
            E E  E     + +D++++ D     DDDD D++
Sbjct: 57  DEDEEDEDDVVLDDDDDDDDDDDLPDLDDDDVDLD 91



 Score = 27.3 bits (61), Expect = 4.1
 Identities = 7/41 (17%), Positives = 15/41 (36%)

Query: 209 PRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVV 249
              ++    +     E +   ++ D +D  DD  D  +  V
Sbjct: 48  ADAEDAAKKDEDEEDEDDVVLDDDDDDDDDDDLPDLDDDDV 88


>gnl|CDD|218538 pfam05285, SDA1, SDA1.  This family consists of several SDA1
           protein homologues. SDA1 is a Saccharomyces cerevisiae
           protein which is involved in the control of the actin
           cytoskeleton. The protein is essential for cell
           viability and is localised in the nucleus.
          Length = 317

 Score = 33.1 bits (76), Expect = 0.16
 Identities = 19/66 (28%), Positives = 29/66 (43%), Gaps = 15/66 (22%)

Query: 205 LLQGPRYKEKELVERSA-YGETED---------EEEESDTEDIRD---DDDDDMEIVVCT 251
           LL+  ++KE+E  ++ A  G   D         E EE +  D      D + D EI    
Sbjct: 74  LLE--KWKEEERKKKEAEQGLESDDDDDEEEEWEVEEDEDSDDEGEWIDVESDKEIESSD 131

Query: 252 SQDSED 257
           S+D E+
Sbjct: 132 SEDEEE 137


>gnl|CDD|240329 PTZ00248, PTZ00248, eukaryotic translation initiation factor 2
           subunit 1; Provisional.
          Length = 319

 Score = 32.7 bits (75), Expect = 0.18
 Identities = 16/57 (28%), Positives = 28/57 (49%), Gaps = 5/57 (8%)

Query: 201 QDPTLLQGPRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
            +P ++ G     +EL+E     + E+EEEE D  +  D+D++D +       D  D
Sbjct: 267 GEPEVVGGDEEDLEELLE-----KAEEEEEEDDYSESEDEDEEDEDEEEEEDDDEGD 318



 Score = 32.7 bits (75), Expect = 0.19
 Identities = 14/35 (40%), Positives = 22/35 (62%)

Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
           K +E  E   Y E+EDE+EE + E+  +DDD+  +
Sbjct: 285 KAEEEEEEDDYSESEDEDEEDEDEEEEEDDDEGDK 319


>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
           Transcription initiation factor IIA (TFIIA) is a
           heterotrimer, the three subunits being known as alpha,
           beta, and gamma, in order of molecular weight. The N and
           C-terminal domains of the gamma subunit are represented
           in pfam02268 and pfam02751, respectively. This family
           represents the precursor that yields both the alpha and
           beta subunits. The TFIIA heterotrimer is an essential
           general transcription initiation factor for the
           expression of genes transcribed by RNA polymerase II.
           Together with TFIID, TFIIA binds to the promoter region;
           this is the first step in the formation of a
           pre-initiation complex (PIC). Binding of the rest of the
           transcription machinery follows this step. After
           initiation, the PIC does not completely dissociate from
           the promoter. Some components, including TFIIA, remain
           attached and re-initiate a subsequent round of
           transcription.
          Length = 332

 Score = 32.8 bits (75), Expect = 0.21
 Identities = 37/203 (18%), Positives = 57/203 (28%), Gaps = 27/203 (13%)

Query: 61  PTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQG----PT 116
           P  P GPT+   P  L    +P   T     + L  P   +    LQ           P+
Sbjct: 104 PAGPAGPTIQTEPGQLYPVQVPVMVTQNPANSPLDQPAQQRALQQLQQRYGAPASGQLPS 163

Query: 117 LPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPR 176
             Q         L Q P     P    G       +           TL Q     +G  
Sbjct: 164 QQQSAQKNDESQLQQQPNGETPPQQTDGA--GDDESEALVRLREADGTLEQRIKGAEGGG 221

Query: 177 KGPTLPQGPTLPQGPTLPQDPTLLQDPTLLQGPRYKEKELVER-SAYGETEDEEEESDTE 235
                                 L Q     +  + +    ++   +  E +  +++ D +
Sbjct: 222 ------------------AMKVLKQPKKQAKSSKRRTIAQIDGIDSDDEGDGSDDDDDED 263

Query: 236 DIRDDDDDDMEIVVCTSQDSEDL 258
            I  D DD  + V    +D EDL
Sbjct: 264 AIESDLDDSDDDVS--DEDGEDL 284



 Score = 31.3 bits (71), Expect = 0.50
 Identities = 41/240 (17%), Positives = 59/240 (24%), Gaps = 45/240 (18%)

Query: 52  LQGPTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTL 111
            Q P  L  P   Q    L      Q  T    P      T       P GPT+   P  
Sbjct: 60  AQLPQPLPQPPPTQALQALPAGDQQQHNTPTGSPAANPPATFALPAG-PAGPTIQTEPGQ 118

Query: 112 PQGPTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTL 171
                +P   T     +    P   +    LQ           Q P+  Q         L
Sbjct: 119 LYPVQVPVMVTQNPANSPLDQPAQQRALQQLQQRYGAPA--SGQLPSQQQSAQKNDESQL 176

Query: 172 LQGPRKGPTLPQGPTLPQGPTLP------------------------QDPTLLQDPTLLQ 207
            Q P      P   T   G                                L Q     +
Sbjct: 177 QQQP--NGETPPQQTDGAGDDESEALVRLREADGTLEQRIKGAEGGGAMKVLKQPKKQAK 234

Query: 208 GPRYKEKELVERSAYGETEDEEEESDTED-----IRDDDDD----------DME-IVVCT 251
             + +    ++     +  D  ++ D ED     + D DDD          D + +++C 
Sbjct: 235 SSKRRTIAQIDGIDSDDEGDGSDDDDDEDAIESDLDDSDDDVSDEDGEDLFDTDNVMLCQ 294


>gnl|CDD|217783 pfam03896, TRAP_alpha, Translocon-associated protein (TRAP), alpha
           subunit.  The alpha-subunit of the TRAP complex (TRAP
           alpha) is a single-spanning membrane protein of the
           endoplasmic reticulum (ER) which is found in proximity
           of nascent polypeptide chains translocating across the
           membrane.
          Length = 281

 Score = 32.1 bits (73), Expect = 0.29
 Identities = 15/49 (30%), Positives = 21/49 (42%), Gaps = 4/49 (8%)

Query: 218 ERSAYGETEDEEEESDTEDIRDDDD----DDMEIVVCTSQDSEDLPGPS 262
             SA   TEDEE E D  D  ++D+    +D   +    +D E     S
Sbjct: 28  FASAQDLTEDEEAEDDVVDEDEEDEAVVEEDENELTEEEEDEEGEVKAS 76



 Score = 28.6 bits (64), Expect = 3.5
 Identities = 10/44 (22%), Positives = 22/44 (50%)

Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSE 256
           ++E  +     + EDE    + E+   ++++D E  V  S D++
Sbjct: 37  DEEAEDDVVDEDEEDEAVVEEDENELTEEEEDEEGEVKASPDAD 80


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 32.3 bits (74), Expect = 0.29
 Identities = 12/35 (34%), Positives = 16/35 (45%)

Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDL 258
             EDEEEE D  D  D++DDD ++           
Sbjct: 319 GEEDEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSD 353



 Score = 30.4 bits (69), Expect = 1.5
 Identities = 13/24 (54%), Positives = 18/24 (75%)

Query: 223 GETEDEEEESDTEDIRDDDDDDME 246
           GE ED+EEE D+++  DD DD+ E
Sbjct: 284 GEEEDDEEEEDSKESADDLDDEFE 307



 Score = 29.6 bits (67), Expect = 2.4
 Identities = 13/34 (38%), Positives = 17/34 (50%)

Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
           + E+EE+  D ED  DDDDD  E         E+
Sbjct: 322 DEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEE 355



 Score = 28.4 bits (64), Expect = 5.6
 Identities = 11/47 (23%), Positives = 22/47 (46%), Gaps = 4/47 (8%)

Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMRGT 270
           E E++ + SD E+  +D+D D E      ++ E+       K+   +
Sbjct: 344 EEEEDVDLSDEEEDEEDEDSDDE----DDEEEEEEEKEKKKKKSAES 386



 Score = 28.0 bits (63), Expect = 6.9
 Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 222 YGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPS 262
             E ED  ++ D ED  DD +++ E V  + ++ ++    S
Sbjct: 323 EEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEEEDEEDEDS 363


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 32.4 bits (73), Expect = 0.31
 Identities = 38/137 (27%), Positives = 44/137 (32%), Gaps = 9/137 (6%)

Query: 64  PQGPTLLQGPTLLQGPTLPQGPTL------PQGPTLLQDPTLPQGPTLLQGPTLPQGPTL 117
           PQGP  +Q P           PT       PQG  +   P            +L   P+L
Sbjct: 175 PQGPPSIQVPPGAALAPSAPPPTPSAQAVPPQGSPIAAQPAPQPQQPSPL--SLISAPSL 232

Query: 118 PQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRK 177
                LP      Q  T  Q       P+     +   GP  P    L QGP  LQ P  
Sbjct: 233 -HPQRLPSPHPPLQPQTASQQSPQPPAPSSRHPQSSHHGPGPPMPHALQQGPVFLQHPSS 291

Query: 178 GPTLPQGPTLPQGPTLP 194
            P  P G    Q P LP
Sbjct: 292 NPPQPFGLAQSQVPPLP 308



 Score = 29.7 bits (66), Expect = 2.1
 Identities = 35/142 (24%), Positives = 43/142 (30%), Gaps = 11/142 (7%)

Query: 61  PTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQG 120
                 P  L     L    LP      Q  T  Q    P  P+       PQ      G
Sbjct: 217 QPQQPSPLSLISAPSLHPQRLPSPHPPLQPQTASQQSPQPPAPS----SRHPQSSHHGPG 272

Query: 121 PTLPQGPTLPQGPTLLQGPTLL----QGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPR 176
           P +P    L QGP  LQ P+       G    Q P  L  P+  Q  +          P+
Sbjct: 273 PPMPH--ALQQGPVFLQHPSSNPPQPFGLAQSQVP-PLPLPSQAQPHSHTPPSQSALQPQ 329

Query: 177 KGPTLPQGPTLPQGPTLPQDPT 198
           + P     P  P  P +   PT
Sbjct: 330 QPPREQPLPPAPSMPHIKPPPT 351



 Score = 29.3 bits (65), Expect = 2.8
 Identities = 27/94 (28%), Positives = 34/94 (36%), Gaps = 4/94 (4%)

Query: 63  LPQGPTLLQGPTLL----QGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLP 118
           L QGP  LQ P+       G    Q P LP           P   + LQ    P+   LP
Sbjct: 279 LQQGPVFLQHPSSNPPQPFGLAQSQVPPLPLPSQAQPHSHTPPSQSALQPQQPPREQPLP 338

Query: 119 QGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPT 152
             P++P     P  P         + P  LQGP+
Sbjct: 339 PAPSMPHIKPPPTTPIPQLPNQSHKHPPHLQGPS 372


>gnl|CDD|221490 pfam12253, CAF1A, Chromatin assembly factor 1 subunit A.  The CAF-1
           or chromatin assembly factor-1 consists of three
           subunits, and this is the first, or A. The A domain is
           uniquely required for the progression of S phase in
           mouse cells, independent of its ability to promote
           histone deposition but dependent on its ability to
           interact with HP1 - heterochromatin protein 1-rich
           heterochromatin domains next to centromeres that are
           crucial for chromosome segregation during mitosis. This
           HP1-CAF-1 interaction module functions as a built-in
           replication control for heterochromatin, which, like a
           control barrier, has an impact on S-phase progression in
           addition to DNA-based checkpoints.
          Length = 76

 Score = 29.9 bits (68), Expect = 0.32
 Identities = 9/22 (40%), Positives = 15/22 (68%), Gaps = 1/22 (4%)

Query: 226 EDE-EEESDTEDIRDDDDDDME 246
           + E EEE + ED+  +D++D E
Sbjct: 45  DAEWEEEEEGEDLESEDEEDEE 66



 Score = 26.8 bits (60), Expect = 3.3
 Identities = 9/23 (39%), Positives = 17/23 (73%)

Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
           E E+E E+ ++ED  D+++DD +
Sbjct: 49  EEEEEGEDLESEDEEDEEEDDDD 71



 Score = 26.8 bits (60), Expect = 3.6
 Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 226 EDEEEESDTEDIRDDDDD 243
           E E+EE + ED  DD D 
Sbjct: 58  ESEDEEDEEEDDDDDMDG 75



 Score = 26.0 bits (58), Expect = 6.2
 Identities = 10/22 (45%), Positives = 13/22 (59%)

Query: 223 GETEDEEEESDTEDIRDDDDDD 244
           GE  + E+E D E+  DDD D 
Sbjct: 54  GEDLESEDEEDEEEDDDDDMDG 75



 Score = 26.0 bits (58), Expect = 6.3
 Identities = 8/23 (34%), Positives = 15/23 (65%)

Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
           E E+EEE  D E   ++D+++ +
Sbjct: 47  EWEEEEEGEDLESEDEEDEEEDD 69


>gnl|CDD|220102 pfam09073, BUD22, BUD22.  BUD22 has been shown in yeast to be a
           nuclear protein involved in bud-site selection. It plays
           a role in positioning the proximal bud pole signal. More
           recently it has been shown to be involved in ribosome
           biogenesis.
          Length = 424

 Score = 32.1 bits (73), Expect = 0.37
 Identities = 14/55 (25%), Positives = 23/55 (41%), Gaps = 2/55 (3%)

Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRM 267
           E E     +  +  D+EEE D++       D M  +V +S + E    PS +   
Sbjct: 178 EDESKSEESAEDDSDDEEEEDSDSEDYSQYDGM--LVDSSDEEEGEEAPSINYNE 230


>gnl|CDD|240226 PTZ00007, PTZ00007, (NAP-L) nucleosome assembly protein -L;
           Provisional.
          Length = 337

 Score = 31.7 bits (72), Expect = 0.39
 Identities = 9/53 (16%), Positives = 17/53 (32%)

Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMRGTRELLTM 276
           +   +E++ D +     D    +       + ED  G   S   +     LT 
Sbjct: 285 DYSSDEDDDDYDSYDSSDSASSDSNSDVDTNEEDDRGEKESNGAKSNELHLTS 337


>gnl|CDD|145949 pfam03066, Nucleoplasmin, Nucleoplasmin.  Nucleoplasmins are also
           known as chromatin decondensation proteins. They bind to
           core histones and transfer DNA to them in a reaction
           that requires ATP. This is thought to play a role in the
           assembly of regular nucleosomal arrays.
          Length = 146

 Score = 30.8 bits (70), Expect = 0.46
 Identities = 10/24 (41%), Positives = 17/24 (70%)

Query: 223 GETEDEEEESDTEDIRDDDDDDME 246
            E++D+EE+ + ED  +DDD+D  
Sbjct: 112 DESDDDEEDEEEEDDEEDDDEDES 135



 Score = 30.0 bits (68), Expect = 0.79
 Identities = 13/25 (52%), Positives = 16/25 (64%), Gaps = 2/25 (8%)

Query: 222 YGETEDEEEESDTEDIRDDDDDDME 246
             + EDEEEE D ED  DD+D+  E
Sbjct: 115 DDDEEDEEEEDDEED--DDEDESEE 137



 Score = 27.3 bits (61), Expect = 6.7
 Identities = 7/21 (33%), Positives = 15/21 (71%)

Query: 226 EDEEEESDTEDIRDDDDDDME 246
           E++E + D ED  ++DD++ +
Sbjct: 110 EEDESDDDEEDEEEEDDEEDD 130


>gnl|CDD|148139 pfam06346, Drf_FH1, Formin Homology Region 1.  This region is found
           in some of the Diaphanous related formins (Drfs). It
           consists of low complexity repeats of around 12
           residues.
          Length = 160

 Score = 30.7 bits (69), Expect = 0.48
 Identities = 44/144 (30%), Positives = 54/144 (37%), Gaps = 1/144 (0%)

Query: 55  PTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQG 114
           P L  G  +P  P L  G  +   P LP G  +P  P L     +P  P L     +P  
Sbjct: 4   PPLPGGVGIPPPPPLPGGVCIPPPPPLPGGTGIPPPPPLPGGAAIPPPPPLPGVAGIPPP 63

Query: 115 PTLPQGPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPT-LPQGPTLLQGPTLLQ 173
           P LP    +P  P LP    +   P L  G  +   P  L G   +P  P L  GP +  
Sbjct: 64  PPLPGATAIPPPPPLPGAAGIPPPPPLPGGAGIPPPPPPLPGGAAVPPPPPLPGGPGVPP 123

Query: 174 GPRKGPTLPQGPTLPQGPTLPQDP 197
            P   P  P  P  P G   P  P
Sbjct: 124 PPPPFPGAPGIPPPPPGMGSPPPP 147


>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
           Rpc31.  RNA polymerase III contains seventeen subunits
           in yeasts and in human cells. Twelve of these are akin
           to RNA polymerase I or II and the other five are RNA pol
           III-specific, and form the functionally distinct groups
           (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
           Rpc34 and Rpc82 form a cluster of enzyme-specific
           subunits that contribute to transcription initiation in
           S.cerevisiae and H.sapiens. There is evidence that these
           subunits are anchored at or near the N-terminal Zn-fold
           of Rpc1, itself prolonged by a highly conserved but RNA
           polymerase III-specific domain.
          Length = 221

 Score = 30.9 bits (70), Expect = 0.53
 Identities = 15/30 (50%), Positives = 19/30 (63%)

Query: 215 ELVERSAYGETEDEEEESDTEDIRDDDDDD 244
           ++ E     E E+EEEE + ED  DDDDDD
Sbjct: 167 DVDEEDEKDEEEEEEEEEEDEDFDDDDDDD 196



 Score = 30.9 bits (70), Expect = 0.58
 Identities = 11/21 (52%), Positives = 16/21 (76%)

Query: 226 EDEEEESDTEDIRDDDDDDME 246
           E+EEEE + ++  DDDDDD +
Sbjct: 177 EEEEEEEEEDEDFDDDDDDDD 197



 Score = 30.1 bits (68), Expect = 1.0
 Identities = 16/37 (43%), Positives = 22/37 (59%), Gaps = 2/37 (5%)

Query: 212 KEKELV-ERSAYGETEDEEEESDTEDIRDD-DDDDME 246
           K KEL  E     + +DEEEE + E+  +D DDDD +
Sbjct: 159 KLKELEAEDVDEEDEKDEEEEEEEEEEDEDFDDDDDD 195



 Score = 29.3 bits (66), Expect = 2.0
 Identities = 13/22 (59%), Positives = 16/22 (72%)

Query: 223 GETEDEEEESDTEDIRDDDDDD 244
            E E+EEE+ D +D  DDDDDD
Sbjct: 178 EEEEEEEEDEDFDDDDDDDDDD 199


>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain.  This
           family represents the C-terminus (approximately 300
           residues) of proteins that are involved as binding
           partners for Prp19 as part of the nuclear pore complex.
           The family in Drosophila is necessary for pre-mRNA
           splicing, and the human protein has been found in
           purifications of the spliceosome. In the past this
           family was thought, erroneously, to be associated with
           microfibrillin.
          Length = 277

 Score = 31.0 bits (70), Expect = 0.59
 Identities = 14/34 (41%), Positives = 20/34 (58%)

Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
           E E++E     E+ +EEEE   E+   D +DDME
Sbjct: 1   ETEVLELEEEDESGEEEEEESEEEEETDSEDDME 34


>gnl|CDD|219922 pfam08595, RXT2_N, RXT2-like, N-terminal.  The family represents
           the N-terminal region of RXT2-like proteins. In S.
           cerevisiae, RXT2 has been demonstrated to be involved in
           conjugation with cellular fusion (mating) and invasive
           growth. A high throughput localisation study has
           localised RXT2 to the nucleus.
          Length = 141

 Score = 30.1 bits (68), Expect = 0.61
 Identities = 16/64 (25%), Positives = 28/64 (43%), Gaps = 12/64 (18%)

Query: 209 PRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDM-------EIVVCTSQDSEDLPGP 261
           PR  E          + +DE++E D E   +DDD++        EI+   +  S+    P
Sbjct: 45  PRIDEDGGDI-----DDDDEDDEDDEEADAEDDDENPYKLIRLEEILAPLTHPSDLPTHP 99

Query: 262 SHSK 265
           + S+
Sbjct: 100 AISR 103


>gnl|CDD|225880 COG3343, RpoE, DNA-directed RNA polymerase, delta subunit
           [Transcription].
          Length = 175

 Score = 30.5 bits (69), Expect = 0.62
 Identities = 12/29 (41%), Positives = 17/29 (58%), Gaps = 1/29 (3%)

Query: 220 SAYGETEDEEEESDTEDIRDDDDDDMEIV 248
               E +DE +  D E+  D+D+DD EIV
Sbjct: 123 DKEEEEDDEVDSLDDEN-DDEDEDDDEIV 150


>gnl|CDD|218333 pfam04931, DNA_pol_phi, DNA polymerase phi.  This family includes
           the fifth essential DNA polymerase in yeast EC:2.7.7.7.
           Pol5p is localised exclusively to the nucleolus and
           binds near or at the enhancer region of rRNA-encoding
           DNA repeating units.
          Length = 784

 Score = 31.4 bits (71), Expect = 0.63
 Identities = 15/46 (32%), Positives = 22/46 (47%), Gaps = 1/46 (2%)

Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
           K  E   R      E EEE+ D  +  DDD+D+ E +  +  +SE 
Sbjct: 633 KADENKSRHQ-QLFEGEEEDEDDLEETDDDEDECEAIEDSESESES 677



 Score = 29.1 bits (65), Expect = 3.6
 Identities = 13/64 (20%), Positives = 22/64 (34%), Gaps = 4/64 (6%)

Query: 213 EKELVERSAYGETEDEEEES---DTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMRG 269
           E E  +     ET+D+E+E    +  +   + D +         D+E   G         
Sbjct: 646 EGEEEDEDDLEETDDDEDECEAIEDSESESESDGEDGEEDEQEDDAEANEGVVPI-DKAV 704

Query: 270 TREL 273
            R L
Sbjct: 705 RRAL 708


>gnl|CDD|130706 TIGR01645, half-pint, poly-U binding splicing factor, half-pint
           family.  The proteins represented by this model contain
           three RNA recognition motifs (rrm: pfam00076) and have
           been characterized as poly-pyrimidine tract binding
           proteins associated with RNA splicing factors. In the
           case of PUF60 (GP|6176532), in complex with p54, and in
           the presence of U2AF, facilitates association of U2
           snRNP with pre-mRNA.
          Length = 612

 Score = 31.2 bits (70), Expect = 0.74
 Identities = 26/107 (24%), Positives = 38/107 (35%), Gaps = 11/107 (10%)

Query: 125 QGPTLPQG--PTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLP 182
           Q P  P    PT +    ++              P LPQ    +  P  ++ P   P  P
Sbjct: 329 QSPATPSSSLPTDIGNKAVVSSAKKEAEEV----PPLPQAAPAVVKPGPMEIPT--PVPP 382

Query: 183 QGPTLPQGPTLPQ--DPTLLQDPTLLQGPRYKEKELVERSAYGETED 227
            G  +P     P    PT + +P+ L  PR K K       +G  +D
Sbjct: 383 PGLAIPSLVAPPGLVAPTEI-NPSFLASPRKKMKREKLPVTFGALDD 428



 Score = 30.0 bits (67), Expect = 1.4
 Identities = 23/108 (21%), Positives = 33/108 (30%), Gaps = 14/108 (12%)

Query: 59  QGPTLPQG--PTLLQGPTLLQGP-TLPQG-PTLPQGPTLLQDPTLPQGPTLLQGPTLPQG 114
           Q P  P    PT +    ++       +  P LPQ    +  P   + PT    P  P G
Sbjct: 329 QSPATPSSSLPTDIGNKAVVSSAKKEAEEVPPLPQAAPAVVKPGPMEIPT----PVPPPG 384

Query: 115 PTLPQGPTLPQ--GPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLP 160
             +P     P    PT    P+ L  P         +   +  G    
Sbjct: 385 LAIPSLVAPPGLVAPTEIN-PSFLASPRK---KMKREKLPVTFGALDD 428



 Score = 28.1 bits (62), Expect = 7.5
 Identities = 17/75 (22%), Positives = 22/75 (29%), Gaps = 3/75 (4%)

Query: 55  PTLLQGPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQG 114
           P L Q       P  ++ PT +  P L     +   P  L  PT    P+ L  P     
Sbjct: 359 PPLPQAAPAVVKPGPMEIPTPVPPPGLAIPSLVA--PPGLVAPTEIN-PSFLASPRKKMK 415

Query: 115 PTLPQGPTLPQGPTL 129
                        TL
Sbjct: 416 REKLPVTFGALDDTL 430


>gnl|CDD|218598 pfam05470, eIF-3c_N, Eukaryotic translation initiation factor 3
           subunit 8 N-terminus.  The largest of the mammalian
           translation initiation factors, eIF3, consists of at
           least eight subunits ranging in mass from 35 to 170 kDa.
           eIF3 binds to the 40 S ribosome in an early step of
           translation initiation and promotes the binding of
           methionyl-tRNAi and mRNA.
          Length = 593

 Score = 30.9 bits (70), Expect = 0.77
 Identities = 13/35 (37%), Positives = 19/35 (54%), Gaps = 2/35 (5%)

Query: 210 RYKEKELVERSAYGETEDEEEESDTEDIRDDDDDD 244
           RY+E    E     E EDE+++ D  D  D+D+D 
Sbjct: 130 RYREDP--ESEDEEEEEDEDDDDDGSDDEDEDEDG 162



 Score = 29.4 bits (66), Expect = 2.3
 Identities = 18/45 (40%), Positives = 23/45 (51%), Gaps = 1/45 (2%)

Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKRMR 268
           E+EDEEEE D ED  DD  DD +        +E++   S S   R
Sbjct: 136 ESEDEEEEED-EDDDDDGSDDEDEDEDGVGATEEVAASSESGVDR 179


>gnl|CDD|220413 pfam09805, Nop25, Nucleolar protein 12 (25kDa).  Members of this
           family of proteins are part of the yeast nuclear pore
           complex-associated pre-60S ribosomal subunit. The family
           functions as a highly conserved exonuclease that is
           required for the 5'-end maturation of 5.8S and 25S
           rRNAs, demonstrating that 5'-end processing also has a
           redundant pathway. Nop25 binds late pre-60S ribosomes,
           accompanying them from the nucleolus to the nuclear
           periphery; and there is evidence for both physical and
           functional links between late 60S subunit processing and
           export.
          Length = 134

 Score = 29.6 bits (67), Expect = 0.79
 Identities = 8/33 (24%), Positives = 18/33 (54%), Gaps = 2/33 (6%)

Query: 214 KELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
           KE ++       ++E+ E++  +  D +DD+ E
Sbjct: 73  KEALKLLEEENDDEEDAETEDTE--DVEDDEWE 103


>gnl|CDD|219912 pfam08574, DUF1762, Protein of unknown function (DUF1762).  This is
           a family of proteins of unknown function. Yeast IWR1 is
           known to interact with RNA polymerase II and deletion of
           this protein results in hypersensitivity to the K1
           killer toxin.
          Length = 77

 Score = 28.5 bits (64), Expect = 0.88
 Identities = 11/31 (35%), Positives = 19/31 (61%)

Query: 227 DEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
            +E+E   ED+ +D+DDD + V+   +DS  
Sbjct: 35  IDEDEEYHEDLANDEDDDADQVLSDDEDSNA 65


>gnl|CDD|215601 PLN03142, PLN03142, Probable chromatin-remodeling complex ATPase
           chain; Provisional.
          Length = 1033

 Score = 30.9 bits (70), Expect = 0.90
 Identities = 13/34 (38%), Positives = 23/34 (67%)

Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
           E E V RSA  +++D+E  ++ ED  ++DD++ E
Sbjct: 17  ELEAVARSAGSDSDDDEVPAEDEDEDEEDDEEAE 50


>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein. 
          Length = 529

 Score = 30.5 bits (69), Expect = 1.1
 Identities = 10/42 (23%), Positives = 21/42 (50%)

Query: 216 LVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
            VE+    E  DEEEE + E+ +++++   +      ++ E 
Sbjct: 26  WVEKEVEKEVPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEK 67


>gnl|CDD|237047 PRK12298, obgE, GTPase CgtA; Reviewed.
          Length = 390

 Score = 30.2 bits (69), Expect = 1.2
 Identities = 9/31 (29%), Positives = 19/31 (61%)

Query: 218 ERSAYGETEDEEEESDTEDIRDDDDDDMEIV 248
            R    E E+E+++   +D  +DDD+ +E++
Sbjct: 357 HREQLEEVEEEDDDDWDDDWDEDDDEGVEVI 387


>gnl|CDD|203043 pfam04546, Sigma70_ner, Sigma-70, non-essential region.  The domain
           is found in the primary vegetative sigma factor. The
           function of this domain is unclear and can be removed
           without loss of function.
          Length = 211

 Score = 29.8 bits (68), Expect = 1.3
 Identities = 10/21 (47%), Positives = 13/21 (61%)

Query: 224 ETEDEEEESDTEDIRDDDDDD 244
             E E +E D ED  DDD+D+
Sbjct: 43  AIESELDEEDLEDDDDDDEDE 63



 Score = 28.7 bits (65), Expect = 3.0
 Identities = 11/37 (29%), Positives = 16/37 (43%), Gaps = 4/37 (10%)

Query: 223 GETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLP 259
               DEE+  D +D  +D+D+D E       D    P
Sbjct: 45  ESELDEEDLEDDDDDDEDEDEDDE----EEADLGPDP 77



 Score = 27.9 bits (63), Expect = 4.6
 Identities = 7/21 (33%), Positives = 10/21 (47%)

Query: 224 ETEDEEEESDTEDIRDDDDDD 244
               E E  + +   DDDDD+
Sbjct: 41  AAAIESELDEEDLEDDDDDDE 61


>gnl|CDD|235124 PRK03427, PRK03427, cell division protein ZipA; Provisional.
          Length = 333

 Score = 30.0 bits (68), Expect = 1.3
 Identities = 14/53 (26%), Positives = 16/53 (30%)

Query: 79  PTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQ 131
             + Q P     P     P  P    + Q    PQ     Q P  PQ    PQ
Sbjct: 103 QPVQQPPEAQVPPQHAPRPAQPAPQPVQQPAYQPQPEQPLQQPVSPQVAPAPQ 155


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
           required for initiation of DNA replication in S.
           cerevisiae, forming a complex with MCM5/CDC46.
           Homologues of CDC45 have been identified in human, mouse
           and smut fungus among others.
          Length = 583

 Score = 30.0 bits (68), Expect = 1.5
 Identities = 19/62 (30%), Positives = 27/62 (43%), Gaps = 7/62 (11%)

Query: 209 PRYKEK--ELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKR 266
           PRY +   +L E     E  DEE+E  ++   D+DDDD +       D       S  +R
Sbjct: 114 PRYDDAYRDLEEDDDDDEESDEEDEESSKSEDDEDDDDDD-----DDDDIATRERSLERR 168

Query: 267 MR 268
            R
Sbjct: 169 RR 170


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family
           are designated YL1. These proteins have been shown to be
           DNA-binding and may be a transcription factor.
          Length = 238

 Score = 29.7 bits (67), Expect = 1.8
 Identities = 10/25 (40%), Positives = 14/25 (56%)

Query: 222 YGETEDEEEESDTEDIRDDDDDDME 246
           +   E+EEEE    D  D +DD+ E
Sbjct: 43  FEIEEEEEEEEVDSDFDDSEDDEPE 67


>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein.  This family includes proteins
           related to Mpp10 (M phase phosphoprotein 10). The U3
           small nucleolar ribonucleoprotein (snoRNP) is required
           for three cleavage events that generate the mature 18S
           rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
           depletion of Mpp10, a U3 snoRNP-specific protein, halts
           18S rRNA production and impairs cleavage at the three U3
           snoRNP-dependent sites.
          Length = 613

 Score = 30.0 bits (67), Expect = 1.8
 Identities = 15/68 (22%), Positives = 23/68 (33%), Gaps = 9/68 (13%)

Query: 200 LQDPTLLQGPRYKEKELVER----SAYGETEDEEEESDTEDI-----RDDDDDDMEIVVC 250
           LQ+  +L     K  E +          + +D E   D  D       D  DD+ E    
Sbjct: 65  LQNKPILDDLNQKYVEFLINKEHIRVLAKLQDSESHEDGSDGSDMDSEDSADDEEEEEED 124

Query: 251 TSQDSEDL 258
            S + E +
Sbjct: 125 ESLEDEMI 132


>gnl|CDD|184885 PRK14891, PRK14891, 50S ribosomal protein L24e/unknown domain
           fusion protein; Provisional.
          Length = 131

 Score = 28.8 bits (64), Expect = 1.9
 Identities = 11/53 (20%), Positives = 21/53 (39%), Gaps = 3/53 (5%)

Query: 208 GPRYKEKELVERSAYGETEDEEE---ESDTEDIRDDDDDDMEIVVCTSQDSED 257
           GP        E +   E  D +E   E+   D  D+ D++ E      + +++
Sbjct: 61  GPAAAATAAAEAAEEAEAADADEDADEAAEADAADEADEEEETDEAVDETADE 113


>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 824

 Score = 30.0 bits (68), Expect = 2.0
 Identities = 38/216 (17%), Positives = 49/216 (22%), Gaps = 29/216 (13%)

Query: 60  GPTLPQGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQ 119
           G   P  P     P     P  P  P  P  P        P   +    P +      P+
Sbjct: 596 GGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPK 655

Query: 120 GPTLPQGPTLPQGPTLLQGPTLLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGP 179
              +P       G     G      P     P     P           P     P  G 
Sbjct: 656 HVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPA--ATPPAGQ 713

Query: 180 TLPQGPTLPQGPT--------------LPQDPTLLQDPTLLQGPRYKEKELVERSAYGET 225
                   PQ                 LP +P    DP                +A    
Sbjct: 714 ADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAA 773

Query: 226 -------------EDEEEESDTEDIRDDDDDDMEIV 248
                        ED+    D ED RD ++  ME++
Sbjct: 774 PPPSPPSEEEEMAEDDAPSMDDEDRRDAEEVAMELL 809


>gnl|CDD|222843 PHA02030, PHA02030, hypothetical protein.
          Length = 336

 Score = 29.6 bits (66), Expect = 2.1
 Identities = 12/50 (24%), Positives = 15/50 (30%), Gaps = 1/50 (2%)

Query: 84  GPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQGP 133
           G  LP  P +  D      P +            P  P +P    LP  P
Sbjct: 263 GSNLPAVPNVAADAGSAAAPAV-PAAAAAVAQAAPSVPQVPNVAVLPDVP 311


>gnl|CDD|217502 pfam03343, SART-1, SART-1 family.  SART-1 is a protein involved in
           cell cycle arrest and pre-mRNA splicing. It has been
           shown to be a component of U4/U6 x U5 tri-snRNP complex
           in human, Schizosaccharomyces pombe and Saccharomyces
           cerevisiae. SART-1 is a known tumour antigen in a range
           of cancers recognised by T cells.
          Length = 603

 Score = 29.7 bits (67), Expect = 2.3
 Identities = 15/58 (25%), Positives = 25/58 (43%), Gaps = 5/58 (8%)

Query: 200 LQDPTLLQGPRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
           LQ   L      ++ E  + S    ++ EE++ D ED   D D +M  V    +  E+
Sbjct: 403 LQKEPL-----EEKPENKDESVEEISDAEEDDEDEEDEDGDGDVEMSAVDNDEEKEEE 455


>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381).  This
           domain is functionally uncharacterized. This domain is
           found in eukaryotes. This presumed domain is typically
           between 156 to 174 amino acids in length. This domain is
           found associated with pfam07780, pfam01728.
          Length = 154

 Score = 28.8 bits (65), Expect = 2.3
 Identities = 10/38 (26%), Positives = 22/38 (57%)

Query: 210 RYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEI 247
           R K ++L+      + E+EEEE + E++ +++  D  +
Sbjct: 87  RKKVRKLLGLDKKEKEEEEEEEVEVEELDEEEQIDELL 124


>gnl|CDD|227931 COG5644, COG5644, Uncharacterized conserved protein [Function
           unknown].
          Length = 869

 Score = 29.7 bits (66), Expect = 2.4
 Identities = 9/29 (31%), Positives = 12/29 (41%)

Query: 224 ETEDEEEESDTEDIRDDDDDDMEIVVCTS 252
           E+E E  +SD +D   D   D       S
Sbjct: 172 ESEIESSDSDHDDENSDSKLDNLRNYIVS 200


>gnl|CDD|219419 pfam07462, MSP1_C, Merozoite surface protein 1 (MSP1) C-terminus.
           This family represents the C-terminal region of
           merozoite surface protein 1 (MSP1) which are found in a
           number of Plasmodium species. MSP-1 is a 200-kDa protein
           expressed on the surface of the P. vivax merozoite.
           MSP-1 of Plasmodium species is synthesised as a
           high-molecular-weight precursor and then processed into
           several fragments. At the time of red cell invasion by
           the merozoite, only the 19-kDa C-terminal fragment
           (MSP-119), which contains two epidermal growth
           factor-like domains, remains on the surface. Antibodies
           against MSP-119 inhibit merozoite entry into red cells,
           and immunisation with MSP-119 protects monkeys from
           challenging infections. Hence, MSP-119 is considered a
           promising vaccine candidate.
          Length = 574

 Score = 29.5 bits (66), Expect = 2.4
 Identities = 24/117 (20%), Positives = 34/117 (29%), Gaps = 13/117 (11%)

Query: 141 LLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQGPTLPQDPTLL 200
           L +G T     T +  P                         Q PT           T L
Sbjct: 258 LPKGTTQEAKVTTVVTPPQADAAPSPLSVRPAGSSGSASGSTQIPTSGSVLGPGAAATEL 317

Query: 201 QDPTLLQGPRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSED 257
           Q    LQ    ++  LV    +G               DDDD+D++ V     +SE+
Sbjct: 318 QQVVQLQNYDEEDDSLVVLPIFGND-------------DDDDEDLDQVATGEAESEE 361


>gnl|CDD|115196 pfam06524, NOA36, NOA36 protein.  This family consists of several
           NOA36 proteins which contain 29 highly conserved
           cysteine residues. The function of this protein is
           unknown.
          Length = 314

 Score = 29.2 bits (65), Expect = 3.0
 Identities = 12/28 (42%), Positives = 16/28 (57%), Gaps = 3/28 (10%)

Query: 222 YGETEDEEEES---DTEDIRDDDDDDME 246
           YG   D++E S   D ++  D DDDD E
Sbjct: 269 YGYESDDDEGSSSNDYDEEEDGDDDDNE 296


>gnl|CDD|221323 pfam11931, DUF3449, Domain of unknown function (DUF3449).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 181 to 207 amino acids in length. This domain
           has two conserved sequence motifs: PIP and CEICG. The
           domain carries a zinc-finger domain of the C2H2-type.
          Length = 187

 Score = 28.4 bits (64), Expect = 3.4
 Identities = 12/42 (28%), Positives = 18/42 (42%), Gaps = 5/42 (11%)

Query: 218 ERSAYGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLP 259
           ER A  +   E+   D  D   DDD++  I      +  +LP
Sbjct: 35  ERQASADESSEDASEDGSDDDSDDDEEEPIY-----NPLNLP 71



 Score = 28.4 bits (64), Expect = 3.7
 Identities = 19/58 (32%), Positives = 24/58 (41%), Gaps = 10/58 (17%)

Query: 205 LLQGPRYKEKELVER-SA--YGETEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLP 259
           LL+  R    E VER  A    E +   +ES  +   D  DDD       S D E+ P
Sbjct: 13  LLKKEREDTIENVERKQALTEEERQASADESSEDASEDGSDDD-------SDDDEEEP 63


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 29.2 bits (65), Expect = 3.8
 Identities = 23/85 (27%), Positives = 34/85 (40%), Gaps = 14/85 (16%)

Query: 175  PRKGPTLPQGPTLPQGPTLPQDPTL------------LQDPTLLQGPRYKEKELVERSAY 222
            P       Q P   +   LP+D  L            L+D  +      KE+   E+   
Sbjct: 3971 PDIQENNSQPPPENEDLDLPEDLKLDEKEGDVSKDSDLEDMDMEAADENKEEADAEKDEP 4030

Query: 223  GETEDEEEESDT--EDIRDDDDDDM 245
             + ED  EE++T  EDI+ DD  D+
Sbjct: 4031 MQDEDPLEENNTLDEDIQQDDFSDL 4055


>gnl|CDD|217861 pfam04050, Upf2, Up-frameshift suppressor 2.  Transcripts
           harbouring premature signals for translation termination
           are recognised and rapidly degraded by eukaryotic cells
           through a pathway known as nonsense-mediated mRNA decay.
           In Saccharomyces cerevisiae, three trans-acting factors
           (Upf1 to Upf3) are required for nonsense-mediated mRNA
           decay.
          Length = 171

 Score = 28.1 bits (63), Expect = 4.5
 Identities = 13/36 (36%), Positives = 21/36 (58%), Gaps = 3/36 (8%)

Query: 223 GETEDEEEESDTEDIRDD--DDDDMEIVVCTSQDSE 256
            E+ DEEE    +D +D+  D ++ +I V T Q+ E
Sbjct: 23  DESSDEEEVDLPDDEQDEESDSEEEQIFV-TRQEEE 57


>gnl|CDD|220759 pfam10446, DUF2457, Protein of unknown function (DUF2457).  This is
           a family of uncharacterized proteins.
          Length = 449

 Score = 28.4 bits (63), Expect = 4.5
 Identities = 14/35 (40%), Positives = 21/35 (60%)

Query: 212 KEKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
           K  +  E  A  E +D+EE+ D +D  D+DDDD +
Sbjct: 40  KLGKEAEEEAMEEEDDDEEDDDDDDDEDEDDDDDD 74


>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT.  This family
           consists of several bacterial cobalamin biosynthesis
           (CobT) proteins. CobT is involved in the transformation
           of precorrin-3 into cobyrinic acid.
          Length = 282

 Score = 28.2 bits (63), Expect = 4.6
 Identities = 10/23 (43%), Positives = 14/23 (60%)

Query: 224 ETEDEEEESDTEDIRDDDDDDME 246
           E  DE E +D+ED  D+DD   +
Sbjct: 211 ELGDEPESADSEDNEDEDDPKED 233


>gnl|CDD|236669 PRK10263, PRK10263, DNA translocase FtsK; Provisional.
          Length = 1355

 Score = 28.5 bits (63), Expect = 4.9
 Identities = 32/162 (19%), Positives = 43/162 (26%), Gaps = 16/162 (9%)

Query: 83  QGPTLPQGPTLLQDPTLPQGPTLLQGPTLPQGPTLPQGPTLPQGPTLPQGPTLLQGP--T 140
           Q P +         PT+   P  + GP   +    P     PQ     Q       P   
Sbjct: 342 QTPPVASVDVPPAQPTVAWQP--VPGPQTGEPVIAPAPEGYPQQSQYAQPAVQYNEPLQQ 399

Query: 141 LLQGPTLLQGPTLLQGPTLPQGPTLLQGPTLLQGPRKGPTLPQGPTLPQGPTLPQDPTLL 200
            +Q       P   Q    P      + P         P  P      Q     Q  T  
Sbjct: 400 PVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQPVAGNAWQAE--EQQSTFA 457

Query: 201 QDPT----------LLQGPRYKEKELVERSAYGETEDEEEES 232
              T            Q P Y++ + VE+    E E   EE+
Sbjct: 458 PQSTYQTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPVVEET 499


>gnl|CDD|165563 PHA03308, PHA03308, transcriptional regulator ICP4; Provisional.
          Length = 1463

 Score = 28.6 bits (63), Expect = 4.9
 Identities = 17/52 (32%), Positives = 25/52 (48%), Gaps = 2/52 (3%)

Query: 169 PTLLQGPRKGPTLPQGPTLPQGPTLP--QDPTLLQDPTLLQGPRYKEKELVE 218
           P + +  R  P LP  P  P+GP  P  ++P   Q+P   Q P +   E+ E
Sbjct: 830 PAVPETDRDNPLLPPCPITPEGPPCPPREEPQQPQEPQEPQSPSFHISEIGE 881


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 28.5 bits (64), Expect = 5.0
 Identities = 12/42 (28%), Positives = 18/42 (42%), Gaps = 2/42 (4%)

Query: 205 LLQGPRYKEKELV--ERSAYGETEDEEEESDTEDIRDDDDDD 244
           L QG   + K           + + + EE D +D  DDDD +
Sbjct: 308 LRQGEELRRKIEGKSVSEEDEDEDSDSEEEDEDDDEDDDDGE 349



 Score = 28.5 bits (64), Expect = 5.2
 Identities = 9/36 (25%), Positives = 20/36 (55%), Gaps = 3/36 (8%)

Query: 212 KEKELVER-SAYGETEDEEEESDTEDIRDDDDDDME 246
           + +EL  +      +E++E+E    +  ++D+DD E
Sbjct: 310 QGEELRRKIEGKSVSEEDEDEDSDSE--EEDEDDDE 343


>gnl|CDD|185603 PTZ00415, PTZ00415, transmission-blocking target antigen s230;
           Provisional.
          Length = 2849

 Score = 28.5 bits (63), Expect = 5.2
 Identities = 11/36 (30%), Positives = 19/36 (52%)

Query: 226 EDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGP 261
           +DE+E+ D +D  DD++++ E       D ED    
Sbjct: 156 DDEDEDEDDDDEEDDEEEEEEEEEIKGFDDEDEEDE 191


>gnl|CDD|226907 COG4530, COG4530, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 129

 Score = 27.1 bits (60), Expect = 5.6
 Identities = 14/41 (34%), Positives = 18/41 (43%), Gaps = 5/41 (12%)

Query: 222 YGETEDEEEESDT-----EDIRDDDDDDMEIVVCTSQDSED 257
            G+ ED + + D      ED  DDDDD   I+     D E 
Sbjct: 89  LGDDEDVDLDDDDDDTFLEDEEDDDDDVSGIIGVPGDDEEV 129


>gnl|CDD|218003 pfam04281, Tom22, Mitochondrial import receptor subunit Tom22.  The
           mitochondrial protein translocase family, which is
           responsible for movement of nuclear encoded pre-proteins
           into mitochondria, is very complex with at least 19
           components. These proteins include several chaperone
           proteins, four proteins of the outer membrane
           translocase (Tom) import receptor, five proteins of the
           Tom channel complex, five proteins of the inner membrane
           translocase (Tim) and three "motor" proteins. This
           family represents the Tom22 proteins. The N terminal
           region of Tom22 has been shown to have chaperone-like
           activity, and the C terminal region faces the
           intermembrane face.
          Length = 136

 Score = 26.8 bits (60), Expect = 7.4
 Identities = 13/39 (33%), Positives = 22/39 (56%), Gaps = 3/39 (7%)

Query: 211 YKEKELVERSAYGETEDEEEESDTE---DIRDDDDDDME 246
           ++EK    ++   E  D+++E DT+   DI DD D + E
Sbjct: 12  FQEKPAAPKNLAQEESDDDDEDDTDTDSDISDDSDFENE 50


>gnl|CDD|130712 TIGR01651, CobT, cobaltochelatase, CobT subunit.  This model
           describes Pseudomonas denitrificans CobT gene product,
           which is a cobalt chelatase subunit that functions in
           cobalamin biosynthesis. Cobalamin (vitamin B12) can be
           synthesized via several pathways, including an aerobic
           pathway (found in Pseudomonas denitrificans) and an
           anaerobic pathway (found in P. shermanii and Salmonella
           typhimurium). These pathways differ in the point of
           cobalt insertion during corrin ring formation. There are
           apparently a number of variations on these two pathways,
           where the major differences seem to be concerned with
           the process of ring contraction. Confusion regarding the
           functions of enzymes found in the aerobic vs. anaerobic
           pathways has arisen because nonhomologous genes in these
           different pathways were given the same gene symbols.
           Thus, cobT in the aerobic pathway (P. denitrificans) is
           not a homolog of cobT in the anaerobic pathway (S.
           typhimurium). It should be noted that E. coli
           synthesizes cobalamin only when it is supplied with the
           precursor cobinamide, which is a complex intermediate.
           Additionally, all E. coli cobalamin synthesis genes
           (cobU, cobS and cobT) were named after their Salmonella
           typhimurium homologs which function in the anaerobic
           cobalamin synthesis pathway. This model describes the
           aerobic cobalamin pathway Pseudomonas denitrificans CobT
           gene product, which is a cobalt chelatase subunit, with
           a MW ~70 kDa. The aerobic pathway cobalt chelatase is a
           heterotrimeric, ATP-dependent enzyme that catalyzes
           cobalt insertion during cobalamin biosynthesis. The
           other two subunits are the P. denitrificans CobS
           (TIGR01650) and CobN (pfam02514 CobN/Magnesium
           Chelatase) proteins. To avoid potential confusion with
           the nonhomologous Salmonella typhimurium/E.coli cobT
           gene product, the P. denitrificans gene symbol is not
           used in the name of this model [Biosynthesis of
           cofactors, prosthetic groups, and carriers, Heme,
           porphyrin, and cobalamin].
          Length = 600

 Score = 28.0 bits (62), Expect = 7.5
 Identities = 10/42 (23%), Positives = 17/42 (40%), Gaps = 6/42 (14%)

Query: 225 TEDEEEESDTEDIRDDDDDDMEIVVCTSQDSEDLPGPSHSKR 266
           T+ E E  + E ++ D DD  +      +  +D   P    R
Sbjct: 247 TDRESESGEEEMVQSDQDDLPD------ESDDDSETPGEGAR 282


>gnl|CDD|143416 cd07098, ALDH_F15-22, Aldehyde dehydrogenase family 15A1 and
           22A1-like.  Aldehyde dehydrogenase family members
           ALDH15A1 (Saccharomyces cerevisiae YHR039C) and ALDH22A1
           (Arabidopsis thaliana, EC=1.2.1.3), and similar
           sequences, are in this CD. Significant improvement of
           stress tolerance in tobacco plants was observed by
           overexpressing the ALDH22A1 gene from maize (Zea mays)
           and was accompanied by a reduction of malondialdehyde
           derived from cellular lipid peroxidation.
          Length = 465

 Score = 28.0 bits (63), Expect = 7.6
 Identities = 15/34 (44%), Positives = 16/34 (47%), Gaps = 2/34 (5%)

Query: 65  QGPTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPT 98
           +G  LL G      P  PQG   P  PTLL D T
Sbjct: 327 KGARLLAGGKRYPHPEYPQGHYFP--PTLLVDVT 358


>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
          Length = 619

 Score = 27.8 bits (63), Expect = 7.6
 Identities = 12/43 (27%), Positives = 17/43 (39%)

Query: 204 TLLQGPRYKEKELVERSAYGETEDEEEESDTEDIRDDDDDDME 246
                P  +E      S   E +D+E+E + ED  DD     E
Sbjct: 171 DGFVDPNAEEDPAHVGSELEELDDDEDEEEEEDENDDSLAADE 213


>gnl|CDD|220135 pfam09184, PPP4R2, PPP4R2.  PPP4R2 (protein phosphatase 4 core
           regulatory subunit R2) is the regulatory subunit of the
           histone H2A phosphatase complex. It has been shown to
           confer resistance to the anticancer drug cisplatin in
           yeast, and may confer resistance in higher eukaryotes.
          Length = 285

 Score = 27.5 bits (61), Expect = 7.9
 Identities = 10/30 (33%), Positives = 17/30 (56%)

Query: 213 EKELVERSAYGETEDEEEESDTEDIRDDDD 242
           + + VE     E E+EEE  + E+  D+D+
Sbjct: 256 DGDYVEEKELKEDEEEEETEEEEEEEDEDE 285


>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27.  This protein forms
           the C subunit of DNA polymerase delta. It carries the
           essential residues for binding to the Pol1 subunit of
           polymerase alpha, from residues 293-332, which are
           characterized by the motif D--G--VT, referred to as the
           DPIM motif. The first 160 residues of the protein form
           the minimal domain for binding to the B subunit, Cdc1,
           of polymerase delta, the final 10 C-terminal residues,
           362-372, being the DNA sliding clamp, PCNA, binding
           motif.
          Length = 427

 Score = 27.9 bits (62), Expect = 8.1
 Identities = 14/57 (24%), Positives = 27/57 (47%), Gaps = 8/57 (14%)

Query: 217 VERSAYGETEDEEEESDTEDIR----DDDDDDMEIVV----CTSQDSEDLPGPSHSK 265
            ERS   E  +E+E+   + ++    D+D+D+   +V       ++SE+   P   K
Sbjct: 278 GERSDSEEETEEKEKEKRKRLKKMMEDEDEDEEMEIVPESPVEEEESEEPEPPPLPK 334


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
           eukaryotic proteins with undetermined function.
          Length = 321

 Score = 27.4 bits (61), Expect = 8.5
 Identities = 18/103 (17%), Positives = 34/103 (33%), Gaps = 16/103 (15%)

Query: 204 TLLQGPRYKEKELVERSAYGETEDE--------EEESDTEDIRDDDDDDMEIVVCTSQDS 255
           ++L+    +EK+  E     E E          E E D     D+D +D E       D+
Sbjct: 171 SMLEALFRREKKEEEEEE-EEDEALIKSLSFGPETEEDRRRADDEDSEDDE----EDNDN 225

Query: 256 EDLPGPSHSKRMRGT---RELLTMKLDSCLHFPRQRGSPWTVS 295
              P    S   + T   ++    + ++      ++ S     
Sbjct: 226 TPSPKSGSSSPAKPTSILKKSAAKRSEAPSSSKAKKNSRGIPK 268


>gnl|CDD|183115 PRK11394, PRK11394, 23S rRNA pseudouridine synthase E; Provisional.
          Length = 217

 Score = 27.4 bits (60), Expect = 9.3
 Identities = 17/48 (35%), Positives = 24/48 (50%), Gaps = 2/48 (4%)

Query: 67  PTLLQGPTLLQGPTLPQGPTLPQGPTLLQDPT--LPQGPTLLQGPTLP 112
           PT      L  G TL  GPTLP G  L+ +P    P+ P + +  ++P
Sbjct: 117 PTQDALEALRNGVTLNDGPTLPAGAELVDEPAWLWPRNPPIRERKSIP 164


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.314    0.137    0.435 

Gapped
Lambda     K      H
   0.267   0.0647    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,461,746
Number of extensions: 1499183
Number of successful extensions: 2767
Number of sequences better than 10.0: 1
Number of HSP's gapped: 2104
Number of HSP's successfully gapped: 245
Length of query: 295
Length of database: 10,937,602
Length adjustment: 96
Effective length of query: 199
Effective length of database: 6,679,618
Effective search space: 1329243982
Effective search space used: 1329243982
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 59 (26.7 bits)