RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy2133
         (207 letters)



>gnl|CDD|238017 cd00060, FHA, Forkhead associated domain (FHA); found in eukaryotic
           and prokaryotic proteins. Putative nuclear signalling
           domain. FHA domains may bind phosphothreonine,
           phosphoserine and sometimes phosphotyrosine. In
           eukaryotes, many FHA domain-containing proteins localize
           to the nucleus, where they participate in establishing
           or maintaining cell cycle checkpoints, DNA repair, or
           transcriptional regulation. Members of the FHA family
           include: Dun1, Rad53,  Cds1, Mek1,
           KAPP(kinase-associated protein phosphatase),and Ki-67 (a
           human nuclear protein related to cell proliferation).
          Length = 102

 Score = 74.7 bits (184), Expect = 8e-18
 Identities = 31/102 (30%), Positives = 50/102 (49%), Gaps = 8/102 (7%)

Query: 2   LKSGQIVNTIDLSTRSFYCVGRE-RNTHLNLLHPTVSRYHAILQYKSTFDEKDPARGFYV 60
           L          L     Y +GR+  N  + L  P+VSR HA+++Y       D   G  +
Sbjct: 7   LSGDASGRRYYLDPGGTYTIGRDSDNCDIVLDDPSVSRRHAVIRY-------DGDGGVVL 59

Query: 61  YDLGSTHGTFLNRCKIKPKMYVRIHVGHMLSFGSSTRFFILQ 102
            DLGST+GTF+N  ++ P   VR+  G ++  G+++  F  +
Sbjct: 60  IDLGSTNGTFVNGQRVSPGEPVRLRDGDVIRLGNTSISFRFE 101


>gnl|CDD|215951 pfam00498, FHA, FHA domain.  The FHA (Forkhead-associated) domain
          is a phosphopeptide binding motif.
          Length = 67

 Score = 71.1 bits (175), Expect = 8e-17
 Identities = 26/75 (34%), Positives = 41/75 (54%), Gaps = 8/75 (10%)

Query: 19 YCVGRERNTHLNLLHPTVSRYHAILQYKSTFDEKDPARGFYVYDLGSTHGTFLNRCKIKP 78
            +GR  +  + L  P+VSR HA ++Y            FY+ DLGST+GTF+N  ++ P
Sbjct: 1  VTIGRSPDCDIVLDDPSVSRRHAEIRYDGG-------GRFYLEDLGSTNGTFVNGQRLGP 53

Query: 79 KMYVRIHVGHMLSFG 93
          +  VR+  G ++  G
Sbjct: 54 E-PVRLRDGDVIRLG 67


>gnl|CDD|214578 smart00240, FHA, Forkhead associated domain.  Found in eukaryotic
          and prokaryotic proteins. Putative nuclear signalling
          domain.
          Length = 52

 Score = 54.1 bits (131), Expect = 2e-10
 Identities = 21/57 (36%), Positives = 31/57 (54%), Gaps = 8/57 (14%)

Query: 21 VGRERNT-HLNLLHPTVSRYHAILQYKSTFDEKDPARGFYVYDLGSTHGTFLNRCKI 76
          +GR      + L  P++SR HA++ Y            FY+ DLGST+GTF+N  +I
Sbjct: 3  IGRSSEDCDIQLDGPSISRRHAVIVYDGG-------GRFYLIDLGSTNGTFVNGKRI 52


>gnl|CDD|224630 COG1716, COG1716, FOG: FHA domain [Signal transduction mechanisms].
          Length = 191

 Score = 48.4 bits (115), Expect = 3e-07
 Identities = 28/107 (26%), Positives = 49/107 (45%), Gaps = 13/107 (12%)

Query: 21  VGRERNTHLNLLHPTVSRYHAILQYKSTFDEKDPARGFYVYDLGSTHGTFLNRCKIKPKM 80
           +GR+ +  + L    VSR HA L+ +            ++ DLGST+GT++N  K++ + 
Sbjct: 93  IGRDPDNDIVLDDDVVSRRHAELRREGNE--------VFLEDLGSTNGTYVNGEKVRQR- 143

Query: 81  YVRIHVGHMLSFGSSTR---FFILQGPSEDEEEESELSVSELKEQRR 124
            V +  G ++  G +       IL     D  +    S  EL+E+  
Sbjct: 144 -VLLQDGDVIRLGGTLAERLRIILTELEIDGVDPVATSAKELEEEGL 189


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 36.0 bits (83), Expect = 0.009
 Identities = 25/81 (30%), Positives = 36/81 (44%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
           E  EEE E    E +E++  E+E+ E E  E+ +E +  +EEE EG S G GD  E E D
Sbjct: 443 ESVEEEEEEEEEEEEEEQESEEEEGEDEEEEEEVEADNGSEEEMEGSSEGDGDGEEPEED 502

Query: 166 LSENPYASTNNEELYLDDPKK 186
                        +      +
Sbjct: 503 AERRNSEMAGISRMSEGQQPR 523


>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein.  TolA couples the inner
           membrane complex of itself with TolQ and TolR to the
           outer membrane complex of TolB and OprL (also called
           Pal). Most of the length of the protein consists of
           low-complexity sequence that may differ in both length
           and composition from one species to another,
           complicating efforts to discriminate TolA (the most
           divergent gene in the tol-pal system) from paralogs such
           as TonB. Selection of members of the seed alignment and
           criteria for setting scoring cutoffs are based largely
           conserved operon struction. //The Tol-Pal complex is
           required for maintaining outer membrane integrity. Also
           involved in transport (uptake) of colicins and
           filamentous DNA, and implicated in pathogenesis.
           Transport is energized by the proton motive force. TolA
           is an inner membrane protein that interacts with
           periplasmic TolB and with outer membrane porins ompC,
           phoE and lamB [Transport and binding proteins, Other,
           Cellular processes, Pathogenesis].
          Length = 346

 Score = 34.4 bits (79), Expect = 0.027
 Identities = 13/43 (30%), Positives = 22/43 (51%)

Query: 108 EEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           + EE +    E K ++  E + K     EK  ++EAK + E+E
Sbjct: 113 QAEEKQKQAEEAKAKQAAEAKAKAEAEAEKKAKEEAKKQAEEE 155



 Score = 32.9 bits (75), Expect = 0.088
 Identities = 13/48 (27%), Positives = 24/48 (50%), Gaps = 3/48 (6%)

Query: 106 EDEEEESELSVSELKEQRRQEK---EKKEREALEKSLEQEAKTEEEDE 150
           E +++  E    +  E + + +   EKK +E  +K  E+EAK +   E
Sbjct: 116 EKQKQAEEAKAKQAAEAKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAE 163



 Score = 31.0 bits (70), Expect = 0.38
 Identities = 9/45 (20%), Positives = 23/45 (51%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
             ++ +++ E    E ++QR  E+ +++      + E+ AK  E+
Sbjct: 65  KEQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQ 109



 Score = 29.4 bits (66), Expect = 1.4
 Identities = 14/48 (29%), Positives = 22/48 (45%), Gaps = 2/48 (4%)

Query: 105 SEDEEEESELSVSELKEQRRQE--KEKKEREALEKSLEQEAKTEEEDE 150
            + E++  E       EQ RQ+  +++   E   K  EQ AK  EE +
Sbjct: 71  KKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQ 118



 Score = 27.5 bits (61), Expect = 4.9
 Identities = 12/35 (34%), Positives = 19/35 (54%), Gaps = 4/35 (11%)

Query: 120 KEQRRQEKEKKEREALEKSL----EQEAKTEEEDE 150
            EQ  ++ E+K+++A E       E +AK E E E
Sbjct: 107 AEQAAKQAEEKQKQAEEAKAKQAAEAKAKAEAEAE 141



 Score = 27.1 bits (60), Expect = 6.8
 Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 118 ELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
           EL+++   EK  K+ E   K  E++ K  EE
Sbjct: 93  ELEQRAAAEKAAKQAEQAAKQAEEKQKQAEE 123


>gnl|CDD|221122 pfam11490, DNA_pol3_alph_N, DNA polymerase III polC-type
           N-terminus.  This is an N-terminal domain of DNA
           polymerase III polC subunit A that is found only in
           Firmicutes. DNA polymerase polC-type III enzyme
           functions as the 'replicase' in low G + C Gram-positive
           bacteria. Purine asymmetry is a characteristic of
           organisms with a heterodimeric DNA polymerase III
           alpha-subunit constituted by polC which probably plays a
           direct role in the maintenance of strand-biased gene
           distribution; since, among prokaryotic genomes, the
           distribution of genes on the leading and lagging strands
           of the replication fork is known to be biased. The
           domain is associated with DNA_pol3_alpha pfam07733.
          Length = 180

 Score = 33.8 bits (78), Expect = 0.027
 Identities = 16/44 (36%), Positives = 30/44 (68%)

Query: 107 DEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           DE+E SE  + E +EQ+ +E+ K   EALE   ++EA+ +++++
Sbjct: 136 DEDESSEEEIEEFEEQKEEEEAKLAEEALEALKKKEAEKKKKEK 179


>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
           This family represents Cwf15/Cwc15 (from
           Schizosaccharomyces pombe and Saccharomyces cerevisiae
           respectively) and their homologues. The function of
           these proteins is unknown, but they form part of the
           spliceosome and are thus thought to be involved in mRNA
           splicing.
          Length = 241

 Score = 32.4 bits (74), Expect = 0.099
 Identities = 17/44 (38%), Positives = 29/44 (65%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
            +D E+E+   + EL++ +++  E+KERE  EK+ E+E   EEE
Sbjct: 137 DDDSEDETAALLRELEKIKKERAEEKEREEEEKAAEEEKAREEE 180



 Score = 29.7 bits (67), Expect = 1.0
 Identities = 12/48 (25%), Positives = 20/48 (41%), Gaps = 2/48 (4%)

Query: 105 SEDEEEESELSVSELKEQ--RRQEKEKKEREALEKSLEQEAKTEEEDE 150
              +++  +    +      R  EK KKER   ++  E+E   EEE  
Sbjct: 129 DSSDDDSDDDDSEDETAALLRELEKIKKERAEEKEREEEEKAAEEEKA 176


>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1.  This domain
           family is found in eukaryotes, and is approximately 40
           amino acids in length. The family is found in
           association with pfam07719, pfam00515. There is a single
           completely conserved residue L that may be functionally
           important. NARP1 is the mammalian homologue of a yeast
           N-terminal acetyltransferase that regulates entry into
           the G(0) phase of the cell cycle.
          Length = 516

 Score = 33.0 bits (76), Expect = 0.100
 Identities = 17/61 (27%), Positives = 25/61 (40%), Gaps = 13/61 (21%)

Query: 104 PSEDEEEESELSVSEL-------------KEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           P   E EE E     L             K +++ EKE+ E+ A +K  E  AK  +  +
Sbjct: 391 PLLAEGEEEEGENGNLSPAERKKLRKKQRKAEKKAEKEEAEKAAAKKKAEAAAKKAKGPD 450

Query: 151 G 151
           G
Sbjct: 451 G 451


>gnl|CDD|218397 pfam05044, Prox1, Homeobox prospero-like protein (PROX1).  The
           homeobox gene Prox1 is expressed in a subpopulation of
           endothelial cells that, after budding from veins, gives
           rise to the mammalian lymphatic system. Prox1 has been
           found to be an early specific marker for the developing
           liver and pancreas in the mammalian foregut endoderm.
           This family contains an atypical homeobox domain.
          Length = 908

 Score = 33.1 bits (75), Expect = 0.11
 Identities = 22/99 (22%), Positives = 48/99 (48%), Gaps = 14/99 (14%)

Query: 90  LSFGSSTRFFILQGPSEDEEEESELSVSELKEQRRQEK----------EKKEREALEKSL 139
           L+   + R  +L  P + ++   +L VS  KEQ+R+E+          +K+ R+  +K +
Sbjct: 267 LNSRENKRKQML--PQQQQQSFDQL-VSPRKEQKREERRQLKQQLRDMQKQLRQLQQKYV 323

Query: 140 E-QEAKTEEEDEGISWGMGDDAEEETDLSENPYASTNNE 177
           +  ++  +  D+ I   +G+ +E+    S +  AS  + 
Sbjct: 324 QIYDSTDDSTDDDIHEDIGNLSEDSPSRSNSLDASAPDN 362


>gnl|CDD|226920 COG4547, CobT, Cobalamin biosynthesis protein CobT
           (nicotinate-mononucleotide:5, 6-dimethylbenzimidazole
           phosphoribosyltransferase) [Coenzyme metabolism].
          Length = 620

 Score = 32.9 bits (75), Expect = 0.12
 Identities = 20/75 (26%), Positives = 31/75 (41%), Gaps = 8/75 (10%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
            +E E S+ S  +  E    E E+ E +A E S + E+   +ED   +   G+DA     
Sbjct: 251 REESEGSDESEEDEAEATDGEGEEGEMDAAEASEDSESDESDED---TETPGEDA----- 302

Query: 166 LSENPYASTNNEELY 180
               P+     E  Y
Sbjct: 303 RPATPFTELMEEVDY 317


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 32.9 bits (75), Expect = 0.13
 Identities = 18/83 (21%), Positives = 37/83 (44%), Gaps = 5/83 (6%)

Query: 102 QGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAE 161
               +    E EL   + KE++ +EKE K+ +A +K  E +AK + +       +   +E
Sbjct: 6   SEAEKKILTEEELERKKKKEEKAKEKELKKLKAAQK--EAKAKLQAQQASDGTNVPKKSE 63

Query: 162 EETDLSENPYASTNNEELYLDDP 184
           +++   +       N E ++D  
Sbjct: 64  KKSRKRD---VEDENPEDFIDPD 83


>gnl|CDD|237063 PRK12329, nusA, transcription elongation factor NusA; Provisional.
          Length = 449

 Score = 32.1 bits (73), Expect = 0.17
 Identities = 16/76 (21%), Positives = 28/76 (36%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEE 163
            +E ++E  +  V+EL  QR +E+  +         EQ  + EE+          + E E
Sbjct: 373 SAEYDQEAEDAKVAELISQREEEEALQREAEERLEAEQAERAEEDARLRELYPLPEDEFE 432

Query: 164 TDLSENPYASTNNEEL 179
            +           EE 
Sbjct: 433 DEDELEEAQPEEEEEA 448


>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3
           subunit.  This is a family of proteins which are
           subunits of the eukaryotic translation initiation factor
           3 (eIF3). In yeast it is called Hcr1. The Saccharomyces
           cerevisiae protein eIF3j (HCR1) has been shown to be
           required for processing of 20S pre-rRNA and binds to 18S
           rRNA and eIF3 subunits Rpg1p and Prt1p.
          Length = 242

 Score = 31.9 bits (73), Expect = 0.17
 Identities = 23/68 (33%), Positives = 33/68 (48%), Gaps = 7/68 (10%)

Query: 106 EDEEEESELSVSELK-------EQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGD 158
           EDEE+E E +    K       + + +EKEK +RE  EK L +  +   EDE        
Sbjct: 41  EDEEKEEEKAKVAAKAKAKKALKAKIEEKEKAKREKEEKGLRELEEDTPEDELAEKLRLR 100

Query: 159 DAEEETDL 166
             +EE+DL
Sbjct: 101 KLQEESDL 108


>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
           Rpc31.  RNA polymerase III contains seventeen subunits
           in yeasts and in human cells. Twelve of these are akin
           to RNA polymerase I or II and the other five are RNA pol
           III-specific, and form the functionally distinct groups
           (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
           Rpc34 and Rpc82 form a cluster of enzyme-specific
           subunits that contribute to transcription initiation in
           S.cerevisiae and H.sapiens. There is evidence that these
           subunits are anchored at or near the N-terminal Zn-fold
           of Rpc1, itself prolonged by a highly conserved but RNA
           polymerase III-specific domain.
          Length = 221

 Score = 31.6 bits (72), Expect = 0.17
 Identities = 15/64 (23%), Positives = 33/64 (51%), Gaps = 4/64 (6%)

Query: 120 KEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSENPYASTNNEEL 179
            +++    EKK +E   + +++E + +EE+E       +D +++ D  ++ Y    N E 
Sbjct: 149 IDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEEEEDEDFDDDDDDDDDDY----NAEN 204

Query: 180 YLDD 183
           Y D+
Sbjct: 205 YFDN 208


>gnl|CDD|227615 COG5296, COG5296, Transcription factor involved in TATA site
           selection and in elongation by RNA polymerase II
           [Transcription].
          Length = 521

 Score = 31.9 bits (72), Expect = 0.21
 Identities = 14/54 (25%), Positives = 23/54 (42%)

Query: 110 EESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEE 163
           E  E   SE   + ++ K+ KE E  E+ L++E      +E +      D   E
Sbjct: 156 EREERLYSERHIELQRFKDYKELEESEQGLQEEYTPSYAEEAVEDISRTDDFAE 209


>gnl|CDD|219900 pfam08553, VID27, VID27 cytoplasmic protein.  This is a family of
           fungal and plant proteins and contains many hypothetical
           proteins. VID27 is a cytoplasmic protein that plays a
           potential role in vacuolar protein degradation.
          Length = 794

 Score = 31.6 bits (72), Expect = 0.27
 Identities = 22/76 (28%), Positives = 33/76 (43%), Gaps = 5/76 (6%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEG-ISWGMGDDAEEE 163
            E E++    + S L+     E    ER+  E+  E+E + E+EDEG       D+  EE
Sbjct: 365 KETEQDYILDAFSALE----IEDANTERDDEEEEDEEEEEEEDEDEGPSKEHSDDEEFEE 420

Query: 164 TDLSENPYASTNNEEL 179
            D+      S  N  L
Sbjct: 421 DDVESKYEDSDGNSSL 436


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 31.5 bits (71), Expect = 0.30
 Identities = 14/31 (45%), Positives = 21/31 (67%)

Query: 103 GPSEDEEEESELSVSELKEQRRQEKEKKERE 133
           G SE+EEEE E    E +E+  +E+E++E E
Sbjct: 861 GDSEEEEEEEEEEEEEEEEEEEEEEEEEENE 891



 Score = 30.3 bits (68), Expect = 0.86
 Identities = 20/84 (23%), Positives = 28/84 (33%), Gaps = 7/84 (8%)

Query: 102 QGPSEDEEEESELS-------VSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISW 154
            G    EE E            S  + ++  E E K     E  +  E K E+E EG   
Sbjct: 641 TGERTGEEGERPTEAEGENGEESGGEAEQEGETETKGENESEGEIPAERKGEQEGEGEIE 700

Query: 155 GMGDDAEEETDLSENPYASTNNEE 178
               D + ET+  E  +      E
Sbjct: 701 AKEADHKGETEAEEVEHEGETEAE 724



 Score = 27.3 bits (60), Expect = 7.4
 Identities = 13/31 (41%), Positives = 20/31 (64%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALE 136
           E+EEEE E    E +E+  +E+E +E  +LE
Sbjct: 867 EEEEEEEEEEEEEEEEEEEEEEENEEPLSLE 897



 Score = 27.3 bits (60), Expect = 8.3
 Identities = 16/73 (21%), Positives = 29/73 (39%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
             E++E E       +    + E  E+E   ++  +  + E+  +G     G D+EEE +
Sbjct: 809 AGEKDEHEGQSETQADDTEVKDETGEQELNAENQGEAKQDEKGVDGGGGSDGGDSEEEEE 868

Query: 166 LSENPYASTNNEE 178
             E        EE
Sbjct: 869 EEEEEEEEEEEEE 881



 Score = 26.9 bits (59), Expect = 10.0
 Identities = 26/110 (23%), Positives = 40/110 (36%), Gaps = 14/110 (12%)

Query: 102 QGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEE---------EDEGI 152
           +   E E +    S  E+  +R+ E+E  E E   K  + + +TE          E EG 
Sbjct: 668 EQEGETETKGENESEGEIPAERKGEQEG-EGEIEAKEADHKGETEAEEVEHEGETEAEGT 726

Query: 153 SW----GMGDDAEEETDLSENPYASTNNEELYLDDPKKTLRGWFDREGKG 198
                   G++ EE  D  E      +  E   D  +    G  + EGK 
Sbjct: 727 EDEGEIETGEEGEEVEDEGEGEAEGKHEVETEGDRKETEHEGETEAEGKE 776


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 31.2 bits (71), Expect = 0.36
 Identities = 23/78 (29%), Positives = 33/78 (42%), Gaps = 2/78 (2%)

Query: 103 GPSEDE-EEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE-GISWGMGDDA 160
           GP   E E ES+    E K + +++KE  E E LE   E + +         S     + 
Sbjct: 426 GPENGEKEAESKKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEE 485

Query: 161 EEETDLSENPYASTNNEE 178
           EEE    ENP+  T +  
Sbjct: 486 EEEELDEENPWLKTTSSV 503



 Score = 28.1 bits (63), Expect = 4.3
 Identities = 15/82 (18%), Positives = 25/82 (30%), Gaps = 5/82 (6%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
           E+ +EE     S+    RR+   +   +  E    ++    E  E       +  EEE  
Sbjct: 405 EESDEEENEEPSKKNVGRRKFGPENGEKEAESKKLKKENKNEFKEKK-----ESDEEEEL 459

Query: 166 LSENPYASTNNEELYLDDPKKT 187
             E            L   +K 
Sbjct: 460 EDEEEAKVEKVANKLLKRSEKA 481


>gnl|CDD|218311 pfam04888, SseC, Secretion system effector C (SseC) like family.
           SseC is a secreted protein that forms a complex together
           with SecB and SecD on the surface of Salmonella. All
           these proteins are secreted by the type III secretion
           system. Many mucosal pathogens use type III secretion
           systems for the injection of effector proteins into
           target cells. SecB, SseC and SecD are inserted into the
           target cell membrane. where they form a small pore or
           translocon. In addition to SseC, this family includes
           the bacterial secreted proteins PopB, PepB, YopB and
           EspD which are thought to be directly involved in pore
           formation, and type III secretion system translocon.
          Length = 303

 Score = 30.9 bits (70), Expect = 0.40
 Identities = 7/42 (16%), Positives = 28/42 (66%), Gaps = 2/42 (4%)

Query: 105 SEDEEEESELSVSELKE--QRRQEKEKKEREALEKSLEQEAK 144
           S+  E++++  + +L+    ++++K ++ +E ++K++E+  +
Sbjct: 10  SKLAEKQAKSKLQQLERARDKQEKKAEEYQEQIKKAIEKAEE 51


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 31.2 bits (70), Expect = 0.41
 Identities = 21/73 (28%), Positives = 40/73 (54%), Gaps = 8/73 (10%)

Query: 114 LSVSELKEQRRQEKEKKEREALEKSL-EQEAKTEEEDEGISWGMGDDAEEETDLSENPYA 172
           L+ S+L ++R +  EK +REA +K+  E+E + E+E E        + E E +      A
Sbjct: 572 LASSKLAKKREEAVEKAKREAEQKAREEREREKEKEKE-------REREREREAERAAKA 624

Query: 173 STNNEELYLDDPK 185
           S+++ E  + +P+
Sbjct: 625 SSSSHESRMSEPQ 637



 Score = 27.3 bits (60), Expect = 7.0
 Identities = 19/80 (23%), Positives = 35/80 (43%), Gaps = 6/80 (7%)

Query: 109 EEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSE 168
           EE  E +  E +++ R+E+E+      EK  E+E + E E E         +  E+ +SE
Sbjct: 582 EEAVEKAKREAEQKAREERER------EKEKEKEREREREREAERAAKASSSSHESRMSE 635

Query: 169 NPYASTNNEELYLDDPKKTL 188
              +   +     + P  T+
Sbjct: 636 PQLSGPAHMRPSFEPPPTTI 655


>gnl|CDD|217829 pfam03985, Paf1, Paf1.  Members of this family are components of
           the RNA polymerase II associated Paf1 complex. The Paf1
           complex functions during the elongation phase of
           transcription in conjunction with Spt4-Spt5 and
           Spt16-Pob3i.
          Length = 431

 Score = 30.9 bits (70), Expect = 0.46
 Identities = 11/60 (18%), Positives = 27/60 (45%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
           EDE+EE E    E +E+  ++ E++  ++ E    + +     D          ++ +++
Sbjct: 372 EDEDEEEEQRSDEHEEEEGEDSEEEGSQSREDGSSESSSDVGSDSESKADKESASDSDSE 431


>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
           Mitofilin controls mitochondrial cristae morphology.
           Mitofilin is enriched in the narrow space between the
           inner boundary and the outer membranes, where it forms a
           homotypic interaction and assembles into a large
           multimeric protein complex. The first 78 amino acids
           contain a typical amino-terminal-cleavable mitochondrial
           presequence rich in positive-charged and hydroxylated
           residues and a membrane anchor domain. In addition, it
           has three centrally located coiled coil domains.
          Length = 493

 Score = 30.8 bits (70), Expect = 0.46
 Identities = 13/42 (30%), Positives = 22/42 (52%), Gaps = 1/42 (2%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTE 146
            E+     E   + L++Q R E E+ E+E L K  E++ + E
Sbjct: 206 EEELLARLESKEAALEKQLRLEFER-EKEELRKKYEEKLRQE 246


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
           approximately 300 residues, found in plants and
           vertebrates. They contain a highly conserved DDRGK
           motif.
          Length = 189

 Score = 30.4 bits (69), Expect = 0.49
 Identities = 18/59 (30%), Positives = 29/59 (49%), Gaps = 3/59 (5%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEET 164
           E+ EEE E    E + + R+E+ +KE+E  EK    ++    E+EG      D+   E 
Sbjct: 46  EELEEEREKKKEEEERKEREEQARKEQEEYEK---LKSSFVVEEEGTDKLSADEESNEL 101


>gnl|CDD|221466 pfam12220, U1snRNP70_N, U1 small nuclear ribonucleoprotein of 70kDa
           MW N terminal.  This domain is found in eukaryotes. This
           domain is about 90 amino acids in length. This domain is
           found associated with pfam00076. This domain is part of
           U1 snRNP, which is the pre-mRNA binding protein of the
           penta-snRNP spliceosome complex. It extends over a
           distance of 180 A from its RNA binding domain, wraps
           around the core domain of U1 snRNP consisting of the
           seven Sm proteins and finally contacts U1-C, which is
           crucial for 5'-splice-site recognition.
          Length = 94

 Score = 29.2 bits (66), Expect = 0.50
 Identities = 12/40 (30%), Positives = 26/40 (65%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKT 145
           +D ++E     +E   ++R+ ++++++E LEK LE+E K 
Sbjct: 48  KDYKDEPPPEPTETWLEKREREKREKKEKLEKKLEEELKE 87


>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
           and chromosome partitioning].
          Length = 420

 Score = 30.5 bits (69), Expect = 0.55
 Identities = 12/49 (24%), Positives = 21/49 (42%), Gaps = 7/49 (14%)

Query: 106 EDEEEESELSVSELKEQR------RQEKEKKEREALEKSLEQEAKTEEE 148
             E+ E    +SE + Q+       +E++K   + L   L  + K  EE
Sbjct: 181 AAEQAELTTLLSEQRAQQAKLAQLLEERKKTLAQ-LNSELSADQKKLEE 228


>gnl|CDD|234767 PRK00448, polC, DNA polymerase III PolC; Validated.
          Length = 1437

 Score = 30.6 bits (70), Expect = 0.57
 Identities = 12/43 (27%), Positives = 27/43 (62%)

Query: 108 EEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           E ++S+  + + + Q+ +E EK  +EALE   + EA+ +++ +
Sbjct: 164 EIDDSKEELEKFEAQKEEEDEKLAKEALEAMKKLEAEKKKQSK 206


>gnl|CDD|218538 pfam05285, SDA1, SDA1.  This family consists of several SDA1
           protein homologues. SDA1 is a Saccharomyces cerevisiae
           protein which is involved in the control of the actin
           cytoskeleton. The protein is essential for cell
           viability and is localised in the nucleus.
          Length = 317

 Score = 30.4 bits (69), Expect = 0.58
 Identities = 14/44 (31%), Positives = 24/44 (54%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
           S D E+E E   +  K +   ++E  E +  E + E+EA+ E+E
Sbjct: 129 SSDSEDEEEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKE 172



 Score = 28.5 bits (64), Expect = 2.7
 Identities = 15/37 (40%), Positives = 23/37 (62%)

Query: 133 EALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSEN 169
           E LEK  E+E K +E ++G+     DD EEE ++ E+
Sbjct: 73  ELLEKWKEEERKKKEAEQGLESDDDDDEEEEWEVEED 109



 Score = 27.7 bits (62), Expect = 4.8
 Identities = 17/46 (36%), Positives = 23/46 (50%), Gaps = 1/46 (2%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
             D+E ES  S  E +E+    K+ KE    E S E E +  EE+E
Sbjct: 122 ESDKEIESSDSEDE-EEKDEAAKKAKEDSDEELSEEDEEEAAEEEE 166


>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex.  This
           entry is characterized by proteins with alternating
           conserved and low-complexity regions. Bud13 together
           with Snu17p and a newly identified factor,
           Pml1p/Ylr016c, form a novel trimeric complex. called The
           RES complex, pre-mRNA retention and splicing complex.
           Subunits of this complex are not essential for viability
           of yeasts but they are required for efficient splicing
           in vitro and in vivo. Furthermore, inactivation of this
           complex causes pre-mRNA leakage from the nucleus. Bud13
           contains a unique, phylogenetically conserved C-terminal
           region of unknown function.
          Length = 141

 Score = 29.6 bits (67), Expect = 0.60
 Identities = 21/75 (28%), Positives = 31/75 (41%), Gaps = 11/75 (14%)

Query: 123 RRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDD--------AEEETDLSENPYAST 174
           R  + E+K  E   +  E+E K E+E E   WG G           EE       P A  
Sbjct: 10  RIIDIEEKREEKEREKEEKERKEEKEKE---WGKGLVQKEEREKRLEELEKAKNKPLARY 66

Query: 175 NNEELYLDDPKKTLR 189
            ++E Y ++ K+  R
Sbjct: 67  ADDEDYDEELKEQER 81


>gnl|CDD|177296 PHA00728, PHA00728, hypothetical protein.
          Length = 151

 Score = 29.9 bits (67), Expect = 0.60
 Identities = 20/61 (32%), Positives = 28/61 (45%)

Query: 128 EKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSENPYASTNNEELYLDDPKKT 187
           +  E E L+K  E+  K   E E +      + +EE    ENPY  TN     L +PK T
Sbjct: 3   KLTEVEQLKKENEELKKKLAELEALMNNESAEEDEELQEIENPYTVTNRAISELVEPKDT 62

Query: 188 L 188
           +
Sbjct: 63  M 63


>gnl|CDD|220413 pfam09805, Nop25, Nucleolar protein 12 (25kDa).  Members of this
           family of proteins are part of the yeast nuclear pore
           complex-associated pre-60S ribosomal subunit. The family
           functions as a highly conserved exonuclease that is
           required for the 5'-end maturation of 5.8S and 25S
           rRNAs, demonstrating that 5'-end processing also has a
           redundant pathway. Nop25 binds late pre-60S ribosomes,
           accompanying them from the nucleolus to the nuclear
           periphery; and there is evidence for both physical and
           functional links between late 60S subunit processing and
           export.
          Length = 134

 Score = 29.2 bits (66), Expect = 0.69
 Identities = 21/72 (29%), Positives = 38/72 (52%), Gaps = 4/72 (5%)

Query: 118 ELKEQRRQ--EKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSENPYASTN 175
            ++E+R+Q  EK+ KER+   K LE+E   ++E++  +    D  ++E +    P  +  
Sbjct: 56  RIREERKQELEKQLKERKEALKLLEEEN--DDEEDAETEDTEDVEDDEWEGFPEPTVTDY 113

Query: 176 NEELYLDDPKKT 187
            EE   +D  KT
Sbjct: 114 EEEYIDEDKYKT 125


>gnl|CDD|205436 pfam13256, DUF4047, Domain of unknown function (DUF4047).  This
           presumed domain is functionally uncharacterized. This
           domain family is found in bacteria, and is approximately
           130 amino acids in length. There are two conserved
           sequence motifs: TEA and FPKT.
          Length = 123

 Score = 29.1 bits (65), Expect = 0.70
 Identities = 15/51 (29%), Positives = 27/51 (52%)

Query: 99  FILQGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEED 149
            IL      ++E    S+  L+++    KE++E+ A+E+   Q+  TE ED
Sbjct: 42  VILHEYEGMKKEVKATSIEVLEQRLVSWKEQREKVAVEREALQKIYTEIED 92


>gnl|CDD|225368 COG2811, NtpF, Archaeal/vacuolar-type H+-ATPase subunit H [Energy
           production and conversion].
          Length = 108

 Score = 28.9 bits (65), Expect = 0.76
 Identities = 17/55 (30%), Positives = 34/55 (61%)

Query: 111 ESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
           ++E+S  E  E+ ++E E+  +EA E++ E   + EEE E ++  + ++A EE +
Sbjct: 14  KAEISADEEIEEAKEEAEQIIKEAREEAREIIEEAEEEAEKLAQEILEEAREEAE 68


>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein.  The YqfQ-like protein family
           includes the B. subtilis YqfQ protein, also known as
           VrrA, which is functionally uncharacterized. This family
           of proteins is found in bacteria. Proteins in this
           family are typically between 146 and 237 amino acids in
           length. There are two conserved sequence motifs: QYGP
           and PKLY.
          Length = 155

 Score = 29.3 bits (66), Expect = 0.80
 Identities = 13/44 (29%), Positives = 20/44 (45%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEE 147
              +EE   E    +  E + + KEKK+RE  +   E+E    E
Sbjct: 99  EETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTE 142



 Score = 28.6 bits (64), Expect = 1.4
 Identities = 16/47 (34%), Positives = 25/47 (53%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
            S+DEEEE+E   ++  EQ    + K E +  +K    + KTE+E  
Sbjct: 93  SSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKP 139


>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT.  This family
           consists of several bacterial cobalamin biosynthesis
           (CobT) proteins. CobT is involved in the transformation
           of precorrin-3 into cobyrinic acid.
          Length = 282

 Score = 29.8 bits (67), Expect = 0.86
 Identities = 13/51 (25%), Positives = 24/51 (47%)

Query: 121 EQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSENPY 171
           E    + E  E E   K  E + + EEE+ G S  + +D++  ++  E+  
Sbjct: 215 EPESADSEDNEDEDDPKEDEDDDQGEEEESGSSDSLSEDSDASSEEMESGE 265


>gnl|CDD|152492 pfam12057, DUF3538, Domain of unknown function (DUF3538).  This
          presumed domain is functionally uncharacterized. This
          domain is found in eukaryotes. This domain is about 120
          amino acids in length. This domain is found associated
          with pfam00240. This domain has a conserved SDL
          sequence motif.
          Length = 118

 Score = 29.0 bits (65), Expect = 0.87
 Identities = 12/22 (54%), Positives = 14/22 (63%)

Query: 32 LHPTVSRYHAILQYKSTFDEKD 53
          L P + RY+ ILQ   TFDE D
Sbjct: 15 LQPFLQRYYDILQNDPTFDEND 36


>gnl|CDD|236545 PRK09510, tolA, cell envelope integrity inner membrane protein
           TolA; Provisional.
          Length = 387

 Score = 29.8 bits (67), Expect = 0.91
 Identities = 13/43 (30%), Positives = 22/43 (51%), Gaps = 5/43 (11%)

Query: 108 EEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           +++  EL   +  EQ R ++ +KER A      QE K + E+ 
Sbjct: 86  QQQAEELQQKQAAEQERLKQLEKERLA-----AQEQKKQAEEA 123



 Score = 29.0 bits (65), Expect = 1.8
 Identities = 10/43 (23%), Positives = 22/43 (51%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
           +   E+  L   E +    QE++K+  EA +++  ++ + EE 
Sbjct: 95  KQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAALKQKQAEEA 137



 Score = 28.6 bits (64), Expect = 2.5
 Identities = 10/45 (22%), Positives = 21/45 (46%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           +  EE  +   +E +  ++ EKE+   +  +K  E+ AK     +
Sbjct: 87  QQAEELQQKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAALKQ 131



 Score = 27.1 bits (60), Expect = 6.6
 Identities = 10/41 (24%), Positives = 15/41 (36%)

Query: 110 EESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           E      +   ++   E +KK      K    EAK + E E
Sbjct: 150 EAEAKRAAAAAKKAAAEAKKKAEAEAAKKAAAEAKKKAEAE 190


>gnl|CDD|219978 pfam08701, GN3L_Grn1, GNL3L/Grn1 putative GTPase.  Grn1 (yeast) and
           GNL3L (human) are putative GTPases which are required
           for growth and play a role in processing of nucleolar
           pre-rRNA. This family contains a potential nuclear
           localisation signal.
          Length = 80

 Score = 28.0 bits (63), Expect = 0.96
 Identities = 8/31 (25%), Positives = 19/31 (61%)

Query: 118 ELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
           E++E++R+++E+KER    +  E+    +  
Sbjct: 50  EIEEKKRKQEEEKERRKEARKAERAEARKRG 80


>gnl|CDD|130712 TIGR01651, CobT, cobaltochelatase, CobT subunit.  This model
           describes Pseudomonas denitrificans CobT gene product,
           which is a cobalt chelatase subunit that functions in
           cobalamin biosynthesis. Cobalamin (vitamin B12) can be
           synthesized via several pathways, including an aerobic
           pathway (found in Pseudomonas denitrificans) and an
           anaerobic pathway (found in P. shermanii and Salmonella
           typhimurium). These pathways differ in the point of
           cobalt insertion during corrin ring formation. There are
           apparently a number of variations on these two pathways,
           where the major differences seem to be concerned with
           the process of ring contraction. Confusion regarding the
           functions of enzymes found in the aerobic vs. anaerobic
           pathways has arisen because nonhomologous genes in these
           different pathways were given the same gene symbols.
           Thus, cobT in the aerobic pathway (P. denitrificans) is
           not a homolog of cobT in the anaerobic pathway (S.
           typhimurium). It should be noted that E. coli
           synthesizes cobalamin only when it is supplied with the
           precursor cobinamide, which is a complex intermediate.
           Additionally, all E. coli cobalamin synthesis genes
           (cobU, cobS and cobT) were named after their Salmonella
           typhimurium homologs which function in the anaerobic
           cobalamin synthesis pathway. This model describes the
           aerobic cobalamin pathway Pseudomonas denitrificans CobT
           gene product, which is a cobalt chelatase subunit, with
           a MW ~70 kDa. The aerobic pathway cobalt chelatase is a
           heterotrimeric, ATP-dependent enzyme that catalyzes
           cobalt insertion during cobalamin biosynthesis. The
           other two subunits are the P. denitrificans CobS
           (TIGR01650) and CobN (pfam02514 CobN/Magnesium
           Chelatase) proteins. To avoid potential confusion with
           the nonhomologous Salmonella typhimurium/E.coli cobT
           gene product, the P. denitrificans gene symbol is not
           used in the name of this model [Biosynthesis of
           cofactors, prosthetic groups, and carriers, Heme,
           porphyrin, and cobalamin].
          Length = 600

 Score = 29.9 bits (67), Expect = 0.98
 Identities = 23/90 (25%), Positives = 35/90 (38%), Gaps = 14/90 (15%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEE 163
            SEDEE+  +   +E  EQ  Q + + E +      E EA   E + G    +  D ++ 
Sbjct: 208 ESEDEEDGDDDQPTE-NEQEEQGEGEGEGQEGSAPQESEATDRESESGEEEMVQSDQDDL 266

Query: 164 TDLSE-------------NPYASTNNEELY 180
            D S+              P+ ST  E  Y
Sbjct: 267 PDESDDDSETPGEGARPARPFTSTGGEPDY 296



 Score = 28.0 bits (62), Expect = 4.2
 Identities = 19/99 (19%), Positives = 35/99 (35%), Gaps = 6/99 (6%)

Query: 103 GPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEED------EGISWGM 156
           G   + E+E +    +  E  ++E+ + E E  E S  QE++  + +      E +    
Sbjct: 204 GDDTESEDEEDGDDDQPTENEQEEQGEGEGEGQEGSAPQESEATDRESESGEEEMVQSDQ 263

Query: 157 GDDAEEETDLSENPYASTNNEELYLDDPKKTLRGWFDRE 195
            D  +E  D SE P         +     +     F   
Sbjct: 264 DDLPDESDDDSETPGEGARPARPFTSTGGEPDYKVFTTA 302


>gnl|CDD|233137 TIGR00811, sit, silicon transporter.  Marine diatoms such as
           Cylindrotheca fusiformis encode at least six silicon
           transport protein homologues which exhibit similar size
           and topology. One characterized member of the family
           (Sit1) functions in the energy-dependent uptake of
           either Silicic acid [Si(OH)4] or Silicate [Si(OH)3O-] by
           a Na+ symport mechanism. The system is found in marine
           diatoms which make their "glass houses" out of silicon
           [Transport and binding proteins, Other].
          Length = 545

 Score = 29.9 bits (67), Expect = 1.1
 Identities = 24/103 (23%), Positives = 39/103 (37%), Gaps = 14/103 (13%)

Query: 84  IHVGHMLSFGSSTRFFILQGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEA 143
           +H G   + G++    +L  P+  EE  + LS  E    RR+   K  RE   + +   A
Sbjct: 448 VHAGREFTMGTN----VLNDPANWEEAIANLSALETFSVRRERMLKNIREL--REMINNA 501

Query: 144 KTEEEDEGISWGMGDDAEEETDLSENPYASTNNEELYLDDPKK 186
            ++EE       +  + +    L        N EE      KK
Sbjct: 502 ISDEEKTTFEAALAIEVKALDKL--------NAEEEEEATNKK 536


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 30.1 bits (67), Expect = 1.1
 Identities = 13/43 (30%), Positives = 24/43 (55%)

Query: 106  EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
            E ++ E E  +   +E ++ E++KK+ E  +K+ E E K  E 
Sbjct: 1651 ELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEA 1693



 Score = 29.7 bits (66), Expect = 1.4
 Identities = 18/68 (26%), Positives = 35/68 (51%)

Query: 101  LQGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDA 160
            L+   E++++  +L   E +E+++ E+ KK  E  +    +EAK  EED+  +       
Sbjct: 1625 LKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAE 1684

Query: 161  EEETDLSE 168
            E+E   +E
Sbjct: 1685 EDEKKAAE 1692



 Score = 29.3 bits (65), Expect = 1.7
 Identities = 19/68 (27%), Positives = 36/68 (52%)

Query: 101  LQGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDA 160
             +   EDE++ +E    E +E ++ E+ KK+    +K  E+  K EEE++  +     +A
Sbjct: 1680 AKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEA 1739

Query: 161  EEETDLSE 168
            EE+   +E
Sbjct: 1740 EEDKKKAE 1747



 Score = 28.2 bits (62), Expect = 4.3
 Identities = 16/60 (26%), Positives = 29/60 (48%), Gaps = 5/60 (8%)

Query: 108  EEEESELSVSELK-----EQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEE 162
             EEE+++  +E       ++++ E+ KK  E  +K+ E   K  EE +        +AEE
Sbjct: 1655 AEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEE 1714



 Score = 27.8 bits (61), Expect = 4.7
 Identities = 20/91 (21%), Positives = 39/91 (42%)

Query: 105  SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEET 164
            +E+ ++  EL   E +E+++ E+ KK  E  +   E+  K  EED+  +     D EE+ 
Sbjct: 1698 AEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKK 1757

Query: 165  DLSENPYASTNNEELYLDDPKKTLRGWFDRE 195
             ++          E    + +  +    D E
Sbjct: 1758 KIAHLKKEEEKKAEEIRKEKEAVIEEELDEE 1788


>gnl|CDD|237035 PRK12280, rplW, 50S ribosomal protein L23; Reviewed.
          Length = 158

 Score = 29.0 bits (65), Expect = 1.2
 Identities = 9/43 (20%), Positives = 24/43 (55%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
           E+ E+E +    E +E+   + +K+++E  EK + ++   ++ 
Sbjct: 93  EESEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKS 135


>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381).  This
           domain is functionally uncharacterized. This domain is
           found in eukaryotes. This presumed domain is typically
           between 156 to 174 amino acids in length. This domain is
           found associated with pfam07780, pfam01728.
          Length = 154

 Score = 28.8 bits (65), Expect = 1.2
 Identities = 11/32 (34%), Positives = 21/32 (65%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREAL 135
             E  +E  E  +++LK ++R+E E+K++E L
Sbjct: 116 EEEQIDELLEKELAKLKREKRRENERKQKEIL 147



 Score = 27.6 bits (62), Expect = 2.8
 Identities = 10/41 (24%), Positives = 22/41 (53%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTE 146
           E EE + E  + EL E+   + ++++R   E+  ++  K +
Sbjct: 110 EVEELDEEEQIDELLEKELAKLKREKRRENERKQKEILKEQ 150


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
           Provisional.
          Length = 482

 Score = 29.5 bits (67), Expect = 1.4
 Identities = 17/82 (20%), Positives = 38/82 (46%), Gaps = 7/82 (8%)

Query: 107 DEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDL 166
            ++   ++     K ++++E+EKKE++    + +++ + EEE++       +  EEE + 
Sbjct: 405 SKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEK-------EKKEEEKEE 457

Query: 167 SENPYASTNNEELYLDDPKKTL 188
            E        EE      + TL
Sbjct: 458 EEEEAEEEKEEEEEKKKKQATL 479


>gnl|CDD|235334 PRK05035, PRK05035, electron transport complex protein RnfC;
           Provisional.
          Length = 695

 Score = 29.5 bits (67), Expect = 1.5
 Identities = 14/49 (28%), Positives = 24/49 (48%), Gaps = 9/49 (18%)

Query: 106 EDEEEESELSVSELKE-----QRRQEKEKKEREALEKSLEQEAKTEEED 149
           E E++++E +    K      Q R E+EK  REA  K   +    +++D
Sbjct: 442 EQEKKKAEEA----KARFEARQARLEREKAAREARHKKAAEARAAKDKD 486



 Score = 28.0 bits (63), Expect = 4.3
 Identities = 15/85 (17%), Positives = 28/85 (32%), Gaps = 4/85 (4%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDD----AE 161
           E E+   E    +  E R  + +     AL +   ++A   +     +    D+    A 
Sbjct: 463 EREKAAREARHKKAAEARAAKDKDAVAAALARVKAKKAAATQPIVIKAGARPDNSAVIAA 522

Query: 162 EETDLSENPYASTNNEELYLDDPKK 186
            E   ++        +     DPKK
Sbjct: 523 REARKAQARARQAEKQAAAAADPKK 547


>gnl|CDD|216289 pfam01080, Presenilin, Presenilin.  Mutations in presenilin-1 are a
           major cause of early onset Alzheimer's disease. It has
           been found that presenilin-1 binds to beta-catenin
           in-vivo. This family also contains SPE proteins from
           C.elegans.
          Length = 403

 Score = 29.4 bits (66), Expect = 1.5
 Identities = 18/55 (32%), Positives = 26/55 (47%), Gaps = 1/55 (1%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE-GISWGMGD 158
           S+DE + SE           +E   ++ E    SL    K EEE+E G+  G+GD
Sbjct: 278 SDDESDSSETESQSDSSLAPEEDAAEQPEVQSNSLPSNEKREEEEERGVKLGLGD 332


>gnl|CDD|237744 PRK14521, rpsP, 30S ribosomal protein S16; Provisional.
          Length = 186

 Score = 28.6 bits (64), Expect = 1.5
 Identities = 13/64 (20%), Positives = 25/64 (39%)

Query: 115 SVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSENPYAST 174
            +S+ K+  ++   + E++  E   E  A+ +  +          A EE +  E P    
Sbjct: 120 KLSKAKKAAKKAALEAEKKVNEARAEAVAEKKAAEAAAVAAEEAAAAEEEEAEEAPAEEA 179

Query: 175 NNEE 178
             EE
Sbjct: 180 PAEE 183


>gnl|CDD|183610 PRK12585, PRK12585, putative monovalent cation/H+ antiporter
           subunit G; Reviewed.
          Length = 197

 Score = 28.9 bits (64), Expect = 1.6
 Identities = 17/55 (30%), Positives = 30/55 (54%)

Query: 108 EEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEE 162
           E E  E  + E ++Q  QE+E++E+   E+S + E +  E+DE  +    D  E+
Sbjct: 143 EWERREEKIDEREDQEEQEREREEQTIEEQSDDSEHEIIEQDESETESDDDKTEK 197



 Score = 28.1 bits (62), Expect = 3.0
 Identities = 20/60 (33%), Positives = 36/60 (60%), Gaps = 4/60 (6%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
           E+ EE  E    E K   R+++E++ERE  E+++E+++  + E E I     D++E E+D
Sbjct: 136 EELEERMEWERREEKIDEREDQEEQEREREEQTIEEQSD-DSEHEIIE---QDESETESD 191



 Score = 27.0 bits (59), Expect = 6.0
 Identities = 13/29 (44%), Positives = 20/29 (68%)

Query: 122 QRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           Q + EK ++ERE LE+ +E E + E+ DE
Sbjct: 125 QEQIEKARQEREELEERMEWERREEKIDE 153


>gnl|CDD|234352 TIGR03779, Bac_Flav_CT_M, Bacteroides conjugative transposon TraM
           protein.  Members of this protein family are designated
           TraM and are found in a proposed transfer region of a
           class of conjugative transposon found in the Bacteroides
           lineage [Cellular processes, DNA transformation].
          Length = 410

 Score = 28.9 bits (65), Expect = 1.8
 Identities = 16/50 (32%), Positives = 23/50 (46%), Gaps = 4/50 (8%)

Query: 99  FILQGPSEDEEEES---ELSVSELKEQRRQEKEKKEREAL-EKSLEQEAK 144
           F     +++E+E     E   S L  +     E +E+ AL EKS E  AK
Sbjct: 136 FYEYPKTDEEKELLREVEELESRLATEPSPAPELEEQLALMEKSYELAAK 185


>gnl|CDD|237869 PRK14962, PRK14962, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 472

 Score = 29.0 bits (65), Expect = 1.8
 Identities = 17/65 (26%), Positives = 35/65 (53%), Gaps = 2/65 (3%)

Query: 83  RIHVGHMLSFGSSTRFFILQGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQE 142
           +  V  + S   +TRF        D EE+++ S  + KE++++E + KE +  ++ +E E
Sbjct: 319 KRLVCKLGSASIATRFSSPNVQENDVEEKNDNSNVQQKEKKKEESKAKEEK--QEDIEFE 376

Query: 143 AKTEE 147
            + +E
Sbjct: 377 KRFKE 381


>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
          Length = 651

 Score = 29.2 bits (66), Expect = 1.8
 Identities = 11/31 (35%), Positives = 18/31 (58%)

Query: 120 KEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           KE++   KE+K    L+K  E++ K  E+ E
Sbjct: 557 KEEKEALKEQKRLRKLKKQEEKKKKELEKLE 587



 Score = 28.1 bits (63), Expect = 4.3
 Identities = 17/37 (45%), Positives = 22/37 (59%), Gaps = 4/37 (10%)

Query: 108 EEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAK 144
           E+EE E     LKEQ+R  K KK+ E  +K LE+  K
Sbjct: 556 EKEEKE----ALKEQKRLRKLKKQEEKKKKELEKLEK 588



 Score = 26.9 bits (60), Expect = 8.4
 Identities = 18/81 (22%), Positives = 33/81 (40%), Gaps = 13/81 (16%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
           ED+ +   +   + KE+ ++EKE+KE    +K L +  K EE+ +        + E+   
Sbjct: 536 EDKPDGPSVWKLDDKEELQREKEEKEALKEQKRLRKLKKQEEKKK-------KELEKLEK 588

Query: 166 LSENPYASTNNEELYLDDPKK 186
               P       E +     K
Sbjct: 589 AKIPP------AEFFKRQEDK 603


>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27.  This protein forms
           the C subunit of DNA polymerase delta. It carries the
           essential residues for binding to the Pol1 subunit of
           polymerase alpha, from residues 293-332, which are
           characterized by the motif D--G--VT, referred to as the
           DPIM motif. The first 160 residues of the protein form
           the minimal domain for binding to the B subunit, Cdc1,
           of polymerase delta, the final 10 C-terminal residues,
           362-372, being the DNA sliding clamp, PCNA, binding
           motif.
          Length = 427

 Score = 29.0 bits (65), Expect = 2.0
 Identities = 19/78 (24%), Positives = 32/78 (41%), Gaps = 3/78 (3%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQ-EAKTEEEDEGISWGMGDDAEE 162
              D   E E +     ++   E E K       S E+ E K +E+ + +   M D+ E+
Sbjct: 249 GKRDVILEDESAEPTGLDEDEDEDEPKPSGERSDSEEETEEKEKEKRKRLKKMMEDEDED 308

Query: 163 E--TDLSENPYASTNNEE 178
           E    + E+P     +EE
Sbjct: 309 EEMEIVPESPVEEEESEE 326



 Score = 27.1 bits (60), Expect = 7.2
 Identities = 23/70 (32%), Positives = 34/70 (48%), Gaps = 6/70 (8%)

Query: 103 GPSEDEEEESELSVSEL--KEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDA 160
           G  EDE+E+      E    E+  +EKEK++R+ L+K +E     E+EDE +        
Sbjct: 264 GLDEDEDEDEPKPSGERSDSEEETEEKEKEKRKRLKKMME----DEDEDEEMEIVPESPV 319

Query: 161 EEETDLSENP 170
           EEE      P
Sbjct: 320 EEEESEEPEP 329


>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
           envelope biogenesis, outer membrane].
          Length = 387

 Score = 28.8 bits (64), Expect = 2.0
 Identities = 10/41 (24%), Positives = 24/41 (58%)

Query: 108 EEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
           + +E +    E ++Q + E++++E +A + + EQ+ K E  
Sbjct: 112 KAQEQQKQAEEAEKQAQLEQKQQEEQARKAAAEQKKKAEAA 152



 Score = 27.6 bits (61), Expect = 4.5
 Identities = 14/51 (27%), Positives = 27/51 (52%), Gaps = 5/51 (9%)

Query: 101 LQGPSEDEEEESELSVSELKEQRRQEKEK-KEREALEKSLEQEAKTEEEDE 150
           L+     E+E     + +L+++R + +E+ K+ E  EK  + E K +EE  
Sbjct: 92  LKPKQAAEQER----LKQLEKERLKAQEQQKQAEEAEKQAQLEQKQQEEQA 138


>gnl|CDD|227446 COG5116, RPN2, 26S proteasome regulatory complex component
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 926

 Score = 29.1 bits (65), Expect = 2.1
 Identities = 13/68 (19%), Positives = 28/68 (41%), Gaps = 3/68 (4%)

Query: 104 PSEDEEEESELSV--SELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAE 161
            S     +   +V  + +K   R +++ KE+   +K ++ E+ + E  EG    +    E
Sbjct: 774 ASGKSVRKVNTAVLSTTIKAAARAKQKPKEKGPNDKEIKIESPSVET-EGERCTIKQREE 832

Query: 162 EETDLSEN 169
           +  D    
Sbjct: 833 KGIDAPAI 840


>gnl|CDD|227466 COG5137, COG5137, Histone chaperone involved in gene silencing
           [Transcription / Chromatin structure and dynamics].
          Length = 279

 Score = 28.8 bits (64), Expect = 2.2
 Identities = 18/58 (31%), Positives = 27/58 (46%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEE 163
             EEEE E   S+   +  +E  ++E E  E S + E   + E E I    G++ E E
Sbjct: 187 GREEEEDEEVGSDSYGEGNRELNEEEEEEAEGSDDGEDVVDYEGERIDKKQGEEEEME 244


>gnl|CDD|178515 PLN02927, PLN02927, antheraxanthin epoxidase/zeaxanthin epoxidase.
          Length = 668

 Score = 28.9 bits (64), Expect = 2.2
 Identities = 19/67 (28%), Positives = 29/67 (43%), Gaps = 13/67 (19%)

Query: 36  VSRYHAILQYKSTFDEKDPARGFYVYDLGSTHGTFLN-----RCKIKPKMYVRIHVGHML 90
           VS+ HA + YK           F++ DL S HGT++      R +  P    R     ++
Sbjct: 581 VSKMHARVIYKDG--------AFFLMDLRSEHGTYVTDNEGRRYRATPNFPARFRSSDII 632

Query: 91  SFGSSTR 97
            FGS  +
Sbjct: 633 EFGSDKK 639


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
           family of proteins are involved in 60S ribosomal
           biogenesis. They are specifically involved in the
           processing beyond the 27S stage of 25S rRNA maturation.
           This family contains sequences that bear similarity to
           the glioma tumour suppressor candidate region gene 2
           protein (p60). This protein has been found to interact
           with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 28.5 bits (64), Expect = 2.3
 Identities = 11/29 (37%), Positives = 15/29 (51%)

Query: 120 KEQRRQEKEKKEREALEKSLEQEAKTEEE 148
           K QR +EK +KE E   K  +Q  K   +
Sbjct: 280 KAQRNKEKRRKELEREAKEEKQLKKKLAQ 308



 Score = 27.7 bits (62), Expect = 4.5
 Identities = 18/49 (36%), Positives = 28/49 (57%)

Query: 120 KEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSE 168
            E++RQE E+ E + LEK   + ++ +E  EG+     DD EEE+D   
Sbjct: 209 AEKKRQELERVEEKKLEKMAPEASRLDEMSEGLLEESDDDGEEESDDES 257


>gnl|CDD|184860 PRK14858, tatA, twin arginine translocase protein A; Provisional.
          Length = 108

 Score = 27.5 bits (61), Expect = 2.3
 Identities = 11/34 (32%), Positives = 19/34 (55%), Gaps = 2/34 (5%)

Query: 118 ELKEQRRQEKEK--KEREALEKSLEQEAKTEEED 149
           + + +  +EKEK  K  E  +++   EAK EE+ 
Sbjct: 51  QEESRTAEEKEKAEKLAETKKEAEAPEAKAEEDQ 84


>gnl|CDD|179712 PRK04019, rplP0, acidic ribosomal protein P0; Validated.
          Length = 330

 Score = 28.7 bits (65), Expect = 2.4
 Identities = 11/51 (21%), Positives = 24/51 (47%), Gaps = 3/51 (5%)

Query: 107 DEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMG 157
            +++  +  + E+   + Q    +E E  E   E+E + E  +E  + G+G
Sbjct: 279 ADKDALDEELKEVLSAQAQAAAAEEEEEEE---EEEEEEEPSEEEAAAGLG 326


>gnl|CDD|216670 pfam01733, Nucleoside_tran, Nucleoside transporter.  This is a
           family of nucleoside transporters. In mammalian cells
           nucleoside transporters transport nucleoside across the
           plasma membrane and are essential for nucleotide
           synthesis via the salvage pathways for cells that lack
           their own de novo synthesis pathways. Also in this
           family is mouse and human nucleolar protein HNP36, a
           protein of unknown function; although it has been
           hypothesised to be a plasma membrane nucleoside
           transporter.
          Length = 305

 Score = 28.5 bits (64), Expect = 2.5
 Identities = 12/49 (24%), Positives = 22/49 (44%), Gaps = 1/49 (2%)

Query: 114 LSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEE 162
           +S++ LK+ R   K   + +  E + E E   EE       G  D+++ 
Sbjct: 78  ISLNVLKKLRFY-KYYWQLKERETNEELEQVDEESSGSNLNGTFDNSKP 125


>gnl|CDD|221756 pfam12757, DUF3812, Protein of unknown function (DUF3812).  This is
           a family of fungal proteins whose function is not known.
          Length = 126

 Score = 27.6 bits (62), Expect = 2.5
 Identities = 11/30 (36%), Positives = 17/30 (56%)

Query: 121 EQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
            QR +++EKK  E   K   +EAK  E ++
Sbjct: 95  AQRARDEEKKLDEEEAKRQHEEAKEREREK 124


>gnl|CDD|221250 pfam11831, Myb_Cef, pre-mRNA splicing factor component.  This
           family is a region of the Myb-Related Cdc5p/Cef1
           proteins, in fungi, and is part of the pre-mRNA splicing
           factor complex.
          Length = 363

 Score = 28.5 bits (64), Expect = 2.5
 Identities = 11/34 (32%), Positives = 19/34 (55%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEK 137
             E+ EEE E   ++   ++R  +E KE+E L +
Sbjct: 156 EPEEMEEELEEDAADRDARKRAAEEAKEQEELRR 189


>gnl|CDD|217502 pfam03343, SART-1, SART-1 family.  SART-1 is a protein involved in
           cell cycle arrest and pre-mRNA splicing. It has been
           shown to be a component of U4/U6 x U5 tri-snRNP complex
           in human, Schizosaccharomyces pombe and Saccharomyces
           cerevisiae. SART-1 is a known tumour antigen in a range
           of cancers recognised by T cells.
          Length = 603

 Score = 28.6 bits (64), Expect = 2.6
 Identities = 12/67 (17%), Positives = 30/67 (44%), Gaps = 4/67 (5%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALE---KSLEQEAKTEEEDEGISWGMGDDA 160
           P   +E     +    K+++ + + K++RE L        ++ +   +  GI   +G+D 
Sbjct: 27  PGSTKESRDAAAYENWKKRQEEAEAKRKREELREKIAKAREKRERNSKLGGIK-TLGEDD 85

Query: 161 EEETDLS 167
           +++ D  
Sbjct: 86  DDDDDTK 92


>gnl|CDD|218380 pfam05010, TACC, Transforming acidic coiled-coil-containing protein
           (TACC).  This family contains the proteins TACC 1, 2 and
           3 the genes for which are found concentrated in the
           centrosomes of eukaryotic and may play a conserved role
           in organising centrosomal microtubules. The human TACC
           proteins have been linked to cancer and TACC2 has been
           identified as a possible tumour suppressor (AZU-1). The
           functional homologue (Alp7) in Schizosaccharomyces pombe
           has been shown to be required for organisation of
           bipolar spindles.
          Length = 207

 Score = 28.2 bits (63), Expect = 2.7
 Identities = 13/44 (29%), Positives = 22/44 (50%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
            E  +  S+        Q    KE+ + ++LE++LEQ+ K  EE
Sbjct: 150 EEIAQVRSKAKAETAALQASLRKEQMKVQSLEETLEQKNKENEE 193


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 28.4 bits (64), Expect = 2.9
 Identities = 18/44 (40%), Positives = 26/44 (59%), Gaps = 7/44 (15%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSL--EQEAKTEE 147
             EEE  E    E +E +++EK+K+EREA    L  E++ K EE
Sbjct: 275 AAEEERQE----EAQE-KKEEKKKEEREAKLAKLSPEEQRKLEE 313


>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510).  This
           family consists of several hypothetical bacterial
           proteins of around 200 residues in length. The function
           of this family is unknown.
          Length = 214

 Score = 28.2 bits (63), Expect = 2.9
 Identities = 17/65 (26%), Positives = 33/65 (50%), Gaps = 3/65 (4%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETD 165
           E +EEE E + SE KE +   +++ E E+ E++ E++ ++ +E+E  +            
Sbjct: 64  EVKEEEKEAANSEDKEDKGDAEKEDE-ESEEENEEEDEESSDENEKET--EEKTESNVEK 120

Query: 166 LSENP 170
              NP
Sbjct: 121 EITNP 125


>gnl|CDD|220759 pfam10446, DUF2457, Protein of unknown function (DUF2457).  This is
           a family of uncharacterized proteins.
          Length = 449

 Score = 28.4 bits (63), Expect = 3.0
 Identities = 17/64 (26%), Positives = 34/64 (53%), Gaps = 1/64 (1%)

Query: 124 RQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSENPYASTNNEELYLDD 183
           R+  ++ E EA+E+  + E   +++D+       DD +E+ D  E+   ST +++   DD
Sbjct: 39  RKLGKEAEEEAMEEEDDDEEDDDDDDDEDEDDDDDDDDED-DEDEDDDDSTLHDDSSADD 97

Query: 184 PKKT 187
             +T
Sbjct: 98  GNET 101



 Score = 28.0 bits (62), Expect = 3.8
 Identities = 23/101 (22%), Positives = 42/101 (41%), Gaps = 11/101 (10%)

Query: 103 GPSEDEEEESELSVSELKEQ----RRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGD 158
             SE E+++      + KE+       +KE   R+ L K  E+EA  EE+D+       D
Sbjct: 6   ASSELEDDDWVRGSLDYKEKLTLNDTMKKENAIRK-LGKEAEEEAMEEEDDDE-----ED 59

Query: 159 DAEEETDLSENPYASTNNEELYLDDPKKTLRGWFDREGKGF 199
           D +++ +  ++     + ++   DD   TL         G 
Sbjct: 60  DDDDDDEDEDDDDDDDDEDDEDEDDDDSTLH-DDSSADDGN 99


>gnl|CDD|226396 COG3879, COG3879, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 247

 Score = 28.1 bits (63), Expect = 3.0
 Identities = 13/71 (18%), Positives = 23/71 (32%), Gaps = 9/71 (12%)

Query: 89  MLSFGSSTRFFILQ---------GPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSL 139
           MLS   +     +          G S     + +L       Q++      E E LE  L
Sbjct: 21  MLSISLAMLLAGVMLAAVFQTSKGESVRRARDLDLVKELRSLQKKVNTLAAEVEDLENKL 80

Query: 140 EQEAKTEEEDE 150
           +   ++   D+
Sbjct: 81  DSVRRSVLTDD 91


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 28.4 bits (63), Expect = 3.2
 Identities = 13/62 (20%), Positives = 29/62 (46%), Gaps = 2/62 (3%)

Query: 106  EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE--GISWGMGDDAEEE 163
            ED +++    ++E  E+  ++  ++  +  E+S E   K++EE E   +      D   +
Sbjct: 4045 EDIQQDDFSDLAEDDEKMNEDGFEENVQENEESTEDGVKSDEELEQGEVPEDQAIDNHPK 4104

Query: 164  TD 165
             D
Sbjct: 4105 MD 4106


>gnl|CDD|220634 pfam10220, DUF2146, Uncharacterized conserved protein (DUF2146).
           This is a family of proteins conserved from plants to
           humans. In Dictyostelium it is annotated as Mss11p but
           this could not be confirmed. Mss11p is required for the
           activation of pseudo-hyphal and invasive growth by
           Ste12p in yeast.
          Length = 890

 Score = 28.3 bits (63), Expect = 3.2
 Identities = 12/59 (20%), Positives = 20/59 (33%), Gaps = 1/59 (1%)

Query: 90  LSFGSSTRFFILQGPSEDEEE-ESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEE 147
            S  S   F          EE ++    +       +E  K  RE   ++L ++  T E
Sbjct: 593 PSDASDLNFSTASSSEASSEESDNYARPTSRSGTDEEEASKTAREKRPQALARQPSTTE 651


>gnl|CDD|224307 COG1389, COG1389, DNA topoisomerase VI, subunit B [DNA replication,
           recombination, and repair].
          Length = 538

 Score = 28.1 bits (63), Expect = 3.5
 Identities = 18/64 (28%), Positives = 31/64 (48%), Gaps = 13/64 (20%)

Query: 103 GPSEDEEEESELSVSE--------LKEQRRQEKEKKEREALEKSLEQEAK-----TEEED 149
               + E E  L++ E        L  +RR+ +E+K+R+ +EK L + AK      E+ +
Sbjct: 438 ADVPEIENEIRLALMEVARKLKLYLSRKRREMEERKKRKTIEKYLPEIAKKLAEILEKPE 497

Query: 150 EGIS 153
           E I 
Sbjct: 498 EDIR 501


>gnl|CDD|223394 COG0317, SpoT, Guanosine polyphosphate
           pyrophosphohydrolases/synthetases [Signal transduction
           mechanisms / Transcription].
          Length = 701

 Score = 28.0 bits (63), Expect = 3.7
 Identities = 12/32 (37%), Positives = 17/32 (53%)

Query: 119 LKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
            K+Q R E  +  RE LEK L +    +E +E
Sbjct: 475 FKKQDRDENVEAGRELLEKELSRLGLPKELEE 506


>gnl|CDD|239286 cd02988, Phd_like_VIAF, Phosducin (Phd)-like family, Viral
           inhibitor of apoptosis (IAP)-associated factor (VIAF)
           subfamily; VIAF is a Phd-like protein that functions in
           caspase activation during apoptosis. It was identified
           as an IAP binding protein through a screen of a human
           B-cell library using a prototype IAP. VIAF lacks a
           consensus IAP binding motif and while it does not
           function as an IAP antagonist, it still plays a
           regulatory role in the complete activation of caspases.
           VIAF itself is a substrate for IAP-mediated
           ubiquitination, suggesting that it may be a target of
           IAPs in the prevention of cell death. The similarity of
           VIAF to Phd points to a potential role distinct from
           apoptosis regulation. Phd functions as a cytosolic
           regulator of G protein by specifically binding to G
           protein betagamma (Gbg)-subunits. The C-terminal domain
           of Phd adopts a thioredoxin fold, but it does not
           contain a CXXC motif. Phd interacts with G protein beta
           mostly through the N-terminal helical domain.
          Length = 192

 Score = 27.6 bits (62), Expect = 3.7
 Identities = 14/44 (31%), Positives = 21/44 (47%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEE 147
           P E+EEE  EL++ E  E   ++K   E +      E +   EE
Sbjct: 20  PKEEEEEALELAIQEAHENALEKKLLDELDEELDEEEDDRFLEE 63


>gnl|CDD|236729 PRK10636, PRK10636, putative ABC transporter ATP-binding protein;
           Provisional.
          Length = 638

 Score = 28.2 bits (63), Expect = 3.7
 Identities = 18/74 (24%), Positives = 33/74 (44%), Gaps = 9/74 (12%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKE--------KKEREALEKSLEQ-EAKTEEEDEGISW 154
             E  +E +  S    K+Q+R+E E        +KE   LEK +E+  A+  + +E +  
Sbjct: 529 TDEAPKENNANSAQARKDQKRREAELRTQTQPLRKEIARLEKEMEKLNAQLAQAEEKLGD 588

Query: 155 GMGDDAEEETDLSE 168
               D   + +L+ 
Sbjct: 589 SELYDQSRKAELTA 602


>gnl|CDD|240329 PTZ00248, PTZ00248, eukaryotic translation initiation factor 2
           subunit 1; Provisional.
          Length = 319

 Score = 27.7 bits (62), Expect = 4.0
 Identities = 11/44 (25%), Positives = 21/44 (47%), Gaps = 3/44 (6%)

Query: 126 EKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSEN 169
             E+   E LE   + E + EE+D   S    ++ E+E +  ++
Sbjct: 274 GDEEDLEELLE---KAEEEEEEDDYSESEDEDEEDEDEEEEEDD 314


>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
          Length = 1068

 Score = 28.1 bits (63), Expect = 4.3
 Identities = 14/60 (23%), Positives = 29/60 (48%), Gaps = 5/60 (8%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKS-LEQEAKTEEEDEGISWGMGDDAEEE 163
              + E +E + ++ ++Q+   +E++ R   EK   +QEAK    +E        + E+E
Sbjct: 655 ESQQAEVTEKARTQDEQQQAPRRERQRRRNDEKRQAQQEAKALNVEEQSV----QETEQE 710


>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
           domain.  This domain is found in a number of different
           types of plant proteins including NAM-like proteins.
          Length = 147

 Score = 27.3 bits (61), Expect = 4.3
 Identities = 10/29 (34%), Positives = 18/29 (62%)

Query: 120 KEQRRQEKEKKEREALEKSLEQEAKTEEE 148
           KE+ R++K K ++E  EK  E+E +  + 
Sbjct: 69  KEKLRRDKLKAKKEEAEKEKEKEERFMKA 97


>gnl|CDD|218333 pfam04931, DNA_pol_phi, DNA polymerase phi.  This family includes
           the fifth essential DNA polymerase in yeast EC:2.7.7.7.
           Pol5p is localised exclusively to the nucleolus and
           binds near or at the enhancer region of rRNA-encoding
           DNA repeating units.
          Length = 784

 Score = 27.9 bits (62), Expect = 4.4
 Identities = 23/84 (27%), Positives = 32/84 (38%), Gaps = 9/84 (10%)

Query: 101 LQGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDA 160
            Q   E EEE+ +    +L+E    E E +  E  E   E + +  EEDE       DDA
Sbjct: 641 HQQLFEGEEEDED----DLEETDDDEDECEAIEDSESESESDGEDGEEDEQ-----EDDA 691

Query: 161 EEETDLSENPYASTNNEELYLDDP 184
           E    +     A        L+ P
Sbjct: 692 EANEGVVPIDKAVRRALPKVLNLP 715


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 27.7 bits (61), Expect = 4.8
 Identities = 12/45 (26%), Positives = 20/45 (44%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           E+ EE   ++ SE K   R  +E ++ E   +  E+E       E
Sbjct: 124 EEVEETEGVTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLE 168


>gnl|CDD|239202 cd02808, GltS_FMN, Glutamate synthase (GltS) FMN-binding domain. 
          GltS is a complex iron-sulfur flavoprotein that
          catalyzes the reductive synthesis of L-glutamate from
          2-oxoglutarate and L-glutamine via intramolecular
          channelling of ammonia, a reaction in the plant, yeast
          and bacterial pathway for ammonia assimilation. It is a
          multifunctional enzyme that functions through three
          distinct active centers, carrying out  L-glutamine
          hydrolysis, conversion of 2-oxoglutarate into
          L-glutamate, and electron uptake from an electron
          donor.
          Length = 392

 Score = 27.5 bits (62), Expect = 5.1
 Identities = 5/22 (22%), Positives = 10/22 (45%), Gaps = 1/22 (4%)

Query: 73 RCKIKPKMYVRIHVGHMLSFGS 94
            +   K+    ++  M SFG+
Sbjct: 69 NAEKPLKLDSPFNISAM-SFGA 89


>gnl|CDD|216108 pfam00769, ERM, Ezrin/radixin/moesin family.  This family of
           proteins contain a band 4.1 domain (pfam00373), at their
           amino terminus. This family represents the rest of these
           proteins.
          Length = 244

 Score = 27.4 bits (61), Expect = 5.1
 Identities = 20/64 (31%), Positives = 31/64 (48%), Gaps = 5/64 (7%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEET 164
           ++ E EE E +  EL+E+ +QE+E  E + LEK      + EEE+  +        EE  
Sbjct: 24  AQKELEEYEETALELEEKLKQEEE--EAQLLEKK---ADELEEENRRLEEEAAASEEERE 78

Query: 165 DLSE 168
            L  
Sbjct: 79  RLEA 82


>gnl|CDD|218970 pfam06278, DUF1032, Protein of unknown function (DUF1032).  This
           family consists of several conserved eukaryotic proteins
           of unknown function.
          Length = 565

 Score = 27.6 bits (61), Expect = 5.2
 Identities = 19/76 (25%), Positives = 29/76 (38%), Gaps = 7/76 (9%)

Query: 114 LSVSELKEQRRQEKEKKEREALEKSLEQEAK----TEE---EDEGISWGMGDDAEEETDL 166
           L    +KE+   +++   R   E+ L  E +     EE   ED     G  DD  +  D 
Sbjct: 340 LYWKHVKERLETQRQMLRRRGAERWLPDEEQKLWPLEEDRLEDSVEDDGDADDFSDPEDY 399

Query: 167 SENPYASTNNEELYLD 182
            E P      E+ + D
Sbjct: 400 LEPPEGLDPEEQAFQD 415


>gnl|CDD|227458 COG5129, MAK16, Nuclear protein with HMG-like acidic region
           [General function prediction only].
          Length = 303

 Score = 27.3 bits (60), Expect = 5.2
 Identities = 24/91 (26%), Positives = 39/91 (42%), Gaps = 11/91 (12%)

Query: 102 QGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSL---EQEAKTEEEDEGISWGMGD 158
           +     EEEE   +  E       EKEK +++ LEK L   +    +E E+E       +
Sbjct: 199 EKERYVEEEEESDTELEA-VTDDSEKEKTKKKDLEKWLGSDQSMETSESEEE-------E 250

Query: 159 DAEEETDLSENPYASTNNEELYLDDPKKTLR 189
            +E E+D  E+        +   DD KK+ +
Sbjct: 251 SSESESDEDEDEDNKGKIRKRKTDDAKKSRK 281


>gnl|CDD|219882 pfam08524, rRNA_processing, rRNA processing.  This is a family of
           proteins that are involved in rRNA processing. In a
           localisation study they were found to localise to the
           nucleus and nucleolus. The family also includes other
           metazoa members from plants to mammals where the protein
           has been named BR22 and is associated with TTF-1,
           thyroid transcription factor 1. In the lungs, the family
           binds TTF-1 to form a complex which influences the
           expression of the key lung surfactant protein-B (SP-B)
           and -C (SP-C), the small hydrophobic surfactant proteins
           that maintain surface tension in alveoli.
          Length = 150

 Score = 26.8 bits (59), Expect = 5.5
 Identities = 15/49 (30%), Positives = 28/49 (57%)

Query: 102 QGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           +G +  E+E +E  V   KE R+ EK+KK  E  E + +++ +  E++ 
Sbjct: 45  EGYAVPEKESAEKQVKSSKEDRKFEKKKKLDEKKEIAKQRKREQREKEL 93


>gnl|CDD|218391 pfam05029, TIMELESS_C, Timeless protein C terminal region.  The
           timeless (tim) gene is essential for circadian function
           in Drosophila. Putative homologues of Drosophila tim
           have been identified in both mice and humans (mTim and
           hTIM, respectively). Mammalian TIM is not the true
           orthologue of Drosophila TIM, but is the likely
           orthologue of a fly gene, timeout (also called tim-2).
           mTim has been shown to be essential for embryonic
           development, but does not have substantiated circadian
           function. Some family members contain a SANT domain in
           this region.
          Length = 507

 Score = 27.8 bits (61), Expect = 5.5
 Identities = 14/48 (29%), Positives = 22/48 (45%)

Query: 103 GPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
             +   E  + +S   L++Q +QEK       L+  L + A   EEDE
Sbjct: 270 VSAFQVEGSTLISAENLRQQLKQEKTSWPLLWLQSCLIRAADDREEDE 317


>gnl|CDD|233055 TIGR00617, rpa1, replication factor-a protein 1 (rpa1).  All
           proteins in this family for which functions are known
           are part of a multiprotein complex made up of homologs
           of RPA1, RPA2 and RPA3 that bind ssDNA and function in
           the recognition of DNA damage for nucleotide excision
           repairThis family is based on the phylogenomic analysis
           of JA Eisen (1999, Ph.D. Thesis, Stanford University)
           [DNA metabolism, DNA replication, recombination, and
           repair].
          Length = 608

 Score = 27.4 bits (61), Expect = 5.7
 Identities = 8/13 (61%), Positives = 10/13 (76%)

Query: 188 LRGWFDREGKGFP 200
           L+GW+D EGKG  
Sbjct: 407 LKGWYDNEGKGTM 419


>gnl|CDD|235665 PRK05996, motB, flagellar motor protein MotB; Validated.
          Length = 423

 Score = 27.4 bits (61), Expect = 5.7
 Identities = 12/49 (24%), Positives = 20/49 (40%), Gaps = 1/49 (2%)

Query: 125 QEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAE-EETDLSENPYA 172
           + ++K  +   E+    E  +    +  +   GD     E DL  NPYA
Sbjct: 91  EGEQKPGKSKFEEDQRVEGSSAVTGDDTTRTSGDQTNYSEADLFRNPYA 139


>gnl|CDD|223520 COG0443, DnaK, Molecular chaperone [Posttranslational modification,
           protein turnover, chaperones].
          Length = 579

 Score = 27.7 bits (62), Expect = 5.7
 Identities = 18/89 (20%), Positives = 35/89 (39%), Gaps = 11/89 (12%)

Query: 100 ILQGPSEDEEEESELSVSELKEQRRQEKEKKER-----------EALEKSLEQEAKTEEE 148
            ++  S   +EE E  V + +     +K+ +E             +LEK+L++  K  EE
Sbjct: 476 TIKASSGLSDEEIERMVEDAEANAALDKKFRELVEARNEAESLIYSLEKALKEIVKVSEE 535

Query: 149 DEGISWGMGDDAEEETDLSENPYASTNNE 177
           ++        D EE  +  +    +   E
Sbjct: 536 EKEKIEEAITDLEEALEGEKEEIKAKIEE 564


>gnl|CDD|221775 pfam12794, MscS_TM, Mechanosensitive ion channel inner membrane
           domain 1.  The small mechanosensitive channel, MscS, is
           a part of the turgor-driven solute efflux system that
           protects bacteria from lysis in the event of osmotic
           shock. The MscS protein alone is sufficient to form a
           functional mechanosensitive channel gated directly by
           tension in the lipid bilayer. The MscS proteins are
           heptamers of three transmembrane subunits with seven
           converging M3 domains, and this domain is one of the
           inner membrane domains.
          Length = 339

 Score = 27.2 bits (61), Expect = 5.7
 Identities = 11/29 (37%), Positives = 15/29 (51%)

Query: 122 QRRQEKEKKEREALEKSLEQEAKTEEEDE 150
            +R E   +  E  E+S E  A+T EE E
Sbjct: 265 AKRAEILAQRAEEEEESSEGAAETIEEPE 293


>gnl|CDD|222867 PHA02546, 47, endonuclease subunit; Provisional.
          Length = 340

 Score = 27.3 bits (61), Expect = 5.8
 Identities = 12/36 (33%), Positives = 17/36 (47%), Gaps = 4/36 (11%)

Query: 50  DEKDPARGFYVYDLGSTHGTFLNRCKIKPKMYVRIH 85
           DE DP RGF+V+D  +    F+         + RI 
Sbjct: 210 DENDP-RGFWVFDTETHKLEFIANPTT---WHRRIT 241


>gnl|CDD|171561 PRK12528, PRK12528, RNA polymerase sigma factor; Provisional.
          Length = 161

 Score = 26.7 bits (59), Expect = 5.9
 Identities = 12/26 (46%), Positives = 17/26 (65%)

Query: 123 RRQEKEKKEREALEKSLEQEAKTEEE 148
           RRQ+ E+   EAL +  E+ A +EEE
Sbjct: 71  RRQDLERAYLEALAQLPERVAPSEEE 96


>gnl|CDD|191716 pfam07263, DMP1, Dentin matrix protein 1 (DMP1).  This family
           consists of several mammalian dentin matrix protein 1
           (DMP1) sequences. The dentin matrix acidic
           phosphoprotein 1 (DMP1) gene has been mapped to human
           chromosome 4q21. DMP1 is a bone and teeth specific
           protein initially identified from mineralised dentin.
           DMP1 is primarily localised in the nuclear compartment
           of undifferentiated osteoblasts. In the nucleus, DMP1
           acts as a transcriptional component for activation of
           osteoblast-specific genes like osteocalcin. During the
           early phase of osteoblast maturation, Ca(2+) surges into
           the nucleus from the cytoplasm, triggering the
           phosphorylation of DMP1 by a nuclear isoform of casein
           kinase II. This phosphorylated DMP1 is then exported out
           into the extracellular matrix, where it regulates
           nucleation of hydroxyapatite. DMP1 is a unique molecule
           that initiates osteoblast differentiation by
           transcription in the nucleus and orchestrates
           mineralised matrix formation extracellularly, at later
           stages of osteoblast maturation. The DMP1 gene has been
           found to be ectopically expressed in lung cancer
           although the reason for this is unknown.
          Length = 514

 Score = 27.3 bits (60), Expect = 6.3
 Identities = 22/78 (28%), Positives = 40/78 (51%), Gaps = 1/78 (1%)

Query: 107 DEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDL 166
           + +E+SE + S+   Q  Q+   +  +  +    QE  +E ++E +S   GD+ +  T  
Sbjct: 311 ESQEDSEENQSQEDSQEVQDPSSESSQEADLP-SQENSSESQEEVVSESRGDNPDNTTSH 369

Query: 167 SENPYASTNNEELYLDDP 184
           SE+   S ++EE  LD P
Sbjct: 370 SEDQEDSESSEEDSLDTP 387


>gnl|CDD|224308 COG1390, NtpE, Archaeal/vacuolar-type H+-ATPase subunit E [Energy
           production and conversion].
          Length = 194

 Score = 27.0 bits (60), Expect = 6.8
 Identities = 17/45 (37%), Positives = 26/45 (57%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
            + EEE+E  + E +E+  + KE+ +REA E   E   K E+E E
Sbjct: 13  REAEEEAEEILEEAREEAEKIKEEAKREAEEAIEEILRKAEKEAE 57


>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain.  This
           family represents the C-terminus (approximately 300
           residues) of proteins that are involved as binding
           partners for Prp19 as part of the nuclear pore complex.
           The family in Drosophila is necessary for pre-mRNA
           splicing, and the human protein has been found in
           purifications of the spliceosome. In the past this
           family was thought, erroneously, to be associated with
           microfibrillin.
          Length = 277

 Score = 27.2 bits (60), Expect = 6.9
 Identities = 16/50 (32%), Positives = 29/50 (58%), Gaps = 4/50 (8%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQ----EAKTEEEDE 150
           ++DE EE E    +L+E +R +++++ERE +E+   +       TEEE  
Sbjct: 108 TDDENEEEEYEAWKLRELKRIKRDREEREEMEREKAEIEKMRNMTEEERR 157


>gnl|CDD|236335 PRK08724, fliD, flagellar capping protein; Validated.
          Length = 673

 Score = 27.1 bits (60), Expect = 6.9
 Identities = 13/41 (31%), Positives = 23/41 (56%)

Query: 110 EESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           E+ EL+  + K+  R + E +ERE LEK  + +A  ++   
Sbjct: 384 EKGELTPEQAKQIARAKLEPEERERLEKIDKAQAALKQAQS 424


>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
          Length = 434

 Score = 27.1 bits (60), Expect = 6.9
 Identities = 13/48 (27%), Positives = 24/48 (50%), Gaps = 3/48 (6%)

Query: 102 QGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEED 149
                 EE E++    +LK++   E EKK++E  ++  E+  K E + 
Sbjct: 390 DETDASEEAEAKAKEEKLKQE---ENEKKQKEQADEDKEKRQKDERKK 434


>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein.  This family includes proteins
           related to Mpp10 (M phase phosphoprotein 10). The U3
           small nucleolar ribonucleoprotein (snoRNP) is required
           for three cleavage events that generate the mature 18S
           rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
           depletion of Mpp10, a U3 snoRNP-specific protein, halts
           18S rRNA production and impairs cleavage at the three U3
           snoRNP-dependent sites.
          Length = 613

 Score = 27.3 bits (60), Expect = 7.5
 Identities = 21/64 (32%), Positives = 27/64 (42%)

Query: 105 SEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEET 164
           S D+EEE E   S   E    E E       E SLE  +  E ED+       ++A EE 
Sbjct: 114 SADDEEEEEEDESLEDEMIDDEDEADLFNESESSLEDLSDDETEDDEEKKMEEEEAGEEK 173

Query: 165 DLSE 168
           +  E
Sbjct: 174 ESVE 177


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 27.0 bits (61), Expect = 7.7
 Identities = 20/42 (47%), Positives = 23/42 (54%), Gaps = 1/42 (2%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEE 147
           E  EEE E    EL EQ++QE EKKE E  E   EQ  + E 
Sbjct: 106 EKREEELEKKEKEL-EQKQQELEKKEEELEELIEEQLQELER 146


>gnl|CDD|146016 pfam03179, V-ATPase_G, Vacuolar (H+)-ATPase G subunit.  This family
           represents the eukaryotic vacuolar (H+)-ATPase
           (V-ATPase) G subunit. V-ATPases generate an acidic
           environment in several intracellular compartments.
           Correspondingly, they are found as membrane-attached
           proteins in several organelles. They are also found in
           the plasma membranes of some specialised cells.
           V-ATPases consist of peripheral (V1) and membrane
           integral (V0) heteromultimeric complexes. The G subunit
           is part of the V1 subunit, but is also thought to be
           strongly attached to the V0 complex. It may be involved
           in the coupling of ATP degradation to H+ translocation.
          Length = 105

 Score = 26.0 bits (58), Expect = 7.8
 Identities = 14/44 (31%), Positives = 24/44 (54%), Gaps = 6/44 (13%)

Query: 109 EEESELSVSELKEQRRQEKEKKE------REALEKSLEQEAKTE 146
           +EE+E  + E + QR  E ++ E      R  LEK +E+E + +
Sbjct: 35  KEEAEKEIEEYRAQREAEFKEFEAEHSGSRGELEKKIEKETEEK 78


>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y.  Members of this family are
           RNase Y, an endoribonuclease. The member from Bacillus
           subtilis, YmdA, has been shown to be involved in
           turnover of yitJ riboswitch [Transcription, Degradation
           of RNA].
          Length = 514

 Score = 27.2 bits (61), Expect = 7.9
 Identities = 13/43 (30%), Positives = 20/43 (46%)

Query: 108 EEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
            E E EL     + QR + +  +  E L++ +E   K EE  E
Sbjct: 65  AELERELKERRNELQRLERRLLQREETLDRKMESLDKKEENLE 107


>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
           RPA34.5.  This is a family of proteins conserved from
           yeasts to human. Subunit A34.5 of RNA polymerase I is a
           non-essential subunit which is thought to help Pol I
           overcome topological constraints imposed on ribosomal
           DNA during the process of transcription.
          Length = 193

 Score = 26.6 bits (59), Expect = 7.9
 Identities = 16/48 (33%), Positives = 26/48 (54%)

Query: 103 GPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           GP  +   ESE S  E   +  +E E +E E  EK  ++E K E++++
Sbjct: 123 GPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEK 170


>gnl|CDD|236912 PRK11448, hsdR, type I restriction enzyme EcoKI subunit R;
           Provisional.
          Length = 1123

 Score = 27.2 bits (61), Expect = 8.0
 Identities = 17/79 (21%), Positives = 32/79 (40%), Gaps = 8/79 (10%)

Query: 79  KMYVRIHVGHMLSFGSSTRF----FILQGPSEDEEEESELSVSELKEQRRQ-EKEKKERE 133
           K+  R+ V    ++G    F    F+   P ED E        E+   ++Q E + +E+ 
Sbjct: 110 KLAFRLAVWFHRTYGKDWDFKPGPFV---PPEDPENLLHALQQEVLTLKQQLELQAREKA 166

Query: 134 ALEKSLEQEAKTEEEDEGI 152
             +   E + +     EG+
Sbjct: 167 QSQALAEAQQQELVALEGL 185


>gnl|CDD|234029 TIGR02830, spore_III_AG, stage III sporulation protein AG.  CC A
           comparative genome analysis of all sequenced genomes of
           shows a number of proteins conserved strictly among the
           endospore-forming subset of the Firmicutes. This
           protein, a member of this panel, is found in a spore
           formation operon and is designated stage III sporulation
           protein AG [Cellular processes, Sporulation and
           germination].
          Length = 186

 Score = 26.5 bits (59), Expect = 8.3
 Identities = 12/53 (22%), Positives = 22/53 (41%), Gaps = 4/53 (7%)

Query: 91  SFGSSTRFFILQGPSEDEEEESEL--SVSELKEQRRQEKEKKEREALEKSLEQ 141
            F SS        P+ ++ E   +   V +  E    EK+ +    L++ LE+
Sbjct: 22  FFSSSEDIEESDTPNNEKTEPEFVQGEVQKEDEISDYEKQYENE--LKEILEK 72


>gnl|CDD|223003 PHA03169, PHA03169, hypothetical protein; Provisional.
          Length = 413

 Score = 26.9 bits (59), Expect = 8.5
 Identities = 15/86 (17%), Positives = 28/86 (32%), Gaps = 9/86 (10%)

Query: 102 QGPSEDEEEESELSVSELKEQRRQEKEKK---------EREALEKSLEQEAKTEEEDEGI 152
           +  +E    ++E      +E R  EKE++                +       EE   G+
Sbjct: 59  RAVAEQGHRQTESDTETAEESRHGEKEERGQGGPSGSGSESVGSPTPSPSGSAEELASGL 118

Query: 153 SWGMGDDAEEETDLSENPYASTNNEE 178
           S      +  E+  S +P  S  +  
Sbjct: 119 SPENTSGSSPESPASHSPPPSPPSHP 144


>gnl|CDD|225281 COG2425, COG2425, Uncharacterized protein containing a von
           Willebrand factor type A (vWA) domain [General function
           prediction only].
          Length = 437

 Score = 27.0 bits (60), Expect = 8.6
 Identities = 12/43 (27%), Positives = 21/43 (48%)

Query: 110 EESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGI 152
           E  E  + +L+ +  ++  + ERE L    ++E     E EGI
Sbjct: 110 ERWEELLQDLQREGSEDFLEGEREGLLSEKQEEISLSGEMEGI 152


>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).  This
           family consists of several Plasmodium falciparum SPAM
           (secreted polymorphic antigen associated with
           merozoites) proteins. Variation among SPAM alleles is
           the result of deletions and amino acid substitutions in
           non-repetitive sequences within and flanking the alanine
           heptad-repeat domain. Heptad repeats in which the a and
           d position contain hydrophobic residues generate
           amphipathic alpha-helices which give rise to helical
           bundles or coiled-coil structures in proteins. SPAM is
           an example of a P. falciparum antigen in which a
           repetitive sequence has features characteristic of a
           well-defined structural element.
          Length = 164

 Score = 26.4 bits (58), Expect = 8.6
 Identities = 15/51 (29%), Positives = 24/51 (47%), Gaps = 3/51 (5%)

Query: 100 ILQGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           +     ED+EEE E    E++E    E  + E E +E   E+E   E+  +
Sbjct: 43  VKDEKQEDDEEEEEEDEEEIEEP---EDIEDEEEIVEDEEEEEEDEEDNVD 90


>gnl|CDD|135898 PRK06397, PRK06397, V-type ATP synthase subunit H; Validated.
          Length = 111

 Score = 26.0 bits (57), Expect = 8.7
 Identities = 14/41 (34%), Positives = 22/41 (53%)

Query: 110 EESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDE 150
           +E E S+ +     + E+E + +EA  K  E+  KTEEE  
Sbjct: 16  KEKEESIDKEIANIKNEQENEIKEAKSKYEEKAKKTEEESL 56


>gnl|CDD|222636 pfam14265, DUF4355, Domain of unknown function (DUF4355).  This
           family of proteins is found in bacteria and viruses.
           Proteins in this family are typically between 180 and
           214 amino acids in length.
          Length = 125

 Score = 26.1 bits (58), Expect = 8.8
 Identities = 10/42 (23%), Positives = 23/42 (54%), Gaps = 1/42 (2%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEE 147
           E+++ E+E  ++++  + + E E ++ E   + LE E    E
Sbjct: 29  EEKKSEAE-KLAKMSAEEKAEYELEKLEKELEELEAELARRE 69


>gnl|CDD|234055 TIGR02907, spore_VI_D, stage VI sporulation protein D.  SpoVID, the
           stage VI sporulation protein D, is restricted to
           endospore-forming members of the bacteria, all of which
           are found among the Firmicutes. It is widely distributed
           but not quite universal in this group. Between
           well-conserved N-terminal and C-terminal domains is a
           poorly conserved, low-complexity region of variable
           length, rich enough in glutamic acid to cause spurious
           BLAST search results unless a filter is used. The seed
           alignment for this model was trimmed, in effect, by
           choosing member sequences in which these regions are
           relatively short. SpoVID is involved in spore coat
           assembly by the mother cell compartment late in the
           process of sporulation [Cellular processes, Sporulation
           and germination].
          Length = 338

 Score = 26.8 bits (59), Expect = 8.9
 Identities = 18/82 (21%), Positives = 34/82 (41%), Gaps = 5/82 (6%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEE 163
           P+ ++EEE E   +E +   ++E   +E    E  +E EA  + E         DD +E 
Sbjct: 146 PAREDEEEEESFSAEFEHPAQEETAGEEERTDEPKVEHEAHEQHEQPAD-----DDPDEW 200

Query: 164 TDLSENPYASTNNEELYLDDPK 185
              +  P+   +  E   ++  
Sbjct: 201 KISASEPFQLESEVEASPEEEN 222


>gnl|CDD|234309 TIGR03683, A-tRNA_syn_arch, alanyl-tRNA synthetase.  This family of
           alanyl-tRNA synthetases is limited to the archaea, and
           is a subset of those sequences identified by the model
           pfam07973 covering the second additional domain (SAD) of
           alanyl and threonyl tRNA synthetases.
          Length = 902

 Score = 26.9 bits (60), Expect = 8.9
 Identities = 15/48 (31%), Positives = 24/48 (50%)

Query: 100 ILQGPSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEE 147
           IL+ P E   E  +    E KEQR++ +  K++ A  K  E  ++ E 
Sbjct: 755 ILKVPPEQLPETVKRFFEEWKEQRKEIERLKKKLAELKIYELISEAER 802


>gnl|CDD|221389 pfam12037, DUF3523, Domain of unknown function (DUF3523).  This
           presumed domain is functionally uncharacterized. This
           domain is found in eukaryotes. This domain is typically
           between 257 to 277 amino acids in length. This domain is
           found associated with pfam00004. This domain has a
           conserved LER sequence motif.
          Length = 276

 Score = 26.7 bits (59), Expect = 9.1
 Identities = 11/43 (25%), Positives = 25/43 (58%)

Query: 106 EDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQEAKTEEE 148
            +E  ++    ++ ++QR Q +++  R+  +K LEQ+ +  EE
Sbjct: 91  AEERRKTLQEQTQQEQQRAQYQDELARKRYQKELEQQRRQNEE 133


>gnl|CDD|216292 pfam01086, Clathrin_lg_ch, Clathrin light chain. 
          Length = 225

 Score = 26.6 bits (59), Expect = 9.1
 Identities = 10/39 (25%), Positives = 20/39 (51%), Gaps = 3/39 (7%)

Query: 104 PSEDEEEESELSVSELKEQRRQEKEKKEREALEKSLEQE 142
               E EE E S+ + +E  R++   +ER+   +  ++E
Sbjct: 103 ADRVEGEEPE-SIRKWRE--RRDLRIEERDEASEKKKEE 138


>gnl|CDD|132187 TIGR03143, AhpF_homolog, putative alkyl hydroperoxide reductase F
           subunit.  This family of thioredoxin reductase homologs
           is found adjacent to alkylhydroperoxide reductase C
           subunit predominantly in cases where there is only one C
           subunit in the genome and that genome is lacking the F
           subunit partner (also a thioredcxin reductase homolog)
           that is usually found (TIGR03140).
          Length = 555

 Score = 27.1 bits (60), Expect = 9.3
 Identities = 22/82 (26%), Positives = 30/82 (36%), Gaps = 13/82 (15%)

Query: 124 RQEKEKKEREALEKSLEQEAKTEEEDEGISWGMGDDAEEETDLSENPYASTNNEELYLDD 183
           R  KE KE+  + +  E+E   E             A E +     P A+T    L  D 
Sbjct: 306 RYVKELKEKLGIAEEYEEEEAKE-------------ASEASAAETTPAATTKKGSLLDDS 352

Query: 184 PKKTLRGWFDREGKGFPLFTFL 205
            ++ L G F R      L  FL
Sbjct: 353 LRQQLVGIFGRLENPVTLLLFL 374


>gnl|CDD|239316 cd03018, PRX_AhpE_like, Peroxiredoxin (PRX) family, AhpE-like
           subfamily; composed of proteins similar to Mycobacterium
           tuberculosis AhpE. AhpE is described as a 1-cys PRX
           because of the absence of a resolving cysteine. The
           structure and sequence of AhpE, however, show greater
           similarity to 2-cys PRXs than 1-cys PRXs. PRXs are
           thiol-specific antioxidant (TSA) proteins that confer a
           protective role in cells through their peroxidase
           activity in which hydrogen peroxide, peroxynitrate, and
           organic hydroperoxides are reduced and detoxified using
           reducing equivalents derived from either thioredoxin,
           glutathione, trypanothione and AhpF. The first step of
           catalysis is the nucleophilic attack by the peroxidatic
           cysteine on the peroxide leading to the formation of a
           cysteine sulfenic acid intermediate. The absence of a
           resolving cysteine suggests that functional AhpE is
           regenerated by an external reductant. The solution
           behavior and crystal structure of AhpE show that it
           forms dimers and octamers.
          Length = 149

 Score = 26.1 bits (58), Expect = 9.5
 Identities = 8/19 (42%), Positives = 10/19 (52%)

Query: 183 DPKKTLRGWFDREGKGFPL 201
           D   +LR W +  G  FPL
Sbjct: 71  DSPFSLRAWAEENGLTFPL 89


>gnl|CDD|216833 pfam01991, vATP-synt_E, ATP synthase (E/31 kDa) subunit.  This
           family includes the vacuolar ATP synthase E subunit, as
           well as the archaebacterial ATP synthase E subunit.
          Length = 195

 Score = 26.5 bits (59), Expect = 10.0
 Identities = 17/46 (36%), Positives = 21/46 (45%), Gaps = 3/46 (6%)

Query: 99  FILQGPSEDEEE---ESELSVSELKEQRRQEKEKKEREALEKSLEQ 141
           FI Q   E  EE   E+E      K +  +E EKK  E  EK  +Q
Sbjct: 1   FIRQEAEEKAEEIRAEAEEEFEIEKAEAVEEAEKKIEEIYEKKEKQ 46


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.313    0.133    0.383 

Gapped
Lambda     K      H
   0.267   0.0677    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 10,585,871
Number of extensions: 991289
Number of successful extensions: 3724
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3106
Number of HSP's successfully gapped: 538
Length of query: 207
Length of database: 10,937,602
Length adjustment: 92
Effective length of query: 115
Effective length of database: 6,857,034
Effective search space: 788558910
Effective search space used: 788558910
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 57 (25.8 bits)