RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy7022
         (124 letters)



>gnl|CDD|197732 smart00451, ZnF_U1, U1-like zinc finger.  Family of C2H2-type
          zinc fingers, present in matrin, U1 small nuclear
          ribonucleoprotein C and other RNA-binding proteins.
          Length = 35

 Score = 43.8 bits (104), Expect = 2e-07
 Identities = 12/30 (40%), Positives = 20/30 (66%)

Query: 3  GYYCNVCDCVVKDSINFLDHINGKKHQRNL 32
          G+YC +C+    D I+   H+ GKKH++N+
Sbjct: 3  GFYCKLCNVTFTDEISVEAHLKGKKHKKNV 32


>gnl|CDD|205121 pfam12874, zf-met, Zinc-finger of C2H2 type.  This is a
          zinc-finger domain with the CxxCx(12)Hx(6)H motif,
          found in multiple copies in a wide range of proteins
          from plants to metazoans. Some member proteins,
          particularly those from plants, are annotated as being
          RNA-binding.
          Length = 25

 Score = 37.9 bits (89), Expect = 4e-05
 Identities = 8/25 (32%), Positives = 12/25 (48%)

Query: 4  YYCNVCDCVVKDSINFLDHINGKKH 28
          +YC +C+           H+ GKKH
Sbjct: 1  FYCELCNVTFTSESQLKSHLRGKKH 25


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 39.3 bits (91), Expect = 2e-04
 Identities = 21/57 (36%), Positives = 30/57 (52%), Gaps = 2/57 (3%)

Query: 54  KKKYE-QKKKDYDIEQRMRELKE-EEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSD 108
           KK+ E  +K   + EQ+ RE +E E+EK KE  RER +E +R         E   S+
Sbjct: 579 KKREEAVEKAKREAEQKAREEREREKEKEKEREREREREAERAAKASSSSHESRMSE 635



 Score = 33.5 bits (76), Expect = 0.022
 Identities = 14/52 (26%), Positives = 31/52 (59%), Gaps = 5/52 (9%)

Query: 60  KKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDLAA 111
           KK++  +E+  RE +++  + +E  +E+ KE++R     E E E +++  A+
Sbjct: 579 KKREEAVEKAKREAEQKAREEREREKEKEKERER-----EREREAERAAKAS 625


>gnl|CDD|204841 pfam12171, zf-C2H2_jaz, Zinc-finger double-stranded RNA-binding. 
          This domain family is found in archaea and eukaryotes,
          and is approximately 30 amino acids in length. The
          mammalian members of this group occur multiple times
          along the protein, joined by flexible linkers, and are
          referred to as JAZ - dsRNA-binding ZF protein -
          zinc-fingers. The JAZ proteins are expressed in all
          tissues tested and localise in the nucleus,
          particularly the nucleolus. JAZ preferentially binds to
          double-stranded (ds) RNA or RNA/DNA hybrids rather than
          DNA. In addition to binding double-stranded RNA, these
          zinc-fingers are required for nucleolar localisation.
          Length = 27

 Score = 34.9 bits (81), Expect = 5e-04
 Identities = 9/26 (34%), Positives = 13/26 (50%)

Query: 4  YYCNVCDCVVKDSINFLDHINGKKHQ 29
          +YC  CD   K      +H+  KKH+
Sbjct: 2  FYCVACDKYFKSENALENHLKSKKHK 27


>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
          Length = 434

 Score = 33.6 bits (77), Expect = 0.015
 Identities = 16/61 (26%), Positives = 29/61 (47%), Gaps = 2/61 (3%)

Query: 53  NKKKYEQKKKDYDI--EQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDLA 110
             K YE+ K + D+  ++R   LKE  +KL++Y ++  + K       E E +  +  L 
Sbjct: 349 LTKLYEEVKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVKDETDASEEAEAKAKEEKLK 408

Query: 111 A 111
            
Sbjct: 409 Q 409


>gnl|CDD|203945 pfam08439, Peptidase_M3_N, Oligopeptidase F.  This domain is found
           to the N-terminus of the pfam01432 domain in bacterial
           and archaeal proteins including Oligoendopeptidase F. An
           example of this protein is Lactococcus lactis PepF.
          Length = 70

 Score = 31.0 bits (71), Expect = 0.028
 Identities = 14/38 (36%), Positives = 23/38 (60%), Gaps = 3/38 (7%)

Query: 67  EQRMRELKEEEEKLKEYR---RERRKEKKRKLDDGEEE 101
           E+++  L EE+ +LK YR    E R++K   L + EE+
Sbjct: 7   EEKLEALLEEDPELKPYRFYLEEIRRQKPHTLSEEEEK 44


>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein.  This entry is a highly
           conserved protein present in eukaryotes.
          Length = 680

 Score = 33.0 bits (75), Expect = 0.030
 Identities = 27/81 (33%), Positives = 38/81 (46%), Gaps = 8/81 (9%)

Query: 37  RVERSSLDQVKKRFDMNKKKY----EQKKKDYDIEQRM-RELKEEEEKLKEYRRERRKEK 91
           R  +S L Q+KK  DM + K       K+KD    Q M + LK E +      ++  +EK
Sbjct: 444 RSLKSDLGQLKKENDMLQTKLNSMVSAKQKDKQSMQSMEKRLKSEADSRVNAEKQLAEEK 503

Query: 92  KRKLDDGEEEEEGDQSDLAAI 112
           KRK    EEEE   ++   A 
Sbjct: 504 KRK---KEEEETAARAAAQAA 521


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 32.4 bits (73), Expect = 0.042
 Identities = 20/70 (28%), Positives = 38/70 (54%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDG 98
            E    D+ KK+ +  KK  E KKK  + +++  E K+  E  K+    ++ E+ +K D+ 
Sbjct: 1468 EAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKADEA 1527

Query: 99   EEEEEGDQSD 108
            ++ EE  ++D
Sbjct: 1528 KKAEEAKKAD 1537



 Score = 29.7 bits (66), Expect = 0.45
 Identities = 20/70 (28%), Positives = 40/70 (57%), Gaps = 1/70 (1%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDG 98
            E    D+ KK+ +  KKK ++ KK  + +++  E K+ EE  K+    ++ E+ +K D+ 
Sbjct: 1481 EAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEA-KKADEAKKAEEAKKADEA 1539

Query: 99   EEEEEGDQSD 108
            ++ EE  ++D
Sbjct: 1540 KKAEEKKKAD 1549



 Score = 29.0 bits (64), Expect = 0.68
 Identities = 18/70 (25%), Positives = 35/70 (50%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDG 98
            E+   D+ KK+ +  KK  E KKK  + +++    K++ E+ K+     + E +   D+ 
Sbjct: 1300 EKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEA 1359

Query: 99   EEEEEGDQSD 108
            E  EE  ++ 
Sbjct: 1360 EAAEEKAEAA 1369



 Score = 27.4 bits (60), Expect = 2.2
 Identities = 16/65 (24%), Positives = 29/65 (44%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDG 98
            E    D+ KK+ +  KKK +  KK  +  ++  E  + E +      E  +EK    +  
Sbjct: 1313 EAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKK 1372

Query: 99   EEEEE 103
            +EE +
Sbjct: 1373 KEEAK 1377



 Score = 27.0 bits (59), Expect = 3.1
 Identities = 18/68 (26%), Positives = 39/68 (57%), Gaps = 3/68 (4%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLK---EYRRERRKEKKRKL 95
             +   +++KK  +  KK  + KKK+ + +++  ELK+ EE+ K       ++ +E K+K 
Sbjct: 1618 AKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKA 1677

Query: 96   DDGEEEEE 103
            ++ ++ EE
Sbjct: 1678 EEAKKAEE 1685



 Score = 27.0 bits (59), Expect = 3.6
 Identities = 19/66 (28%), Positives = 40/66 (60%), Gaps = 1/66 (1%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEE-EEKLKEYRRERRKEKKRKLDD 97
            E+   D+ KK+ + +KKK ++ KK    +++  E K++ EEK K    +++ E+ +K D+
Sbjct: 1389 EKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADE 1448

Query: 98   GEEEEE 103
             +++ E
Sbjct: 1449 AKKKAE 1454



 Score = 26.6 bits (58), Expect = 3.9
 Identities = 22/69 (31%), Positives = 42/69 (60%), Gaps = 4/69 (5%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEE----KLKEYRRERRKEKKRK 94
            E+ + + +KK  +  KK  E KKK+ + +++  ELK+ EE    K +E ++E  ++KK+ 
Sbjct: 1687 EKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKA 1746

Query: 95   LDDGEEEEE 103
             +  ++EEE
Sbjct: 1747 EEAKKDEEE 1755



 Score = 26.6 bits (58), Expect = 4.9
 Identities = 19/62 (30%), Positives = 40/62 (64%), Gaps = 2/62 (3%)

Query: 44   DQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRK--EKKRKLDDGEEE 101
            ++ KK+ D  KKK E+KKK  + +++  E K++ ++LK+    ++K  E K+K ++ ++ 
Sbjct: 1374 EEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKA 1433

Query: 102  EE 103
            +E
Sbjct: 1434 DE 1435



 Score = 26.3 bits (57), Expect = 5.9
 Identities = 26/78 (33%), Positives = 45/78 (57%), Gaps = 3/78 (3%)

Query: 26   KKHQRNLGMSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRR 85
            KK    L  +    +   ++ KK  + +KKK E+ KKD + ++++  LK+EEEK  E   
Sbjct: 1715 KKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAE--- 1771

Query: 86   ERRKEKKRKLDDGEEEEE 103
            E RKEK+  +++  +EE+
Sbjct: 1772 EIRKEKEAVIEEELDEED 1789



 Score = 25.9 bits (56), Expect = 9.2
 Identities = 18/65 (27%), Positives = 35/65 (53%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDG 98
            E+   D+ KK+ +  KK  E KKK  + ++     K+ EE  K    +++ E+ +K D+ 
Sbjct: 1429 EKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEA 1488

Query: 99   EEEEE 103
            +++ E
Sbjct: 1489 KKKAE 1493



 Score = 25.9 bits (56), Expect = 9.4
 Identities = 19/66 (28%), Positives = 36/66 (54%), Gaps = 1/66 (1%)

Query: 39   ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRK-EKKRKLDD 97
             +   D  KK+ +  KK  E KKK  + +++  ELK+     K+    ++K E+K+K D+
Sbjct: 1376 AKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKADE 1435

Query: 98   GEEEEE 103
             +++ E
Sbjct: 1436 AKKKAE 1441


>gnl|CDD|179310 PRK01622, PRK01622, OxaA-like protein precursor; Validated.
          Length = 256

 Score = 32.0 bits (73), Expect = 0.052
 Identities = 13/54 (24%), Positives = 27/54 (50%), Gaps = 4/54 (7%)

Query: 23  INGKKHQRNLGMSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEE 76
           ++  K QR +   M V +  LD+++ +  + K   +QK    + ++ M EL + 
Sbjct: 80  VSQYKSQRGMQEKMAVMKPELDKIQAKLKVTKDLEKQK----EYQKEMMELYKS 129


>gnl|CDD|189014 cd09607, M3B_PepF_2, Peptidase family M3B Oligopeptidase F (PepF). 
           Peptidase family M3B Oligopeptidase F (PepF;
           Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and
           includes oligoendopeptidase F from Lactococcus lactis.
           This enzyme hydrolyzes peptides containing between 7 and
           17 amino acids with fairly broad specificity. The PepF
           gene is duplicated in L. lactis on the plasmid that
           bears it, while a shortened second copy is found in
           Bacillus subtilis. Most bacterial PepFs are cytoplasmic
           endopeptidases; however, the PepF Bacillus
           amyloliquefaciens oligopeptidase is a secreted protein
           and may facilitate the process of sporulation.
           Specifically, the yjbG gene encoding the homolog of the
           PepF1 and PepF2 oligoendopeptidases of Lactococcus
           lactis has been identified in Bacillus subtilis as an
           inhibitor of sporulation initiation when over expressed
           from a multicopy plasmid.
          Length = 581

 Score = 31.7 bits (73), Expect = 0.062
 Identities = 24/84 (28%), Positives = 35/84 (41%), Gaps = 19/84 (22%)

Query: 40  RSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYR---RERRKEKKRKLD 96
            +SL+Q+    D  KK     ++D D       L  + E LKE+     ERR++ K  L 
Sbjct: 92  SASLEQLLTLLD--KKLAALSEEDLD------ALLADPE-LKEHAFFLEERRRQAKHLLS 142

Query: 97  DGEEEEEGDQSDLAAIMGFSGFGG 120
           + EEE       L A +   G   
Sbjct: 143 EEEEE-------LIAKLSVDGLHA 159


>gnl|CDD|189015 cd09608, M3B_PepF_3, Peptidase family M3B Oligopeptidase F (PepF). 
           Peptidase family M3B Oligopeptidase F (PepF;
           Pz-peptidase B; EC 3.4.24.-) is mostly bacterial and
           includes oligoendopeptidase F from Lactococcus lactis.
           This enzyme hydrolyzes peptides containing between 7 and
           17 amino acids with fairly broad specificity. The PepF
           gene is duplicated in L. lactis on the plasmid that
           bears it, while a shortened second copy is found in
           Bacillus subtilis. Most bacterial PepFs are cytoplasmic
           endopeptidases; however, the PepF Bacillus
           amyloliquefaciens oligopeptidase is a secreted protein
           and may facilitate the process of sporulation.
           Specifically, the yjbG gene encoding the homolog of the
           PepF1 and PepF2 oligoendopeptidases of Lactococcus
           lactis has been identified in Bacillus subtilis as an
           inhibitor of sporulation initiation when over expressed
           from a multicopy plasmid.
          Length = 538

 Score = 31.7 bits (73), Expect = 0.066
 Identities = 13/38 (34%), Positives = 21/38 (55%), Gaps = 3/38 (7%)

Query: 67  EQRMRELKEEEEKLKEYRR---ERRKEKKRKLDDGEEE 101
           E+++    +EE +LK+YR    E  ++K   L   EEE
Sbjct: 65  EEKLESFLKEEPELKDYRHYLEEILRQKPHTLSAEEEE 102


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
           Provisional.
          Length = 482

 Score = 31.4 bits (72), Expect = 0.084
 Identities = 13/52 (25%), Positives = 34/52 (65%)

Query: 54  KKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGD 105
           KK  ++ KK  +  ++ RE +++E+K K +  ++++E++ +  + +EEE+ +
Sbjct: 406 KKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEE 457



 Score = 28.3 bits (64), Expect = 1.2
 Identities = 18/64 (28%), Positives = 37/64 (57%)

Query: 46  VKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGD 105
           V+K     +++ ++KKK     ++  E +EEE++ KE  +E  +E+  +  + EEE++  
Sbjct: 416 VEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEKKKK 475

Query: 106 QSDL 109
           Q+ L
Sbjct: 476 QATL 479


>gnl|CDD|135173 PRK04654, PRK04654, sec-independent translocase; Provisional.
          Length = 214

 Score = 31.3 bits (70), Expect = 0.089
 Identities = 17/74 (22%), Positives = 35/74 (47%), Gaps = 4/74 (5%)

Query: 27  KHQRNLGMSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRE----LKEEEEKLKE 82
           K  R  G+ +R  R   D VK+  +   +  E K+   D++  +RE    L+  ++++++
Sbjct: 27  KAARFAGLWVRRARMQWDSVKQELERELEAEELKRSLQDVQASLREAEDQLRNTQQQVEQ 86

Query: 83  YRRERRKEKKRKLD 96
             R    +  R +D
Sbjct: 87  GARALHDDVSRDID 100


>gnl|CDD|224340 COG1422, COG1422, Predicted membrane protein [Function unknown].
          Length = 201

 Score = 30.8 bits (70), Expect = 0.11
 Identities = 11/45 (24%), Positives = 25/45 (55%), Gaps = 4/45 (8%)

Query: 69  RMRELKEE----EEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDL 109
           +M+EL++     +++ +E +     +K +KL + + E   DQ +L
Sbjct: 73  KMKELQKMMKEFQKEFREAQESGDMKKLKKLQEKQMEMMDDQREL 117


>gnl|CDD|113290 pfam04514, BTV_NS2, Bluetongue virus non-structural protein NS2.
           This family includes NS2 proteins from other members of
           the Orbivirus genus. NS2 is a non-specific
           single-stranded RNA-binding protein that forms large
           homomultimers and accumulates in viral inclusion bodies
           of infected cells. Three RNA binding regions have been
           identified in Bluetongue virus serotype 17 at residues
           2-11, 153-166 and 274-286. NS2 multimers also possess
           nucleotidyl phosphatase activity. The precise function
           of NS2 is not known, but it may be involved in the
           transport and condensation of viral mRNAs.
          Length = 363

 Score = 31.0 bits (70), Expect = 0.12
 Identities = 17/74 (22%), Positives = 33/74 (44%), Gaps = 8/74 (10%)

Query: 35  SMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRK 94
            M  E S  DQ ++R    +++          E         EE+ ++ R+E+  E+  +
Sbjct: 202 DMTPETSKQDQKEERRAAVERRLA--------ELVEMINWNLEERRRDLRKEQELEENVE 253

Query: 95  LDDGEEEEEGDQSD 108
            D  +E+E G+ S+
Sbjct: 254 RDSDDEDEHGEDSE 267


>gnl|CDD|217933 pfam04156, IncA, IncA protein.  Chlamydia trachomatis is an
           obligate intracellular bacterium that develops within a
           parasitophorous vacuole termed an inclusion. The
           inclusion is non-fusogenic with lysosomes but intercepts
           lipids from a host cell exocytic pathway. Initiation of
           chlamydial development is concurrent with modification
           of the inclusion membrane by a set of C.
           trachomatis-encoded proteins collectively designated
           Incs. One of these Incs, IncA, is functionally
           associated with the homotypic fusion of inclusions. This
           family probably includes members of the wider Inc family
           rather than just IncA.
          Length = 186

 Score = 30.1 bits (68), Expect = 0.18
 Identities = 18/78 (23%), Positives = 41/78 (52%), Gaps = 8/78 (10%)

Query: 43  LDQVKKRFDMNKKKYEQKKKDY--------DIEQRMRELKEEEEKLKEYRRERRKEKKRK 94
           L+ +++R    + + E  K+D          +E+R+  L+E  ++L +  RE R++ + +
Sbjct: 95  LEDLEERIAELESELEDLKEDLQLLRELLKSLEERLESLEESIKELAKELRELRQDLREE 154

Query: 95  LDDGEEEEEGDQSDLAAI 112
           +++  EE E  Q +L  +
Sbjct: 155 VEELREELERLQENLQRL 172


>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
           protein; Reviewed.
          Length = 782

 Score = 30.6 bits (70), Expect = 0.20
 Identities = 17/80 (21%), Positives = 38/80 (47%), Gaps = 10/80 (12%)

Query: 40  RSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRER--------RKEK 91
              L+Q  +  +   K+ E+ K++   E++  +L+EEE+KL E   +         +KE 
Sbjct: 529 ERELEQKAEEAEALLKEAEKLKEEL--EEKKEKLQEEEDKLLEEAEKEAQQAIKEAKKEA 586

Query: 92  KRKLDDGEEEEEGDQSDLAA 111
              + +  + ++G  + + A
Sbjct: 587 DEIIKELRQLQKGGYASVKA 606



 Score = 27.5 bits (62), Expect = 1.9
 Identities = 13/54 (24%), Positives = 33/54 (61%), Gaps = 2/54 (3%)

Query: 41  SSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRK 94
           +SL+++++  +   ++ E   K+   E+   EL+E++EKL+E   +  +E +++
Sbjct: 523 ASLEELERELEQKAEEAEALLKEA--EKLKEELEEKKEKLQEEEDKLLEEAEKE 574


>gnl|CDD|185246 PRK15348, PRK15348, type III secretion system lipoprotein SsaJ;
           Provisional.
          Length = 249

 Score = 30.3 bits (68), Expect = 0.21
 Identities = 20/71 (28%), Positives = 36/71 (50%), Gaps = 6/71 (8%)

Query: 22  HINGKKHQRNLGMSMRVERSSLDQVKKRFDMN---KKKYEQKKKDYDIEQRMRELKEEEE 78
           HI+ +K Q   G+++RVE+S      +   +N    +++    K +   Q +   +EE++
Sbjct: 43  HIDAEKKQEEDGVTLRVEQSQFINAVELLRLNGYPHRQFTTADKMFPANQLVVSPQEEQQ 102

Query: 79  K---LKEYRRE 86
           K   LKE R E
Sbjct: 103 KINFLKEQRIE 113


>gnl|CDD|235334 PRK05035, PRK05035, electron transport complex protein RnfC;
           Provisional.
          Length = 695

 Score = 30.3 bits (69), Expect = 0.25
 Identities = 13/60 (21%), Positives = 28/60 (46%), Gaps = 10/60 (16%)

Query: 37  RVERS---SLDQVKKRFDMNKKKYEQKKKDYDIEQRM-RELKEEEEKLKEYRRERRKEKK 92
           R  ++   +++Q KK+ +  K ++E ++       R+ RE    E + K+    R  + K
Sbjct: 432 RQAKAEIRAIEQEKKKAEEAKARFEARQ------ARLEREKAAREARHKKAAEARAAKDK 485


>gnl|CDD|216807 pfam01956, DUF106, Integral membrane protein DUF106.  This
          archaebacterial protein family has no known function.
          Members are predicted to be integral membrane proteins.
          Length = 168

 Score = 29.5 bits (67), Expect = 0.26
 Identities = 13/40 (32%), Positives = 27/40 (67%), Gaps = 2/40 (5%)

Query: 51 DMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKE 90
          D   +KY++++K+   ++R REL++  +KL   + E+R+E
Sbjct: 39 DRKMEKYQKREKEI--QKRARELRKNGDKLSPKKFEKRQE 76


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 30.1 bits (69), Expect = 0.28
 Identities = 14/60 (23%), Positives = 32/60 (53%), Gaps = 3/60 (5%)

Query: 54  KKKYEQKKKDY--DIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDLAA 111
           K++  + + ++  ++ +R  EL++ E++L + + E    K   L+  EEE E  + +L  
Sbjct: 63  KEEIHKLRNEFEKELRERRNELQKLEKRLLQ-KEENLDRKLELLEKREEELEKKEKELEQ 121



 Score = 29.4 bits (67), Expect = 0.39
 Identities = 14/48 (29%), Positives = 36/48 (75%), Gaps = 1/48 (2%)

Query: 44  DQVKKRFDMNKKKYEQ-KKKDYDIEQRMRELKEEEEKLKEYRRERRKE 90
           + + ++ ++ +K+ E+ +KK+ ++EQ+ +EL+++EE+L+E   E+ +E
Sbjct: 96  ENLDRKLELLEKREEELEKKEKELEQKQQELEKKEEELEELIEEQLQE 143



 Score = 27.4 bits (62), Expect = 2.1
 Identities = 18/68 (26%), Positives = 43/68 (63%), Gaps = 5/68 (7%)

Query: 37  RVERSSLDQVKKRFDMNKKKYEQ-KKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKL 95
           R  R+ L +++KR     +K E   +K   +E+R  EL+++E++L++ +++  ++K+ +L
Sbjct: 78  RERRNELQKLEKR---LLQKEENLDRKLELLEKREEELEKKEKELEQ-KQQELEKKEEEL 133

Query: 96  DDGEEEEE 103
           ++  EE+ 
Sbjct: 134 EELIEEQL 141


>gnl|CDD|227000 COG4653, COG4653, Predicted phage phi-C31 gp36 major capsid-like
           protein [General function prediction only].
          Length = 422

 Score = 29.8 bits (67), Expect = 0.29
 Identities = 12/53 (22%), Positives = 20/53 (37%), Gaps = 3/53 (5%)

Query: 59  QKKKDYDIEQRMRELKEEEEKLKEYRRER---RKEKKRKLDDGEEEEEGDQSD 108
                    QR + L      ++  +R R    +E+ R+L    E   GD+ D
Sbjct: 8   AGDLTGAAAQRFQALTRHATAIRAEQRRRGEEAEEENRRLLADIERVGGDKLD 60


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 30.0 bits (68), Expect = 0.33
 Identities = 24/78 (30%), Positives = 33/78 (42%), Gaps = 23/78 (29%)

Query: 53  NKKKYEQKKKDYDIEQRMREL-------------------KEEEEKLK--EYRRERRKEK 91
                E+K  +YD  QR+REL                   KEE E+LK  E  R RR   
Sbjct: 227 PPMTPEEKDDEYD--QRVRELTFDRRAQPTDRTKTEEELAKEEAERLKKLEAERLRRMRG 284

Query: 92  KRKLDDGEEEEEGDQSDL 109
           + + D+ EE+ +    DL
Sbjct: 285 EEEDDEEEEDSKESADDL 302


>gnl|CDD|114912 pfam06220, zf-U1, U1 zinc finger.  This family consists of
          several U1 small nuclear ribonucleoprotein C (U1-C)
          proteins. The U1 small nuclear ribonucleoprotein (U1
          snRNP) binds to the pre-mRNA 5' splice site (ss) at
          early stages of spliceosome assembly. Recruitment of U1
          to a class of weak 5' ss is promoted by binding of the
          protein TIA-1 to uridine-rich sequences immediately
          downstream from the 5' ss. Binding of TIA-1 in the
          vicinity of a 5' ss helps to stabilise U1 snRNP
          recruitment, at least in part, via a direct interaction
          with U1-C, thus providing one molecular mechanism for
          the function of this splicing regulator. This domain is
          probably a zinc-binding. It is found in multiple copies
          in some members of the family.
          Length = 38

 Score = 27.4 bits (61), Expect = 0.33
 Identities = 12/33 (36%), Positives = 17/33 (51%), Gaps = 2/33 (6%)

Query: 1  MLGYYCNVCDCVVKDSINFLD--HINGKKHQRN 31
          M  YYC+ CDC +      +   H  G+KH+ N
Sbjct: 1  MPKYYCDYCDCYLTHDSPSVRKSHNGGRKHKDN 33


>gnl|CDD|221937 pfam13148, DUF3987, Protein of unknown function (DUF3987).  A
           family of uncharacterized proteins found by clustering
           human gut metagenomic sequences.
          Length = 379

 Score = 29.5 bits (67), Expect = 0.42
 Identities = 12/49 (24%), Positives = 26/49 (53%), Gaps = 5/49 (10%)

Query: 55  KKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           ++YE++ K+Y+ E+ + E      + K   ++ +K  K+  D+    EE
Sbjct: 72  EEYEEELKEYEAEKEIWEA-----EKKGLEKKAKKAIKKGKDEEALAEE 115



 Score = 27.2 bits (61), Expect = 2.4
 Identities = 17/58 (29%), Positives = 29/58 (50%), Gaps = 11/58 (18%)

Query: 64  YDIEQRMRELKEEEEKLKEYRRERR---------KEKKRKLDDGEEEEEGDQSDLAAI 112
            +IE+ +RE  E EE+LKEY  E+          ++K +K     ++EE    +L  +
Sbjct: 64  EEIEEELRE--EYEEELKEYEAEKEIWEAEKKGLEKKAKKAIKKGKDEEALAEELLEL 119


>gnl|CDD|221049 pfam11262, Tho2, Transcription factor/nuclear export subunit
          protein 2.  THO and TREX form a eukaryotic complex
          which functions in messenger ribonucleoprotein
          metabolism and plays a role in preventing the
          transcription-associated genetic instability. Tho2,
          along with four other subunits forms THO.
          Length = 296

 Score = 29.2 bits (66), Expect = 0.52
 Identities = 12/45 (26%), Positives = 25/45 (55%), Gaps = 1/45 (2%)

Query: 39 ERSSLDQVKKRFDMNKKKYEQKKKDY-DIEQRMRELKEEEEKLKE 82
          E   L++  K  D +    ++KKK+   ++  +++L+EE +K  E
Sbjct: 32 EIERLEKQIKELDSSSSGIDKKKKEKKRLKSLIKKLEEELKKHIE 76


>gnl|CDD|148682 pfam07222, PBP_sp32, Proacrosin binding protein sp32.  This family
           consists of several mammalian specific proacrosin
           binding protein sp32 sequences. sp32 is a sperm specific
           protein which is known to bind with with 55- and 53-kDa
           proacrosins and the 49-kDa acrosin intermediate. The
           exact function of sp32 is unclear, it is thought however
           that the binding of sp32 to proacrosin may be involved
           in packaging the acrosin zymogen into the acrosomal
           matrix.
          Length = 243

 Score = 28.9 bits (64), Expect = 0.64
 Identities = 18/69 (26%), Positives = 34/69 (49%), Gaps = 4/69 (5%)

Query: 39  ERSSLDQVKKRFDMNKKKYEQK--KKDYDIEQRMRELKEEE--EKLKEYRRERRKEKKRK 94
           E  S     +R   N ++  Q        ++ +  + K+E+   KL+EY +E + E+K+ 
Sbjct: 161 ENQSFQPWPERLHNNVEELLQSSLSLGGSVQVKAPKPKQEQLLSKLQEYLQEHKTEEKQP 220

Query: 95  LDDGEEEEE 103
            ++ EEEE 
Sbjct: 221 QEEQEEEEV 229


>gnl|CDD|184611 PRK14297, PRK14297, chaperone protein DnaJ; Provisional.
          Length = 380

 Score = 29.0 bits (65), Expect = 0.69
 Identities = 26/81 (32%), Positives = 35/81 (43%), Gaps = 13/81 (16%)

Query: 41  SSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEE 100
           +S D++KK F     KY   K   +        KE EEK KE       E  + L D ++
Sbjct: 16  ASDDEIKKAFRKLAIKYHPDKNKGN--------KEAEEKFKEI-----NEAYQVLSDPQK 62

Query: 101 EEEGDQSDLAAIMGFSGFGGG 121
           + + DQ   A   G  GFG G
Sbjct: 63  KAQYDQFGTADFNGAGGFGSG 83


>gnl|CDD|193580 cd09891, NGN_Bact_1, Bacterial N-Utilization Substance G (NusG)
          N-terminal (NGN) domain, subgroup 1.  The N-Utilization
          Substance G (NusG) protein is involved in transcription
          elongation and termination in bacteria. NusG is
          essential in Escherichia coli and associates with RNA
          polymerase elongation and Rho-termination. Homologs of
          the NusG gene exist in all bacteria. The NusG
          N-terminal domain (NGN) is similar in all NusG
          homologs, but its C-terminal domain and the linker that
          separates these two domains are different. The domain
          organization of NusG suggests that the common
          properties of NusG and its homologs are due to their
          similar NGN domains.
          Length = 107

 Score = 27.8 bits (63), Expect = 0.73
 Identities = 14/48 (29%), Positives = 25/48 (52%), Gaps = 10/48 (20%)

Query: 57 YEQKKKDYDIEQRMRELKEEE---------EKLKEYRRERRKEKKRKL 95
          YE K K+  +E+R+     E+         E++ E +  ++K K+RKL
Sbjct: 11 YENKVKEN-LEKRIESEGLEDYIGEVLVPTEEVVEVKNGKKKVKERKL 57


>gnl|CDD|234125 TIGR03156, GTP_HflX, GTP-binding protein HflX.  This protein family
           is one of a number of homologous small, well-conserved
           GTP-binding proteins with pleiotropic effects. Bacterial
           members are designated HflX, following the naming
           convention in Escherichia coli where HflX is encoded
           immediately downstream of the RNA chaperone Hfq, and
           immediately upstream of HflKC, a membrane-associated
           protease pair with an important housekeeping function.
           Over large numbers of other bacterial genomes, the
           pairing with hfq is more significant than with hflK and
           hlfC. The gene from Homo sapiens in this family has been
           named PGPL (pseudoautosomal GTP-binding protein-like)
           [Unknown function, General].
          Length = 351

 Score = 28.6 bits (65), Expect = 0.74
 Identities = 11/29 (37%), Positives = 21/29 (72%)

Query: 66  IEQRMRELKEEEEKLKEYRRERRKEKKRK 94
           I +R+ +LK+E EK+++ R  +R+ +KR 
Sbjct: 159 IRERIAQLKKELEKVEKQRERQRRRRKRA 187


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 28.9 bits (65), Expect = 0.75
 Identities = 16/74 (21%), Positives = 35/74 (47%), Gaps = 2/74 (2%)

Query: 37   RVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLK--EYRRERRKEKKRK 94
            +    +  +VKKR + +    ++KKK      R ++ K   ++    +  R  R+ +K+K
Sbjct: 1301 KPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRLLRRPRKKK 1360

Query: 95   LDDGEEEEEGDQSD 108
             D   E+++  + D
Sbjct: 1361 SDSSSEDDDDSEVD 1374


>gnl|CDD|217838 pfam04004, Leo1, Leo1-like protein.  Members of this family are
           part of the Paf1/RNA polymerase II complex. The Paf1
           complex probably functions during the elongation phase
           of transcription. The Leo1 subunit of the yeast
           Paf1-complex binds RNA and contributes to complex
           recruitment. The subunit acts by co-ordinating
           co-transcriptional chromain modifications and helping
           recruitment of mRNA 3prime-end processing factors.
          Length = 312

 Score = 28.7 bits (64), Expect = 0.80
 Identities = 19/57 (33%), Positives = 28/57 (49%), Gaps = 3/57 (5%)

Query: 62  KDYDIEQRMRELKEEEEKLKEYRRE--RRKEKKRKLDDGEEEEEGDQSDLAAIMGFS 116
           KD + E+R RE K+EE+KL+  RR   R K K +  +        D +   A   +S
Sbjct: 233 KDPEHEKRERE-KKEEQKLRARRRRQNREKMKNKPPNRPGHGSGSDSNVAKAATTYS 288


>gnl|CDD|111006 pfam02064, MAS20, MAS20 protein import receptor. 
          Length = 184

 Score = 28.5 bits (63), Expect = 0.80
 Identities = 18/59 (30%), Positives = 32/59 (54%), Gaps = 2/59 (3%)

Query: 50  FDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSD 108
           FD  ++   + +K   + QR +E  + EE+  E+ +E + +K R+  D E  E+G  SD
Sbjct: 29  FDYKRRNDPEFRKQ--LRQRAKEQAKMEEEAAEHAKEAKLQKIREFLDMEAAEDGFPSD 85


>gnl|CDD|234241 TIGR03517, GldM_gliding, gliding motility-associated protein GldM. 
           This protein family, GldM, is named for the member from
           Flavobacterium johnsoniae, which is required for a type
           of rapid gliding motility found in certain members of
           the Bacteriodetes. However, members are found also in
           several members of the Bacteriodetes that appear not to
           be motile. The best conserved region, toward the
           N-terminus, is centered on a highly hydrobobic probable
           transmembrane helix. Two paralogs are found in Cytophaga
           hutchinsonii.
          Length = 523

 Score = 28.6 bits (64), Expect = 0.81
 Identities = 19/76 (25%), Positives = 37/76 (48%), Gaps = 5/76 (6%)

Query: 52  MNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEE----EEGDQS 107
           +NK   +   KD    +  ++++ + + L +Y  + ++E  RK D  +E+       ++ 
Sbjct: 61  LNKAVAKAPGKDKAWSESAQKVRTKSDSLMDYMNDLKEEIIRKADGEKEDGGPKGAKEKD 120

Query: 108 DLAAIM-GFSGFGGGK 122
           DL A+M G  G   GK
Sbjct: 121 DLEAVMVGTLGPINGK 136


>gnl|CDD|216652 pfam01698, FLO_LFY, Floricaula / Leafy protein.  This family
           consists of various plant development proteins which are
           homologues of floricaula (FLO) and Leafy (LFY) proteins
           which are floral meristem identity proteins. Mutations
           in the sequences of these proteins affect flower and
           leaf development.
          Length = 382

 Score = 28.4 bits (64), Expect = 0.90
 Identities = 10/39 (25%), Positives = 21/39 (53%)

Query: 70  MRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSD 108
           +     + EK K+ +++RRK  K   +D +++E+ D   
Sbjct: 174 VPGHSSDSEKKKQRKKQRRKRSKELREDDDDDEDEDDDG 212


>gnl|CDD|225962 COG3428, COG3428, Predicted membrane protein [Function unknown].
          Length = 494

 Score = 28.6 bits (64), Expect = 0.90
 Identities = 8/49 (16%), Positives = 14/49 (28%), Gaps = 5/49 (10%)

Query: 74  KEEEEKLKEYRRERRKEKKRK-LDD----GEEEEEGDQSDLAAIMGFSG 117
            E    ++E R ++    +    D       E     QS +       G
Sbjct: 136 FELAALVREARVKKLDALELAEADTPEEEVAEVLARSQSSVLRYPMNKG 184


>gnl|CDD|153281 cd07597, BAR_SNX8, The Bin/Amphiphysin/Rvs (BAR) domain of Sorting
           Nexin 8.  BAR domains are dimerization, lipid binding
           and curvature sensing modules found in many different
           proteins with diverse functions. Sorting nexins (SNXs)
           are Phox homology (PX) domain containing proteins that
           are involved in regulating membrane traffic and protein
           sorting in the endosomal system. SNXs differ from each
           other in their lipid-binding specificity, subcellular
           localization and specific function in the endocytic
           pathway. A subset of SNXs also contain BAR domains. The
           PX-BAR structural unit determines the specific membrane
           targeting of SNXs. SNX8 and the yeast counterpart Mvp1p
           are involved in sorting and delivery of late-Golgi
           proteins, such as carboxypeptidase Y, to vacuoles. BAR
           domains form dimers that bind to membranes, induce
           membrane bending and curvature, and may also be involved
           in protein-protein interactions.
          Length = 246

 Score = 28.0 bits (63), Expect = 0.97
 Identities = 15/49 (30%), Positives = 25/49 (51%), Gaps = 9/49 (18%)

Query: 37  RVERSSLDQV---KKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKE 82
           R E+ SL+ +    KR ++NKKK E  +   D++        E +KL+ 
Sbjct: 138 RHEKLSLNNIQRLLKRIELNKKKLESLRAKPDVKG------AEVDKLEA 180


>gnl|CDD|206666 cd01878, HflX, HflX GTPase family.  HflX subfamily. A distinct
          conserved domain with a glycine-rich segment N-terminal
          of the GTPase domain characterizes the HflX subfamily.
          The E. coli HflX has been implicated in the control of
          the lambda cII repressor proteolysis, but the actual
          biological functions of these GTPases remain unclear.
          HflX is widespread, but not universally represented in
          all three superkingdoms.
          Length = 204

 Score = 28.2 bits (64), Expect = 1.1
 Identities = 11/28 (39%), Positives = 20/28 (71%)

Query: 66 IEQRMRELKEEEEKLKEYRRERRKEKKR 93
          I +R+ +L++E EK+K+ R  +R  +KR
Sbjct: 11 IRERIAKLRKELEKVKKQRELQRARRKR 38


>gnl|CDD|218351 pfam04961, FTCD_C, Formiminotransferase-cyclodeaminase.  Members
          of this family are thought to be Formiminotransferase-
          cyclodeaminase enzymes EC:4.3.1.4. This domain is found
          in the C-terminus of the bifunctional animal members of
          the family.
          Length = 176

 Score = 27.9 bits (63), Expect = 1.1
 Identities = 13/34 (38%), Positives = 21/34 (61%), Gaps = 6/34 (17%)

Query: 49 RFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKE 82
             + KKKYE      D+E+ M+E+ E+ E+L+E
Sbjct: 39 NLTIGKKKYE------DVEEEMKEILEKAEELRE 66


>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
          Length = 1036

 Score = 27.9 bits (62), Expect = 1.4
 Identities = 21/62 (33%), Positives = 36/62 (58%), Gaps = 13/62 (20%)

Query: 51  DMNKKKYEQKKKDYDIEQRMRELKEEEEKL--KEYRRERRKEKKRKLDDGEEEEEGDQSD 108
            M++  +E    D+ +E++ REL    EKL  +E  RER+ E++R+    EEE+   ++D
Sbjct: 240 GMDEHSFE----DFLLEEKRREL----EKLAKEEAERERQAEEQRRR---EEEKAAMEAD 288

Query: 109 LA 110
            A
Sbjct: 289 RA 290


>gnl|CDD|221028 pfam11208, DUF2992, Protein of unknown function (DUF2992).  This
           bacterial family of proteins has no known function.
           However, the cis-regulatory yjdF motif, just upstream
           from the gene encoding the proteins for this family, is
           a small non-coding RNA, Rfam:RF01764. The yjdF motif is
           found in many Firmicutes, including Bacillus subtilis.
           In most cases, it resides in potential 5' UTRs of
           homologues of the yjdF gene whose function is unknown.
           However, in Streptococcus thermophilus, a yjdF RNA motif
           is associated with an operon whose protein products
           synthesise nicotinamide adenine dinucleotide (NAD+).
           Also, the S. thermophilus yjdF RNA lacks typical yjdF
           motif consensus features downstream of and including the
           P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the
           S. thermophilus RNAs might sense a distinct compound
           that structurally resembles the ligand bound by other
           yjdF RNAs. On the ohter hand, perhaps these RNAs have an
           alternative solution forming a similar binding site, as
           is observed with some SAM riboswitches.
          Length = 132

 Score = 27.2 bits (61), Expect = 1.5
 Identities = 24/82 (29%), Positives = 43/82 (52%), Gaps = 7/82 (8%)

Query: 13  VKDSINFLDHINGKKHQRNLGMSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRE 72
           VK SI     IN K+ QR     ++    S     K     K ++E+ K++   ++R +E
Sbjct: 55  VKVSIKKQKKINPKRLQRQAAKEVKKPGIS----TKAQQALKLEHERNKQEK--KKRSKE 108

Query: 73  LKEEEEKLK-EYRRERRKEKKR 93
            KEEE++ K + +++++K K R
Sbjct: 109 KKEEEKERKRQLKQQKKKAKHR 130


>gnl|CDD|227396 COG5064, SRP1, Karyopherin (importin) alpha [Intracellular
           trafficking and secretion].
          Length = 526

 Score = 27.9 bits (62), Expect = 1.6
 Identities = 13/46 (28%), Positives = 24/46 (52%)

Query: 61  KKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQ 106
           K  +  ++  R  +E++ +L++ +RE    K+R L D  EE E   
Sbjct: 17  KGRFSADELRRRREEQQVELRKQKREELLNKRRNLADVSEEAESSF 62


>gnl|CDD|220623 pfam10186, Atg14, UV radiation resistance protein and
           autophagy-related subunit 14.  The Atg14 or Apg14
           proteins are hydrophilic proteins with a predicted
           molecular mass of 40.5 kDa, and have a coiled-coil motif
           at the N terminus region. Yeast cells with mutant Atg14
           are defective not only in autophagy but also in sorting
           of carboxypeptidase Y (CPY), a vacuolar-soluble
           hydrolase, to the vacuole. Subcellular fractionation
           indicate that Apg14p and Apg6p are peripherally
           associated with a membrane structure(s). Apg14p was
           co-immunoprecipitated with Apg6p, suggesting that they
           form a stable protein complex. These results imply that
           Apg6/Vps30p has two distinct functions: in the
           autophagic process and in the vacuolar protein sorting
           pathway. Apg14p may be a component specifically required
           for the function of Apg6/Vps30p through the autophagic
           pathway. There are 17 auto-phagosomal component proteins
           which are categorized into six functional units, one of
           which is the AS-PI3K complex (Vps30/Atg6 and Atg14). The
           AS-PI3K complex and the Atg2-Atg18 complex are essential
           for nucleation, and the specific function of the AS-PI3K
           apparently is to produce phosphatidylinositol
           3-phosphate (PtdIns(3)P) at the pre-autophagosomal
           structure (PAS). The localisation of this complex at the
           PAS is controlled by Atg14. Autophagy mediates the
           cellular response to nutrient deprivation, protein
           aggregation, and pathogen invasion in humans, and
           malfunction of autophagy has been implicated in multiple
           human diseases including cancer. This effect seems to be
           mediated through direct interaction of the human Atg14
           with Beclin 1 in the human phosphatidylinositol 3-kinase
           class III complex.
          Length = 307

 Score = 27.7 bits (62), Expect = 1.7
 Identities = 10/44 (22%), Positives = 23/44 (52%)

Query: 51  DMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRK 94
            +  +   +K++   I  R+ +LKEE E+ +E   E ++   ++
Sbjct: 61  LLKLEVARKKERLNQIRARISQLKEEIEQKRERIEELKRALAQR 104


>gnl|CDD|184042 PRK13415, PRK13415, flagella biosynthesis protein FliZ;
           Provisional.
          Length = 219

 Score = 27.4 bits (61), Expect = 1.9
 Identities = 23/91 (25%), Positives = 38/91 (41%), Gaps = 4/91 (4%)

Query: 13  VKDSINFLDHINGKKHQRNLGMSMRVER--SSLDQVKKRFDMNKKKYEQKKKDYDIEQRM 70
           V +SI  L  I  +K +    ++   ER  S  +  +    +  +   + K+        
Sbjct: 130 VGESIQLLKEIEDEK-EIEEILAQHEERLESKAEWSRWGQKLRDQWKGKSKQKQTTLPSF 188

Query: 71  RE-LKEEEEKLKEYRRERRKEKKRKLDDGEE 100
              LKEE ++LKE R E  K  K+K    +E
Sbjct: 189 SALLKEELKELKEKRSEGLKRLKKKGTAHDE 219


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 27.7 bits (61), Expect = 1.9
 Identities = 21/69 (30%), Positives = 36/69 (52%), Gaps = 6/69 (8%)

Query: 58  EQKKKDYDIEQRMRELKEEEEKLKE---YRRERRKEKKRKLDDGEEEEEGDQSDLAAIMG 114
           E KKK    E+R + L+EEE++ K+    R+ R +E+KR+L +  E    + ++    + 
Sbjct: 218 ELKKKR---EERRKVLEEEEQRRKQEEADRKSREEEEKRRLKEEIERRRAEAAEKRQKVP 274

Query: 115 FSGFGGGKK 123
             G    KK
Sbjct: 275 EDGLSEDKK 283


>gnl|CDD|183610 PRK12585, PRK12585, putative monovalent cation/H+ antiporter
           subunit G; Reviewed.
          Length = 197

 Score = 27.3 bits (60), Expect = 1.9
 Identities = 28/95 (29%), Positives = 53/95 (55%), Gaps = 14/95 (14%)

Query: 23  INGKKHQRNLGMSMRVERSSLDQVKKRFDMNKKKY---------EQKKKDYDIEQRMREL 73
           IN   +   + +++R+ R  L  VKK  D+ KKK          + +++  ++E+RM E 
Sbjct: 88  INRAAYDTGVPLAIRI-RDQLRSVKKD-DIKKKKSLIIRQEQIEKARQEREELEERM-EW 144

Query: 74  KEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSD 108
           +  EEK+ E  RE ++E++R+ ++   EE+ D S+
Sbjct: 145 ERREEKIDE--REDQEEQEREREEQTIEEQSDDSE 177


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 27.6 bits (62), Expect = 1.9
 Identities = 14/59 (23%), Positives = 30/59 (50%), Gaps = 7/59 (11%)

Query: 45  QVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           +V ++ D  +++ E+K     I +   E ++EE + K  + E++KE++         EE
Sbjct: 256 EVLRKVDKTREEEEEK-----ILKAAEEERQEEAQEK--KEEKKKEEREAKLAKLSPEE 307


>gnl|CDD|178635 PLN03086, PLN03086, PRLI-interacting factor K; Provisional.
          Length = 567

 Score = 27.5 bits (61), Expect = 1.9
 Identities = 15/62 (24%), Positives = 34/62 (54%), Gaps = 5/62 (8%)

Query: 50  FDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRK-----EKKRKLDDGEEEEEG 104
           F++ + + + +++  + +QR +   E E K KE   ++R+     ++ R+LD  E + + 
Sbjct: 3   FELRRAREKLEREQRERKQRAKLKLERERKAKEEAAKQREAIEAAQRSRRLDAIEAQIKA 62

Query: 105 DQ 106
           DQ
Sbjct: 63  DQ 64


>gnl|CDD|118668 pfam10140, YukC, WXG100 protein secretion system (Wss), protein
           YukC.  Members of this family of proteins include
           predicted membrane proteins homologous to YukC in B.
           subtilis. The YukC protein family would participate to
           the formation of a translocon required for the secretion
           of WXG100 proteins (pfam06013) in monoderm bacteria, the
           WXG100 protein secretion system (Wss). This family
           includes EssB in Staphylococcus aureus.
          Length = 359

 Score = 27.6 bits (62), Expect = 1.9
 Identities = 15/35 (42%), Positives = 22/35 (62%), Gaps = 2/35 (5%)

Query: 55  KKYEQKKKDYDI--EQRMRELKEEEEKLKEYRRER 87
           K  EQ K D D+  ++R  +L E E++L EY +ER
Sbjct: 325 KYREQVKNDDDLSGDERQEKLDELEDELDEYWKER 359


>gnl|CDD|130161 TIGR01089, fucI, L-fucose isomerase.  This enzyme catalyzes the
           first step in fucose metabolism, and has been
           characterized in Escherichia coli and Bacteroides
           thetaiotaomicron [Energy metabolism, Sugars].
          Length = 587

 Score = 27.6 bits (61), Expect = 2.1
 Identities = 16/70 (22%), Positives = 28/70 (40%), Gaps = 9/70 (12%)

Query: 34  MSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKR 93
           + MR E   + ++++R D         +K YD E+    L   ++  K    E  KE +R
Sbjct: 198 LGMRNEAVDMTEIRRRID---------QKIYDEEELEMALAWADKYCKYGEDENNKEYQR 248

Query: 94  KLDDGEEEEE 103
             +      E
Sbjct: 249 NAEQSRAVWE 258


>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y.  Members of this family are
           RNase Y, an endoribonuclease. The member from Bacillus
           subtilis, YmdA, has been shown to be involved in
           turnover of yitJ riboswitch [Transcription, Degradation
           of RNA].
          Length = 514

 Score = 27.6 bits (62), Expect = 2.1
 Identities = 16/59 (27%), Positives = 29/59 (49%), Gaps = 3/59 (5%)

Query: 54  KKKYEQKKKDYDIE--QRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDLA 110
           K++  + + + + E  +R  EL+  E +L + R E    K   LD  EE  E  + +L+
Sbjct: 57  KEEVHKLRAELERELKERRNELQRLERRLLQ-REETLDRKMESLDKKEENLEKKEKELS 114



 Score = 26.8 bits (60), Expect = 3.3
 Identities = 15/47 (31%), Positives = 31/47 (65%)

Query: 47  KKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKR 93
           +K   ++KK+   +KK+ ++  + + L E+EE+L+E   E+R+E +R
Sbjct: 94  RKMESLDKKEENLEKKEKELSNKEKNLDEKEEELEELIAEQREELER 140


>gnl|CDD|148701 pfam07246, Phlebovirus_NSM, Phlebovirus nonstructural protein NS-M.
            This family consists of several Phlebovirus
           nonstructural NS-M proteins which represent the
           N-terminal region of the M polyprotein precursor. The
           function of this family is unknown.
          Length = 264

 Score = 27.2 bits (59), Expect = 2.2
 Identities = 15/79 (18%), Positives = 34/79 (43%), Gaps = 4/79 (5%)

Query: 28  HQRNLGMSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRER 87
             RN+    +  R   +++KK  +      ++K K+ + + R +    E ++ K   ++ 
Sbjct: 128 FIRNITTGEQSPRVDYEKLKKNAEEKDATIQRKTKEMEEDSRNQIAHHEIQQKKNEIQKL 187

Query: 88  RKEKKRKLDDGEEEEEGDQ 106
           R + KR    G+E  +   
Sbjct: 188 RNDLKR----GQEHRDAKL 202


>gnl|CDD|220535 pfam10037, MRP-S27, Mitochondrial 28S ribosomal protein S27.
           Members of this family of small ribosomal proteins
           possess one of three conserved blocks of sequence found
           in proteins that stimulate the dissociation of guanine
           nucleotides from G-proteins, leaving open the
           possibility that MRP-S27 might be a functional partner
           of GTP-binding ribosomal proteins.
          Length = 417

 Score = 27.4 bits (61), Expect = 2.2
 Identities = 11/44 (25%), Positives = 25/44 (56%)

Query: 60  KKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           +++   IE+   EL E+ ++L  +   +  ++K+KL +  +EE 
Sbjct: 365 EERLPTIEKEDLELYEQRQQLWFFENRKLWQRKKKLREQADEEY 408


>gnl|CDD|221489 pfam12252, SidE, Dot/Icm substrate protein.  This family of proteins
            is found in bacteria. Proteins in this family are
            typically between 397 and 1543 amino acids in length.
            This family is the SidE protein in the Dot/Icm pathway of
            Legionella pneumophila bacteria. There is little
            literature describing the family.
          Length = 1443

 Score = 27.5 bits (61), Expect = 2.3
 Identities = 19/58 (32%), Positives = 32/58 (55%), Gaps = 8/58 (13%)

Query: 40   RSSLDQVK-KRFDMNKKKYEQK-------KKDYDIEQRMRELKEEEEKLKEYRRERRK 89
            RSSL+Q+K K F+M +K+ +Q        +K  D      +L+E+  KLK+    ++K
Sbjct: 1266 RSSLNQMKPKTFEMQEKEIQQNFELLAKLEKTLDKSDTAEKLREDIPKLKDLLIAKQK 1323


>gnl|CDD|236892 PRK11281, PRK11281, hypothetical protein; Provisional.
          Length = 1113

 Score = 27.2 bits (61), Expect = 2.3
 Identities = 8/32 (25%), Positives = 14/32 (43%), Gaps = 2/32 (6%)

Query: 83  YRR--ERRKEKKRKLDDGEEEEEGDQSDLAAI 112
           YRR   +R+   ++  +G E  E     L  +
Sbjct: 748 YRRALAKRQNLVKEGAEGAEPVEEPTLALEQV 779


>gnl|CDD|146486 pfam03879, Cgr1, Cgr1 family.  Members of this family are
           coiled-coil proteins that are involved in pre-rRNA
           processing.
          Length = 105

 Score = 26.6 bits (59), Expect = 2.4
 Identities = 14/50 (28%), Positives = 31/50 (62%)

Query: 54  KKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           +K+ E++ +   I+ R +ELK+E+E  ++ R +  KE++   ++ E  E+
Sbjct: 31  EKRMEKRLEQQAIKAREKELKDEKEAERQRRIQAIKERRAAKEEKERYEK 80


>gnl|CDD|112836 pfam04037, DUF382, Domain of unknown function (DUF382).  This
          domain is specific to the human splicing factor 3b
          subunit 2 and it's orthologues. Splicing factor 3b
          subunit 2 or SAP145 is a suppressor of U2 snRNA
          mutations. Pre-mRNA splicing is catalyzed by a large
          ribonucleoprotein complex called the spliceosome.
          Spliceosomes are multi-component enzymes that catalyze
          pre-mRNA splicing and form step-wise by the ordered
          interaction of UsnRNPs and non-snRNP proteins with
          short conserved regions of the pre-mRNA at the 5' and
          3' splice sites and branch site.
          Length = 129

 Score = 26.8 bits (60), Expect = 2.4
 Identities = 18/62 (29%), Positives = 30/62 (48%), Gaps = 11/62 (17%)

Query: 45 QVKKRFDMNKKKYEQKK-------KDYDIEQRMREL---KEEEEKLKEYRRERRKEKKRK 94
            K+++   K+  E+         +   I + MR+    KE E+ LK+ +RER + K  K
Sbjct: 37 SQKRKYLSGKRGIEKPPFELPDFIEATGIAE-MRDALLEKEAEKTLKQKQRERVQPKMGK 95

Query: 95 LD 96
          LD
Sbjct: 96 LD 97


>gnl|CDD|227084 COG4741, COG4741, Predicted secreted endonuclease distantly
          related to archaeal Holliday junction resolvase
          [Nucleotide transport and metabolism].
          Length = 175

 Score = 27.0 bits (60), Expect = 2.4
 Identities = 13/58 (22%), Positives = 31/58 (53%), Gaps = 3/58 (5%)

Query: 39 ERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRE---LKEEEEKLKEYRRERRKEKKR 93
           R+ +  ++ + +   ++ E+  +  + E+ + E    KEEE KLKE+  ++ +E + 
Sbjct: 20 LRAYIRSLQGKVESKARELEETLQKAERERLVNEAQARKEEEWKLKEWIEKKIEEARE 77


>gnl|CDD|227666 COG5374, COG5374, Uncharacterized conserved protein [Function
           unknown].
          Length = 192

 Score = 27.1 bits (60), Expect = 2.5
 Identities = 8/29 (27%), Positives = 14/29 (48%)

Query: 54  KKKYEQKKKDYDIEQRMRELKEEEEKLKE 82
           +K  E+  K  D    +RE  ++E   K+
Sbjct: 163 QKNQEELFKLLDKYNELREQVQKESSKKK 191


>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
           This is a family of fungal proteins of unknown function.
          Length = 182

 Score = 27.0 bits (60), Expect = 2.6
 Identities = 12/50 (24%), Positives = 33/50 (66%)

Query: 57  YEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQ 106
            E KKK  ++ + + ++K+E E+ ++++ +++K KK+K  D +++++   
Sbjct: 58  TEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKD 107


>gnl|CDD|148285 pfam06584, DIRP, DIRP.  DIRP (Domain in Rb-related Pathway) is
           postulated to be involved in the Rb-related pathway,
           which is encoded by multiple eukaryotic genomes and is
           present in proteins including lin-9 of Caenorhabditis
           elegans, aly of fruit fly and mustard weed. Studies of
           lin-9 and aly of fruit fly proteins containing DIRP
           suggest that this domain might be involved in
           development. Aly, lin-9, act in parallel to, or
           downstream of, activation of MAPK by the RTK-Ras
           signalling pathway.
          Length = 109

 Score = 26.5 bits (59), Expect = 2.6
 Identities = 12/31 (38%), Positives = 17/31 (54%)

Query: 73  LKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           LKEE EKL+  R + R+ ++ KL   E    
Sbjct: 54  LKEEREKLERKREKIRQLQQLKLHYKELNVG 84


>gnl|CDD|197850 smart00738, NGN, In Spt5p, this domain may confer affinity for
          Spt4p. It possesses a RNP-like fold.  In Spt5p, this
          domain may confer affinity for Spt4p.Spt4p.
          Length = 106

 Score = 26.2 bits (58), Expect = 2.9
 Identities = 13/45 (28%), Positives = 25/45 (55%)

Query: 54 KKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDG 98
           +  E+K +   +E ++  +    E++KE RR ++K  +RKL  G
Sbjct: 16 AENLERKAEALGLEDKIVSILVPTEEVKEIRRGKKKVVERKLFPG 60


>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27.  This protein forms
           the C subunit of DNA polymerase delta. It carries the
           essential residues for binding to the Pol1 subunit of
           polymerase alpha, from residues 293-332, which are
           characterized by the motif D--G--VT, referred to as the
           DPIM motif. The first 160 residues of the protein form
           the minimal domain for binding to the B subunit, Cdc1,
           of polymerase delta, the final 10 C-terminal residues,
           362-372, being the DNA sliding clamp, PCNA, binding
           motif.
          Length = 427

 Score = 27.1 bits (60), Expect = 3.0
 Identities = 14/45 (31%), Positives = 27/45 (60%)

Query: 59  QKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
            + +D D  +   E  + EE+ +E  +E+RK  K+ ++D +E+EE
Sbjct: 266 DEDEDEDEPKPSGERSDSEEETEEKEKEKRKRLKKMMEDEDEDEE 310


>gnl|CDD|215590 PLN03123, PLN03123, poly [ADP-ribose] polymerase; Provisional.
          Length = 981

 Score = 27.1 bits (60), Expect = 3.0
 Identities = 9/37 (24%), Positives = 17/37 (45%)

Query: 46  VKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKE 82
            K   D++      +KK  D+E ++    +E   LK+
Sbjct: 221 AKTDRDVSTSTAASQKKSSDLESKLEAQSKELWSLKD 257


>gnl|CDD|182730 PRK10787, PRK10787, DNA-binding ATP-dependent protease La;
           Provisional.
          Length = 784

 Score = 27.2 bits (60), Expect = 3.0
 Identities = 21/82 (25%), Positives = 42/82 (51%), Gaps = 6/82 (7%)

Query: 34  MSMRVERSSLDQVKKRF-DMNKKKYEQKKKDYDIEQRM----RELKEEEEKLKEYRRERR 88
           M+M      L QV+KR  +  KK+ E+ +++Y + ++M    +EL E ++   E    +R
Sbjct: 197 MAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQKELGEMDDAPDENEALKR 256

Query: 89  K-EKKRKLDDGEEEEEGDQSDL 109
           K +  +   + +E+ E +   L
Sbjct: 257 KIDAAKMPKEAKEKAEAELQKL 278


>gnl|CDD|149875 pfam08941, USP8_interact, USP8 interacting.  This domain
          interacts with the UBP deubiquitinating enzyme USP8.
          Length = 179

 Score = 26.7 bits (59), Expect = 3.1
 Identities = 14/38 (36%), Positives = 17/38 (44%)

Query: 51 DMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERR 88
          D   +  E K    D E ++ E K E E LK Y R  R
Sbjct: 8  DQATEIAELKHTQVDHEIQINEQKRELELLKYYIRALR 45


>gnl|CDD|188441 TIGR03926, T7_EssB, type VII secretion protein EssB.  Members of
           this family are associated with type VII secretion of
           WXG100 family targets in the Firmicutes, but not in the
           Actinobacteria. This protein is designated YukC in
           Bacillus subtilis and EssB is Staphylococcus aureus
           [Protein fate, Protein and peptide secretion and
           trafficking].
          Length = 377

 Score = 26.9 bits (60), Expect = 3.3
 Identities = 13/34 (38%), Positives = 21/34 (61%), Gaps = 2/34 (5%)

Query: 55  KKYEQKKKDYDI--EQRMRELKEEEEKLKEYRRE 86
           K  EQ K D D+  ++R  +L E E++L EY ++
Sbjct: 344 KYREQVKNDTDLSGDERQEKLDELEKELDEYLKK 377


>gnl|CDD|232965 TIGR00414, serS, seryl-tRNA synthetase.  This model represents the
           seryl-tRNA synthetase found in most organisms. This
           protein is a class II tRNA synthetase, and is recognized
           by the pfam model tRNA-synt_2b. The seryl-tRNA
           synthetases of two archaeal species, Methanococcus
           jannaschii and Methanobacterium thermoautotrophicum,
           differ considerably and are included in a different
           model [Protein synthesis, tRNA aminoacylation].
          Length = 418

 Score = 26.9 bits (60), Expect = 3.3
 Identities = 15/65 (23%), Positives = 33/65 (50%), Gaps = 5/65 (7%)

Query: 36  MRVERSSLDQVKKRF-----DMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKE 90
            +   S +++++ +       + K K ++K K  +I++ ++ELKEE  +L    +    E
Sbjct: 39  RKKLLSEIEELQAKRNELSKQIGKAKGQKKDKIEEIKKELKELKEELTELSAALKALEAE 98

Query: 91  KKRKL 95
            + KL
Sbjct: 99  LQDKL 103


>gnl|CDD|222366 pfam13764, E3_UbLigase_R4, E3 ubiquitin-protein ligase UBR4.  This
           is a family of E## ubiquitin ligase enzymes.
          Length = 794

 Score = 26.9 bits (60), Expect = 3.3
 Identities = 12/31 (38%), Positives = 17/31 (54%), Gaps = 3/31 (9%)

Query: 67  EQRMRELKEEE---EKLKEYRRERRKEKKRK 94
           E  +  L E+E   +K+ E R E R EK+R 
Sbjct: 394 ENLLETLAEKEGVAKKIDEVRDETRAEKRRL 424


>gnl|CDD|215532 PLN02982, PLN02982, galactinol-raffinose
           galactosyltransferase/ghydrolase, hydrolyzing O-glycosyl
           compounds.
          Length = 865

 Score = 27.1 bits (60), Expect = 3.4
 Identities = 32/132 (24%), Positives = 51/132 (38%), Gaps = 34/132 (25%)

Query: 16  SINFLDHINGKKHQRNL---GMSMRVERSSLDQVKK---------------RFDMNKKK- 56
           SINF D  N  +  +NL   G  M       D+ +K                FD  K K 
Sbjct: 266 SINF-DGDNPNEDAKNLVLGGTQMTARLYRFDECEKFRNYKGGSMLGPDPPHFDPKKPKM 324

Query: 57  ---------YEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQS 107
                    + +K +   IE  + +L E + K+K+ ++E        + DGEE+    +S
Sbjct: 325 LIYKAIEREHAEKARKKAIESGVTDLSEFDAKIKQLKKEL-----DAMFDGEEKSVSSES 379

Query: 108 DLAAIMGFSGFG 119
           + +     SG G
Sbjct: 380 ESSGSCKVSGSG 391


>gnl|CDD|214395 CHL00204, ycf1, Ycf1; Provisional.
          Length = 1832

 Score = 27.0 bits (60), Expect = 3.4
 Identities = 17/49 (34%), Positives = 26/49 (53%), Gaps = 5/49 (10%)

Query: 47  KKRFDMN--KKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKR 93
            K+   N   K  E K  D   E+  ++ K+E++K +EY+RE   EK R
Sbjct: 721 MKKIFRNWNGKDAEFKISDSVEEKTKKKKKKEKKKEEEYKRE---EKAR 766



 Score = 25.8 bits (57), Expect = 9.1
 Identities = 14/46 (30%), Positives = 25/46 (54%), Gaps = 6/46 (13%)

Query: 64  YDIEQRMREL------KEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           +DI   M+++      K+ E K+ +   E+ K+KK+K    EEE +
Sbjct: 715 FDISGLMKKIFRNWNGKDAEFKISDSVEEKTKKKKKKEKKKEEEYK 760


>gnl|CDD|234350 TIGR03766, TIGR03766, conserved hypothetical integral membrane
           protein.  Models TIGR03110, TIGR03111, and TIGR03112
           describe a three-gene system found in several
           Gram-positive bacteria, where TIGR03110 (XrtG) is
           distantly related to a putative transpeptidase,
           exosortase (TIGR02602). This model describes a small
           clade that correlates by both gene clustering and
           phyletic pattern, although imperfectly, to the three
           gene system. Both this narrow clade, and the larger set
           of full-length homologous integral membrane proteins,
           have an especially well-conserved region near the
           C-terminus with an invariant tyrosine. The function is
           unknown.
          Length = 483

 Score = 26.9 bits (60), Expect = 3.5
 Identities = 7/25 (28%), Positives = 11/25 (44%)

Query: 51  DMNKKKYEQKKKDYDIEQRMRELKE 75
                  + +K  Y I++  R LKE
Sbjct: 325 ATLSLPTKAEKNKYSIKEIKRRLKE 349


>gnl|CDD|215521 PLN02967, PLN02967, kinase.
          Length = 581

 Score = 26.9 bits (59), Expect = 3.6
 Identities = 14/34 (41%), Positives = 20/34 (58%)

Query: 75  EEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSD 108
            EEEK ++  R+RRK KK   D  ++  E + SD
Sbjct: 128 VEEEKTEKKVRKRRKVKKMDEDVEDQGSESEVSD 161


>gnl|CDD|216096 pfam00749, tRNA-synt_1c, tRNA synthetases class I (E and Q),
           catalytic domain.  Other tRNA synthetase sub-families
           are too dissimilar to be included. This family includes
           only glutamyl and glutaminyl tRNA synthetases. In some
           organisms, a single glutamyl-tRNA synthetase
           aminoacylates both tRNA(Glu) and tRNA(Gln).
          Length = 314

 Score = 26.5 bits (59), Expect = 3.6
 Identities = 17/62 (27%), Positives = 27/62 (43%), Gaps = 8/62 (12%)

Query: 48  KRFDMNKKKYEQKKK------DYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEE 101
            RFD+  K  E+  +       +   + + E +EE+E L    R R  E+  +L   EEE
Sbjct: 76  DRFDIYYKYAEELIEKGLAYVCFCTPEELEEEREEQEALGSPERPRYDEECLRL--FEEE 133

Query: 102 EE 103
             
Sbjct: 134 MR 135


>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
           Rpc31.  RNA polymerase III contains seventeen subunits
           in yeasts and in human cells. Twelve of these are akin
           to RNA polymerase I or II and the other five are RNA pol
           III-specific, and form the functionally distinct groups
           (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
           Rpc34 and Rpc82 form a cluster of enzyme-specific
           subunits that contribute to transcription initiation in
           S.cerevisiae and H.sapiens. There is evidence that these
           subunits are anchored at or near the N-terminal Zn-fold
           of Rpc1, itself prolonged by a highly conserved but RNA
           polymerase III-specific domain.
          Length = 221

 Score = 26.6 bits (59), Expect = 3.6
 Identities = 18/56 (32%), Positives = 26/56 (46%)

Query: 54  KKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDL 109
           K K +      + E    +L   E+KLKE   E   E+  K ++ EEEEE +  D 
Sbjct: 134 KFKRKVGLFTEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEEEEDEDF 189


>gnl|CDD|225171 COG2262, HflX, GTPases [General function prediction only].
          Length = 411

 Score = 26.8 bits (60), Expect = 3.7
 Identities = 12/39 (30%), Positives = 22/39 (56%), Gaps = 2/39 (5%)

Query: 58  EQKKKDYD--IEQRMRELKEEEEKLKEYRRERRKEKKRK 94
           E + +     I +R+ +LK E E +++ R  RRK++ R 
Sbjct: 152 ETQLETDRRRIRRRIAKLKRELENVEKAREPRRKKRSRS 190


>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
           envelope biogenesis, outer membrane].
          Length = 387

 Score = 26.8 bits (59), Expect = 3.7
 Identities = 15/54 (27%), Positives = 33/54 (61%), Gaps = 2/54 (3%)

Query: 58  EQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDLAA 111
           E K K    ++R+++L  E+E+LK   ++++ E+  K    E++++ +Q+  AA
Sbjct: 91  ELKPKQAAEQERLKQL--EKERLKAQEQQKQAEEAEKQAQLEQKQQEEQARKAA 142


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
            biogenesis [Translation, ribosomal structure and
            biogenesis].
          Length = 1077

 Score = 26.6 bits (58), Expect = 3.8
 Identities = 13/65 (20%), Positives = 38/65 (58%), Gaps = 1/65 (1%)

Query: 34   MSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMREL-KEEEEKLKEYRRERRKEKK 92
            +S R+E     + +++ ++  +  +++ KD + ++RM  L + +EE++ +  +ER +  +
Sbjct: 999  VSRRIELPVPPECREKHEIKDRIVKERIKDQEEKERMESLQRAKEEEIGKKEKEREQRIR 1058

Query: 93   RKLDD 97
            + + D
Sbjct: 1059 KTIHD 1063



 Score = 25.9 bits (56), Expect = 8.6
 Identities = 12/82 (14%), Positives = 40/82 (48%), Gaps = 2/82 (2%)

Query: 22   HINGKKHQRNLGMSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLK 81
                    R +   + +++ S+  V +R ++      + ++ ++I+ R+ + + ++++ K
Sbjct: 975  AEEDYSLPREIESKLPLDKRSIAVVSRRIELPVP--PECREKHEIKDRIVKERIKDQEEK 1032

Query: 82   EYRRERRKEKKRKLDDGEEEEE 103
            E     ++ K+ ++   E+E E
Sbjct: 1033 ERMESLQRAKEEEIGKKEKERE 1054


>gnl|CDD|189037 cd09867, PIN_FEN1, PIN domain of Flap Endonuclease-1, a
           structure-specific, divalent-metal-ion dependent, 5'
           nuclease and homologs.  Flap endonuclease-1 (FEN1) is
           involved in multiple DNA metabolic pathways, including
           DNA replication processes (5' flap DNA endonuclease
           activity and double stranded DNA 5'-exonuclease
           activity) and DNA repair processes (long-patch base
           excision repair) in eukaryotes and archaea. Interaction
           between FEN1 and PCNA (Proliferating cell nuclear
           antigen) is an essential prerequisite to FEN1's DNA
           replication functionality and stimulates FEN1 nuclease
           activity by 10-50 fold. FEN1 belongs to the
           FEN1-EXO1-like family of structure-specific, 5'
           nucleases. These nucleases contain a PIN (PilT N
           terminus) domain with a helical arch/clamp region (I
           domain) of variable length (approximately 45 residues in
           FEN1 PIN domains) and a H3TH (helix-3-turn-helix)
           domain, an atypical helix-hairpin-helix-2-like region.
           Both the H3TH domain (not included here) and the helical
           arch/clamp region are involved in DNA binding. Nucleases
           within this group also have a carboxylate-rich active
           site that is involved in binding essential divalent
           metal ion cofactors (Mg2+/Mn2+).  FEN1 has a C-terminal
           extension containing residues forming the consensus
           PIP-box - Qxx(M/L/I)xxF(Y/F) which serves to anchor FEN1
           to PCNA.
          Length = 261

 Score = 26.3 bits (59), Expect = 3.9
 Identities = 19/38 (50%), Positives = 24/38 (63%), Gaps = 7/38 (18%)

Query: 72  ELKEEE-EKLKEYRRERRKEKKRKLDDGEEEEEGDQSD 108
           ELK  E EK    RRERR+E + KL+  E +EEGD  +
Sbjct: 86  ELKSGELEK----RRERREEAEEKLE--EAKEEGDAEE 117


>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
           component YidC; Validated.
          Length = 429

 Score = 26.7 bits (59), Expect = 4.0
 Identities = 7/38 (18%), Positives = 20/38 (52%)

Query: 65  DIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEE 102
           +I++     K E +  K+   ++R+  +R+++    +E
Sbjct: 331 EIKKTRTAEKNEAKARKKEIAQKRRAAEREINREARQE 368


>gnl|CDD|172594 PRK14104, PRK14104, chaperonin GroEL; Provisional.
          Length = 546

 Score = 26.5 bits (58), Expect = 4.1
 Identities = 15/38 (39%), Positives = 23/38 (60%), Gaps = 1/38 (2%)

Query: 62  KDYDIEQRMRELKEE-EEKLKEYRRERRKEKKRKLDDG 98
           K  DIE R+ ++K + EE   +Y RE+ +E+  KL  G
Sbjct: 338 KKADIEARVAQIKAQIEETTSDYDREKLQERLAKLAGG 375


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 26.8 bits (59), Expect = 4.2
 Identities = 16/76 (21%), Positives = 33/76 (43%), Gaps = 9/76 (11%)

Query: 47  KKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE--- 103
           KK+ +   K+ E KK     ++   +L+ ++        ++ ++K RK D  +E  E   
Sbjct: 21  KKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKSEKKSRKRDVEDENPEDFI 80

Query: 104 ------GDQSDLAAIM 113
                 G +  L++ M
Sbjct: 81  DPDTPFGQKKRLSSQM 96


>gnl|CDD|237147 PRK12586, PRK12586, putative monovalent cation/H+ antiporter
           subunit G; Reviewed.
          Length = 145

 Score = 25.9 bits (57), Expect = 4.8
 Identities = 7/39 (17%), Positives = 13/39 (33%)

Query: 52  MNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKE 90
           M +K          +    +   E  +   + R E RK+
Sbjct: 102 MYRKNDAHTHASILLSSNEQNSTEALQLRAKKREEHRKK 140


>gnl|CDD|222613 pfam14235, DUF4337, Domain of unknown function (DUF4337).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria. Proteins in
           this family are typically between 187 and 201 amino
           acids in length. There is a single completely conserved
           residue Q that may be functionally important.
          Length = 158

 Score = 26.0 bits (58), Expect = 4.8
 Identities = 5/54 (9%), Positives = 19/54 (35%), Gaps = 3/54 (5%)

Query: 43  LDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLD 96
               +        +Y+++K  Y      +EL+ + ++ +    +    +  +  
Sbjct: 65  AAAPRAELQAKIARYKKEKARY--RSEAKELEAKAKEAEA-ESDHALHQHHRFA 115


>gnl|CDD|178195 PLN02583, PLN02583, cinnamoyl-CoA reductase.
          Length = 297

 Score = 26.2 bits (58), Expect = 4.9
 Identities = 11/26 (42%), Positives = 17/26 (65%)

Query: 59 QKKKDYDIEQRMRELKEEEEKLKEYR 84
          QK  + +IE+ +R L  EEE+LK + 
Sbjct: 38 QKNGETEIEKEIRGLSCEEERLKVFD 63


>gnl|CDD|219627 pfam07897, DUF1675, Protein of unknown function (DUF1675).  The
           members of this family are sequences derived from
           hypothetical plant proteins of unknown function. One
           member of this family is annotated as a putative
           RNA-binding protein, but no evidence was found to
           support this.
          Length = 283

 Score = 26.1 bits (57), Expect = 5.0
 Identities = 19/54 (35%), Positives = 27/54 (50%)

Query: 71  RELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDLAAIMGFSGFGGGKKK 124
            E +EE  K KE +  RR E KRK  + E     +  D+ +I   +G G G+ K
Sbjct: 66  VETEEEWRKRKEMQSLRRLEAKRKRSEKEYNGVSNGDDMDSINAANGGGSGRDK 119


>gnl|CDD|152107 pfam11671, Apis_Csd, Complementary sex determiner protein.  This
          family of proteins represents the complementary sex
          determiner in the honeybee. In the honeybee, the
          mechanism of sex determination depends on the csd gene
          which produces an SR-type protein. Males are homozygous
          while females are homozygous for the csd gene.
          Heterozygosity generates an active protein which
          initiates female development.
          Length = 146

 Score = 25.8 bits (56), Expect = 5.0
 Identities = 11/38 (28%), Positives = 20/38 (52%)

Query: 58 EQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKL 95
          E+++K Y  E   RE +E   +    R ER + ++ K+
Sbjct: 11 EREQKSYKNENSYREYRETSRERSRDRTERERSREHKI 48


>gnl|CDD|221756 pfam12757, DUF3812, Protein of unknown function (DUF3812).  This is
           a family of fungal proteins whose function is not known.
          Length = 126

 Score = 25.7 bits (57), Expect = 5.0
 Identities = 10/39 (25%), Positives = 23/39 (58%)

Query: 51  DMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRK 89
           +++++   Q+ +D + +    E K + E+ KE  RE++K
Sbjct: 88  EIDERAEAQRARDEEKKLDEEEAKRQHEEAKEREREKKK 126


>gnl|CDD|217360 pfam03087, DUF241, Arabidopsis protein of unknown function.  This
           family represents a number of Arabidopsis proteins.
           Their functions are unknown.
          Length = 230

 Score = 26.2 bits (58), Expect = 5.1
 Identities = 18/51 (35%), Positives = 29/51 (56%), Gaps = 4/51 (7%)

Query: 35  SMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRR 85
           S+R E S LD+ K+R +   K     K+  ++E  + EL++E E L  +RR
Sbjct: 173 SIRSELSRLDEEKRRDNEEVKD--ILKRLEELENSIEELEDELESL--FRR 219


>gnl|CDD|235600 PRK05771, PRK05771, V-type ATP synthase subunit I; Validated.
          Length = 646

 Score = 26.4 bits (59), Expect = 5.3
 Identities = 14/51 (27%), Positives = 26/51 (50%), Gaps = 4/51 (7%)

Query: 55  KKYEQKKKDYDIE----QRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEE 101
           KK   ++ + + E    + +RE+KEE E++++ R    +E K       EE
Sbjct: 198 KKLGFERLELEEEGTPSELIREIKEELEEIEKERESLLEELKELAKKYLEE 248



 Score = 26.0 bits (58), Expect = 7.3
 Identities = 14/52 (26%), Positives = 30/52 (57%), Gaps = 5/52 (9%)

Query: 47  KKRFDMNKKKYEQKKKDYD-----IEQRMRELKEEEEKLKEYRRERRKEKKR 93
           +++  ++ K  E+  KD +     IE+ ++EL+EE  +L+   +E  +E +R
Sbjct: 74  EEKKKVSVKSLEELIKDVEEELEKIEKEIKELEEEISELENEIKELEQEIER 125


>gnl|CDD|218735 pfam05760, IER, Immediate early response protein (IER).  This
           family consists of several eukaryotic immediate early
           response (IER) 2 and 5 proteins. The role of IER5 is
           unclear although it play an important role in mediating
           the cellular response to mitogenic signals. Again,
           little is known about the function of IER2 although it
           is thought to play a role in mediating the cellular
           responses to a variety of extracellular signals.
          Length = 272

 Score = 26.0 bits (57), Expect = 5.3
 Identities = 16/41 (39%), Positives = 25/41 (60%), Gaps = 5/41 (12%)

Query: 84  RRERRKEKKRKL----DDGEEEEEGDQSDLAAIMGFSGFGG 120
           +R RR++ + +     +D EE E G+ S+L +I G SGF G
Sbjct: 195 KRARREDFEPESGGESEDAEEMETGNISNLISIFG-SGFSG 234


>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
           unknown].
          Length = 652

 Score = 26.2 bits (58), Expect = 5.8
 Identities = 12/37 (32%), Positives = 25/37 (67%)

Query: 54  KKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKE 90
           ++  E+ KK ++  +R ++ ++    ++EYRRERR+E
Sbjct: 616 RRAIEEWKKRFEERERRQKEEDILRIIEEYRRERRRE 652


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 26.2 bits (58), Expect = 5.9
 Identities = 24/82 (29%), Positives = 41/82 (50%), Gaps = 6/82 (7%)

Query: 35  SMRVERSSLDQVKKRFDMNKKKYEQKKKDY-----DIEQRMRELKEEEEKLKEYRRERRK 89
            ++ E SSL    +R +    +  Q+  D      +IE+ + +L++EEEKLKE R E  +
Sbjct: 685 GLKRELSSLQSELRRIENRLDELSQELSDASRKIGEIEKEIEQLEQEEEKLKE-RLEELE 743

Query: 90  EKKRKLDDGEEEEEGDQSDLAA 111
           E    L+   E  + +  +L A
Sbjct: 744 EDLSSLEQEIENVKSELKELEA 765


>gnl|CDD|236978 PRK11778, PRK11778, putative inner membrane peptidase; Provisional.
          Length = 330

 Score = 25.9 bits (58), Expect = 6.1
 Identities = 4/39 (10%), Positives = 25/39 (64%)

Query: 65  DIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           ++++ ++    ++++LK + + ++K++K++    + + +
Sbjct: 51  EMKEELKAALLDKKELKAWHKAQKKKEKQEAKAAKAKSK 89


>gnl|CDD|237753 PRK14552, PRK14552, C/D box methylation guide ribonucleoprotein
           complex aNOP56 subunit; Provisional.
          Length = 414

 Score = 26.1 bits (58), Expect = 6.3
 Identities = 14/52 (26%), Positives = 27/52 (51%), Gaps = 11/52 (21%)

Query: 43  LDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRK 94
            +++ KR +  K+KY +  K           K+ EEK  + R++++K KK+ 
Sbjct: 365 KEELNKRIEEIKEKYPKPPK-----------KKREEKKPQKRKKKKKRKKKG 405


>gnl|CDD|235040 PRK02463, PRK02463, OxaA-like protein precursor; Provisional.
          Length = 307

 Score = 25.8 bits (57), Expect = 6.4
 Identities = 9/33 (27%), Positives = 16/33 (48%), Gaps = 6/33 (18%)

Query: 54  KKKYEQKKKDY------DIEQRMRELKEEEEKL 80
           K  Y+ +K  Y       I +R++    +EEK+
Sbjct: 85  KATYQSEKMAYLKPVFEPINERLKNATTQEEKM 117


>gnl|CDD|222689 pfam14335, DUF4391, Domain of unknown function (DUF4391).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria and archaea.
           Proteins in this family are typically between 220 and
           257 amino acids in length.
          Length = 221

 Score = 25.7 bits (57), Expect = 6.6
 Identities = 10/39 (25%), Positives = 25/39 (64%), Gaps = 1/39 (2%)

Query: 53  NKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEK 91
           N   YE++     +E R+ +++E E+++ + +++ +KEK
Sbjct: 165 NTGVYEKEDLKERVE-RLEQIEELEKEIAKLKKKLKKEK 202


>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1.  This domain
           family is found in eukaryotes, and is approximately 40
           amino acids in length. The family is found in
           association with pfam07719, pfam00515. There is a single
           completely conserved residue L that may be functionally
           important. NARP1 is the mammalian homologue of a yeast
           N-terminal acetyltransferase that regulates entry into
           the G(0) phase of the cell cycle.
          Length = 516

 Score = 26.0 bits (58), Expect = 6.6
 Identities = 11/56 (19%), Positives = 25/56 (44%), Gaps = 6/56 (10%)

Query: 54  KKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDL 109
           +KK  +K++    E++     E+EE  K   +++ +   +K    + E +    D 
Sbjct: 411 RKKLRKKQRK--AEKK----AEKEEAEKAAAKKKAEAAAKKAKGPDGETKKVDPDP 460


>gnl|CDD|130141 TIGR01069, mutS2, MutS2 family protein.  Function of MutS2 is
           unknown. It should not be considered a DNA mismatch
           repair protein. It is likely a DNA mismatch binding
           protein of unknown cellular function [DNA metabolism,
           Other].
          Length = 771

 Score = 25.9 bits (57), Expect = 6.9
 Identities = 16/52 (30%), Positives = 25/52 (48%)

Query: 44  DQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKL 95
           +  +K   + K   EQ+K   ++EQ M ELKE E   K    +  +E  + L
Sbjct: 526 ELEQKNEHLEKLLKEQEKLKKELEQEMEELKERERNKKLELEKEAQEALKAL 577



 Score = 25.9 bits (57), Expect = 8.2
 Identities = 14/55 (25%), Positives = 29/55 (52%), Gaps = 5/55 (9%)

Query: 54  KKKYEQKKKDYD-----IEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           K  Y + K++ +     +    +EL+++ E L++  +E+ K KK    + EE +E
Sbjct: 503 KTFYGEFKEEINVLIEKLSALEKELEQKNEHLEKLLKEQEKLKKELEQEMEELKE 557


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 25.8 bits (57), Expect = 7.0
 Identities = 18/75 (24%), Positives = 42/75 (56%), Gaps = 1/75 (1%)

Query: 40  RSSLDQVKKRFDMNKKKYEQKKKDYD-IEQRMRELKEEEEKLKEYRRERRKEKKRKLDDG 98
            + L+++++R +  K+K E  K++ +  E  + EL++   +L+E + E  ++    L++ 
Sbjct: 315 ENELEELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEAKEELEEKLSALLEEL 374

Query: 99  EEEEEGDQSDLAAIM 113
           EE  E  + +LA + 
Sbjct: 375 EELFEALREELAELE 389


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
           approximately 300 residues, found in plants and
           vertebrates. They contain a highly conserved DDRGK
           motif.
          Length = 189

 Score = 25.4 bits (56), Expect = 7.1
 Identities = 13/37 (35%), Positives = 23/37 (62%)

Query: 67  EQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           E++  E K E E+ +E   E  +EKK++ ++ +E EE
Sbjct: 30  ERKKLEEKREGERKEEEELEEEREKKKEEEERKEREE 66


>gnl|CDD|132364 TIGR03321, alt_F1F0_F0_B, alternate F1F0 ATPase, F0 subunit B.  A
           small number of taxonomically diverse prokaryotic
           species, including Methanosarcina barkeri, have what
           appears to be a second ATP synthase, in addition to the
           normal F1F0 ATPase in bacteria and A1A0 ATPase in
           archaea. These enzymes use ion gradients to synthesize
           ATP, CC and in principle may run in either direction.
           This model represents the F0 subunit B of this apparent
           second ATP synthase.
          Length = 246

 Score = 25.8 bits (57), Expect = 7.2
 Identities = 16/52 (30%), Positives = 31/52 (59%), Gaps = 1/52 (1%)

Query: 51  DMNKKKYEQKKKDYDIEQRMRELKEEEEK-LKEYRRERRKEKKRKLDDGEEE 101
           D + KK E +++  + E++  EL ++ E  L + + E + E++R LD+  EE
Sbjct: 47  DADTKKREAEQERREYEEKNEELDQQREVLLTKAKEEAQAERQRLLDEAREE 98


>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein.  This family
           consists of several Borrelia P83/P100 antigen proteins.
          Length = 489

 Score = 25.7 bits (56), Expect = 7.3
 Identities = 15/72 (20%), Positives = 32/72 (44%), Gaps = 2/72 (2%)

Query: 37  RVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEK--LKEYRRERRKEKKRK 94
           +  +  LD+ +   D  ++K +  + + D ++     K++E K   K       KE K+ 
Sbjct: 216 QQLKEELDKKQIDADKAQQKADFAQDNADKQRDEVRQKQQEAKNLPKPADTSSPKEDKQV 275

Query: 95  LDDGEEEEEGDQ 106
            ++ + E E  Q
Sbjct: 276 AENQKREIEKAQ 287


>gnl|CDD|217701 pfam03732, Retrotrans_gag, Retrotransposon gag protein.  Gag or
           Capsid-like proteins from LTR retrotransposons. There is
           a central motif QGXXEXXXXXFXXLXXH that is common to
           Retroviridae gag-proteins, but is poorly conserved.
          Length = 97

 Score = 25.0 bits (55), Expect = 7.3
 Identities = 17/60 (28%), Positives = 30/60 (50%), Gaps = 4/60 (6%)

Query: 44  DQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEE 103
           D++K  F    K++    +   +E  +R L++  E ++EY  ER K   R+L     +EE
Sbjct: 30  DELKDAF---LKRFFPSIRKDLLENELRSLRQGTESVREY-VERFKRLARQLPHHGFDEE 85


>gnl|CDD|233224 TIGR00993, 3a0901s04IAP86, chloroplast protein import component
           Toc86/159, G and M domains.  The long precursor of the
           86K protein originally described is proposed to have
           three domains. The N-terminal A-domain is acidic,
           repetitive, weakly conserved, readily removed by
           proteolysis during chloroplast isolation, and not
           required for protein translocation. The other domains
           are designated G (GTPase) and M (membrane anchor); this
           family includes most of the G domain and all of M
           [Transport and binding proteins, Amino acids, peptides
           and amines].
          Length = 763

 Score = 26.1 bits (57), Expect = 7.4
 Identities = 15/65 (23%), Positives = 29/65 (44%), Gaps = 6/65 (9%)

Query: 48  KRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKR-----KLDDGEEEE 102
            +  M K   EQ+K   + E   R    ++++ +E  +  +  KK      +L DG  EE
Sbjct: 417 TKAQMAKLSKEQRKAYLE-EYDYRVKLLQKKQWREELKRMKMMKKFGKEIGELPDGYSEE 475

Query: 103 EGDQS 107
             +++
Sbjct: 476 VDEEN 480


>gnl|CDD|119241 pfam10721, DUF2514, Protein of unknown function (DUF2514).  This
           family is conserved in bacteria and some viruses. The
           function is not known.
          Length = 162

 Score = 25.5 bits (56), Expect = 7.6
 Identities = 13/56 (23%), Positives = 24/56 (42%), Gaps = 8/56 (14%)

Query: 57  YEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDLAAI 112
           ++QK  D D    + E+  E          R +E++R+    E  ++  Q + AA 
Sbjct: 30  WQQKWADRDAADALAEVIAE-------TAARAEEQRRQAAQNEAAKDA-QEEAAAA 77


>gnl|CDD|224415 COG1498, SIK1, Protein implicated in ribosomal biogenesis, Nop56p
           homolog [Translation, ribosomal structure and
           biogenesis].
          Length = 395

 Score = 25.8 bits (57), Expect = 7.7
 Identities = 13/51 (25%), Positives = 24/51 (47%), Gaps = 8/51 (15%)

Query: 65  DIEQRMRELKEEEEKL--------KEYRRERRKEKKRKLDDGEEEEEGDQS 107
           ++E+R+ +LKE+  K          +  R  R  +K+K    + E  G Q+
Sbjct: 345 ELEKRIEKLKEKPPKPPTKAKPERDKKERPGRYRRKKKEKKAKSERRGLQN 395


>gnl|CDD|219788 pfam08314, Sec39, Secretory pathway protein Sec39.  Mnaimneh et al
           identified Sec39p as a protein involved in ER-Golgi
           transport in a large scale promoter shut down analysis
           of essential yeast genes. Kraynack et al. (2005) showed
           that Sec39p (Dsl3p) is required for Golgi-ER retrograde
           transport and is part of a very stable protein complex
           that also includes Dsl1p (in mammals ZW10), Tip20p
           (Rint-1) and the ER localized Q-SNARE proteins Ufe1p
           (syntaxin-18), Sec20p and Use1p. This was confirmed in a
           genome-wide analysis of protein complexes by Gavin et al
           (2006).
          Length = 675

 Score = 25.9 bits (57), Expect = 8.1
 Identities = 17/68 (25%), Positives = 26/68 (38%), Gaps = 4/68 (5%)

Query: 22  HINGKKHQRNLGMSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMREL----KEEE 77
            +  K  +    + +RV    L  + K  + N K Y    K  DI + + E      EEE
Sbjct: 499 SLVLKPGEPFKPVQIRVHDDPLSLISKVLEQNPKAYTDLDKLLDILKNLVEAGQPDSEEE 558

Query: 78  EKLKEYRR 85
           +     RR
Sbjct: 559 QVETAERR 566


>gnl|CDD|238038 cd00085, HNHc, HNH nucleases; HNH endonuclease signature which is
          found in viral, prokaryotic, and eukaryotic proteins.
          The alignment includes members of the large group of
          homing endonucleases, yeast intron 1 protein, MutS, as
          well as bacterial colicins, pyocins, and anaredoxins.
          Length = 57

 Score = 24.4 bits (53), Expect = 8.1
 Identities = 5/26 (19%), Positives = 6/26 (23%)

Query: 3  GYYCNVCDCVVKDSINFLDHINGKKH 28
             C  C          +DHI     
Sbjct: 11 DGLCPYCGKPGGTEGLEVDHIIPLSD 36


>gnl|CDD|215349 PLN02647, PLN02647, acyl-CoA thioesterase.
          Length = 437

 Score = 25.5 bits (56), Expect = 8.1
 Identities = 14/40 (35%), Positives = 22/40 (55%), Gaps = 1/40 (2%)

Query: 75  EEEEKLKEYRRERRKEKKRKLDDGEEE-EEGDQSDLAAIM 113
           EEE+ L E    R K +K+K  + + E E G+   L A++
Sbjct: 227 EEEKLLFEEAEARNKLRKKKRGEQKREFENGEAERLEALL 266


>gnl|CDD|235185 PRK03980, PRK03980, flap endonuclease-1; Provisional.
          Length = 292

 Score = 25.6 bits (57), Expect = 8.2
 Identities = 16/35 (45%), Positives = 23/35 (65%), Gaps = 2/35 (5%)

Query: 62 KDYDIEQRMRELKEE-EEKLKEYRRERRKEKKRKL 95
          K  +IE+R RE++EE EEK +E + E   E+ RK 
Sbjct: 40 KAEEIEER-REVREEAEEKYEEAKEEGDLEEARKY 73


>gnl|CDD|217337 pfam03050, DDE_Tnp_IS66, Transposase IS66 family.  Transposase
           proteins are necessary for efficient DNA transposition.
           This family includes IS66 from Agrobacterium
           tumefaciens.
          Length = 277

 Score = 25.6 bits (57), Expect = 8.2
 Identities = 12/38 (31%), Positives = 17/38 (44%), Gaps = 4/38 (10%)

Query: 68  QRMRELKEEEEKLK----EYRRERRKEKKRKLDDGEEE 101
           +R+ EL   E + +    E R   R+E  R L D  E 
Sbjct: 170 RRIGELYAIEREARGLPPEERLALRQEYSRPLLDALEA 207


>gnl|CDD|163506 TIGR03794, NHLM_micro_HlyD, NHLM bacteriocin system secretion
           protein.  Members of this protein family are homologs of
           the HlyD membrane fusion protein of type I secretion
           systems. Their occurrence in prokaryotic genomes is
           associated with the occurrence of a novel class of
           microcin (small bacteriocins) with a leader peptide
           region related to nitrile hydratase. We designate the
           class of bacteriocin as Nitrile Hydratase Leader
           Microcin, or NHLM. This family, therefore, is designated
           as NHLM bacteriocin system secretion protein. Some but
           not all NHLM-class putative microcins belong to the TOMM
           (thiazole/oxazole modified microcin) class as assessed
           by the presence of the scaffolding protein and/or
           cyclodehydratase in the same gene clusters [Transport
           and binding proteins, Amino acids, peptides and amines,
           Cellular processes, Biosynthesis of natural products].
          Length = 421

 Score = 25.6 bits (56), Expect = 8.3
 Identities = 10/50 (20%), Positives = 20/50 (40%), Gaps = 3/50 (6%)

Query: 49  RFDMNKKKYEQK---KKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKL 95
            +    K+  ++   K    +E+ +  L+EE   L     ++R    R L
Sbjct: 117 NYTGRLKEGRERHFQKSKEALEETIGRLREELAALSREVGKQRGLLSRGL 166


>gnl|CDD|237791 PRK14701, PRK14701, reverse gyrase; Provisional.
          Length = 1638

 Score = 25.6 bits (56), Expect = 8.3
 Identities = 11/64 (17%), Positives = 24/64 (37%)

Query: 34  MSMRVERSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKR 93
             + +E  ++ ++        K  E+ K+   IE  +    E+ E L+   ++    KK 
Sbjct: 415 FRVDLEDPTIYRILGLLSEILKIEEELKEGIPIEGVLDVFPEDVEFLRSILKDEEVIKKV 474

Query: 94  KLDD 97
               
Sbjct: 475 AERP 478


>gnl|CDD|193205 pfam12729, 4HB_MCP_1, Four helix bundle sensory module for signal
           transduction.  This family is a four helix bundle that
           operates as a ubiquitous sensory module in prokaryotic
           signal-transduction. The 4HB_MCP is always found between
           two predicted transmembrane helices indicating that it
           detects only extracellular signals. In many cases the
           domain is associated with a cytoplasmic HAMP domain
           suggesting that most proteins carrying the bundle might
           share the mechanism of transmembrane signalling which is
           well-characterized in E coli chemoreceptors.
          Length = 181

 Score = 25.3 bits (56), Expect = 8.4
 Identities = 14/49 (28%), Positives = 25/49 (51%), Gaps = 3/49 (6%)

Query: 41  SSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRK 89
             +++++   D   KKYE+     + ++   E K   E+LK YR+ R K
Sbjct: 82  KDIEELRAEIDKLLKKYEKTILTEEEKKLFNEFK---EQLKAYRKVRNK 127


>gnl|CDD|217829 pfam03985, Paf1, Paf1.  Members of this family are components of
           the RNA polymerase II associated Paf1 complex. The Paf1
           complex functions during the elongation phase of
           transcription in conjunction with Spt4-Spt5 and
           Spt16-Pob3i.
          Length = 431

 Score = 25.5 bits (56), Expect = 8.6
 Identities = 16/70 (22%), Positives = 37/70 (52%), Gaps = 1/70 (1%)

Query: 40  RSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKE-KKRKLDDG 98
           RS ++  ++R +   +   ++  +  +  ++R    +E K+++ RR R       ++D+ 
Sbjct: 314 RSRVELRRRRVNDVIRPLVREHNNDQLNVKLRNPSTKESKMRDKRRARLDPIDFEEVDED 373

Query: 99  EEEEEGDQSD 108
           E+EEE  +SD
Sbjct: 374 EDEEEEQRSD 383


>gnl|CDD|129242 TIGR00136, gidA, glucose-inhibited division protein A.  GidA, the
           longer of two forms of GidA-related proteins, appears to
           be present in all complete eubacterial genomes so far,
           as well as Saccharomyces cerevisiae. A subset of these
           organisms have a closely related protein. GidA is absent
           in the Archaea. It appears to act with MnmE, in an
           alpha2/beta2 heterotetramer, in the
           5-carboxymethylaminomethyl modification of uridine 34 in
           certain tRNAs. The shorter, related protein, previously
           called gid or gidA(S), is now called TrmFO (see model
           TIGR00137) [Protein synthesis, tRNA and rRNA base
           modification].
          Length = 617

 Score = 25.8 bits (57), Expect = 8.7
 Identities = 10/67 (14%), Positives = 26/67 (38%), Gaps = 6/67 (8%)

Query: 52  MNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGEEEEEGDQSDLAA 111
           ++ ++Y +  K          ++EE ++LK       KE K +L +  +     ++    
Sbjct: 453 IDDERYARFLKK------KENIEEEIQRLKSTWLTPSKEVKEELKNHLQSPLKREASGED 506

Query: 112 IMGFSGF 118
           ++     
Sbjct: 507 LLRRPEM 513


>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
           bacterial type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. This family
           represents the SMC protein of most bacteria. The smc
           gene is often associated with scpB (TIGR00281) and scpA
           genes, where scp stands for segregation and condensation
           protein. SMC was shown (in Caulobacter crescentus) to be
           induced early in S phase but present and bound to DNA
           throughout the cell cycle [Cellular processes, Cell
           division, DNA metabolism, Chromosome-associated
           proteins].
          Length = 1179

 Score = 25.8 bits (57), Expect = 8.7
 Identities = 19/73 (26%), Positives = 38/73 (52%), Gaps = 10/73 (13%)

Query: 40  RSSLDQVKKRFDMNKKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGE 99
            S LD++ +     ++K E+ K+         EL+  E +L+E   E  +E + +L++ E
Sbjct: 329 ESKLDELAEELAELEEKLEELKE---------ELESLEAELEELEAE-LEELESRLEELE 378

Query: 100 EEEEGDQSDLAAI 112
           E+ E  +S +A +
Sbjct: 379 EQLETLRSKVAQL 391


>gnl|CDD|233229 TIGR00999, 8a0102, Membrane Fusion Protein cluster 2 (function
          with RND porters).  [Transport and binding proteins,
          Other].
          Length = 265

 Score = 25.5 bits (56), Expect = 8.8
 Identities = 11/50 (22%), Positives = 24/50 (48%), Gaps = 8/50 (16%)

Query: 40 RSSLDQVKKRFDMNKKKYEQKKK--------DYDIEQRMRELKEEEEKLK 81
           + L   +KR ++ +K YE++KK          + E     L+E + +++
Sbjct: 22 AAELKVAQKRVELARKTYEREKKLFEQGVIPRQEFESAEYALEEAQAEVQ 71


>gnl|CDD|219547 pfam07741, BRF1, Brf1-like TBP-binding domain.  This region
          covers both the Brf homology II and III regions. This
          region is involved in binding TATA binding protein.
          Length = 95

 Score = 24.9 bits (55), Expect = 8.8
 Identities = 11/33 (33%), Positives = 21/33 (63%)

Query: 62 KDYDIEQRMRELKEEEEKLKEYRRERRKEKKRK 94
          KDY  EQ  +ELK++ ++     ++++K K +K
Sbjct: 26 KDYLEEQEEKELKQKADEGNNSGKKKKKRKAKK 58


>gnl|CDD|227684 COG5397, COG5397, Uncharacterized conserved protein [Function
          unknown].
          Length = 349

 Score = 25.6 bits (56), Expect = 9.3
 Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 4/36 (11%)

Query: 54 KKKYEQKKKDYDIEQRMRELKEEEEKLKEYRRERRK 89
          K++Y     D +I QR+   K     +K+  R RR+
Sbjct: 58 KRRYVGPADDPEIAQRVERHK----AVKDDLRARRR 89


>gnl|CDD|233509 TIGR01652, ATPase-Plipid, phospholipid-translocating P-type ATPase,
           flippase.  This model describes the P-type ATPase
           responsible for transporting phospholipids from one
           leaflet of bilayer membranes to the other. These ATPases
           are found only in eukaryotes.
          Length = 1057

 Score = 25.4 bits (56), Expect = 9.3
 Identities = 8/32 (25%), Positives = 19/32 (59%), Gaps = 2/32 (6%)

Query: 51  DMNKKKYEQKKKDYDIEQRMRELKEEEEKLKE 82
           ++++++YE+  ++Y+       L + EEKL  
Sbjct: 582 ELSEEEYEEWNEEYNEASTA--LTDREEKLDV 611


>gnl|CDD|131739 TIGR02692, tRNA_CCA_actino, tRNA adenylyltransferase.  The enzyme
           tRNA adenylyltransferase, also called
           tRNA-nucleotidyltransferase and CCA-adding enzyme, can
           add or repair the required CCA triplet at the 3'-end of
           tRNA molecules. Genes encoding tRNA include the CCA tail
           in some but not all bacteria, and this enzyme may be
           required for viability. Members of this family represent
           a distinct clade within the larger family pfam01743
           (tRNA nucleotidyltransferase/poly(A) polymerase family
           protein). The example from Streptomyces coelicolor was
           shown to act as a CCA-adding enzyme and not as a poly(A)
           polymerase [Protein synthesis, tRNA and rRNA base
           modification].
          Length = 466

 Score = 25.5 bits (56), Expect = 9.4
 Identities = 10/27 (37%), Positives = 18/27 (66%), Gaps = 1/27 (3%)

Query: 53  NKKKYEQKKKDYD-IEQRMRELKEEEE 78
           NK+K  + +  YD +E+R+ EL  +E+
Sbjct: 384 NKRKAARLQAAYDDLEERIAELAAQED 410


>gnl|CDD|223046 PHA03328, PHA03328, nuclear egress lamina protein UL31;
           Provisional.
          Length = 316

 Score = 25.4 bits (56), Expect = 9.7
 Identities = 9/21 (42%), Positives = 13/21 (61%)

Query: 65  DIEQRMRELKEEEEKLKEYRR 85
           DI  +M +L  + E L EY+R
Sbjct: 284 DIYCKMCDLNFDGELLLEYKR 304


>gnl|CDD|218368 pfam04992, RNA_pol_Rpb1_6, RNA polymerase Rpb1, domain 6.  RNA
          polymerases catalyze the DNA dependent polymerisation
          of RNA. Prokaryotes contain a single RNA polymerase
          compared to three in eukaryotes (not including
          mitochondrial. and chloroplast polymerases). This
          domain, domain 6, represents a mobile module of the RNA
          polymerase. Domain 6 forms part of the shelf module.
          This family appears to be specific to the largest
          subunit of RNA polymerase II.
          Length = 187

 Score = 25.1 bits (56), Expect = 9.7
 Identities = 15/58 (25%), Positives = 21/58 (36%), Gaps = 9/58 (15%)

Query: 49 RFDMNKKKYEQKKK--DYDIEQRMR-------ELKEEEEKLKEYRRERRKEKKRKLDD 97
          R D+            D D+ + +         L EE E+L E RR  R+E     D 
Sbjct: 26 RVDLMDPDEGFLPGVLDEDVVKELLGDAEVQKLLDEEYEQLLEDRRLLREEIFPDGDS 83


>gnl|CDD|131941 TIGR02895, spore_sigI, RNA polymerase sigma-I factor.  Members of
           this sigma factor protein family are strictly limited to
           endospore-forming species in the Firmicutes lineage of
           bacteria, but are not universally present among such
           species. Sigma-I was shown to be induced by heat shock
           (PMID:11157964) in Bacillus subtilis and is suggested by
           its phylogenetic profile to be connected to the program
           of sporulation (PMID:16311624) [Transcription,
           Transcription factors, Cellular processes, Sporulation
           and germination].
          Length = 218

 Score = 25.1 bits (55), Expect = 9.8
 Identities = 8/32 (25%), Positives = 17/32 (53%)

Query: 52  MNKKKYEQKKKDYDIEQRMRELKEEEEKLKEY 83
              K  E+ + + + E R  E+ E ++ LK++
Sbjct: 102 EFNKSMEEYRNEIENENRRLEILEYKKLLKQF 133


>gnl|CDD|219589 pfam07808, RED_N, RED-like protein N-terminal region.  This
          family contains sequences that are similar to the
          N-terminal region of Red protein. This and related
          proteins contain a RED repeat which consists of a
          number of RE and RD sequence elements. The region in
          question has several conserved NLS sequences and a
          putative trimeric coiled-coil region, suggesting that
          these proteins are expressed in the nucleus. The
          function of Red protein is unknown, but efficient
          sequestration to nuclear bodies suggests that its
          expression may be tightly regulated of that the protein
          self-aggregates extremely efficiently.
          Length = 238

 Score = 25.2 bits (55), Expect = 9.9
 Identities = 13/41 (31%), Positives = 18/41 (43%)

Query: 59 QKKKDYDIEQRMRELKEEEEKLKEYRRERRKEKKRKLDDGE 99
          +KK  Y  +Q     KE   K ++  RERRK   +  D   
Sbjct: 3  KKKYAYLRKQEENAEKEINPKYRDRARERRKGINKDYDPSS 43


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.314    0.134    0.375 

Gapped
Lambda     K      H
   0.267   0.0812    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 6,647,961
Number of extensions: 632649
Number of successful extensions: 4359
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3345
Number of HSP's successfully gapped: 1073
Length of query: 124
Length of database: 10,937,602
Length adjustment: 85
Effective length of query: 39
Effective length of database: 7,167,512
Effective search space: 279532968
Effective search space used: 279532968
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 53 (24.0 bits)