RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy15165
         (606 letters)



>gnl|CDD|222150 pfam13465, zf-H2C2_2, Zinc-finger double domain. 
          Length = 26

 Score = 40.5 bits (95), Expect = 3e-05
 Identities = 17/26 (65%), Positives = 19/26 (73%)

Query: 384 YLRRHMRVHTNEKPYKCKDCGAAFNH 409
            LRRHMR HT EKPYKC  CG +F+ 
Sbjct: 1   NLRRHMRTHTGEKPYKCPVCGKSFSS 26



 Score = 34.3 bits (79), Expect = 0.005
 Identities = 18/25 (72%), Positives = 20/25 (80%), Gaps = 1/25 (4%)

Query: 547 YLKRHMRTHTNEKPYKC-VCGLGFN 570
            L+RHMRTHT EKPYKC VCG  F+
Sbjct: 1   NLRRHMRTHTGEKPYKCPVCGKSFS 25


>gnl|CDD|197676 smart00355, ZnF_C2H2, zinc finger. 
          Length = 23

 Score = 30.9 bits (70), Expect = 0.091
 Identities = 11/23 (47%), Positives = 12/23 (52%)

Query: 370 YICEYCHKEFTFYNYLRRHMRVH 392
           Y C  C K F   + LR HMR H
Sbjct: 1   YRCPECGKVFKSKSALREHMRTH 23



 Score = 30.5 bits (69), Expect = 0.14
 Identities = 10/23 (43%), Positives = 12/23 (52%)

Query: 533 FVCEYCNKEFTFLQYLKRHMRTH 555
           + C  C K F     L+ HMRTH
Sbjct: 1   YRCPECGKVFKSKSALREHMRTH 23



 Score = 28.2 bits (63), Expect = 0.72
 Identities = 9/21 (42%), Positives = 14/21 (66%)

Query: 338 HVCPHCGKKFTRKAELQLHIK 358
           + CP CGK F  K+ L+ H++
Sbjct: 1   YRCPECGKVFKSKSALREHMR 21



 Score = 25.9 bits (57), Expect = 4.7
 Identities = 8/21 (38%), Positives = 12/21 (57%)

Query: 501 LQCPHCPKTFPRKTELSNHIK 521
            +CP C K F  K+ L  H++
Sbjct: 1   YRCPECGKVFKSKSALREHMR 21


>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27.  This protein forms
           the C subunit of DNA polymerase delta. It carries the
           essential residues for binding to the Pol1 subunit of
           polymerase alpha, from residues 293-332, which are
           characterized by the motif D--G--VT, referred to as the
           DPIM motif. The first 160 residues of the protein form
           the minimal domain for binding to the B subunit, Cdc1,
           of polymerase delta, the final 10 C-terminal residues,
           362-372, being the DNA sliding clamp, PCNA, binding
           motif.
          Length = 427

 Score = 35.2 bits (81), Expect = 0.093
 Identities = 34/193 (17%), Positives = 71/193 (36%), Gaps = 13/193 (6%)

Query: 16  PSRRRSLGVESSVPKLKIKTETDDKSWEDKSLLEPEIKIKVEQGQATTSDETEEDDNTRQ 75
                 +   +S             S   KS++ PE+K+K  +    TS ET  +    +
Sbjct: 140 GVGLPPVAPAASPALKPTANGKRPSSKPPKSIMSPEVKVKSAKKTQDTSKETTTEKTEGK 199

Query: 76  TSFKTIISKYVG-------YNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDA 128
           TS K    K           +     T +K+ K+ A  ++   ++ ++   R      ++
Sbjct: 200 TSVKAASLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVKEESEEESGKRDVILEDES 259

Query: 129 PVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFAS 188
                 +        E+      ++  + E+  +K  ++E+R+++    +  D D     
Sbjct: 260 AEPTGLDE----DEDEDEPKPSGERSDSEEETEEK--EKEKRKRLKKMMEDEDEDEEMEI 313

Query: 189 LDEMSSEEEEEED 201
           + E   EEEE E+
Sbjct: 314 VPESPVEEEESEE 326


>gnl|CDD|200998 pfam00096, zf-C2H2, Zinc finger, C2H2 type.  The C2H2 zinc finger
           is the classical zinc finger domain. The two conserved
           cysteines and histidines co-ordinate a zinc ion. The
           following pattern describes the zinc finger.
           #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can
           be any amino acid, and numbers in brackets indicate the
           number of residues. The positions marked # are those
           that are important for the stable fold of the zinc
           finger. The final position can be either his or cys. The
           C2H2 zinc finger is composed of two short beta strands
           followed by an alpha helix. The amino terminal part of
           the helix binds the major groove in DNA binding zinc
           fingers. The accepted consensus binding sequence for Sp1
           is usually defined by the asymmetric hexanucleotide core
           GGGCGG but this sequence does not include, among others,
           the GAG (=CTC) repeat that constitutes a high-affinity
           site for Sp1 binding to the wt1 promoter.
          Length = 22

 Score = 30.8 bits (70), Expect = 0.11
 Identities = 11/22 (50%), Positives = 13/22 (59%)

Query: 534 VCEYCNKEFTFLQYLKRHMRTH 555
            C  C K F+    LKRH+RTH
Sbjct: 1   KCPDCGKSFSRKSNLKRHLRTH 22



 Score = 29.2 bits (66), Expect = 0.37
 Identities = 9/22 (40%), Positives = 13/22 (59%)

Query: 371 ICEYCHKEFTFYNYLRRHMRVH 392
            C  C K F+  + L+RH+R H
Sbjct: 1   KCPDCGKSFSRKSNLKRHLRTH 22



 Score = 28.9 bits (65), Expect = 0.48
 Identities = 10/20 (50%), Positives = 15/20 (75%)

Query: 339 VCPHCGKKFTRKAELQLHIK 358
            CP CGK F+RK+ L+ H++
Sbjct: 1   KCPDCGKSFSRKSNLKRHLR 20



 Score = 27.7 bits (62), Expect = 1.1
 Identities = 9/20 (45%), Positives = 14/20 (70%)

Query: 502 QCPHCPKTFPRKTELSNHIK 521
           +CP C K+F RK+ L  H++
Sbjct: 1   KCPDCGKSFSRKSNLKRHLR 20


>gnl|CDD|227561 COG5236, COG5236, Uncharacterized conserved protein, contains RING
           Zn-finger [General function prediction only].
          Length = 493

 Score = 34.6 bits (79), Expect = 0.14
 Identities = 22/97 (22%), Positives = 35/97 (36%), Gaps = 8/97 (8%)

Query: 474 QKQICEICCAEVYHINGHIKDKHSGFF-LQCPHCPKTFP------RKTELSNHIKGIHMK 526
            K  C   C  +  +  H K +H      +C    K F       R + L +H  G   +
Sbjct: 155 PKSKCHRRCGSLKELKKHYKAQHGFVLCSECIGNKKDFWNEIRLFRSSTLRDHKNGGLEE 214

Query: 527 HELRQTFVCEYCNKEFTFLQYLKRHMRTHTNEKPYKC 563
              +   +C +C   F     L+RH R   +E  + C
Sbjct: 215 EGFKGHPLCIFCKIYFYDDDELRRHCRLR-HEACHIC 250


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 34.6 bits (80), Expect = 0.19
 Identities = 29/208 (13%), Positives = 67/208 (32%), Gaps = 10/208 (4%)

Query: 4    DDWNNLDDHVGKPSRRRSLGVESSVPKLKIKTETDDKSWEDKSLLEPEIKIKVEQGQATT 63
                +  D   K S   +     S  K K+  + D+K        + + + +  + + ++
Sbjct: 1180 KKKKSSADKSKKASVVGNSKRVDSDEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSS 1239

Query: 64   SDETEEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASS-----GSDTPKKRR 118
                +   N    S +            +    +   + +A   S        D      
Sbjct: 1240 VKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNGG 1299

Query: 119  GRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEE-----RRRKI 173
             +  S  +     R     A    +++++ K  +++K+  +  +  A +      R RK 
Sbjct: 1300 SKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRLLRRPRKK 1359

Query: 174  DLDKKPSDMDNVFASLDEMSSEEEEEED 201
              D    D D+      E   +E++E+D
Sbjct: 1360 KSDSSSEDDDDSEVDDSEDEDDEDDEDD 1387


>gnl|CDD|220759 pfam10446, DUF2457, Protein of unknown function (DUF2457).  This is
           a family of uncharacterized proteins.
          Length = 449

 Score = 33.8 bits (77), Expect = 0.26
 Identities = 31/150 (20%), Positives = 60/150 (40%), Gaps = 6/150 (4%)

Query: 60  QATTSDETEEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRG 119
           +A  +D  +E D+  +  F    +     +   L T+ ++++R    +S+ S   +K+R 
Sbjct: 104 EAGFADSDDESDDGSEYVFWAPGTTTAATSPRKLETMRRKSRRRTSDSSADSLNERKQRR 163

Query: 120 RSRSKGRDAPVSRKRNRRAKTPS-QEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKK 178
           + +   R     +    R  TP   + TD       ++   +    +  E RR       
Sbjct: 164 KWKRPRRSPI--KPPKIRPGTPELPDSTDFVCGTLDEDRPLEAAYKSCMEARRLSKQVVI 221

Query: 179 PSDMDNVFASLDEMSSEEEEEEDWDEIHLG 208
           P D+D  F + D    E+EE+E  D   + 
Sbjct: 222 PQDIDPSFPTSD---PEDEEDELDDVEEVI 248



 Score = 29.6 bits (66), Expect = 6.0
 Identities = 15/52 (28%), Positives = 29/52 (55%)

Query: 155 KNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDEIH 206
           +N  +K+ K A+EE   + D D++  D D+     D+   ++E++ED D+  
Sbjct: 35  ENAIRKLGKEAEEEAMEEEDDDEEDDDDDDDEDEDDDDDDDDEDDEDEDDDD 86


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 34.0 bits (79), Expect = 0.26
 Identities = 17/69 (24%), Positives = 41/69 (59%), Gaps = 6/69 (8%)

Query: 134 RNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERR---RKIDLDKKPSDMDNVFASLD 190
           R RR +    E+   ++LQ+++NL++K++ + K E     ++ +L++K  +++     L+
Sbjct: 78  RERRNELQKLEK---RLLQKEENLDRKLELLEKREEELEKKEKELEQKQQELEKKEEELE 134

Query: 191 EMSSEEEEE 199
           E+  E+ +E
Sbjct: 135 ELIEEQLQE 143


>gnl|CDD|227674 COG5384, Mpp10, U3 small nucleolar ribonucleoprotein component
           [Translation, ribosomal structure and biogenesis].
          Length = 569

 Score = 33.5 bits (76), Expect = 0.32
 Identities = 13/56 (23%), Positives = 21/56 (37%)

Query: 159 KKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDEIHLGQFSPYE 214
           KK   +   +   ++D ++  S MD V   L     +E   E   E      S +E
Sbjct: 223 KKHSDVKDPKEDEELDEEEHDSAMDKVKLDLFADEEDEPNAEGVGEASDKNLSSFE 278


>gnl|CDD|220661 pfam10263, SprT-like, SprT-like family.  This family represents a
           domain found in eukaryotes and prokaryotes. The domain
           contains a characteristic motif of the zinc
           metallopeptidases. This family includes the bacterial
           SprT protein.
          Length = 153

 Score = 32.4 bits (74), Expect = 0.35
 Identities = 18/98 (18%), Positives = 27/98 (27%), Gaps = 24/98 (24%)

Query: 321 EVVHLAIHKKHSHSGQYHVCPHCGKKFTRKAEL--QLHIKGIHLKHQLEKT--------- 369
           E+ H A+       G  H     G +F                  H+ +           
Sbjct: 65  EMCHAALFLLFGGRGYPH-----GDEFKALMAQVGGAGPLEPTTTHRFDIEVVSGRKRYI 119

Query: 370 YICEYCHKEFTFYNYLRRHMRVHTNEKPYKCKDCGAAF 407
           Y C  C + +     +RRH         Y+C  CG   
Sbjct: 120 YRCGSCGQLYPRKRRIRRHK--------YRCGRCGGKL 149


>gnl|CDD|206065 pfam13894, zf-C2H2_4, C2H2-type zinc finger.  This family contains
           a number of divergent C2H2 type zinc fingers.
          Length = 24

 Score = 28.8 bits (64), Expect = 0.47
 Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 533 FVCEYCNKEFTFLQYLKRHMRTH 555
           F C  C K F+    LKRH+R H
Sbjct: 1   FKCPLCGKSFSSKDALKRHLRKH 23



 Score = 28.4 bits (63), Expect = 0.82
 Identities = 9/23 (39%), Positives = 14/23 (60%)

Query: 370 YICEYCHKEFTFYNYLRRHMRVH 392
           + C  C K F+  + L+RH+R H
Sbjct: 1   FKCPLCGKSFSSKDALKRHLRKH 23



 Score = 28.0 bits (62), Expect = 0.96
 Identities = 9/23 (39%), Positives = 13/23 (56%)

Query: 502 QCPHCPKTFPRKTELSNHIKGIH 524
           +CP C K+F  K  L  H++  H
Sbjct: 2   KCPLCGKSFSSKDALKRHLRKHH 24



 Score = 28.0 bits (62), Expect = 1.1
 Identities = 10/24 (41%), Positives = 14/24 (58%)

Query: 338 HVCPHCGKKFTRKAELQLHIKGIH 361
             CP CGK F+ K  L+ H++  H
Sbjct: 1   FKCPLCGKSFSSKDALKRHLRKHH 24



 Score = 26.1 bits (57), Expect = 4.4
 Identities = 10/24 (41%), Positives = 14/24 (58%)

Query: 249 FKCRVCDWKLNSYDKLLRHIKSDH 272
           FKC +C    +S D L RH++  H
Sbjct: 1   FKCPLCGKSFSSKDALKRHLRKHH 24


>gnl|CDD|219953 pfam08648, DUF1777, Protein of unknown function (DUF1777).  This is
           a family of eukaryotic proteins of unknown function.
           Some of the proteins in this family are putative nucleic
           acid binding proteins.
          Length = 158

 Score = 31.8 bits (72), Expect = 0.48
 Identities = 19/92 (20%), Positives = 47/92 (51%), Gaps = 7/92 (7%)

Query: 99  RAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRK-------RNRRAKTPSQEETDAKIL 151
           R+ R +R         ++ R R RS+ R+    R+       R+RR+++P +  + ++  
Sbjct: 7   RSPRRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRSRSRSP 66

Query: 152 QQQKNLEKKIKKMAKEERRRKIDLDKKPSDMD 183
            ++++ +++  K A+E ++R+     K  D++
Sbjct: 67  SRRRDRKRERDKDAREPKKRERQKLIKEEDLE 98



 Score = 28.3 bits (63), Expect = 6.6
 Identities = 19/72 (26%), Positives = 37/72 (51%), Gaps = 4/72 (5%)

Query: 97  DKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPS----QEETDAKILQ 152
            +R++  +   S  S +P++ R RSRS  R     R+R++ A+ P     Q+    + L+
Sbjct: 39  RRRSRSRSPHRSRRSRSPRRHRSRSRSPSRRRDRKRERDKDAREPKKRERQKLIKEEDLE 98

Query: 153 QQKNLEKKIKKM 164
            + + E ++ KM
Sbjct: 99  GKSDEEVEMMKM 110


>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
          Length = 1068

 Score = 33.1 bits (76), Expect = 0.48
 Identities = 23/90 (25%), Positives = 47/90 (52%), Gaps = 7/90 (7%)

Query: 117 RRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKE-----ERRR 171
           R  R+R +GR+     +RNRR     Q+  + +  QQ +  EK   +  ++     ER+R
Sbjct: 624 RDNRTRREGRENREENRRNRRQAQ--QQTAETRESQQAEVTEKARTQDEQQQAPRRERQR 681

Query: 172 KIDLDKKPSDMDNVFASLDEMSSEEEEEED 201
           + + +K+ +  +    +++E S +E E+E+
Sbjct: 682 RRNDEKRQAQQEAKALNVEEQSVQETEQEE 711


>gnl|CDD|213729 TIGR02605, CxxC_CxxC_SSSS, putative regulatory protein, FmdB
           family.  This model represents a region of about 50
           amino acids found in a number of small proteins in a
           wide range of bacteria. The region begins usually with
           the initiator Met and contains two CxxC motifs separated
           by 17 amino acids. One member of this family is has been
           noted as a putative regulatory protein, designated FmdB
           (SP:Q50229, PMID:8841393 ). Most members of this family
           have a C-terminal region containing highly degenerate
           sequence, such as
           SSTSESTKSSGSSGSSGSSESKASGSTEKSTSSTTAAAAV in
           Mycobacterium tuberculosis and
           VAVGGSAPAPSPAPRAGGGGGGCCGGGCCG in Streptomyces
           avermitilis. These low complexity regions, which are not
           included in the model, resemble low-complexity
           C-terminal regions of some heterocycle-containing
           bacteriocin precursors [Regulatory functions, DNA
           interactions].
          Length = 52

 Score = 29.6 bits (67), Expect = 0.49
 Identities = 7/26 (26%), Positives = 13/26 (50%)

Query: 398 YKCKDCGAAFNHNVSLKNHKNSSCPK 423
           Y+C  CG  F     + +   ++CP+
Sbjct: 6   YRCTACGHRFEVLQKMSDDPLATCPE 31


>gnl|CDD|112562 pfam03753, HHV6-IE, Human herpesvirus 6 immediate early protein.
           The proteins in this family are poorly characterized,
           but an investigation has indicated that the immediate
           early protein is required the down-regulation of MHC
           class I expression in dendritic cells. Human herpesvirus
           6 immediate early protein is also referred to as U90.
          Length = 993

 Score = 33.1 bits (75), Expect = 0.57
 Identities = 33/150 (22%), Positives = 54/150 (36%), Gaps = 17/150 (11%)

Query: 37  TDDKSWEDKSLL--EPEIKIKVEQGQATTSDETEEDDNTRQTSFKTII----------SK 84
             D S   +S      EI           ++    ++    T F              + 
Sbjct: 585 ETDHSAPYESESDNNDEIDYIASVDSGNRTNNIHMNNTNENTPFSKSGKSPPEVTPSKTF 644

Query: 85  YVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQE 144
           Y      D++T  K  KRTA+  + G  T K ++ +S S   D  V         + S++
Sbjct: 645 YKRDKKKDISTNRKVKKRTAKRKTVGYKTDKSKKIKSDSLPTDTNVI-----VISSESED 699

Query: 145 ETDAKILQQQKNLEKKIKKMAKEERRRKID 174
           E D   + ++  L+KKIK   K E   + D
Sbjct: 700 EEDGFNIIKKSQLKKKIKSELKSESSSESD 729


>gnl|CDD|217392 pfam03153, TFIIA, Transcription factor IIA, alpha/beta subunit.
           Transcription initiation factor IIA (TFIIA) is a
           heterotrimer, the three subunits being known as alpha,
           beta, and gamma, in order of molecular weight. The N and
           C-terminal domains of the gamma subunit are represented
           in pfam02268 and pfam02751, respectively. This family
           represents the precursor that yields both the alpha and
           beta subunits. The TFIIA heterotrimer is an essential
           general transcription initiation factor for the
           expression of genes transcribed by RNA polymerase II.
           Together with TFIID, TFIIA binds to the promoter region;
           this is the first step in the formation of a
           pre-initiation complex (PIC). Binding of the rest of the
           transcription machinery follows this step. After
           initiation, the PIC does not completely dissociate from
           the promoter. Some components, including TFIIA, remain
           attached and re-initiate a subsequent round of
           transcription.
          Length = 332

 Score = 32.4 bits (74), Expect = 0.60
 Identities = 16/71 (22%), Positives = 25/71 (35%), Gaps = 2/71 (2%)

Query: 136 RRAKTPSQEETDAKILQQQKNLE--KKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMS 193
           R  +     E   K  +    ++  K+ KK AK  +RR I         D    S D+  
Sbjct: 202 RLREADGTLEQRIKGAEGGGAMKVLKQPKKQAKSSKRRTIAQIDGIDSDDEGDGSDDDDD 261

Query: 194 SEEEEEEDWDE 204
            +  E +  D 
Sbjct: 262 EDAIESDLDDS 272


>gnl|CDD|217373 pfam03115, Astro_capsid, Astrovirus capsid protein precursor.  This
           product is encoded by astrovirus ORF2, one of the three
           astrovirus ORFs (1a, 1b, 2). The 87kD precursor protein
           undergoes an intracellular cleavage to form a 79kD
           protein. Subsequently, extracellular trypsin cleavage
           yields the three proteins forming the infectious virion.
          Length = 787

 Score = 32.8 bits (75), Expect = 0.67
 Identities = 14/47 (29%), Positives = 21/47 (44%), Gaps = 9/47 (19%)

Query: 116 KRRGRSRSKGRDAPV---------SRKRNRRAKTPSQEETDAKILQQ 153
           K R RS+S+GR   V          R++N R K  S +     + +Q
Sbjct: 22  KSRARSQSRGRGRSVKITVNSRNKGRRQNGRNKYQSNQRVRNIVNKQ 68


>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
           domain.  This domain is found in a number of different
           types of plant proteins including NAM-like proteins.
          Length = 147

 Score = 31.2 bits (71), Expect = 0.72
 Identities = 21/113 (18%), Positives = 46/113 (40%), Gaps = 14/113 (12%)

Query: 98  KRAKRTARAASSGSDT---------PKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDA 148
              K+  R+ SS   T          +      R +GR    ++++ RR K  +++E   
Sbjct: 28  ASKKKKKRSNSSPGSTSNEENEDEDDESTAESKRPEGRKK--AKEKLRRDKLKAKKEEAE 85

Query: 149 KILQQQKNLEKKIKKMAKEE---RRRKIDLDKKPSDMDNVFASLDEMSSEEEE 198
           K  ++++   K + +  KE     ++K +      +   +FA    +S E+ +
Sbjct: 86  KEKEKEERFMKALAEAEKERAELEKKKAEAKLMKEEKKIMFADTSSLSPEQRQ 138


>gnl|CDD|227701 COG5414, COG5414, TATA-binding protein-associated factor
           [Transcription].
          Length = 392

 Score = 32.4 bits (73), Expect = 0.75
 Identities = 29/141 (20%), Positives = 53/141 (37%), Gaps = 15/141 (10%)

Query: 72  NTRQTSFKTIISK----YVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSR---SK 124
             R   F+   SK     V   +DDL   D +A+  +       +  ++ R  S     +
Sbjct: 187 YVRARRFRKKSSKIEIEEVEKKVDDLLEKDMKAESVSVVLKDEKELARQERVSSWENFKE 246

Query: 125 GRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDN 184
               P+SR   ++ K  ++EE +  +       E+ +   A E   +++    K    + 
Sbjct: 247 EPGEPLSRPALKKEKQGAEEEGEEGMS------EEDLDVGAAEIENKEVSEGDKEQQQEE 300

Query: 185 VFASLDEMSSEEEEEEDWDEI 205
           V     E   EE + +  DEI
Sbjct: 301 VEN--AEAHKEEVQSDRPDEI 319


>gnl|CDD|227278 COG4942, COG4942, Membrane-bound metallopeptidase [Cell division
           and chromosome partitioning].
          Length = 420

 Score = 32.4 bits (74), Expect = 0.75
 Identities = 16/80 (20%), Positives = 36/80 (45%), Gaps = 3/80 (3%)

Query: 122 RSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSD 181
           + K     ++    +  +   Q++  AK+ +Q K+LE +I  +  +      DL K    
Sbjct: 39  QLKQIQKEIAALEKKIRE---QQDQRAKLEKQLKSLETEIASLEAQLIETADDLKKLRKQ 95

Query: 182 MDNVFASLDEMSSEEEEEED 201
           + ++ A L+ +  +E E+  
Sbjct: 96  IADLNARLNALEVQEREQRR 115


>gnl|CDD|222425 pfam13865, FoP_duplication, C-terminal duplication domain of Friend
           of PRMT1.  Fop, or Friend of Prmt1, proteins are
           conserved from fungi and plants to vertebrates. There is
           little that is actually conserved except for this
           C-terminal LDXXLDAYM region where X is any amino acid).
           The Fop proteins themselves are nuclear proteins
           localised to regions with low levels of DAPI, with a
           punctate/speckle-like distribution. Fop is a
           chromatin-associated protein and it colocalises with
           facultative heterochromatin. It is is critical for
           oestrogen-dependent gene activation.
          Length = 76

 Score = 29.7 bits (67), Expect = 0.80
 Identities = 18/92 (19%), Positives = 34/92 (36%), Gaps = 19/92 (20%)

Query: 103 TARAASSGSDTPKKRRGRSRSKGRDAPVSRKRN--RRAKTPSQEETDAKILQQQKNLEKK 160
             R  S G     + RG  R + R     + +    + K  ++E+ DA+       L++ 
Sbjct: 2   GGRKGSRGGKFRPRGRGARRGRRRGRGGRKGKGGAAKPKPKTREDLDAE-------LDQY 54

Query: 161 IKKMAKEERRRKIDLDKKPSDMDNVFASLDEM 192
           +     +       LD   +D+D   +  DE 
Sbjct: 55  MSTTKSK-------LD---ADLDAYMSKKDEK 76



 Score = 28.6 bits (64), Expect = 2.1
 Identities = 18/86 (20%), Positives = 24/86 (27%), Gaps = 22/86 (25%)

Query: 108 SSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKE 167
           S G    +  + R R +G      R R  R                 K    K K   +E
Sbjct: 1   SGGRKGSRGGKFRPRGRGARRGRRRGRGGRKG---------------KGGAAKPKPKTRE 45

Query: 168 ERRRKIDLDKKPSD-MDNVFASLDEM 192
                 DLD +    M    + LD  
Sbjct: 46  ------DLDAELDQYMSTTKSKLDAD 65


>gnl|CDD|221408 pfam12072, DUF3552, Domain of unknown function (DUF3552).  This
           presumed domain is functionally uncharacterized. This
           domain is found in bacteria, archaea and eukaryotes.
           This domain is about 200 amino acids in length. This
           domain is found associated with pfam00013, pfam01966.
           This domain has a single completely conserved residue A
           that may be functionally important.
          Length = 201

 Score = 31.4 bits (72), Expect = 0.88
 Identities = 15/69 (21%), Positives = 39/69 (56%), Gaps = 6/69 (8%)

Query: 134 RNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEER---RRKIDLDKKPSDMDNVFASLD 190
           + RR +   QE+   ++LQ+++ L++K + + K+E     ++ +L  +   ++     L+
Sbjct: 74  KERRNELQRQEK---RLLQKEETLDRKDESLEKKEESLEEKEKELAARQQQLEEKEEELE 130

Query: 191 EMSSEEEEE 199
           E+  E+++E
Sbjct: 131 ELIEEQQQE 139


>gnl|CDD|218517 pfam05236, TAF4, Transcription initiation factor TFIID component
           TAF4 family.  This region of similarity is found in
           Transcription initiation factor TFIID component TAF4.
          Length = 255

 Score = 31.6 bits (72), Expect = 0.89
 Identities = 19/72 (26%), Positives = 32/72 (44%), Gaps = 4/72 (5%)

Query: 130 VSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERR--RKIDLDKKPSDMDNVFA 187
           +SR R    K+    E  + + +Q + L +K K+   EERR  R+ +L  +  +   +  
Sbjct: 91  LSRHRRDGIKSDPNYEIRSDVRRQLRFLAQKQKEE--EERRVERRRELGLEDPEQLRLKQ 148

Query: 188 SLDEMSSEEEEE 199
              E    E EE
Sbjct: 149 KAKEEQKAESEE 160


>gnl|CDD|220362 pfam09723, CxxC_CxxC_SSSS, Zinc ribbon domain.  This entry
           represents a region of about 41 amino acids found in a
           number of small proteins in a wide range of bacteria.
           The region usually begins with the initiator Met and
           contains two CxxC motifs separated by 17 amino acids.
           One protein in this entry has been noted as a putative
           regulatory protein, designated FmdB. Most proteins in
           this entry have a C-terminal region containing highly
           degenerate sequence.
          Length = 42

 Score = 28.3 bits (64), Expect = 1.1
 Identities = 8/26 (30%), Positives = 15/26 (57%)

Query: 398 YKCKDCGAAFNHNVSLKNHKNSSCPK 423
           Y+C+DCG  F     + +   ++CP+
Sbjct: 6   YRCEDCGHTFEVLQKISDDPLATCPE 31


>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
           unknown].
          Length = 652

 Score = 32.0 bits (73), Expect = 1.1
 Identities = 17/96 (17%), Positives = 40/96 (41%), Gaps = 8/96 (8%)

Query: 114 PKKRRGRSRSKGRDAPVSRKR--NRRAKTPSQEET----DAKILQQQKNLEKKIKKMA-- 165
           P+++ G    + R+  V  KR           EE       ++ + ++ +EK   ++   
Sbjct: 403 PREKEGTEEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKREIEKLESELERF 462

Query: 166 KEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEED 201
           + E R K+  D++    D     L++   E+++  +
Sbjct: 463 RREVRDKVRKDREIRARDRRIERLEKELEEKKKRVE 498


>gnl|CDD|236394 PRK09169, PRK09169, hypothetical protein; Validated.
          Length = 2316

 Score = 32.0 bits (73), Expect = 1.3
 Identities = 20/116 (17%), Positives = 36/116 (31%), Gaps = 25/116 (21%)

Query: 99  RAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRA---------KTPSQEETDAK 149
                       +  P   R R R +  DAP  R     +             +E T  +
Sbjct: 2   GPAHAPHKRRRDAAAPADPRPRRRPRLGDAPAPRTARADSGATPRGRPRAGADREPTSEQ 61

Query: 150 ILQQQKNLEK---------------KIKKMAKEERRRKIDLDKKPS-DMDNVFASL 189
           +   ++ L++               ++  + ++ R RK+D D       DNV A  
Sbjct: 62  LRDYERWLDRAAAGQLDAQREQQCARLWFLVQQARARKVDPDFCLDLARDNVLAQR 117


>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y.  Members of this family are
           RNase Y, an endoribonuclease. The member from Bacillus
           subtilis, YmdA, has been shown to be involved in
           turnover of yitJ riboswitch [Transcription, Degradation
           of RNA].
          Length = 514

 Score = 31.4 bits (72), Expect = 1.3
 Identities = 19/68 (27%), Positives = 37/68 (54%), Gaps = 7/68 (10%)

Query: 134 RNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMS 193
           + RR +    E    ++LQ+++ L++K++ + K+E     +L+KK  ++ N   +LDE  
Sbjct: 72  KERRNELQRLER---RLLQREETLDRKMESLDKKEE----NLEKKEKELSNKEKNLDEKE 124

Query: 194 SEEEEEED 201
            E EE   
Sbjct: 125 EELEELIA 132


>gnl|CDD|220237 pfam09428, DUF2011, Fungal protein of unknown function (DUF2011).
           This is a family of fungal proteins whose function is
           unknown.
          Length = 130

 Score = 29.9 bits (68), Expect = 1.4
 Identities = 21/72 (29%), Positives = 32/72 (44%), Gaps = 8/72 (11%)

Query: 96  IDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQK 155
           + K   +  +      +  KKR+ R   K R A     R RR +T  + E + +    +K
Sbjct: 64  LKKHNAKVEKELLREKEKKKKRK-RPGKKRRIA----LRLRRERTKERAEKEKRT---RK 115

Query: 156 NLEKKIKKMAKE 167
           N EKK K+  KE
Sbjct: 116 NREKKFKRRQKE 127


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 31.4 bits (71), Expect = 1.5
 Identities = 21/121 (17%), Positives = 50/121 (41%), Gaps = 5/121 (4%)

Query: 81  IISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKT 140
           +ISKY      D    ++R KR  R     S        ++ S   ++P     ++ ++ 
Sbjct: 387 VISKYA--MKQDDTEEEERRKRQERERQGTSSRSS-DPSKASSTSGESPS--MASQESEE 441

Query: 141 PSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEE 200
               E + +  ++++  E++ ++   E+   + +++      + +  S +     EE EE
Sbjct: 442 EESVEEEEEEEEEEEEEEQESEEEEGEDEEEEEEVEADNGSEEEMEGSSEGDGDGEEPEE 501

Query: 201 D 201
           D
Sbjct: 502 D 502


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 31.4 bits (71), Expect = 1.6
 Identities = 18/106 (16%), Positives = 44/106 (41%), Gaps = 2/106 (1%)

Query: 96  IDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQK 155
           ++K   +   A +  +  PK   G+   K ++     K+ ++ K P +E  D K  ++ K
Sbjct: 80  VEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEK-PKEEPKDRKPKEEAK 138

Query: 156 NLEKKIKKMAKEERRRKIDLDK-KPSDMDNVFASLDEMSSEEEEEE 200
                 +K  ++E++ +   D+ +    + V A        +++  
Sbjct: 139 EKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPP 184


>gnl|CDD|227381 COG5048, COG5048, FOG: Zn-finger [General function prediction
           only].
          Length = 467

 Score = 31.2 bits (70), Expect = 1.6
 Identities = 17/50 (34%), Positives = 27/50 (54%), Gaps = 1/50 (2%)

Query: 532 TFVCEYCNKEFTFLQYLKRHMRTHTNEKPYKC-VCGLGFNFNVSLKNHKQ 580
              C  C   F+ L++L RH+R+HT EKP +C   G   +F+  L+  + 
Sbjct: 33  PDSCPNCTDSFSRLEHLTRHIRSHTGEKPSQCSYSGCDKSFSRPLELSRH 82



 Score = 31.2 bits (70), Expect = 1.6
 Identities = 21/93 (22%), Positives = 38/93 (40%), Gaps = 2/93 (2%)

Query: 330 KHSHSGQYHVCPHCGKKFTRKAELQLHIKGIHLKHQLEKTYIC--EYCHKEFTFYNYLRR 387
                        C   F+R + L  H++ ++   +  K + C    C K F+  + L+R
Sbjct: 282 SEKGFSLPIKSKQCNISFSRSSPLTRHLRSVNHSGESLKPFSCPYSLCGKLFSRNDALKR 341

Query: 388 HMRVHTNEKPYKCKDCGAAFNHNVSLKNHKNSS 420
           H+ +HT+  P K K   ++   +  L N    S
Sbjct: 342 HILLHTSISPAKEKLLNSSSKFSPLLNNEPPQS 374



 Score = 30.0 bits (67), Expect = 3.7
 Identities = 22/74 (29%), Positives = 33/74 (44%), Gaps = 4/74 (5%)

Query: 495 KHSGFFLQCPH--CPKTFPRKTELSNHIKGIHMKHELRQTFVC--EYCNKEFTFLQYLKR 550
              GF L      C  +F R + L+ H++ ++   E  + F C    C K F+    LKR
Sbjct: 282 SEKGFSLPIKSKQCNISFSRSSPLTRHLRSVNHSGESLKPFSCPYSLCGKLFSRNDALKR 341

Query: 551 HMRTHTNEKPYKCV 564
           H+  HT+  P K  
Sbjct: 342 HILLHTSISPAKEK 355


>gnl|CDD|220297 pfam09581, Spore_III_AF, Stage III sporulation protein AF
           (Spore_III_AF).  This family represents the stage III
           sporulation protein AF (Spore_III_AF) of the bacterial
           endospore formation program, which exists in some but
           not all members of the Firmicutes (formerly called
           low-GC Gram-positives). The C-terminal region of these
           proteins is poorly conserved.
          Length = 185

 Score = 30.3 bits (69), Expect = 1.7
 Identities = 12/27 (44%), Positives = 17/27 (62%), Gaps = 1/27 (3%)

Query: 143 QEETDAKILQQ-QKNLEKKIKKMAKEE 168
           Q    A IL++  K LEK+++K  KEE
Sbjct: 73  QASQRAYILEEYAKQLEKQVEKKLKEE 99


>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family.  This model
           represents a subfamily of RNA splicing factors including
           the Pad-1 protein (N. crassa), CAPER (M. musculus) and
           CC1.3 (H.sapiens). These proteins are characterized by
           an N-terminal arginine-rich, low complexity domain
           followed by three (or in the case of 4 H. sapiens
           paralogs, two) RNA recognition domains (rrm: pfam00706).
           These splicing factors are closely related to the U2AF
           splicing factor family (TIGR01642). A homologous gene
           from Plasmodium falciparum was identified in the course
           of the analysis of that genome at TIGR and was included
           in the seed.
          Length = 457

 Score = 31.0 bits (70), Expect = 1.9
 Identities = 11/80 (13%), Positives = 29/80 (36%), Gaps = 5/80 (6%)

Query: 97  DKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPS----QEETDAKILQ 152
           +   +       S   +  + R R R + RD    R+   R+++P+         +    
Sbjct: 12  NDTRRSDKGRERSRRRSRSRDRSR-RRRDRDYYRGRRGRSRSRSPNRYYRPRGDRSYRRD 70

Query: 153 QQKNLEKKIKKMAKEERRRK 172
            +++     + + + ER  +
Sbjct: 71  DRRSGRNTKEPLTEAERDDR 90


>gnl|CDD|222911 PHA02616, PHA02616, VP2/VP3; Provisional.
          Length = 259

 Score = 30.8 bits (69), Expect = 2.1
 Identities = 16/67 (23%), Positives = 27/67 (40%), Gaps = 2/67 (2%)

Query: 68  EEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRD 127
           E + +  +   + +  K    +          +K++    +  S + KKRRG  RS GR 
Sbjct: 194 ELNKDIYKIPTQAVKRKQDELHPVSPTKKAALSKKSKWTGTKSSQSSKKRRG--RSTGRS 251

Query: 128 APVSRKR 134
             V R R
Sbjct: 252 TTVRRNR 258


>gnl|CDD|109943 pfam00906, Hepatitis_core, Hepatitis core antigen.  The core
           antigen of hepatitis viruses possesses a carboxyl
           terminus rich in arginine. On this basis it was
           predicted that the core antigen would bind DNA. There is
           some experimental evidence to support this.
          Length = 182

 Score = 30.2 bits (68), Expect = 2.1
 Identities = 21/69 (30%), Positives = 28/69 (40%), Gaps = 12/69 (17%)

Query: 77  SFKTIIS---KYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRK 133
           SF   I     Y   N   L+T+ +      R  S    TP  RR RS+S  R       
Sbjct: 120 SFGVWIRTPPAYRPPNAPILSTLPETIVVRRRGRSPRRRTPSPRRRRSQSPRR------- 172

Query: 134 RNRRAKTPS 142
             RR+++PS
Sbjct: 173 --RRSQSPS 179


>gnl|CDD|232905 TIGR00284, TIGR00284, dihydropteroate synthase-related protein.
           This protein has been found so far only in the Archaea,
           and in particular in those archaea that lack a
           bacterial-type dihydropteroate synthase. The central
           region of this protein shows considerable homology to
           the amino-terminal half of dihydropteroate synthases,
           while the carboxyl-terminal region shows homology to the
           small, uncharacterized protein slr0651 of Synechocystis
           PCC6803 [Unknown function, General].
          Length = 499

 Score = 31.0 bits (70), Expect = 2.2
 Identities = 15/62 (24%), Positives = 28/62 (45%), Gaps = 4/62 (6%)

Query: 142 SQEETDAKILQQQKNLEKKIKKMAKEE---RRRKIDLDKKPSDMDNVFASLDEMSSEEEE 198
           S EE   +++ + K LE+   K+ + E   R   + +  KP  +  V A +    +E+  
Sbjct: 109 STEEPADEVVLEIKKLEEYTSKIEEREADFRIGSLKIPLKPPPL-RVVAEIPPTVAEDGI 167

Query: 199 EE 200
           E 
Sbjct: 168 EG 169


>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
           splicing factor.  These splicing factors consist of an
           N-terminal arginine-rich low complexity domain followed
           by three tandem RNA recognition motifs (pfam00076). The
           well-characterized members of this family are auxilliary
           components of the U2 small nuclear ribonuclearprotein
           splicing factor (U2AF). These proteins are closely
           related to the CC1-like subfamily of splicing factors
           (TIGR01622). Members of this subfamily are found in
           plants, metazoa and fungi.
          Length = 509

 Score = 31.0 bits (70), Expect = 2.2
 Identities = 17/106 (16%), Positives = 37/106 (34%), Gaps = 17/106 (16%)

Query: 105 RAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDA--KILQQQKNLEKKIK 162
           R+        + RR R RS   D+    +R   +++P      +  +   + +   + ++
Sbjct: 27  RSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRSLRYSSVRRSRDRPRRRSRSVR 86

Query: 163 KMAKEERR---------------RKIDLDKKPSDMDNVFASLDEMS 193
            + +  RR               ++   D KP   + V A   + S
Sbjct: 87  SIEQHRRRLRDRSPSNQWRKDDKKRSLWDIKPPGYELVTADQAKAS 132


>gnl|CDD|217502 pfam03343, SART-1, SART-1 family.  SART-1 is a protein involved in
           cell cycle arrest and pre-mRNA splicing. It has been
           shown to be a component of U4/U6 x U5 tri-snRNP complex
           in human, Schizosaccharomyces pombe and Saccharomyces
           cerevisiae. SART-1 is a known tumour antigen in a range
           of cancers recognised by T cells.
          Length = 603

 Score = 30.9 bits (70), Expect = 2.4
 Identities = 34/153 (22%), Positives = 66/153 (43%), Gaps = 14/153 (9%)

Query: 55  KVEQGQATTSDETEEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGS--- 111
            +E  +     + ++DD   +   ++I+SKY     D+     K+         SG    
Sbjct: 182 NLELKKKKPDYDPDDDDKFNK---RSILSKY-----DEEIEGKKKKSDNLFTLDSGGSTD 233

Query: 112 DTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRR 171
           D  +K+R   + K +   VS   +   +TP+ +  D   + + K  + K KK  K++RR+
Sbjct: 234 DEAEKKRQEVKKKLKINNVSLDDDS-TETPASDYYDVSEMVKFK--KPKKKKKKKKKRRK 290

Query: 172 KIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDE 204
            +D D+   + + + +S      + EEE    E
Sbjct: 291 DLDEDELEPEAEGLGSSDSGSRKDVEEENARLE 323


>gnl|CDD|221755 pfam12756, zf-C2H2_2, C2H2 type zinc-finger (2 copies).  This
           family contains two copies of a C2H2-like zinc finger
           domain.
          Length = 100

 Score = 28.8 bits (65), Expect = 2.4
 Identities = 17/78 (21%), Positives = 33/78 (42%), Gaps = 8/78 (10%)

Query: 281 CYHCGYYSKNRSTLKNHVRVEHGENQAKRKEKKICDICSAEVVHLAIHKKHSHSGQYHVC 340
           C  C + S        H+   HG    +R  + + D+    +++    K H    + + C
Sbjct: 2   CLFCNHTSDTVEENLEHMFKSHGFFIPER--EYLVDL--EGLLNYLREKIH----EGNEC 53

Query: 341 PHCGKKFTRKAELQLHIK 358
            +CGK+F     L+ H++
Sbjct: 54  LYCGKQFKSLEALRQHMR 71



 Score = 28.4 bits (64), Expect = 3.2
 Identities = 11/30 (36%), Positives = 18/30 (60%)

Query: 361 HLKHQLEKTYICEYCHKEFTFYNYLRRHMR 390
           +L+ ++ +   C YC K+F     LR+HMR
Sbjct: 42  YLREKIHEGNECLYCGKQFKSLEALRQHMR 71


>gnl|CDD|234471 TIGR04104, cxxc_20_cxxc, cxxc_20_cxxc protein.  This small,
           uncommon, poorly conserved protein is found primarily in
           the Firmicutes. It features are pair of CxxC motifs
           separated by about 20 amino acids, followed by a highly
           hydrophobic region of about 45 amino acids. It has no
           conserved gene neighborhood, and its function is
           unknown.
          Length = 94

 Score = 28.5 bits (64), Expect = 3.0
 Identities = 10/35 (28%), Positives = 20/35 (57%), Gaps = 3/35 (8%)

Query: 371 ICEYCHKEFTFYNYLRRHMRVHTNEKPYKCKDCGA 405
           IC+ C+++F++   L+    +    +P KC +CG 
Sbjct: 2   ICKNCNEKFSYKELLKSLFSL---YRPIKCPNCGT 33


>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
          Length = 509

 Score = 30.3 bits (69), Expect = 3.1
 Identities = 19/136 (13%), Positives = 40/136 (29%), Gaps = 11/136 (8%)

Query: 73  TRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSR 132
            +  S   I         +++    +  K+T                 +          +
Sbjct: 25  AKSKSKGFITK-------EEIKEALESKKKTPEQIDQVLIFLSGMVKDTDDATESDIPKK 77

Query: 133 KRNRRAKTPSQEETDAK----ILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFAS 188
           K    AK  + +    K     L   K  EKK      ++     D+D      D+    
Sbjct: 78  KTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQADDDDDDD 137

Query: 189 LDEMSSEEEEEEDWDE 204
            D+   +++ ++D D+
Sbjct: 138 DDDDLDDDDIDDDDDD 153



 Score = 30.3 bits (69), Expect = 3.1
 Identities = 21/108 (19%), Positives = 35/108 (32%), Gaps = 5/108 (4%)

Query: 97  DKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKN 156
            K+ K  A+AA++ +   KK +    S        +K             D  +L Q   
Sbjct: 76  KKKTKTAAKAAAAKAPAKKKLKDELDSS---KKAEKKNALDKDDDLNYVKDIDVLNQAD- 131

Query: 157 LEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDE 204
            +        +     ID D    D D      D    +EE++E  + 
Sbjct: 132 -DDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDVDDEDEEKKEAKEL 178



 Score = 30.0 bits (68), Expect = 4.5
 Identities = 25/178 (14%), Positives = 54/178 (30%), Gaps = 12/178 (6%)

Query: 34  KTETDDKSWEDKSLLEPE--IKIKVEQGQATTSD--ETEEDDNTRQTSFKTI-ISKYVGY 88
            +   + + E+++  + +        +G  T  +  E  E           + I      
Sbjct: 4   ASTKAELAAEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIFLSGMV 63

Query: 89  NIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEET-D 147
              D  T     K+  + A+  +      + + + +   +  + K+N   K        D
Sbjct: 64  KDTDDATESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKD 123

Query: 148 AKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDEI 205
             +L Q    +        +     ID D    D D      D     ++E+E+  E 
Sbjct: 124 IDVLNQAD--DDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDV----DDEDEEKKEA 175


>gnl|CDD|197903 smart00834, CxxC_CXXC_SSSS, Putative regulatory protein.
           CxxC_CXXC_SSSS represents a region of about 41 amino
           acids found in a number of small proteins in a wide
           range of bacteria. The region usually begins with the
           initiator Met and contains two CxxC motifs separated by
           17 amino acids. One protein in this entry has been noted
           as a putative regulatory protein, designated FmdB. Most
           proteins in this entry have a C-terminal region
           containing highly degenerate sequence.
          Length = 41

 Score = 26.7 bits (60), Expect = 3.3
 Identities = 8/26 (30%), Positives = 15/26 (57%)

Query: 398 YKCKDCGAAFNHNVSLKNHKNSSCPK 423
           Y+C+DCG  F     + +   ++CP+
Sbjct: 6   YRCEDCGHTFEVLQKISDDPLTTCPE 31


>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62. 
          Length = 217

 Score = 29.8 bits (67), Expect = 3.4
 Identities = 14/81 (17%), Positives = 31/81 (38%), Gaps = 4/81 (4%)

Query: 99  RAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLE 158
           RAKR  RA  S     K +  + +           +++  +            + +K   
Sbjct: 8   RAKRVVRALES----EKYKANKDKGNPEIYNKINSQDKAIEKFKLLIKAQMAERVKKLHS 63

Query: 159 KKIKKMAKEERRRKIDLDKKP 179
           ++ K+  K+ +++K+ L   P
Sbjct: 64  QEKKEEKKKPKKKKVPLQVNP 84


>gnl|CDD|237619 PRK14135, recX, recombination regulator RecX; Provisional.
          Length = 263

 Score = 29.8 bits (68), Expect = 3.7
 Identities = 16/64 (25%), Positives = 35/64 (54%), Gaps = 2/64 (3%)

Query: 142 SQEETDAKILQQQKNLEKKIKKMAKEERRRKI--DLDKKPSDMDNVFASLDEMSSEEEEE 199
           ++E+      +  + L KK +K+  +  ++KI   L  K    + + A+L+E+  E++EE
Sbjct: 150 TEEDQIEVAQKLAEKLLKKYQKLPFKALKQKIIQSLLTKGFSYEVIKAALEELDLEQDEE 209

Query: 200 EDWD 203
           E+ +
Sbjct: 210 EEQE 213


>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain.  This domain is
           found at the N terminus of SMC proteins. The SMC
           (structural maintenance of chromosomes) superfamily
           proteins have ATP-binding domains at the N- and
           C-termini, and two extended coiled-coil domains
           separated by a hinge in the middle. The eukaryotic SMC
           proteins form two kind of heterodimers: the SMC1/SMC3
           and the SMC2/SMC4 types. These heterodimers constitute
           an essential part of higher order complexes, which are
           involved in chromatin and DNA dynamics. This family also
           includes the RecF and RecN proteins that are involved in
           DNA metabolism and recombination.
          Length = 1162

 Score = 30.3 bits (68), Expect = 3.8
 Identities = 26/187 (13%), Positives = 74/187 (39%), Gaps = 9/187 (4%)

Query: 22  LGVESSVPKLKIKTETDDKSWEDKSLLEPEIKIKVEQGQATTSDETEEDDNTRQTSFKTI 81
             +++S+ +L  +   + +  E       + +I   Q +    ++  +++  +    K  
Sbjct: 663 SELKASLSELTKELLAEQELQEKAESELAKNEILRRQEEIKKKEQRIKEELKKLKLEKEE 722

Query: 82  ISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRK-RNRRAKT 140
           +        D +     +     +         ++   +SR K  +    +   + + K 
Sbjct: 723 L------LADKVQEAQDKINEELKLLEQKIKEKEEEEEKSRLKKEEEEEEKSELSLKEKE 776

Query: 141 PSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEE 200
            ++EE   + L+ ++  E+K+K   +E R  + +L ++   ++     L     E+ +EE
Sbjct: 777 LAEEEEKTEKLKVEEEKEEKLKAQEEELRALEEELKEEAELLEE--EQLLIEQEEKIKEE 834

Query: 201 DWDEIHL 207
           + +E+ L
Sbjct: 835 ELEELAL 841


>gnl|CDD|177301 PHA00733, PHA00733, hypothetical protein.
          Length = 128

 Score = 28.7 bits (64), Expect = 3.8
 Identities = 14/43 (32%), Positives = 20/43 (46%), Gaps = 6/43 (13%)

Query: 338 HVCPHCGKKFTRKAELQLHIKGIHLKHQLEKTYICEYCHKEFT 380
           +VCP C   F+    L+ HI+        E + +C  C KEF 
Sbjct: 74  YVCPLCLMPFSSSVSLKQHIR------YTEHSKVCPVCGKEFR 110



 Score = 28.7 bits (64), Expect = 4.4
 Identities = 14/48 (29%), Positives = 22/48 (45%), Gaps = 2/48 (4%)

Query: 369 TYICEYCHKEFTFYNYLRRHMRVHTNEKPYKCKDCGAAFNHNVSLKNH 416
            Y+C  C   F+    L++H+R   + K   C  CG  F +  S  +H
Sbjct: 73  PYVCPLCLMPFSSSVSLKQHIRYTEHSK--VCPVCGKEFRNTDSTLDH 118


>gnl|CDD|218561 pfam05340, DUF740, Protein of unknown function (DUF740).  This
           family consists of several uncharacterized plant
           proteins of unknown function.
          Length = 565

 Score = 30.0 bits (67), Expect = 3.9
 Identities = 27/105 (25%), Positives = 41/105 (39%), Gaps = 10/105 (9%)

Query: 104 ARAASSGSDTPKKRRGRSRSKGRDAPVSR--KRNRRAKTPSQEETDAKILQQ--QKNLEK 159
             ++S G   P+ RR +S S  R+A  S   +  RR+       T   +     ++NL  
Sbjct: 60  KPSSSGGGFFPELRRTKSFSAKRNAGFSGADEPQRRSCDVRSRSTLWSLFHDDDEENLPS 119

Query: 160 KIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDE 204
            I     +   R      KP   D V    +E+  EE+EE    E
Sbjct: 120 SIAPPEIDPEPR------KPIVPDLVLEEEEEVEMEEDEEYYEKE 158


>gnl|CDD|172341 PRK13808, PRK13808, adenylate kinase; Provisional.
          Length = 333

 Score = 29.9 bits (67), Expect = 4.1
 Identities = 12/69 (17%), Positives = 24/69 (34%)

Query: 98  KRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNL 157
           K AK   +AA   +    K    +    +    ++K+  +      +        ++   
Sbjct: 262 KAAKAVKKAAKKAAKAAAKAAKGAAKATKGKAKAKKKAGKKAAAGSKAKATAKAPKRGAK 321

Query: 158 EKKIKKMAK 166
            KK KK+ K
Sbjct: 322 GKKAKKVTK 330



 Score = 29.5 bits (66), Expect = 5.6
 Identities = 25/137 (18%), Positives = 53/137 (38%), Gaps = 23/137 (16%)

Query: 57  EQGQATTSDETEEDDNTRQTSFKTIISKYVGY--------NIDDLNTIDK---------- 98
            +G+   +D+T E    R  S++      V Y         +D + TID+          
Sbjct: 133 ARGEEVRADDTPEVLAKRLASYRAQTEPLVHYYSEKRKLLTVDGMMTIDEVTREIGRVLA 192

Query: 99  --RAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKR---NRRAKTPSQEETDAKILQQ 153
              A    +AA + +     ++  +++K     VS+K+      +   + +       + 
Sbjct: 193 AVGAANAKKAAKTPAAKSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKKA 252

Query: 154 QKNLEKKIKKMAKEERR 170
           +K  +K +KK AK  ++
Sbjct: 253 KKTAKKALKKAAKAVKK 269


>gnl|CDD|223430 COG0353, RecR, Recombinational DNA repair protein (RecF pathway)
           [DNA replication, recombination, and repair].
          Length = 198

 Score = 29.5 bits (67), Expect = 4.1
 Identities = 16/43 (37%), Positives = 17/43 (39%), Gaps = 13/43 (30%)

Query: 310 KEKKICDICSAE--------VVH-----LAIHKKHSHSGQYHV 339
            E   CDICS E        VV      LA+ K     G YHV
Sbjct: 64  TESDPCDICSDESRDKSQLCVVEEPKDVLALEKTGEFRGLYHV 106


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 30.3 bits (68), Expect = 4.2
 Identities = 17/65 (26%), Positives = 36/65 (55%), Gaps = 1/65 (1%)

Query: 138 AKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLD-KKPSDMDNVFASLDEMSSEE 196
            K  ++EE + K  +++K  EK++KK+   ++  K  L  ++ SD  NV    ++ S + 
Sbjct: 10  KKILTEEELERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKSEKKSRKR 69

Query: 197 EEEED 201
           + E++
Sbjct: 70  DVEDE 74


>gnl|CDD|234616 PRK00076, recR, recombination protein RecR; Reviewed.
          Length = 196

 Score = 29.3 bits (67), Expect = 4.3
 Identities = 14/42 (33%), Positives = 18/42 (42%), Gaps = 13/42 (30%)

Query: 311 EKKICDICSAE--------VVH-----LAIHKKHSHSGQYHV 339
           E+  C+ICS          VV      LAI +   + G YHV
Sbjct: 64  EQDPCEICSDPRRDQSLICVVESPADVLAIERTGEYRGLYHV 105


>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
           Rpc31.  RNA polymerase III contains seventeen subunits
           in yeasts and in human cells. Twelve of these are akin
           to RNA polymerase I or II and the other five are RNA pol
           III-specific, and form the functionally distinct groups
           (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
           Rpc34 and Rpc82 form a cluster of enzyme-specific
           subunits that contribute to transcription initiation in
           S.cerevisiae and H.sapiens. There is evidence that these
           subunits are anchored at or near the N-terminal Zn-fold
           of Rpc1, itself prolonged by a highly conserved but RNA
           polymerase III-specific domain.
          Length = 221

 Score = 29.3 bits (66), Expect = 4.3
 Identities = 20/86 (23%), Positives = 41/86 (47%), Gaps = 15/86 (17%)

Query: 119 GRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKK 178
           G ++  G+   +S+ + +        E +  I ++   LEKK+K++  E+   + + D+ 
Sbjct: 121 GINKKAGKKLALSKFKRKV---GLFTEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDE- 176

Query: 179 PSDMDNVFASLDEMSSEEEEEEDWDE 204
                      +E   EEEE+ED+D+
Sbjct: 177 -----------EEEEEEEEEDEDFDD 191


>gnl|CDD|177753 PLN00149, PLN00149, potassium transporter; Provisional.
          Length = 779

 Score = 30.2 bits (68), Expect = 4.4
 Identities = 24/98 (24%), Positives = 40/98 (40%), Gaps = 7/98 (7%)

Query: 54  IKVEQGQATTSDETEEDDNTRQTSFKTIISKYVGYNI--DDLNTIDKRAKRTARAASSGS 111
           I+ E+ +   + E EE ++ R T   T  +   G  +  DD +  +       R   S  
Sbjct: 626 IRSEKPEPNGAPENEEGEDERMTVVGTCSTHLEGIQLREDDSDKQEPAGTSELREIRS-- 683

Query: 112 DTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAK 149
             P   R + R +    P S K +R A+   QE  +A+
Sbjct: 684 --PPVSRPKKRVRFV-VPESPKIDRGAREELQELMEAR 718


>gnl|CDD|218029 pfam04328, DUF466, Protein of unknown function (DUF466).  Small
           bacterial protein of unknown function. Structural
           modelling suggests this domain may bind nucleic acids.
          Length = 64

 Score = 27.2 bits (61), Expect = 4.5
 Identities = 6/18 (33%), Positives = 11/18 (61%)

Query: 426 SYETYLKHLKTNHHGYEV 443
            YE Y++H++ +H    V
Sbjct: 24  DYEKYVEHMRRHHPDKPV 41


>gnl|CDD|226202 COG3677, COG3677, Transposase and inactivated derivatives [DNA
           replication, recombination, and repair].
          Length = 129

 Score = 28.5 bits (64), Expect = 4.7
 Identities = 10/43 (23%), Positives = 14/43 (32%), Gaps = 2/43 (4%)

Query: 308 KRKEKKICDICSAEVVHLAIHKKHSHSGQYHVCPHCGKKFTRK 350
            +  K  C  C +  V            Q + C  CG  FT +
Sbjct: 26  MQITKVNCPRCKSSNVV--KIGGIRRGHQRYKCKSCGSTFTVE 66


>gnl|CDD|220284 pfam09538, FYDLN_acid, Protein of unknown function (FYDLN_acid).
           Members of this family are bacterial proteins with a
           conserved motif [KR]FYDLN, sometimes flanked by a pair
           of CXXC motifs, followed by a long region of low
           complexity sequence in which roughly half the residues
           are Asp and Glu, including multiple runs of five or more
           acidic residues. The function of members of this family
           is unknown.
          Length = 104

 Score = 28.0 bits (63), Expect = 4.8
 Identities = 7/13 (53%), Positives = 8/13 (61%)

Query: 335 GQYHVCPHCGKKF 347
           G    CP CGK+F
Sbjct: 7   GTKRTCPTCGKRF 19


>gnl|CDD|227812 COG5525, COG5525, Phage terminase, large subunit GpA [Replication,
           recombination and repair].
          Length = 611

 Score = 29.7 bits (67), Expect = 5.2
 Identities = 11/48 (22%), Positives = 20/48 (41%), Gaps = 1/48 (2%)

Query: 332 SHSGQYHV-CPHCGKKFTRKAELQLHIKGIHLKHQLEKTYICEYCHKE 378
               +++V CPHCG++   K   +   +G+           CE+C   
Sbjct: 221 GDQRRFYVPCPHCGEEQQLKFGEKSGPRGLKDTPAEAAFIQCEHCGCV 268


>gnl|CDD|233208 TIGR00956, 3a01205, Pleiotropic Drug Resistance (PDR) Family
           protein.  [Transport and binding proteins, Other].
          Length = 1394

 Score = 29.7 bits (67), Expect = 5.2
 Identities = 14/79 (17%), Positives = 28/79 (35%), Gaps = 12/79 (15%)

Query: 150 ILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWDEIHLGQ 209
           IL  ++   K+ KK  +     K D++        V  S D     ++  ++ D      
Sbjct: 702 ILVFRRGSLKRAKKAGETSASNKNDIEAGE-----VLGSTDLTDESDDVNDEKDM----- 751

Query: 210 FSPYEFKCRVCDWKLNSYD 228
               E    +  W+  +Y+
Sbjct: 752 --EKESGEDIFHWRNLTYE 768


>gnl|CDD|232888 TIGR00233, trpS, tryptophanyl-tRNA synthetase.  This model
           represents tryptophanyl-tRNA synthetase. Some members of
           the family have a pfam00458 domain amino-terminal to the
           region described by this model [Protein synthesis, tRNA
           aminoacylation].
          Length = 327

 Score = 29.6 bits (67), Expect = 5.3
 Identities = 11/66 (16%), Positives = 30/66 (45%), Gaps = 4/66 (6%)

Query: 155 KNLEKKIKKMAKEERRRKIDLDKKP---SDMDNVF-ASLDEMSSEEEEEEDWDEIHLGQF 210
           K ++KKI+K A +  R  +   ++     ++  ++      +  +++ +E ++    G+ 
Sbjct: 208 KQIKKKIRKAATDGGRVTLFEHREKPGVPNLLVIYQYLSFFLIDDDKLKEIYEAYKSGKL 267

Query: 211 SPYEFK 216
              E K
Sbjct: 268 GYGECK 273


>gnl|CDD|192632 pfam10571, UPF0547, Uncharacterized protein family UPF0547.  This
           domain contains a zinc-ribbon motif.
          Length = 26

 Score = 25.6 bits (57), Expect = 5.9
 Identities = 12/35 (34%), Positives = 14/35 (40%), Gaps = 11/35 (31%)

Query: 313 KICDICSAEVVHLAIHKKHSHSGQYHVCPHCGKKF 347
           K C  C AEV                +CPHCG +F
Sbjct: 1   KTCPECGAEVPL-----------AAKICPHCGYEF 24


>gnl|CDD|221185 pfam11719, Drc1-Sld2, DNA replication and checkpoint protein.
           Genome duplication is precisely regulated by
           cyclin-dependent kinases CDKs, which bring about the
           onset of S phase by activating replication origins and
           then prevent relicensing of origins until mitosis is
           completed. The optimum sequence motif for CDK
           phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found
           to have at least 11 potential phosphorylation sites.
           Drc1 is required for DNA synthesis and S-M replication
           checkpoint control. Drc1 associates with Cdc2 and is
           phosphorylated at the onset of S phase when Cdc2 is
           activated. Thus Cdc2 promotes DNA replication by
           phosphorylating Drc1 and regulating its association with
           Cut5. Sld2 and Sld3 represent the minimal set of S-CDK
           substrates required for DNA replication.
          Length = 397

 Score = 29.4 bits (66), Expect = 5.9
 Identities = 27/145 (18%), Positives = 47/145 (32%), Gaps = 29/145 (20%)

Query: 63  TSDETEEDDNTRQTSFKTIISKYVGYNIDDLNTIDKRAKRTARAASSGSDTPKK-RRGRS 121
             DE   +   R  S +T +S      +D   T               S+TP   RR   
Sbjct: 154 AEDEDRPEYGPR--SERTPLSSGKKVMLDLFFTPTSW--------RYSSETPSFLRRSNQ 203

Query: 122 RSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSD 181
                  P++        +PS        L+ Q+ + K + ++ +EE             
Sbjct: 204 DVSATSNPLNSAEPDFGVSPSP-------LRPQRPVGKGLSELVQEEES----------- 245

Query: 182 MDNVFASLDEMSSEEEEEEDWDEIH 206
           +D+    L E+ +EE      +E  
Sbjct: 246 IDDELDVLREIEAEEAGIGPIEEEV 270


>gnl|CDD|202114 pfam02114, Phosducin, Phosducin. 
          Length = 245

 Score = 29.3 bits (65), Expect = 5.9
 Identities = 7/75 (9%), Positives = 27/75 (36%), Gaps = 7/75 (9%)

Query: 106 AASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMA 165
             +      +   G++   G    ++  R  + ++   +           + ++ +++M+
Sbjct: 2   EKAKSQSLEEDFEGQASHTGPKGVINDWRKFKLESEDSDS-------VAHSKKEILRQMS 54

Query: 166 KEERRRKIDLDKKPS 180
             + R   D  ++ S
Sbjct: 55  SPQSRDDKDSKERFS 69


>gnl|CDD|206083 pfam13912, zf-C2H2_6, C2H2-type zinc finger. 
          Length = 27

 Score = 25.6 bits (57), Expect = 6.5
 Identities = 7/24 (29%), Positives = 10/24 (41%)

Query: 369 TYICEYCHKEFTFYNYLRRHMRVH 392
            + C  C K F+    L  H + H
Sbjct: 1   VHTCGVCGKTFSSLQALGGHKKSH 24


>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
           subunit (TFIIF-alpha).  Transcription initiation factor
           IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
           II-associating protein 74 (RAP74) is the large subunit
           of transcription factor IIF (TFIIF), which is essential
           for accurate initiation and stimulates elongation by RNA
           polymerase II.
          Length = 528

 Score = 29.5 bits (66), Expect = 6.7
 Identities = 36/150 (24%), Positives = 53/150 (35%), Gaps = 19/150 (12%)

Query: 43  EDKSLLEPEIKIKVEQGQATTSDETEEDDNTRQTSF-------KTIISKYVGYNIDD-LN 94
           E +  L PEI  K E  Q   S+E+EE+ N  +          K +  K  G + DD  +
Sbjct: 299 EREDKLSPEIPAKPEIEQDEDSEESEEEKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDS 358

Query: 95  TIDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQ 154
             D          S    T KK++   + +  D+  S   N     PS E  D       
Sbjct: 359 GDDSDDSDIDGEDSVSLVTAKKQKEPKKEEPVDSNPSSPGNSGPARPSPESKD------- 411

Query: 155 KNLEKKIKKMAKEERRRKIDLDKKPSDMDN 184
               K  +K A E  +    +  K    +N
Sbjct: 412 ----KGKRKAANEVSKSPASVPAKKLKTEN 437


>gnl|CDD|215521 PLN02967, PLN02967, kinase.
          Length = 581

 Score = 29.2 bits (65), Expect = 7.4
 Identities = 23/103 (22%), Positives = 42/103 (40%), Gaps = 20/103 (19%)

Query: 108 SSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKE 167
           ++     KK   R+R             R+A   S +  + K  +++    +K+KKM ++
Sbjct: 104 AALDKESKKTPRRTR-------------RKAAAASSDVEEEKT-EKKVRKRRKVKKMDED 149

Query: 168 ERRRKIDLDKKPSDMDNVFASLDEMSSEEEEEEDWD-EIHLGQ 209
                 +     S++ +V  S    S E E EE+ D E   G+
Sbjct: 150 V-----EDQGSESEVSDVEESEFVTSLENESEEELDLEKDDGE 187


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family
           are designated YL1. These proteins have been shown to be
           DNA-binding and may be a transcription factor.
          Length = 238

 Score = 28.9 bits (65), Expect = 7.7
 Identities = 18/85 (21%), Positives = 37/85 (43%), Gaps = 5/85 (5%)

Query: 98  KRAKRTARAASSGSD---TPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQ 154
           K+ K+   AA S       PKK+  R           R+++ R+ T   +  +A   + +
Sbjct: 101 KKKKKDPTAAKSPKAAAPRPKKKSERISWAPTLLDSPRRKSSRSST--VQNKEATHERLK 158

Query: 155 KNLEKKIKKMAKEERRRKIDLDKKP 179
           +   ++ K  AK  +R++   +K+ 
Sbjct: 159 EREIRRKKIQAKARKRKEKKKEKEL 183


>gnl|CDD|219715 pfam08070, DTHCT, DTHCT (NUC029) region.  The DTCHT region is the
           C-terminal part of DNA gyrases B / topoisomerase IV /
           HATPase proteins. This region is composed of quite low
           complexity sequence.
          Length = 95

 Score = 27.2 bits (60), Expect = 7.8
 Identities = 23/95 (24%), Positives = 35/95 (36%), Gaps = 19/95 (20%)

Query: 98  KRAKRTARAASSGSDT--PKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQK 155
           K  KR    + S S+   PKK               + +  + + PS  +          
Sbjct: 5   KGKKRETVNSDSDSEAGVPKK-----------PAPPKGKGSKKRKPSSSDES------DS 47

Query: 156 NLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLD 190
           N  KK+ K A  ++ +K D D  PSD D+  A   
Sbjct: 48  NFGKKVSKSATSKKSKKGDDDDFPSDFDSAVAPRA 82


>gnl|CDD|177288 PHA00616, PHA00616, hypothetical protein.
          Length = 44

 Score = 25.9 bits (57), Expect = 8.0
 Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 502 QCPHCPKTFPRKTELSNHIKGIHMKHEL 529
           QC  C   F +K E+  H+  +H +++L
Sbjct: 3   QCLRCGGIFRKKKEVIEHLLSVHKQNKL 30


>gnl|CDD|179886 PRK04860, PRK04860, hypothetical protein; Provisional.
          Length = 160

 Score = 28.3 bits (64), Expect = 8.3
 Identities = 10/21 (47%), Positives = 13/21 (61%)

Query: 385 LRRHMRVHTNEKPYKCKDCGA 405
           +RRH RV   E  Y+C+ CG 
Sbjct: 131 VRRHNRVVRGEAVYRCRRCGE 151


>gnl|CDD|227516 COG5189, SFP1, Putative transcriptional repressor regulating G2/M
           transition [Transcription / Cell division and chromosome
           partitioning].
          Length = 423

 Score = 28.9 bits (64), Expect = 8.3
 Identities = 24/96 (25%), Positives = 38/96 (39%), Gaps = 21/96 (21%)

Query: 346 KFTRKAELQLHIKGIHLKHQLEKTYICEY--CHKEFTFYNYLRRHM-------RVHTN-- 394
           K     E  +      LK +  K Y C    C+K++   N L+ HM       ++H N  
Sbjct: 326 KLAHGGERNIDTPSRMLKVKDGKPYKCPVEGCNKKYKNQNGLKYHMLHGHQNQKLHENPS 385

Query: 395 ----------EKPYKCKDCGAAFNHNVSLKNHKNSS 420
                     +KPY+C+ C   + +   LK H+  S
Sbjct: 386 PEKMNIFSAKDKPYRCEVCDKRYKNLNGLKYHRKHS 421


>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein.  TolA couples the inner
           membrane complex of itself with TolQ and TolR to the
           outer membrane complex of TolB and OprL (also called
           Pal). Most of the length of the protein consists of
           low-complexity sequence that may differ in both length
           and composition from one species to another,
           complicating efforts to discriminate TolA (the most
           divergent gene in the tol-pal system) from paralogs such
           as TonB. Selection of members of the seed alignment and
           criteria for setting scoring cutoffs are based largely
           conserved operon struction. //The Tol-Pal complex is
           required for maintaining outer membrane integrity. Also
           involved in transport (uptake) of colicins and
           filamentous DNA, and implicated in pathogenesis.
           Transport is energized by the proton motive force. TolA
           is an inner membrane protein that interacts with
           periplasmic TolB and with outer membrane porins ompC,
           phoE and lamB [Transport and binding proteins, Other,
           Cellular processes, Pathogenesis].
          Length = 346

 Score = 29.0 bits (65), Expect = 8.4
 Identities = 14/84 (16%), Positives = 33/84 (39%), Gaps = 6/84 (7%)

Query: 96  IDKRAKRTARAASSGSDTPKKRRGRSRSKGRDAPVSRKRNRRAKTPSQEETDAKILQQQK 155
                    RAA       + R+     +      +++  + AK   +++  A+  + ++
Sbjct: 75  QQAEEAEKQRAAE------QARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAEEAKAKQ 128

Query: 156 NLEKKIKKMAKEERRRKIDLDKKP 179
             E K K  A+ E++ K +  K+ 
Sbjct: 129 AAEAKAKAEAEAEKKAKEEAKKQA 152


>gnl|CDD|149172 pfam07948, Nairovirus_M, Nairovirus M polyprotein-like.  The
           sequences in this family are similar to the Dugbe virus
           M polyprotein precursor, which includes glycoproteins G1
           and G2. Both are thought to be inserted in the membrane
           of the Golgi complex of the infected host cell, and G1
           is known to have a role in infection of vertebrate
           hosts.
          Length = 645

 Score = 29.0 bits (65), Expect = 8.5
 Identities = 9/36 (25%), Positives = 17/36 (47%)

Query: 313 KICDICSAEVVHLAIHKKHSHSGQYHVCPHCGKKFT 348
           + C  C    V+    + H  +  Y++CP+C  + T
Sbjct: 495 QTCTKCEQTPVNAIDAEMHDLNCSYNICPYCASRLT 530


>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
          Length = 651

 Score = 29.2 bits (66), Expect = 8.6
 Identities = 14/67 (20%), Positives = 32/67 (47%), Gaps = 8/67 (11%)

Query: 129 PVSRKRNRRAKTPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDM----DN 184
              ++  +R K   +   + K L++ K  E+K KK  ++  + KI     P++     ++
Sbjct: 547 LDDKEELQREKEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAKIP----PAEFFKRQED 602

Query: 185 VFASLDE 191
            +++ DE
Sbjct: 603 KYSAFDE 609


>gnl|CDD|215056 PLN00104, PLN00104, MYST -like histone acetyltransferase;
           Provisional.
          Length = 450

 Score = 29.0 bits (65), Expect = 9.0
 Identities = 12/34 (35%), Positives = 18/34 (52%)

Query: 331 HSHSGQYHVCPHCGKKFTRKAELQLHIKGIHLKH 364
           ++   + + C  C K   RK +LQ H+K   LKH
Sbjct: 192 YNDCSKLYFCEFCLKFMKRKEQLQRHMKKCDLKH 225


>gnl|CDD|153328 cd07644, I-BAR_IMD_BAIAP2L2, Inverse (I)-BAR, also known as the
           IRSp53/MIM homology Domain (IMD), of Brain-specific
           Angiogenesis Inhibitor 1-Associated Protein 2-Like 2.
           The IMD domain, also called Inverse-Bin/Amphiphysin/Rvs
           (I-BAR) domain, is a dimerization and lipid-binding
           module that bends membranes and induces membrane
           protrusions. This group is composed of uncharacterized
           proteins known as BAIAP2L2 (Brain-specific Angiogenesis
           Inhibitor 1-Associated Protein 2-Like 2). They contain
           an N-terminal IMD, an SH3 domain, and a WASP homology 2
           (WH2) actin-binding motif at the C-terminus. The related
           proteins, BAIAP2L1 and IRSp53, function as regulators of
           membrane dynamics and the actin cytoskeleton. The IMD
           domain binds and bundles actin filaments, binds
           membranes and produces membrane protrusions, and
           interacts with the small GTPase Rac.
          Length = 215

 Score = 28.3 bits (63), Expect = 9.4
 Identities = 11/65 (16%), Positives = 31/65 (47%)

Query: 140 TPSQEETDAKILQQQKNLEKKIKKMAKEERRRKIDLDKKPSDMDNVFASLDEMSSEEEEE 199
             S+   + +   +  NLEK + ++ + ER+R  ++ +   +++ +  S+     E +  
Sbjct: 111 EDSRRVYELEYRHRAANLEKCMSELWRMERQRDRNVREMKENVNRLRQSMQAFLKESQRA 170

Query: 200 EDWDE 204
            + +E
Sbjct: 171 AELEE 175


>gnl|CDD|227579 COG5254, ARV1, Predicted membrane protein [Function unknown].
          Length = 239

 Score = 28.3 bits (63), Expect = 9.7
 Identities = 15/49 (30%), Positives = 20/49 (40%), Gaps = 1/49 (2%)

Query: 314 ICDICSAEVVHLAIHKKHSHSGQYHVCPHCGKKFTRKAELQLHIKGIHL 362
           +C  C + V  L      S   Q   CP C +K  +  EL   +K I L
Sbjct: 2   VCIECGSRVDSLYTRYSTSAI-QLSRCPSCNRKMDKYFELDGVLKLIDL 49


>gnl|CDD|218790 pfam05876, Terminase_GpA, Phage terminase large subunit (GpA).
           This family consists of several phage terminase large
           subunit proteins as well as related sequences from
           several bacterial species. The DNA packaging enzyme of
           bacteriophage lambda, terminase, is a heteromultimer
           composed of a small subunit, gpNu1, and a large subunit,
           gpA, products of the Nu1 and A genes, respectively.
           Terminase is involved in the site-specific binding and
           cutting of the DNA in the initial stages of packaging.
           It is now known that gpA is actively involved in late
           stages of packaging, including DNA translocation, and
           that this enzyme contains separate functional domains
           for its early and late packaging activities.
          Length = 552

 Score = 28.7 bits (65), Expect = 9.8
 Identities = 11/48 (22%), Positives = 19/48 (39%), Gaps = 10/48 (20%)

Query: 337 YHV-CPHCGKKFTRKAELQLHIKGIHLKHQLEKT---YICEYCHKEFT 380
           Y+V CPHCG++       +L  + +            Y+C +C     
Sbjct: 199 YYVPCPHCGEEQ------ELRWERLKWDKGEAPETARYVCPHCGCVIE 240


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.318    0.132    0.412 

Gapped
Lambda     K      H
   0.267   0.0804    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 29,500,922
Number of extensions: 2774914
Number of successful extensions: 4674
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4531
Number of HSP's successfully gapped: 246
Length of query: 606
Length of database: 10,937,602
Length adjustment: 103
Effective length of query: 503
Effective length of database: 6,369,140
Effective search space: 3203677420
Effective search space used: 3203677420
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 62 (27.5 bits)