RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy10643
         (769 letters)



>gnl|CDD|221250 pfam11831, Myb_Cef, pre-mRNA splicing factor component.  This
           family is a region of the Myb-Related Cdc5p/Cef1
           proteins, in fungi, and is part of the pre-mRNA splicing
           factor complex.
          Length = 363

 Score =  302 bits (775), Expect = 7e-96
 Identities = 155/382 (40%), Positives = 212/382 (55%), Gaps = 38/382 (9%)

Query: 328 DYS-IGTGAAMKTPRTPAPQTDRILQEAQNMMALTHVDTPLKGGLNTPLLAPDFSGVTPS 386
           +YS       ++TPRTP  + D I+ EA+N+ ALT   TPL GG NTPL   DF GVTP 
Sbjct: 2   NYSQTNNNTPIRTPRTP-AEEDAIMNEARNLRALTETQTPLLGGENTPLHETDFDGVTPR 60

Query: 387 KDHLATPNTVLTTPFSQRSVHDGGPGSTPGGFSTPGVRDSVRGGATP--TPIRDRLNINP 444
           K  + TPN + T PF        G G+TP              G  P  TP RD+L+IN 
Sbjct: 61  KQQIQTPNPLATPPFRS----GNGIGATPLRGG---------SGYGPLRTPNRDKLSIND 107

Query: 445 EDNMLLEAGDTPAAFKSFQTE---QLRAGLSSLPLPKNDYEIVVPENEEMEEKASGDVDM 501
           E  M  E G+TP   K  + E    L++GL+SLP PKN++E+ +PE EE E +   +  +
Sbjct: 108 EAAM--EVGETPREEKLREDEAKLSLKSGLASLPKPKNEFELELPEEEEEEPEEMEEE-L 164

Query: 502 LEDQADVDAAAIARMKAQREHEMRLRSQVIQKNLPRPFDINIVLRPSNSDPPLSELQKAE 561
            ED AD DA   A  +A+ + E+R RSQVIQ+NLPRP  +++++   + + PL+EL  AE
Sbjct: 165 EEDAADRDARKRAAEEAKEQEELRRRSQVIQRNLPRPSVLDLIVLRPSVNVPLTELDPAE 224

Query: 562 ELIKQEMITMLHYDALETPLSVDKKAAKQSNILTDEEHYNFLKHRPYRNFSLEELEAADD 621
           +LI +EM  ++ +DAL+ PL   K                  K  PY +F  EELE A  
Sbjct: 225 KLINKEMALLIAHDALKYPLPGGKPK---------------GKAVPYEDFDDEELEEARK 269

Query: 622 LLKREMDLVKTGMGHGDLSLESFTQVWEECLSQVLFLANQNRYTRASLASKKDRADSLAK 681
           L++ E++ +K  MGH + SLE F + W E   QVLFL   N YT    AS++DR ++   
Sbjct: 270 LIEAELEKLKGEMGHQEESLEEFDEAWSELNEQVLFLPGLNAYTDIEDASEEDRIEAYKA 329

Query: 682 RLEQNRKHMSLEAKKATKMENK 703
            LE  RK M  EA+KA K+E K
Sbjct: 330 ALENVRKKMEKEAEKANKLEKK 351


>gnl|CDD|227476 COG5147, REB1, Myb superfamily proteins, including transcription
           factors and mRNA splicing factors [Transcription / RNA
           processing and modification / Cell division and
           chromosome partitioning].
          Length = 512

 Score =  163 bits (414), Expect = 4e-43
 Identities = 91/289 (31%), Positives = 128/289 (44%), Gaps = 23/289 (7%)

Query: 17  DEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWLDPSIKKTEWSREEDEKLLHLA 76
           DE LKA V K G N WS++ASLL   + KQ   RW   L+P +KK  WS EEDE+L+ L 
Sbjct: 28  DEDLKALVKKLGPNNWSKVASLLISSTGKQSSNRWNNHLNPQLKKKNWSEEEDEQLIDLD 87

Query: 77  KLMPTQWRTIAPII-GRTAAQCLERYEFLLDQAQKKEEGEDVADDPRKLKPGEIDPNPET 135
           K + TQW TIA     RTA QC+ERY   L+           +   R+ +  +IDP  E 
Sbjct: 88  KELGTQWSTIADYKDRRTAQQCVERYVNTLEDLSSTH----DSKLQRRNEFDKIDPFNEN 143

Query: 136 KPARPDPKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRELRAA 195
              RPD  + +  E E+  EA  RL   +  KA  K REK  E    +  LQ+ +EL++A
Sbjct: 144 SARRPDIYEDELLEREVNREASYRLRVPRVSKADVKPREKGEENNPDIEDLQEMKELKSA 203

Query: 196 GIEVAPRQKKKRGIDYNAEIPFEKRPAPGFYDTSKEER---LRQQHLDGE-LRSEKEERE 251
            I        K  I+        K    G     ++E      ++ L  +         +
Sbjct: 204 SITRHLILPSKSEIN--------KAFKKGETLALEQEINEYKEKKGLSRKQFCERIWSTD 255

Query: 252 RKKDK------QKLKQRKENDIPTAMLQNLEPEKKRSKLVLPEPQISDM 294
           R +DK      +KL  R +  I   + +     ++R K    E Q    
Sbjct: 256 RDEDKFWPNIYKKLPYRDKKSIYKHLRRKYNIFEQRGKWTKEEEQELAK 304


>gnl|CDD|212557 cd11659, SANT_CDC5_II, SANT/myb-like DNA-binding domain of Cell
           Division Cycle 5-Like Protein repeat II.  In humans,
           cell division cycle 5-like protein (CDC5) functions in
           pre-mRNA splicing in cell cycle control. The
           DNA-binding, myb-like domain of CDC5 is a member of the
           SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR
           and TFIIIB', several factors that share this domain. The
           SANT domain resembles the 3 alpha-helix bundle of
           DNA-binding Myb domains and is found in a diverse set of
           proteins.
          Length = 53

 Score =  112 bits (281), Expect = 5e-30
 Identities = 46/53 (86%), Positives = 50/53 (94%)

Query: 57  PSIKKTEWSREEDEKLLHLAKLMPTQWRTIAPIIGRTAAQCLERYEFLLDQAQ 109
           PSIKKTEW+REEDEKLLHLAKL+PTQWRTIAPI+GRTA QCLERY  LLD+AQ
Sbjct: 1   PSIKKTEWTREEDEKLLHLAKLLPTQWRTIAPIVGRTAQQCLERYNKLLDEAQ 53


>gnl|CDD|215818 pfam00249, Myb_DNA-binding, Myb-like DNA-binding domain.  This
          family contains the DNA binding domains from Myb
          proteins, as well as the SANT domain family.
          Length = 47

 Score = 56.0 bits (136), Expect = 2e-10
 Identities = 18/39 (46%), Positives = 24/39 (61%)

Query: 17 DEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWL 55
          DE+L  AV K+G   WS+IA  L  ++  QCK RW  +L
Sbjct: 9  DELLIEAVKKHGNGNWSKIAKHLPGRTDNQCKNRWNNYL 47



 Score = 41.7 bits (99), Expect = 3e-05
 Identities = 16/43 (37%), Positives = 21/43 (48%), Gaps = 2/43 (4%)

Query: 61  KTEWSREEDEKLLHLAKLMPT-QWRTIAPII-GRTAAQCLERY 101
           +  W+ EEDE L+   K      W  IA  + GRT  QC  R+
Sbjct: 1   RGPWTPEEDELLIEAVKKHGNGNWSKIAKHLPGRTDNQCKNRW 43


>gnl|CDD|206092 pfam13921, Myb_DNA-bind_6, Myb-like DNA-binding domain.  This
          family contains the DNA binding domains from Myb
          proteins, as well as the SANT domain family.
          Length = 59

 Score = 56.6 bits (137), Expect = 2e-10
 Identities = 23/57 (40%), Positives = 32/57 (56%), Gaps = 2/57 (3%)

Query: 16 QDEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWLDPSIKKTEWSREEDEKL 72
          +DE L   V KYG N W +IA  L R +   C+ RW   L P   +  W++EED++L
Sbjct: 5  EDEKLLKLVEKYG-NDWKQIAEELGR-TPSACRDRWRRKLRPKRSRGPWTKEEDQRL 59



 Score = 42.3 bits (100), Expect = 2e-05
 Identities = 17/39 (43%), Positives = 24/39 (61%)

Query: 64  WSREEDEKLLHLAKLMPTQWRTIAPIIGRTAAQCLERYE 102
           W+ EEDEKLL L +     W+ IA  +GRT + C +R+ 
Sbjct: 1   WTEEEDEKLLKLVEKYGNDWKQIAEELGRTPSACRDRWR 39


>gnl|CDD|197842 smart00717, SANT, SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding
          domains. 
          Length = 49

 Score = 52.2 bits (126), Expect = 5e-09
 Identities = 20/41 (48%), Positives = 26/41 (63%)

Query: 17 DEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWLDP 57
          DE+L   V KYGKN W +IA  L  ++A+QC+ RW   L P
Sbjct: 9  DELLIELVKKYGKNNWEKIAKELPGRTAEQCRERWRNLLKP 49



 Score = 44.1 bits (105), Expect = 4e-06
 Identities = 22/47 (46%), Positives = 26/47 (55%), Gaps = 2/47 (4%)

Query: 61  KTEWSREEDEKLLHLAKLMPT-QWRTIAPIIG-RTAAQCLERYEFLL 105
           K EW+ EEDE L+ L K      W  IA  +  RTA QC ER+  LL
Sbjct: 1   KGEWTEEEDELLIELVKKYGKNNWEKIAKELPGRTAEQCRERWRNLL 47


>gnl|CDD|238096 cd00167, SANT, 'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding
          domains. Tandem copies of the domain bind telomeric DNA
          tandem repeatsas part of the capping complex. Binding
          is sequence dependent for repeats which contain the G/C
          rich motif [C2-3 A (CA)1-6]. The domain is also found
          in regulatory transcriptional repressor complexes where
          it also binds DNA.
          Length = 45

 Score = 51.8 bits (125), Expect = 6e-09
 Identities = 20/39 (51%), Positives = 25/39 (64%)

Query: 17 DEILKAAVMKYGKNQWSRIASLLHRKSAKQCKARWFEWL 55
          DE+L  AV KYGKN W +IA  L  ++ KQC+ RW   L
Sbjct: 7  DELLLEAVKKYGKNNWEKIAKELPGRTPKQCRERWRNLL 45



 Score = 41.4 bits (98), Expect = 3e-05
 Identities = 19/45 (42%), Positives = 22/45 (48%), Gaps = 2/45 (4%)

Query: 63  EWSREEDEKLLHL-AKLMPTQWRTIAPIIG-RTAAQCLERYEFLL 105
            W+ EEDE LL    K     W  IA  +  RT  QC ER+  LL
Sbjct: 1   PWTEEEDELLLEAVKKYGKNNWEKIAKELPGRTPKQCRERWRNLL 45


>gnl|CDD|178751 PLN03212, PLN03212, Transcription repressor MYB5; Provisional.
          Length = 249

 Score = 52.4 bits (125), Expect = 2e-07
 Identities = 42/137 (30%), Positives = 69/137 (50%), Gaps = 13/137 (9%)

Query: 5   MILKRWVIFVFQDEILKAAVMKYGKNQWSRI---ASLLHRKSAKQCKARWFEWLDPSIKK 61
           M +KR    V +DEIL + + K G+ +W  +   A LL  +  K C+ RW  +L PS+K+
Sbjct: 21  MGMKRGPWTVEEDEILVSFIKKEGEGRWRSLPKRAGLL--RCGKSCRLRWMNYLRPSVKR 78

Query: 62  TEWSREEDEKLLHLAKLMPTQWRTIA-PIIGRTAAQCLERYEFLLDQAQKKEEGEDVADD 120
              + +E++ +L L +L+  +W  IA  I GRT  +    +   L +   ++       D
Sbjct: 79  GGITSDEEDLILRLHRLLGNRWSLIAGRIPGRTDNEIKNYWNTHLRKKLLRQ-----GID 133

Query: 121 PRKLKPGEIDPNPETKP 137
           P+  KP  +D N   KP
Sbjct: 134 PQTHKP--LDANNIHKP 148


>gnl|CDD|215570 PLN03091, PLN03091, hypothetical protein; Provisional.
          Length = 459

 Score = 44.2 bits (104), Expect = 2e-04
 Identities = 26/81 (32%), Positives = 43/81 (53%), Gaps = 4/81 (4%)

Query: 16  QDEILKAAVMKYGKNQWSRIASL--LHRKSAKQCKARWFEWLDPSIKKTEWSREEDEKLL 73
           +DE L   + KYG   WS +     L R   K C+ RW  +L P +K+  +S++E+  ++
Sbjct: 21  EDEKLLRHITKYGHGCWSSVPKQAGLQR-CGKSCRLRWINYLRPDLKRGTFSQQEENLII 79

Query: 74  HLAKLMPTQWRTIAP-IIGRT 93
            L  ++  +W  IA  + GRT
Sbjct: 80  ELHAVLGNRWSQIAAQLPGRT 100


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 44.4 bits (104), Expect = 3e-04
 Identities = 45/186 (24%), Positives = 76/186 (40%), Gaps = 20/186 (10%)

Query: 106  DQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQG 165
            D+A+K  E +  AD+ +K +  E     E K A    K  +  + E   +A       + 
Sbjct: 1500 DEAKKAAEAKKKADEAKKAE--EAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEEL 1557

Query: 166  KKA--KRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAP 223
            KKA  K+KA E +  E  +  AL+K  E + A        ++ R  +       EK+   
Sbjct: 1558 KKAEEKKKAEEAKKAEEDKNMALRKAEEAKKA--------EEARIEEVMKLYEEEKKMKA 1609

Query: 224  GFYDTSKEERLRQQHL--------DGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNL 275
                 ++E +++ + L          E   +KE  E+KK ++  K  +EN I  A     
Sbjct: 1610 EEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKK 1669

Query: 276  EPEKKR 281
              E K+
Sbjct: 1670 AEEDKK 1675



 Score = 36.7 bits (84), Expect = 0.053
 Identities = 34/204 (16%), Positives = 83/204 (40%), Gaps = 10/204 (4%)

Query: 106  DQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQG 165
            ++A+K +E +  A++ +K    E     E    + D      +  +   EA+      + 
Sbjct: 1467 EEAKKADEAKKKAEEAKKAD--EAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKA 1524

Query: 166  KKAKRKAREKQLEEARRLAALQKRRELRAA-----GIEVAPRQKKKRGIDYNAEIPFEKR 220
             +AK+    K+ +EA++    +K  EL+ A       E    ++ K+  +       +  
Sbjct: 1525 DEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAE 1584

Query: 221  PAPGFYDTSKEERLRQQHLDGELRSE---KEERERKKDKQKLKQRKENDIPTAMLQNLEP 277
             A    +   EE ++    + ++++E   K E  + K ++  K  +E      + +    
Sbjct: 1585 EAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAE 1644

Query: 278  EKKRSKLVLPEPQISDMELEQVVK 301
            EKK+++ +    + + ++  +  K
Sbjct: 1645 EKKKAEELKKAEEENKIKAAEEAK 1668



 Score = 34.3 bits (78), Expect = 0.34
 Identities = 54/278 (19%), Positives = 108/278 (38%), Gaps = 22/278 (7%)

Query: 41   RKSAKQCKARWFEWLDPSIKKTEWSREEDEKLLHLAKLMPTQWRTIAPIIGRTAAQCLER 100
            +K+ +  KA   +  D + KK E +++ DE     AK    + +  A    + A +  + 
Sbjct: 1290 KKADEAKKAEEKKKADEAKKKAEEAKKADE-----AKKKAEEAKKKADAAKKKAEEAKKA 1344

Query: 101  YEFLLDQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDE----DELEMLSEA 156
             E    +A+   +  + A+   K +  E       K A    K  +E    DE +  +E 
Sbjct: 1345 AEAAKAEAEAAADEAEAAE--EKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEE 1402

Query: 157  RARLANTQGKKA--KRKARE--KQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGIDYN 212
              + A+   K A  K+KA E  K+ EE ++    +K+ E      E   + ++ +  +  
Sbjct: 1403 DKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEA 1462

Query: 213  AEIPFEKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAML 272
             +   E + A      ++E +   +       ++K+  E KK   + K+  E        
Sbjct: 1463 KKKAEEAKKADEAKKKAEEAKKADE-------AKKKAEEAKKKADEAKKAAEAKKKADEA 1515

Query: 273  QNLEPEKKRSKLVLPEPQISDMELEQVVKLGRATEVAR 310
            +  E  KK  +    E      E ++  +  +A E+ +
Sbjct: 1516 KKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKK 1553



 Score = 34.0 bits (77), Expect = 0.46
 Identities = 41/186 (22%), Positives = 73/186 (39%), Gaps = 22/186 (11%)

Query: 109  QKKEEGEDVADDPRK---LKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQG 165
            + K+  ED     RK    K  E     E      + K M  +E +   EA+ +    + 
Sbjct: 1568 EAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKK 1627

Query: 166  KKAKRKARE----KQLEEARRLAALQKRRE---LRAAGIEVAPRQKKKRGIDYNAEIPFE 218
             + ++K  E    K+ EE ++   L+K  E   ++AA       + KK+  +       E
Sbjct: 1628 AEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDE 1687

Query: 219  KRPAPGFYDTSKEERLRQQHLDG---ELRSEKEERERKKDKQKLKQRKENDIPTAMLQNL 275
            K+           E L+++  +    E   +KE  E+KK ++  K  +EN I     +  
Sbjct: 1688 KK---------AAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAKKE 1738

Query: 276  EPEKKR 281
              E K+
Sbjct: 1739 AEEDKK 1744



 Score = 33.6 bits (76), Expect = 0.53
 Identities = 49/209 (23%), Positives = 87/209 (41%), Gaps = 19/209 (9%)

Query: 106  DQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQG 165
            D+A+KK E    AD+ +K K  E     E K    + K  DE + +     +A  A  + 
Sbjct: 1434 DEAKKKAEEAKKADEAKK-KAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKA 1492

Query: 166  KKAKRKARE--KQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAP 223
            ++AK+KA E  K  E  ++    +K  E + A       + KK      AE   EK+ A 
Sbjct: 1493 EEAKKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAE---EKKKAD 1549

Query: 224  GFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKRSK 283
               +  K E L++           EE+++ ++ +K ++ K   +  A       E +  +
Sbjct: 1550 ---ELKKAEELKK----------AEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEE 1596

Query: 284  LVLPEPQISDMELEQVVKLGRATEVAREV 312
            ++    +   M+ E+  K   A   A E+
Sbjct: 1597 VMKLYEEEKKMKAEEAKKAEEAKIKAEEL 1625



 Score = 32.8 bits (74), Expect = 0.90
 Identities = 46/267 (17%), Positives = 105/267 (39%), Gaps = 17/267 (6%)

Query: 41   RKSAKQCKARWFEWLDPSIKKTEWSREEDEKLLHLAKLMPTQWRTIAPIIGRTAAQCLER 100
            +K+ +  KA   +  +   K  E  + E++K + L K         A    +     +E 
Sbjct: 1546 KKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRK---------AEEAKKAEEARIEE 1596

Query: 101  YEFLLDQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARL 160
               L ++ +K +  E    +  K+K  E+    E K      K  + +E +   E +   
Sbjct: 1597 VMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAE 1656

Query: 161  ANTQGKKAKRKAREKQLEEARRLAALQKRRELRAAGIEVAPR--QKKKRGIDYNAEIPFE 218
               + K A+   + +  E+ ++    +K  E      E   +  ++ K+  +   +   E
Sbjct: 1657 EENKIKAAEEAKKAE--EDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEE 1714

Query: 219  KRPAPGFYDTSKEERLRQQHLDGELRSEK----EERERKKDKQKLKQRKENDIPTAMLQN 274
            K+ A       +E +++ +    E   +K    E ++ +++K+K+   K+ +   A    
Sbjct: 1715 KKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIR 1774

Query: 275  LEPEKKRSKLVLPEPQISDMELEQVVK 301
             E E    + +  E +   ME+++ +K
Sbjct: 1775 KEKEAVIEEELDEEDEKRRMEVDKKIK 1801



 Score = 30.5 bits (68), Expect = 4.4
 Identities = 33/214 (15%), Positives = 91/214 (42%), Gaps = 9/214 (4%)

Query: 105  LDQAQKKEEGEDVADDPRKLKPGEIDPNPETKPA---RPDPKDMDEDELEMLSEARARLA 161
              +A++ ++ ++      K K  E+    E K A   +   +    +E + ++  +A  A
Sbjct: 1527 AKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEEA 1586

Query: 162  NTQGKKAKRKAREKQLEEARRLAALQKRRE----LRAAGIEVAPRQKKKRGIDYNAEIPF 217
              + ++A+ +   K  EE +++ A + ++     ++A  ++ A  +KKK       E   
Sbjct: 1587 K-KAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAE- 1644

Query: 218  EKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNLEP 277
            EK+ A       +E +++      +   +K++ E  K  ++ +++    +     +  + 
Sbjct: 1645 EKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKA 1704

Query: 278  EKKRSKLVLPEPQISDMELEQVVKLGRATEVARE 311
            E+ + K    + +  +++  +     +A E  +E
Sbjct: 1705 EELKKKEAEEKKKAEELKKAEEENKIKAEEAKKE 1738


>gnl|CDD|212558 cd11660, SANT_TRF, Telomere repeat binding factor-like
          DNA-binding domains of the SANT/myb-like family.  Human
          telomere repeat binding factors, TRF1 and TRF2,
          function as part of the 6 component shelterin complex.
          TRF2 binds DNA and recruits RAP1 (via binding to the
          RAP1 protein c-terminal (RCT)) and TIN2 in the
          protection of telomeres from DNA repair machinery.
          Metazoan shelterin consists of 3 DNA binding proteins
          (TRF2, TRF1, and POT1) and 3 recruited proteins that
          bind to one or more of these DNA-binding proteins
          (RAP1, TIN2, TPP1).  Schizosaccharomyces pombe TAZ1 is
          an orthlog and binds RAP1. Human TRF1 and TRF2 bind
          double-stranded DNA. hTRF2 consists of a basic
          N-terminus, a TRF homology domain, the RAP1 binding
          motif (RBM), the TIN2 binding motif (TBM) and a
          myb-like DNA binding domain, SANT, named after 'SWI3,
          ADA2, N-CoR and TFIIIB', several factors that share
          this domain. Tandem copies of the domain bind telomeric
          DNA tandem repeats as part of the capping complex. The
          single myb-like domain of TRF-type proteins is similar
          to the tandem myb_like domains found in yeast RAP1.
          Length = 50

 Score = 35.2 bits (82), Expect = 0.005
 Identities = 11/38 (28%), Positives = 19/38 (50%), Gaps = 3/38 (7%)

Query: 17 DEILKAAVMKYGKNQWSRI---ASLLHRKSAKQCKARW 51
          DE L   V KYG   W++I      ++ +++   K +W
Sbjct: 8  DEALVEGVEKYGVGNWAKILKDYFFVNNRTSVDLKDKW 45


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 39.1 bits (91), Expect = 0.009
 Identities = 39/215 (18%), Positives = 67/215 (31%), Gaps = 18/215 (8%)

Query: 108 AQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQGKK 167
           A++ +      ++  K +  E     + KP         ++E +               K
Sbjct: 95  AKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEEPKDRKPKEEAKEKRP----------PK 144

Query: 168 AKRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAPGFYD 227
            K K +EK++EE R     +KR  +RA      P +KK            ++R A     
Sbjct: 145 EKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEPPEEEKQRQAAREAV 204

Query: 228 TSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKRSKLVLP 287
             K E            +E+ E+E    K +       +   +   +    +  S L  P
Sbjct: 205 KGKPEE--------PDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISRRSSSSLKKP 256

Query: 288 EPQISDMELEQVVKLGRATEVAREVAIESGSGPTS 322
           +P  S    E      R     R       + P S
Sbjct: 257 DPSPSMASPETRESSKRTETRPRTSLRPPSARPAS 291


>gnl|CDD|222688 pfam14334, DUF4390, Domain of unknown function (DUF4390).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria and eukaryotes.
           Proteins in this family are typically between 192 and
           203 amino acids in length.
          Length = 165

 Score = 34.5 bits (80), Expect = 0.087
 Identities = 17/91 (18%), Positives = 34/91 (37%), Gaps = 3/91 (3%)

Query: 460 KSFQTEQLRAGLSSLPLPKNDYEIVVPENEEMEEKASGD--VDMLEDQADVDAAAIARMK 517
           +   +      LS  PL    Y +    +   +  A+ D  +  L    +   A    ++
Sbjct: 62  EKVASATRTYRLSYDPL-TRRYRVTDGGSGLSQSFATLDEALRALGRIRNWPVADAGDLE 120

Query: 518 AQREHEMRLRSQVIQKNLPRPFDINIVLRPS 548
              ++ +RLR ++    LP+P  IN +    
Sbjct: 121 PGEDYRVRLRVRLDTSQLPKPLQINALFSSD 151


>gnl|CDD|212556 cd11658, SANT_DMAP1_like, SANT/myb-like domain of Human Dna
           Methyltransferase 1 Associated Protein 1-like.  These
           proteins are members of the SANT/myb group. SANT is
           named after 'SWI3, ADA2, N-CoR and TFIIIB', several
           factors that share this domain. The SANT domain
           resembles the 3 alpha-helix bundle of the DNA-binding
           Myb domains and is found in a diverse set of proteins.
          Length = 46

 Score = 29.7 bits (67), Expect = 0.41
 Identities = 12/42 (28%), Positives = 18/42 (42%), Gaps = 4/42 (9%)

Query: 64  WSREEDEKLLHLAKLMPTQWRTIA----PIIGRTAAQCLERY 101
           W++EE + L  L K    +W  I        GR+     E+Y
Sbjct: 1   WTKEETDYLFDLVKRFDLRWNVILDRYPFQKGRSVEDLKEKY 42


>gnl|CDD|130283 TIGR01216, ATP_synt_epsi, ATP synthase, F1 epsilon subunit (delta
           in mitochondria).  This model describes one of the five
           types of subunits in the F1 part of F1/F0 ATP synthases.
           Members of this family are designated epsilon in
           bacterial and chloroplast systems but designated delta
           in mitochondria, where the counterpart of the bacterial
           delta subunit is designated OSCP. In a few cases
           (Propionigenium modestum, Acetobacterium woodii) scoring
           above the trusted cutoff and designated here as
           exceptions, Na+ replaces H+ for translocation [Energy
           metabolism, ATP-proton motive force interconversion].
          Length = 130

 Score = 31.8 bits (73), Expect = 0.49
 Identities = 15/47 (31%), Positives = 25/47 (53%), Gaps = 3/47 (6%)

Query: 143 KDMDEDEL-EMLSEARARLANTQGKKAKRKAREKQLEEAR-RLAALQ 187
            D+DE E  + L  A   L + +  K   +A   +L++AR +L AL+
Sbjct: 85  DDIDEAEAEKALEAAEKLLESAEDDKDLAEA-LLKLKKARAQLEALE 130


>gnl|CDD|213402 cd12203, GT1, GT1, myb-like, SANT family.  GT-1, a myb-like
          protein, is one of the GT trihelix transcription
          factors. GT-1 binds the GT cis-element of rbcS-3A, a
          light-induced gene, as a dimer. Arabidopsis GT-1 is a
          trans-activator and acts in the stabilization of
          components of the transcrtiption pre-initiation complex
          comprised of TFIIA-TBP-TATA. The isolated GT-1
          DNA-binding domain is sufficient to bind DNA. This
          region closely resemble the myb domain, but with longer
          helices. It has been proposed that GT-1 may respond to
          light signals via calcium-dependent phosphorylation to
          create a light-modulated molecular switch. These
          proteins are members of the SANT/myb group. SANT is
          named after 'SWI3, ADA2, N-CoR and TFIIIB', several
          factors that share this domain. The SANT domain
          resembles the 3 alpha-helix bundle of the DNA-binding
          Myb domains and is found in a diverse set of proteins.
          Length = 66

 Score = 30.3 bits (69), Expect = 0.52
 Identities = 15/39 (38%), Positives = 20/39 (51%), Gaps = 5/39 (12%)

Query: 29 KNQWSRIASLLHRK----SAKQCKARWFEWLDPSIKKTE 63
          K  W  IA+ +       SAKQCK +W E L+   KK +
Sbjct: 29 KALWEEIAAKMRELGYNRSAKQCKEKW-ENLNKYYKKVK 66


>gnl|CDD|148065 pfam06234, TmoB, Toluene-4-monooxygenase system protein B (TmoB).
           This family consists of several Toluene-4-monooxygenase
           system protein B (TmoB) sequences. Pseudomonas mendocina
           KR1 metabolises toluene as a carbon source. The initial
           step of the pathway is hydroxylation of toluene to form
           p-cresol by a multicomponent toluene-4-monooxygenase
           (T4MO) system. TmoB adopts a ubiquitin fold. Although
           TmoB is a component of the T4MO system, its precise role
           remains unclear.
          Length = 85

 Score = 30.4 bits (69), Expect = 0.71
 Identities = 14/45 (31%), Positives = 19/45 (42%), Gaps = 7/45 (15%)

Query: 499 VDMLEDQADVDAAAIA------RMKAQREHEMRLRSQVIQKNLPR 537
           VD  ED  D  A   A      R+  +  H +R+R Q   +  PR
Sbjct: 21  VD-TEDTMDQVAEKAAHHSVGRRVPPRPGHVVRVRKQGSTELFPR 64


>gnl|CDD|150884 pfam10278, Med19, Mediator of RNA pol II transcription subunit 19. 
           Med19 represents a family of conserved proteins which
           are members of the multi-protein co-activator Mediator
           complex. Mediator is required for activation of RNA
           polymerase II transcription by DNA binding
           transactivators.
          Length = 178

 Score = 31.7 bits (72), Expect = 0.86
 Identities = 15/69 (21%), Positives = 30/69 (43%)

Query: 204 KKKRGIDYNAEIPFEKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRK 263
           KKK    +      +  P     D+   +   ++H   +   +KE +++KK+K+K K+R 
Sbjct: 110 KKKHKHKHKKHRTQDPLPEETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRH 169

Query: 264 ENDIPTAML 272
             + P    
Sbjct: 170 SPEHPGVGF 178


>gnl|CDD|193581 cd09892, NGN_SP_RfaH, N-Utilization Substance G (NusG) N-terminal
           domain in the NusG Specialized Paralog (SP), RfaH.  RfaH
           is an operon-specific virulence regulator, thought to
           have arisen from an early duplication of N-Utilization
           Substance G (NusG). Paralogs of eubacterial NusG, NusG
           SP (Specialized Paralog of NusG), are more diverse and
           often found as the first ORF in operons encoding
           secreted proteins and LPS biosynthesis genes. NusG SP
           family members are operon-specific transcriptional
           antitermination factors. NusG is essential in
           Escherichia coli and is associated with RNA polymerase
           elongation and Rho-termination in bacteria. In contrast,
           RfaH is a non-essential protein that controls expression
           of operons containing an ops (operon polarity
           suppressor) element in their transcribed DNA. RfaH and
           NusG are different in their response to Rho-dependent
           terminators and regulatory targets. The NusG N-terminal
           (NGN) domain is quite similar in all NusG orthologs, but
           its C-terminal domains and the linker that separate
           these two domains are different. The domain organization
           of NusG and its homologs suggest that the common
           properties of NusG and RfaH are due to their similar NGN
           domains.
          Length = 96

 Score = 30.2 bits (69), Expect = 0.86
 Identities = 11/22 (50%), Positives = 12/22 (54%)

Query: 419 STPGVRDSVRGGATPTPIRDRL 440
           ST GV   VR G  P P+ D L
Sbjct: 69  STRGVSRLVRFGGEPAPVPDAL 90


>gnl|CDD|236978 PRK11778, PRK11778, putative inner membrane peptidase; Provisional.
          Length = 330

 Score = 32.1 bits (74), Expect = 1.1
 Identities = 11/39 (28%), Positives = 19/39 (48%)

Query: 231 EERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPT 269
           +E L+   LD +      + ++KK+KQ+ K  K    P 
Sbjct: 53  KEELKAALLDKKELKAWHKAQKKKEKQEAKAAKAKSKPR 91


>gnl|CDD|234796 PRK00571, atpC, F0F1 ATP synthase subunit epsilon; Validated.
          Length = 135

 Score = 30.5 bits (70), Expect = 1.5
 Identities = 14/50 (28%), Positives = 18/50 (36%), Gaps = 3/50 (6%)

Query: 143 KDMDEDELE-MLSEARARLANTQGKKAKRKAREKQLEEAR-RLAALQKRR 190
            D+DE   E     A   L N        +A+   L  A  RL   +K R
Sbjct: 87  DDIDEARAEEAKERAEEALENKHDDVDYARAQAA-LARAIARLRVAEKLR 135


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 31.5 bits (72), Expect = 1.5
 Identities = 14/53 (26%), Positives = 27/53 (50%), Gaps = 7/53 (13%)

Query: 143 KDMDEDELEMLSEARARLANTQGKKAKRKAREKQL--EEARRLAALQKRRELR 193
           K  +E+  E   E +        KK +R+A+  +L  EE R+L   +++++ R
Sbjct: 274 KAAEEERQEEAQEKKEEK-----KKEEREAKLAKLSPEEQRKLEEKERKKQAR 321


>gnl|CDD|221313 pfam11917, DUF3435, Protein of unknown function (DUF3435).  This
           family of proteins are functionally uncharacterized.
           This protein is found in eukaryotes. Proteins in this
           family are typically between 435 to 791 amino acids in
           length. This family is related to pfam00589 suggesting
           it may be an integrase enzyme.
          Length = 418

 Score = 31.6 bits (72), Expect = 1.7
 Identities = 26/118 (22%), Positives = 49/118 (41%), Gaps = 4/118 (3%)

Query: 502 LEDQADVDAAAIARMKAQREHEMRLRSQVIQK-NLPRPFDINIVLRPS-NSDPPLSELQK 559
           L  + D D  A+      +E  +R  +++ +  +  RP D+    + S   DP L EL +
Sbjct: 236 LPRRVDRDVQAVVLGLPPQEALIRAATRMSRTRDPRRPRDLTDEQKASVEEDPELQELIR 295

Query: 560 AEELIKQEMITMLH--YDALETPLSVDKKAAKQSNILTDEEHYNFLKHRPYRNFSLEE 615
             + +K+E+I +      A  TPL    +  ++      +     LK +    F  E+
Sbjct: 296 KRDHLKKEIIALYGQVAKAKGTPLYERLEKRRREVRNERQRLRRELKKKIREEFDEEQ 353


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
           approximately 300 residues, found in plants and
           vertebrates. They contain a highly conserved DDRGK
           motif.
          Length = 189

 Score = 30.8 bits (70), Expect = 1.7
 Identities = 21/86 (24%), Positives = 36/86 (41%), Gaps = 11/86 (12%)

Query: 179 EARRLAALQKRRELRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAPGFYDTSKEERLRQQH 238
           +  +L   Q RR+ R A  E    ++KK       E   E+            E  R++ 
Sbjct: 7   KRAKLEEKQARRQQREA-EEEEREERKKLEEKREGERKEEEE----------LEEEREKK 55

Query: 239 LDGELRSEKEERERKKDKQKLKQRKE 264
            + E R E+EE+ RK+ ++  K +  
Sbjct: 56  KEEEERKEREEQARKEQEEYEKLKSS 81


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 31.2 bits (70), Expect = 2.1
 Identities = 42/172 (24%), Positives = 67/172 (38%), Gaps = 17/172 (9%)

Query: 99  ERYEFLLDQAQKKEEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEM------ 152
           ER E    +   K E ++   D  + +  E +P PE +         + +   M      
Sbjct: 122 EREEVEETEGVTKSEQKNDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKH 181

Query: 153 ----LSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQKKKRG 208
                S   A  A  +  K   K ++KQ E A  L  L+K+RE R   +E   +++K+  
Sbjct: 182 TENTFSRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEE 241

Query: 209 IDYNAEIPFEKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLK 260
            D  +    EKR         KEE  R++    E R +  E    +DK+  K
Sbjct: 242 ADRKSREEEEKR-------RLKEEIERRRAEAAEKRQKVPEDGLSEDKKPFK 286


>gnl|CDD|150406 pfam09727, CortBP2, Cortactin-binding protein-2.  This entry is the
           first approximately 250 residues of cortactin-binding
           protein 2. In addition to being a positional candidate
           for autism this protein is expressed at highest levels
           in the brain in humans. The human protein has six
           associated ankyrin repeat domains pfam00023 towards the
           C-terminus which act as protein-protein interaction
           domains.
          Length = 193

 Score = 30.6 bits (69), Expect = 2.2
 Identities = 17/107 (15%), Positives = 44/107 (41%), Gaps = 6/107 (5%)

Query: 163 TQGKKAKRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQKKKRGI-----DYNAEIPF 217
                 +    EK + E  ++ A QK  + R     +A  +++++ +     +    I +
Sbjct: 70  GAEDPEQEDIYEKPMSELDKVMAKQKETQRRMLAQLLAAEKRQRKTVLELEEEKRKHIRY 129

Query: 218 EKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKE 264
            K+         +E    ++ L+ E +S++ ++E++  K      +E
Sbjct: 130 MKKSDDFTNLLEQERERLKKLLEQE-KSQQAKKEQEHRKLLATLEEE 175


>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal.  This
           domain is found to the N-terminus of bacterial signal
           peptidases of the S49 family (pfam01343).
          Length = 154

 Score = 30.2 bits (69), Expect = 2.3
 Identities = 14/50 (28%), Positives = 19/50 (38%), Gaps = 7/50 (14%)

Query: 232 ERLRQQHLDGELRSEKEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKR 281
           E L    LD +     E+ E+K +K K K  K+           E  K R
Sbjct: 56  ESLEAALLDKKELKAWEKAEKKAEKAKAKAEKKK-------AKKEEPKPR 98


>gnl|CDD|233254 TIGR01059, gyrB, DNA gyrase, B subunit.  This model describes the
           common type II DNA topoisomerase (DNA gyrase). Two
           apparently independently arising families, one in the
           Proteobacteria and one in Gram-positive lineages, are
           both designated toposisomerase IV. Proteins scoring
           above the noise cutoff for this model and below the
           trusted cutoff for topoisomerase IV models probably
           should be designated GyrB [DNA metabolism, DNA
           replication, recombination, and repair].
          Length = 654

 Score = 31.2 bits (71), Expect = 2.4
 Identities = 19/68 (27%), Positives = 29/68 (42%), Gaps = 3/68 (4%)

Query: 245 SEKEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKRSKLV--LPEPQI-SDMELEQVVK 301
           SE  E    +D +  +Q  E  IP   L+ +   KK    V   P+P+I    E +  + 
Sbjct: 122 SEWLEVTVFRDGKIYRQEFERGIPLGPLEVVGETKKTGTTVRFWPDPEIFETTEFDFDIL 181

Query: 302 LGRATEVA 309
             R  E+A
Sbjct: 182 AKRLRELA 189


>gnl|CDD|216991 pfam02357, NusG, Transcription termination factor nusG. 
          Length = 90

 Score = 28.8 bits (65), Expect = 2.4
 Identities = 10/21 (47%), Positives = 11/21 (52%)

Query: 419 STPGVRDSVRGGATPTPIRDR 439
           STPGV   V  G  P P+ D 
Sbjct: 70  STPGVTGFVGFGGKPAPVPDE 90


>gnl|CDD|213874 TIGR03857, F420_MSMEG_2249, probable F420-dependent oxidoreductase,
           MSMEG_2249 family.  Coenzyme F420 has a limited
           phylogenetic distribution, including methanogenic
           archaea, Mycobacterium tuberculosis and related species,
           Colwellia psychrerythraea 34H, Rhodopseudomonas
           palustris HaA2, and others. Partial phylogenetic
           profiling identifies protein subfamilies, within the
           larger family called luciferase-like monooxygenanases
           (pfam00296), that appear only in F420-positive genomes
           and are likely to be F420-dependent. This model
           describes a distinctive subfamily, found only in
           F420-biosynthesizing members of the Actinobacteria of
           the bacterial luciferase-like monooxygenase (LLM)
           superfamily [Unknown function, Enzymes of unknown
           specificity].
          Length = 329

 Score = 30.9 bits (70), Expect = 2.5
 Identities = 13/31 (41%), Positives = 20/31 (64%), Gaps = 1/31 (3%)

Query: 70  EKLLHLAKLMPTQWRTIAPIIGRTAAQCLER 100
           E+L+ +A L+P +W   +  IG +AAQC  R
Sbjct: 278 EQLVDVADLIPDEWLEASAAIG-SAAQCARR 307


>gnl|CDD|178307 PLN02705, PLN02705, beta-amylase.
          Length = 681

 Score = 30.7 bits (69), Expect = 3.6
 Identities = 17/48 (35%), Positives = 25/48 (52%), Gaps = 5/48 (10%)

Query: 247 KEERERKKDKQKLKQRKENDIPTAMLQNLEPEKKRSKLVLPEPQISDM 294
           K ERE++K++ KL++R    I + ML  L     R     P P  +DM
Sbjct: 78  KREREKEKERTKLRERHRRAITSRMLAGL-----RQYGNFPLPARADM 120


>gnl|CDD|178635 PLN03086, PLN03086, PRLI-interacting factor K; Provisional.
          Length = 567

 Score = 30.6 bits (69), Expect = 3.8
 Identities = 22/74 (29%), Positives = 41/74 (55%), Gaps = 8/74 (10%)

Query: 239 LDGELRS--EKEERERKKDKQKLK-----QRKENDIPTAMLQNLEPEKKRSKLVLPEPQI 291
           +D ELR   EK ERE+++ KQ+ K     +RK  +      + +E  ++  +L   E QI
Sbjct: 1   MDFELRRAREKLEREQRERKQRAKLKLERERKAKEEAAKQREAIEAAQRSRRLDAIEAQI 60

Query: 292 -SDMELEQVVKLGR 304
            +D ++++ ++ GR
Sbjct: 61  KADQQMQESLQAGR 74


>gnl|CDD|167649 PRK03963, PRK03963, V-type ATP synthase subunit E; Provisional.
          Length = 198

 Score = 29.7 bits (67), Expect = 3.9
 Identities = 18/58 (31%), Positives = 34/58 (58%), Gaps = 6/58 (10%)

Query: 151 EMLSEARARLANTQGKKAKRKAREKQ---LEEARRLAALQKRRELRAAGIEVAPRQKK 205
            +L EA+ + A    ++A+++A  K    L +A+  A L+K+R +  A +EV  R+K+
Sbjct: 21  YILEEAQ-KEAEKIKEEARKRAESKAEWILRKAKTQAELEKQRIIANAKLEV--RRKR 75


>gnl|CDD|204055 pfam08764, Coagulase, Staphylococcus aureus coagulase.
           Staphylococcus aureus secretes a cofactor called
           coagulase. Coagulase is an extracellular protein that
           forms a complex with human prothrombin, and activates it
           without the usual proteolytic cleavages. The resulting
           complex directly initiates blood clotting.
          Length = 282

 Score = 30.1 bits (68), Expect = 4.4
 Identities = 16/45 (35%), Positives = 23/45 (51%)

Query: 486 PENEEMEEKASGDVDMLEDQADVDAAAIARMKAQREHEMRLRSQV 530
           P +EE EEKA+ +V  L  + D   AA    K    H   LR+++
Sbjct: 146 PYSEEEEEKATDEVYDLVSEIDTLYAAYYGDKQHGTHAKELRAKL 190


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 30.4 bits (69), Expect = 4.5
 Identities = 17/69 (24%), Positives = 25/69 (36%), Gaps = 5/69 (7%)

Query: 131 PNPETKPARPDPKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRR 190
           P            + D+   E+  + RA+  +    + K    E   EEA RL  L+  R
Sbjct: 224 PPKPPMTPEEKDDEYDQRVRELTFDRRAQPTD----RTK-TEEELAKEEAERLKKLEAER 278

Query: 191 ELRAAGIEV 199
             R  G E 
Sbjct: 279 LRRMRGEEE 287


>gnl|CDD|236709 PRK10531, PRK10531, acyl-CoA thioesterase; Provisional.
          Length = 422

 Score = 30.1 bits (68), Expect = 4.6
 Identities = 15/54 (27%), Positives = 25/54 (46%), Gaps = 1/54 (1%)

Query: 297 EQVVKLGRATEV-AREVAIESGSGPTSDALLTDYSIGTGAAMKTPRTPAPQTDR 349
           + V +LGRA +V A   A  +G+      ++ D +I  G     P   A  ++R
Sbjct: 338 DGVAQLGRAWDVDAGLSAYVNGANTNGQVVIRDSAINEGFNTAKPWADAVTSNR 391


>gnl|CDD|204335 pfam09905, DUF2132, Uncharacterized conserved protein (DUF2132). 
          This domain, found in various hypothetical prokaryotic
          proteins, has no known function.
          Length = 64

 Score = 27.1 bits (61), Expect = 5.5
 Identities = 11/24 (45%), Positives = 14/24 (58%), Gaps = 7/24 (29%)

Query: 56 DPSIK-------KTEWSREEDEKL 72
          +PSIK       KT W+RE+ E L
Sbjct: 39 NPSIKSSLKFLRKTPWAREKVENL 62


>gnl|CDD|236973 PRK11767, PRK11767, SpoVR family protein; Provisional.
          Length = 498

 Score = 29.8 bits (68), Expect = 5.6
 Identities = 14/28 (50%), Positives = 18/28 (64%), Gaps = 1/28 (3%)

Query: 730 PRRIASLTEDVNRQKEREAVLQERFGAL 757
           P++I SL E+  RQKERE  LQ +   L
Sbjct: 182 PQKI-SLQEEKARQKEREEYLQSQVNDL 208


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 30.3 bits (68), Expect = 5.6
 Identities = 19/73 (26%), Positives = 34/73 (46%), Gaps = 20/73 (27%)

Query: 140 PDPKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRELR-----A 194
            + K + E+ELE            + KK + KA+EK   E ++L A QK  + +     A
Sbjct: 8   AEKKILTEEELE------------RKKKKEEKAKEK---ELKKLKAAQKEAKAKLQAQQA 52

Query: 195 AGIEVAPRQKKKR 207
           +     P++ +K+
Sbjct: 53  SDGTNVPKKSEKK 65


>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
          Length = 1036

 Score = 29.8 bits (67), Expect = 6.2
 Identities = 28/80 (35%), Positives = 37/80 (46%), Gaps = 9/80 (11%)

Query: 144 DMDEDELE--MLSEARARLANTQGKKAKRKA-REKQLEEARRLAALQKRREL-RA-AGIE 198
            MDE   E  +L E R  L     K AK +A RE+Q EE RR    +   E  RA A  E
Sbjct: 240 GMDEHSFEDFLLEEKRRELE----KLAKEEAERERQAEEQRRREEEKAAMEADRAQAKAE 295

Query: 199 VAPRQKKKRGIDYNAEIPFE 218
           V  R++K + +   A    +
Sbjct: 296 VEKRREKLQNLLKKASRSAD 315


>gnl|CDD|222361 pfam13751, DDE_Tnp_1_6, Transposase DDE domain.  Transposase
           proteins are necessary for efficient DNA transposition.
           This domain is a member of the DDE superfamily, which
           contain three carboxylate residues that are believed to
           be responsible for coordinating metal ions needed for
           catalysis.
          Length = 125

 Score = 28.5 bits (64), Expect = 6.5
 Identities = 13/51 (25%), Positives = 24/51 (47%), Gaps = 6/51 (11%)

Query: 162 NTQGKKAKRKAREKQLEEARRLAALQKRRELRAAGIEVAPRQ-KKKRGIDY 211
             + ++A+RKARE+   E  +        + R+ G+E    Q K+  G+  
Sbjct: 54  RPELEEARRKARERLKSEEGK-----ALYKKRSIGVEGVFGQIKRNLGLRR 99


>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
           recombination, and repair].
          Length = 908

 Score = 29.7 bits (67), Expect = 7.1
 Identities = 32/156 (20%), Positives = 69/156 (44%), Gaps = 9/156 (5%)

Query: 147 EDELEMLSEARARLANTQGKKAKRKAREKQLEEARR-LAALQKRRELRAAGIEVAPRQKK 205
           E+++E L E    +   + +    +A  ++LEE    L +L++R E     +E    + +
Sbjct: 287 EEKIERLEELEREIEELEEELEGLRALLEELEELLEKLKSLEERLEKLEEKLEKLESELE 346

Query: 206 KRGIDYNAEIPFEKRPAPGFYDTSKEERLRQQHLDGELRSEKEERERKKDKQKLKQRKEN 265
           +   + N      +          KE   R + L+ EL  +  ER ++ ++   + ++E 
Sbjct: 347 ELAEEKNELAKLLEE-------RLKELEERLEELEKELE-KALERLKQLEEAIQELKEEL 398

Query: 266 DIPTAMLQNLEPEKKRSKLVLPEPQISDMELEQVVK 301
              +A L+ ++ E +  +  L E +    ELE+ +K
Sbjct: 399 AELSAALEEIQEELEELEKELEELERELEELEEEIK 434


>gnl|CDD|237546 PRK13889, PRK13889, conjugal transfer relaxase TraA; Provisional.
          Length = 988

 Score = 29.7 bits (67), Expect = 7.1
 Identities = 44/219 (20%), Positives = 78/219 (35%), Gaps = 35/219 (15%)

Query: 133 PETKPARPDPKDM-DEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRE 191
                  P+     + +     ++A AR      + A R+AR + L   R   A+     
Sbjct: 770 LPDPVPGPEAGRRPERESAAATTDAPARTVAADPEAALRQARTRALV--RHARAVDAIFR 827

Query: 192 LRAAGIEVAPRQKK---KRGIDYNAEIPFEKRPAPGFYDTSKEER-----------LRQQ 237
           ++  G  V P Q K   +    +    P+    A   Y  + E             +R  
Sbjct: 828 MQEQGGPVLPHQVKELQEARKAFEEVRPYGSHDAEAAYKKNPELAAEAASGRPARAIRAL 887

Query: 238 HLDGELRSEKE-------ERERKKDKQKLKQRKENDIPTA---------MLQNLEPEKKR 281
            L+ ELR++         ER +K D+   +Q +  D+            M ++LE + + 
Sbjct: 888 QLETELRTDPARRADRFVERWQKLDRASQRQYQAGDMSGYKATRAAMGDMAKSLERDPQL 947

Query: 282 SKLVLPEPQISDMELEQVVKLGRATEVAREVAIESGSGP 320
             L+    +   +  E   +LGR  E+A    I+ G G 
Sbjct: 948 ESLLAGRKRELGIGFESGRRLGR--ELAFSHGIDLGRGR 984


>gnl|CDD|238570 cd01164, FruK_PfkB_like, 1-phosphofructokinase (FruK), minor
           6-phosphofructokinase (pfkB) and related sugar kinases.
           FruK plays an important role in the predominant pathway
           for fructose utilisation.This group also contains
           tagatose-6-phophate kinase, an enzyme of the tagatose
           6-phosphate pathway, which responsible for breakdown of
           the galactose moiety during lactose metabolism by
           bacteria such as L. lactis.
          Length = 289

 Score = 29.4 bits (67), Expect = 7.5
 Identities = 14/47 (29%), Positives = 25/47 (53%), Gaps = 2/47 (4%)

Query: 275 LEPEKKRSKLVLPEPQISDMELEQVVK-LGRATEVAREVAIESGSGP 320
            E +   +++  P P+IS+ ELE +++ L    +    V + SGS P
Sbjct: 94  KEEDGTETEINEPGPEISEEELEALLEKLKALLKKGDIVVL-SGSLP 139


>gnl|CDD|198428 cd10030, UDG_F4_TTUDGA_like, Family 4 Uracil-DNA glycosylase (UDG),
           found exclusively in thermophilic organisms.  The
           enzymes of Family 4 Uracil-DNA glycosylase (UDG), found
           only in thermophilic organisms, are thermostable
           enzymes. Uracil-DNA glycosylases (UDGs) are DNA repair
           enzymes that catalyze the removal of mismatched uracil
           from DNA to initiate DNA base excision repair pathway.
           The Thermus thermophilus enzyme TTUDGA removes uracil
           from both, ssDNA and dsDNA, but not thymine from a G:T
           mismatch. These details suggest that the mechanism by
           which Family 4 UDGs remove uracils from DNA is similar
           to that of Family 1 enzymes. The thermostability of the
           enzyme may be linked to the presence of an iron-sulfur
           cluster, salt-bridges and ion pairs on the molecular
           surface as well as prolines on loops and turns, as
           commonly found in the Family 4 enzymes. Uracil in DNA
           can arise as a result of mis-incorporation of dUMP
           residues by DNA polymerase or deamination of cytosine.
           Uracil mispaired with guanine in DNA is one of the major
           pro-mutagenic events, causing G:C->A:T mutations.
          Length = 164

 Score = 28.6 bits (65), Expect = 7.6
 Identities = 11/33 (33%), Positives = 20/33 (60%), Gaps = 2/33 (6%)

Query: 601 NFLKHRPY--RNFSLEELEAADDLLKREMDLVK 631
           N +K RP   R  + EE+ A    L+R+++L++
Sbjct: 68  NVVKCRPPGNRTPTPEEIAACRPFLERQIELIR 100


>gnl|CDD|222409 pfam13837, Myb_DNA-bind_4, Myb/SANT-like DNA-binding domain.
          This presumed domain appears to be related to other
          Myb/SANT-like DNA binding domains. In particular
          pfam10545 seems most related. This family is greatly
          expanded in plants and appears in several proteins
          annotated as transposon proteins.
          Length = 84

 Score = 27.2 bits (61), Expect = 7.7
 Identities = 10/28 (35%), Positives = 15/28 (53%), Gaps = 4/28 (14%)

Query: 28 GKNQWSRIASLLHR----KSAKQCKARW 51
           K+ W  IA  +      +SA+QCK +W
Sbjct: 31 NKHVWEEIAEKMAERGYNRSAEQCKEKW 58


>gnl|CDD|215901 pfam00401, ATP-synt_DE, ATP synthase, Delta/Epsilon chain, long
           alpha-helix domain.  Part of the ATP synthase CF(1).
           These subunits are part of the head unit of the ATP
           synthase. This subunit is called epsilon in bacteria and
           delta in mitochondria. In bacteria the delta (D) subunit
           is equivalent to the mitochondrial Oligomycin sensitive
           subunit, OSCP (pfam00213).
          Length = 48

 Score = 26.3 bits (59), Expect = 8.0
 Identities = 16/48 (33%), Positives = 23/48 (47%), Gaps = 3/48 (6%)

Query: 143 KDMDEDE-LEMLSEARARLANTQGKKAKRKAREKQLEEAR-RLAALQK 188
           +D+D +   E    A   LA  +G K   +A E  L+ AR RL A + 
Sbjct: 1   EDIDLERAEEAKERAEEALAKAEGDKEYIRA-EAALKRARARLRAAKL 47


>gnl|CDD|235766 PRK06276, PRK06276, acetolactate synthase catalytic subunit;
           Reviewed.
          Length = 586

 Score = 29.3 bits (66), Expect = 8.1
 Identities = 15/55 (27%), Positives = 24/55 (43%)

Query: 142 PKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEARRLAALQKRRELRAAG 196
           PKD+ E EL++         +  G K        Q+++A  L A  +R  + A G
Sbjct: 158 PKDVQEGELDLEKYPIPAKIDLPGYKPTTFGHPLQIKKAAELIAEAERPVILAGG 212


>gnl|CDD|182521 PRK10528, PRK10528, multifunctional acyl-CoA thioesterase I and
           protease I and lysophospholipase L1; Provisional.
          Length = 191

 Score = 28.6 bits (64), Expect = 9.0
 Identities = 11/39 (28%), Positives = 19/39 (48%), Gaps = 5/39 (12%)

Query: 78  LMPTQWRTIAPII-----GRTAAQCLERYEFLLDQAQKK 111
           L+  +W++   ++     G T+ Q L R   LL Q Q +
Sbjct: 35  LLNDKWQSKTSVVNASISGDTSQQGLARLPALLKQHQPR 73


>gnl|CDD|217512 pfam03359, GKAP, Guanylate-kinase-associated protein (GKAP)
           protein. 
          Length = 342

 Score = 29.0 bits (65), Expect = 9.0
 Identities = 18/62 (29%), Positives = 22/62 (35%), Gaps = 11/62 (17%)

Query: 121 PRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEARARLANTQGKKAKRKAREKQLEEA 180
           P+K   G I     ++    D  D          EAR+RLA      AKR A  KQ    
Sbjct: 277 PKKPAKGPIVKPAISREKSLDSSDRQR------QEARSRLA-----AAKRAASFKQNSAT 325

Query: 181 RR 182
             
Sbjct: 326 ES 327


>gnl|CDD|191489 pfam06297, PET, PET Domain.  This domain is suggested to be
           involved in protein-protein interactions. The family is
           found in conjunction with pfam00412.
          Length = 106

 Score = 27.7 bits (62), Expect = 9.9
 Identities = 22/76 (28%), Positives = 32/76 (42%), Gaps = 13/76 (17%)

Query: 199 VAPRQKKKRGIDYNAEIPFEKRPAPGFYDTSKEERLRQQHL-------DGELR--SEKEE 249
           V P    +    Y   +P EK P  G    S+ E+ R++ L       D + R      E
Sbjct: 24  VPPGLTPELVHRYMELLPEEKVPVVG----SEGEKYRRRQLLHQLPPHDQDPRYCHGLSE 79

Query: 250 RERKKDKQKLKQRKEN 265
            E K+ +  +KQRKE 
Sbjct: 80  EEVKELEDFVKQRKEE 95


>gnl|CDD|218538 pfam05285, SDA1, SDA1.  This family consists of several SDA1
           protein homologues. SDA1 is a Saccharomyces cerevisiae
           protein which is involved in the control of the actin
           cytoskeleton. The protein is essential for cell
           viability and is localised in the nucleus.
          Length = 317

 Score = 28.9 bits (65), Expect = 9.9
 Identities = 38/168 (22%), Positives = 67/168 (39%), Gaps = 36/168 (21%)

Query: 112 EEGEDVADDPRKLKPGEIDPNPETKPARPDPKDMDEDELEMLSEAR-------ARLANTQ 164
           EE ++ A   ++    E+    E + A  +  + ++++   L+  R       A++   +
Sbjct: 136 EEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEKASELATTRILTPADFAKIQELR 195

Query: 165 GKKAKRKAREKQLEEARRLAALQKRRE-LRAAGIEVAPRQKKKRGIDYNAEIPFEKRPAP 223
            +K   KA   +L+   + A  +   E + A  IE  P +KKK                 
Sbjct: 196 LEKGVDKALGGKLKRRDKDAPERHSDELVDADDIE-GPAKKKK----------------- 237

Query: 224 GFYDTSKEERLRQQHLDGELRSEKEERERKKDKQ------KLKQRKEN 265
                +KEER+       E R +   R+ KKDK+      K K RK+N
Sbjct: 238 ----QTKEERIATAKEGREDREKFGSRKGKKDKEGKSTTNKEKARKKN 281


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.312    0.130    0.364 

Gapped
Lambda     K      H
   0.267   0.0696    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 40,259,013
Number of extensions: 4088732
Number of successful extensions: 5264
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4831
Number of HSP's successfully gapped: 307
Length of query: 769
Length of database: 10,937,602
Length adjustment: 104
Effective length of query: 665
Effective length of database: 6,324,786
Effective search space: 4205982690
Effective search space used: 4205982690
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 63 (28.1 bits)