RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy3224
         (1278 letters)



>gnl|CDD|203444 pfam06424, PRP1_N, PRP1 splicing factor, N-terminal.  This domain
           is specific to the N-terminal part of the prp1 splicing
           factor, which is involved in mRNA splicing (and possibly
           also poly(A)+ RNA nuclear export and cell cycle
           progression). This domain is specific to the N terminus
           of the RNA splicing factor encoded by prp1. It is
           involved in mRNA splicing and possibly also poly(A)and
           RNA nuclear export and cell cycle progression.
          Length = 131

 Score =  192 bits (489), Expect = 8e-57
 Identities = 83/151 (54%), Positives = 101/151 (66%), Gaps = 20/151 (13%)

Query: 239 APLGYVAGVGRGATGFTTRSDIGPARDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNF 298
           AP GYVAG+GRGATGFTTRSDIGPARD  D               DEE++D +   D   
Sbjct: 1   APPGYVAGLGRGATGFTTRSDIGPARDGVD-------------IDDEEDEDPKRYQD--- 44

Query: 299 DEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKI 358
               G    LF+   YD +DEEAD IYE ID+RMDE+RK  RE++ +EE+E+YR+E PKI
Sbjct: 45  ----GDNEGLFSDGKYDDEDEEADRIYESIDERMDERRKKRREQKEKEEIEKYREENPKI 100

Query: 359 QQQFSDLKRGLVTVSMDEWKNVPEVGDARNR 389
           QQQF+DLKR L TV+ DEW N+PEVGD   +
Sbjct: 101 QQQFADLKRNLATVTEDEWANIPEVGDYTRK 131



 Score =  178 bits (453), Expect = 6e-52
 Identities = 78/141 (55%), Positives = 94/141 (66%), Gaps = 20/141 (14%)

Query: 70  APLGYVAGVGRGATGFTTRSDIGPARDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNF 129
           AP GYVAG+GRGATGFTTRSDIGPARD  D               DEE++D +   D   
Sbjct: 1   APPGYVAGLGRGATGFTTRSDIGPARDGVD-------------IDDEEDEDPKRYQD--- 44

Query: 130 DEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKI 189
               G    LF+   YD +DEEAD IYE ID+RMDE+RK  RE++ +EE+E+YR+E PKI
Sbjct: 45  ----GDNEGLFSDGKYDDEDEEADRIYESIDERMDERRKKRREQKEKEEIEKYREENPKI 100

Query: 190 QQQFSDLKRGLVTVSMDEWKN 210
           QQQF+DLKR L TV+ DEW N
Sbjct: 101 QQQFADLKRNLATVTEDEWAN 121


>gnl|CDD|234059 TIGR02917, PEP_TPR_lipo, putative PEP-CTERM system TPR-repeat
            lipoprotein.  This protein family occurs in strictly
            within a subset of Gram-negative bacterial species with
            the proposed PEP-CTERM/exosortase system, analogous to
            the LPXTG/sortase system common in Gram-positive
            bacteria. This protein occurs in a species if and only if
            a transmembrane histidine kinase (TIGR02916) and a
            DNA-binding response regulator (TIGR02915) also occur.
            The present of tetratricopeptide repeats (TPR) suggests
            protein-protein interaction, possibly for the regulation
            of PEP-CTERM protein expression, since many PEP-CTERM
            proteins in these genomes are preceded by a proposed DNA
            binding site for the response regulator.
          Length = 899

 Score = 65.5 bits (160), Expect = 1e-10
 Identities = 122/609 (20%), Positives = 209/609 (34%), Gaps = 106/609 (17%)

Query: 495  NDIKKARLLLKSVRETNPNHPPAWIASARLEEVTGKVQAARNLIMKGCEENQTSEDLWLE 554
            N   +AR L+  V   +P +  A +    L    G ++ A     K              
Sbjct: 173  NRFDEARALIDEVLTADPGNVDALLLKGDLLLSLGNIELALAAYRK-------------- 218

Query: 555  AARLQPVDTARAVIAQAVRHIPTSVRIWIKAADLE-TETKAKRRVYRKALEHIPNS--VR 611
            A  L+P + A  ++A A         I I+A + E  E  A        L+  PNS    
Sbjct: 219  AIALRPNNIA-VLLALAT--------ILIEAGEFEEAEKHAD-----ALLKKAPNSPLAH 264

Query: 612  LWKAAVELE--DPEDARILLSRAVECCPTSVELWL----ALARLETYENARKVLNKAREN 665
              KA V+ +  + EDAR  L  A++  P  +   L    +  +L   E A + LN+  + 
Sbjct: 265  YLKALVDFQKKNYEDARETLQDALKSAPEYLPALLLAGASEYQLGNLEQAYQYLNQILKY 324

Query: 666  IPTDRQIWTTAAKLEEAHGNNAMVDKIIDRALSSLSANGVEINREHWFKEAIEAEKAGSV 725
             P   Q     A ++       +    +D A+++LS                 A      
Sbjct: 325  APNSHQARRLLASIQ-------LRLGRVDEAIATLSP----------------ALGLDPD 361

Query: 726  HTCQALIRAIIGYGVEQEDRKHTWMEDAESCANQGAYECARAIYAQALATFPSKKSIWLR 785
                    +++G           ++         G +E A    A+A    P   +   +
Sbjct: 362  ---DPAALSLLGE---------AYLA-------LGDFEKAAEYLAKATELDPENAAARTQ 402

Query: 786  AAYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKSNKKSIWLRAAYFEKNHGTRESL 845
                + + G        L+ A    P+     L+   S     +LR+  F+K     + L
Sbjct: 403  LGISKLSQGDPSEAIADLETAAQLDPELGRADLLLILS-----YLRSGQFDKALAAAKKL 457

Query: 846  ETLLQKAVAHCPKSEVLWLMGAKSKWLAGDVPAARGILSLAFQANPN--SEEIWLAAVKL 903
            E          P +  L  +        GD+  AR     A    P+       LA + +
Sbjct: 458  E-------KKQPDNASLHNLLGAIYLGKGDLAKAREAFEKALSIEPDFFPAAANLARIDI 510

Query: 904  ESENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARAS 963
            +  N +        A  R +      +P +    LA   L       E A   L KA   
Sbjct: 511  QEGNPD-------DAIQRFEK-VLTIDPKNLRAILALAGLYLRTGNEEEAVAWLEKAAEL 562

Query: 964  APT---PRVMIQSAKLEWCLDNLERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKA 1020
             P    P + +    L      L++AL +L+EA    PD  + W+M G+ +     L+KA
Sbjct: 563  NPQEIEPALALAQYYLG--KGQLKKALAILNEAADAAPDSPEAWLMLGRAQLAAGDLNKA 620

Query: 1021 HDTFSQAIKKCPHSVPLWIMLANLEERRKMLIKARSVLEKGRLRNPNCAELWLAAIRVEI 1080
              +F + +   P S    ++LA+     K   KA + L++     P+  E  +   ++ +
Sbjct: 621  VSSFKKLLALQPDSALALLLLADAYAVMKNYAKAITSLKRALELKPDNTEAQIGLAQLLL 680

Query: 1081 RAGLKDIAN 1089
             A   + A 
Sbjct: 681  AAKRTESAK 689



 Score = 65.1 bits (159), Expect = 2e-10
 Identities = 127/619 (20%), Positives = 210/619 (33%), Gaps = 99/619 (15%)

Query: 495  NDIKKARLLLKSVRETNPNHPPAWIASARLEEVTGKVQAARNLIMKGCEENQTSEDLWLE 554
            N  K A + LK+  + +PN   A     ++    G   AA        +E + +  L   
Sbjct: 36   NKYKAAIIQLKNALQKDPNDAEARFLLGKIYLALGDYAAAE-------KELRKALSLGYP 88

Query: 555  AARLQPVDTARAVIAQAVRHIPTSVRIWIKAADLETETKAKRRVYRKALEHIPNSVRLWK 614
              ++ P   ARA + Q        +        L+ E  A+    R              
Sbjct: 89   KNQVLP-LLARAYLLQ--GKFQQVLDELPGKTLLDDEGAAELLALRGL------------ 133

Query: 615  AAVELEDPEDARILLSRAVECCPTSVELWLALARLET----YENARKVLNKARENIPTDR 670
            A + L   E A+    +A+   P S+   L LA+L      ++ AR ++++     P + 
Sbjct: 134  AYLGLGQLELAQKSYEQALAIDPRSLYAKLGLAQLALAENRFDEARALIDEVLTADPGNV 193

Query: 671  QIWTTAAKLEEAHGNNAMVDKIIDRALSSLSANGVEINREHWFKEAIEAEK-AGSVHTCQ 729
                    L  + GN       I+ AL++             +++AI       +V    
Sbjct: 194  DALLLKGDLLLSLGN-------IELALAA-------------YRKAIALRPNNIAVLLAL 233

Query: 730  ALIRAIIGYGVEQEDRKHTWMEDAESCAN---QGAYECARAIYAQALATFPSKKSIWLRA 786
            A I       +E  +      E+AE  A+   + A     A Y +AL  F  K       
Sbjct: 234  ATIL------IEAGE-----FEEAEKHADALLKKAPNSPLAHYLKALVDFQKKN------ 276

Query: 787  AYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKSNKKSIWLRAAYFEKNHGTRESLE 846
              +E     RE+L+  L+ A  + P    L L GA          + Y   N    E   
Sbjct: 277  --YED---ARETLQDALKSAPEYLP---ALLLAGA----------SEYQLGNL---EQAY 315

Query: 847  TLLQKAVAHCPKSEVLWLMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESE 906
              L + + + P S     + A  +   G V  A   LS A   +P+         +    
Sbjct: 316  QYLNQILKYAPNSHQARRLLASIQLRLGRVDEAIATLSPALGLDPDDPAALSLLGEAYLA 375

Query: 907  NNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARASAPT 966
              ++E+A   LAKA          P +          +    +   A   L  A    P 
Sbjct: 376  LGDFEKAAEYLAKATELD------PENAAARTQLGISKLSQGDPSEAIADLETAAQLDPE 429

Query: 967  P---RVMIQSAKLEWCLDNLERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDT 1023
                 +++  + L       ++AL    +  K  PD A L  + G I   K  L KA + 
Sbjct: 430  LGRADLLLILSYLR--SGQFDKALAAAKKLEKKQPDNASLHNLLGAIYLGKGDLAKAREA 487

Query: 1024 FSQAIKKCPHSVPLWIMLANLEERRKMLIKARSVLEKGRLRNPNCAELWLAAIRVEIRAG 1083
            F +A+   P   P    LA ++ +      A    EK    +P      LA   + +R G
Sbjct: 488  FEKALSIEPDFFPAAANLARIDIQEGNPDDAIQRFEKVLTIDPKNLRAILALAGLYLRTG 547

Query: 1084 LKDIANTMMAKALQECPNA 1102
             ++ A   + KA +  P  
Sbjct: 548  NEEEAVAWLEKAAELNPQE 566



 Score = 60.1 bits (146), Expect = 7e-09
 Identities = 76/385 (19%), Positives = 127/385 (32%), Gaps = 51/385 (13%)

Query: 867  AKSKWLAGDVPAARGILSLAFQANPNSEE-------IWLAAVKLESENNEYERARRL--- 916
            AKS        AA   L  A Q +PN  E       I+LA     +   E  +A  L   
Sbjct: 29   AKSYLQKNKYKAAIIQLKNALQKDPNDAEARFLLGKIYLALGDYAAAEKELRKALSLGYP 88

Query: 917  -------LAKARAQAGAFQ------------ANPNSEEIWLAAVKLESENN-EYERARRL 956
                   LA+A    G FQ             +  + E+ LA   L      + E A++ 
Sbjct: 89   KNQVLPLLARAYLLQGKFQQVLDELPGKTLLDDEGAAEL-LALRGLAYLGLGQLELAQKS 147

Query: 957  LAKARASAPT-PRVMIQSAKLEWCLDNLERALQLLDEAIKVFPDFAKLWMMKGQIEEQKN 1015
              +A A  P      +  A+L    +  + A  L+DE +   P      ++KG +     
Sbjct: 148  YEQALAIDPRSLYAKLGLAQLALAENRFDEARALIDEVLTADPGNVDALLLKGDLLLSLG 207

Query: 1016 LLDKAHDTFSQAIKKCPHSVPLWIMLANLEERRKMLIKARSVLEKGRLRNPNCAELWLAA 1075
             ++ A   + +AI   P+++ + + LA +        +A    +    + PN        
Sbjct: 208  NIELALAAYRKAIALRPNNIAVLLALATILIEAGEFEEAEKHADALLKKAPNSPLAHYLK 267

Query: 1076 IRVEIRAGLKDIANTMMAKALQECPN--AGILWAEAIF--LEPRPQRKTKSVDALKKCEH 1131
              V+ +    + A   +  AL+  P     +L A A    L    Q        LK   +
Sbjct: 268  ALVDFQKKNYEDARETLQDALKSAPEYLPALLLAGASEYQLGNLEQAYQYLNQILKYAPN 327

Query: 1132 DPHV--LLAVSKLFWCENKNQKCHRSGSRRCMGVKTKSVDALKKCEHDPHVLLAVSKLFW 1189
                  LLA  +L                           AL     DP  L  + + + 
Sbjct: 328  SHQARRLLASIQL---RLGRV----------DEAIATLSPALGLDPDDPAALSLLGEAYL 374

Query: 1190 CENKNQKCREWFNRTVKIDPDLGDA 1214
                 +K  E+  +  ++DP+   A
Sbjct: 375  ALGDFEKAAEYLAKATELDPENAAA 399



 Score = 45.1 bits (107), Expect = 3e-04
 Identities = 51/243 (20%), Positives = 93/243 (38%), Gaps = 7/243 (2%)

Query: 867  AKSKWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARAQAGA 926
            A+          AR ++     A+P + +  L    L       E A     KA      
Sbjct: 166  AQLALAENRFDEARALIDEVLTADPGNVDALLLKGDLLLSLGNIELALAAYRKAI----- 220

Query: 927  FQANPNSEEIWLAAVKLESENNEYERARRLLAKARASAP-TPRVMIQSAKLEWCLDNLER 985
                PN+  + LA   +  E  E+E A +        AP +P      A +++   N E 
Sbjct: 221  -ALRPNNIAVLLALATILIEAGEFEEAEKHADALLKKAPNSPLAHYLKALVDFQKKNYED 279

Query: 986  ALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLANLE 1045
            A + L +A+K  P++    ++ G  E Q   L++A+   +Q +K  P+S     +LA+++
Sbjct: 280  ARETLQDALKSAPEYLPALLLAGASEYQLGNLEQAYQYLNQILKYAPNSHQARRLLASIQ 339

Query: 1046 ERRKMLIKARSVLEKGRLRNPNCAELWLAAIRVEIRAGLKDIANTMMAKALQECPNAGIL 1105
             R   + +A + L      +P+            +  G  + A   +AKA +  P     
Sbjct: 340  LRLGRVDEAIATLSPALGLDPDDPAALSLLGEAYLALGDFEKAAEYLAKATELDPENAAA 399

Query: 1106 WAE 1108
              +
Sbjct: 400  RTQ 402



 Score = 42.0 bits (99), Expect = 0.002
 Identities = 93/476 (19%), Positives = 143/476 (30%), Gaps = 67/476 (14%)

Query: 495 NDIKKARLLLKSVRETNPNHPPAWIASARLEEVTGKVQAARNLIMKGCEENQTSEDLWLE 554
            D+ KAR   +      P+  PA    AR++   G    A     K    +  +    L 
Sbjct: 479 GDLAKAREAFEKALSIEPDFFPAAANLARIDIQEGNPDDAIQRFEKVLTIDPKNLRAILA 538

Query: 555 AARLQPVDTARAVIAQAVRHIPTSVRIWI-KAADLETETKAKRRVYRKALEHIPNSVRLW 613
            A L                       W+ KAA+L  +                  + L 
Sbjct: 539 LAGL-----------YLRTGNEEEAVAWLEKAAELNPQEIEPA-------------LALA 574

Query: 614 KAAVELEDPEDARILLSRAVECCPTSVELWLALARLE----TYENARKVLNKARENIPTD 669
           +  +     + A  +L+ A +  P S E WL L R +        A     K     P  
Sbjct: 575 QYYLGKGQLKKALAILNEAADAAPDSPEAWLMLGRAQLAAGDLNKAVSSFKKLLALQPDS 634

Query: 670 RQIWTTAAKLEEAHGNNAMVDKIIDRALSSLSANGVEINREHWF---KEAIEAEKAGSVH 726
                  A       N A     + RAL     N              +  E+ K  +  
Sbjct: 635 ALALLLLADAYAVMKNYAKAITSLKRALELKPDNTEAQIGLAQLLLAAKRTESAKKIAKS 694

Query: 727 TCQALIRAIIGYGVEQEDRKHTWMEDAESCANQGAYECARAIYAQALATFPSKKSIWLRA 786
             +   +A +G+ +E            +    Q  Y  A   Y +AL   PS ++     
Sbjct: 695 LQKQHPKAALGFELE-----------GDLYLRQKDYPAAIQAYRKALKRAPSSQNAIKLH 743

Query: 787 AYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKS-NKKSIWLRAAYFEKNHGTRESL 845
                +  T E+++TL      H P   VL    A+    +  + +A          +  
Sbjct: 744 RALLASGNTAEAVKTLEAWLKTH-PNDAVLRTALAELYLAQKDYDKAI---------KHY 793

Query: 846 ETLLQKAVAHCPKSEVL-WLMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLE 904
           +T+++KA  +      L WL      +L    P A      A +  PN   I      L 
Sbjct: 794 QTVVKKAPDNAVVLNNLAWL------YLELKDPRALEYAERALKLAPNIPAILDTLGWLL 847

Query: 905 SENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKA 960
            E  E +RA  LL KA   A      P +  I                AR+ L K 
Sbjct: 848 VEKGEADRALPLLRKAVNIA------PEAAAIRYHLALALLATGRKAEARKELDKL 897



 Score = 32.4 bits (74), Expect = 2.0
 Identities = 96/490 (19%), Positives = 158/490 (32%), Gaps = 83/490 (16%)

Query: 551  LWLEAARLQPVDTARAVIAQAVRHIPTSVRIW--IKAADLETETKAK-RRVYRKALEHIP 607
            L L   R    D A A   +  +  P +  +   + A  L     AK R  + KAL   P
Sbjct: 437  LILSYLRSGQFDKALAAAKKLEKKQPDNASLHNLLGAIYLGKGDLAKAREAFEKALSIEP 496

Query: 608  NSVRLWKAAVEL-------EDPEDARILLSRAVECCPTSVELWLALARLETYENARKVLN 660
            +      AA  L        +P+DA     + +   P ++   LALA L           
Sbjct: 497  DFF---PAAANLARIDIQEGNPDDAIQRFEKVLTIDPKNLRAILALAGLY---------- 543

Query: 661  KARENIPTDRQIWTTAAKLEEAHGNNAMVDKIIDRALSSLSANGVEINREHWFKEAIEAE 720
              R     +   W     LE+A   N    + I+ AL+ L+   +   +    K+A+   
Sbjct: 544  -LRTGNEEEAVAW-----LEKAAELNP---QEIEPALA-LAQYYLGKGQ---LKKALAIL 590

Query: 721  KAGSVHTCQALIRAIIGYGVEQEDRKHTWMEDAESCANQGAYECARAIYAQALATFPSKK 780
                                   D    W+    +    G    A + + + LA  P   
Sbjct: 591  N-----EAADAAP----------DSPEAWLMLGRAQLAAGDLNKAVSSFKKLLALQPDSA 635

Query: 781  SIWLRAAYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKSNKKSIWLRAAYFEKNHG 840
               L  A             T L++A+   P +    +  A+                  
Sbjct: 636  LALLLLADAYAVMKNYAKAITSLKRALELKPDNTEAQIGLAQ------------LLLAAK 683

Query: 841  TRESLETLLQKAVAHCPKSEVLWLMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLAA 900
              ES + + +      PK+ + + +         D PAA      A +  P+S      A
Sbjct: 684  RTESAKKIAKSLQKQHPKAALGFELEGDLYLRQKDYPAAIQAYRKALKRAPSS----QNA 739

Query: 901  VKLESENNEYERARRLLAKARAQAGAFQA-NPNSEEIWLAAVKLESENNEYERARRLLAK 959
            +KL    +    A    A+A     A+   +PN   +  A  +L     +Y++A +    
Sbjct: 740  IKL----HRALLASGNTAEAVKTLEAWLKTHPNDAVLRTALAELYLAQKDYDKAIKHYQT 795

Query: 960  ARASAPTPRVMIQSAKLEWCLDNL---------ERALQLLDEAIKVFPDFAKLWMMKGQI 1010
                AP   V++ +  L W    L         ERAL+L      +      L + KG+ 
Sbjct: 796  VVKKAPDNAVVLNN--LAWLYLELKDPRALEYAERALKLAPNIPAILDTLGWLLVEKGEA 853

Query: 1011 EEQKNLLDKA 1020
            +    LL KA
Sbjct: 854  DRALPLLRKA 863


>gnl|CDD|223533 COG0457, NrfG, FOG: TPR repeat [General function prediction only].
          Length = 291

 Score = 55.6 bits (132), Expect = 5e-08
 Identities = 55/237 (23%), Positives = 95/237 (40%), Gaps = 8/237 (3%)

Query: 871  WLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARAQAGAFQAN 930
             L G++  A  +L  A +  PNS+   L  +   +          L  +   +A   +  
Sbjct: 34   ELLGELAEALELLEEALELLPNSDLAGLLLLLALALLKLGRLEEAL--ELLEKALELELL 91

Query: 931  PNSEEIWLAAVKLESENNEYERARRLLAKARASAPTPRVMIQSAKLE--WCLDNLERALQ 988
            PN  E  L    L     +YE A  LL KA A  P P +      L   + L + E AL+
Sbjct: 92   PNLAEALLNLGLLLEALGKYEEALELLEKALALDPDPDLAEALLALGALYELGDYEEALE 151

Query: 989  LLDEAIKVFP---DFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCP-HSVPLWIMLANL 1044
            L ++A+++ P   + A+  +  G + E     ++A +   +A+K  P       + L  L
Sbjct: 152  LYEKALELDPELNELAEALLALGALLEALGRYEEALELLEKALKLNPDDDAEALLNLGLL 211

Query: 1045 EERRKMLIKARSVLEKGRLRNPNCAELWLAAIRVEIRAGLKDIANTMMAKALQECPN 1101
              +     +A    EK    +P+ AE       + +  G  + A   + KAL+  P+
Sbjct: 212  YLKLGKYEEALEYYEKALELDPDNAEALYNLALLLLELGRYEEALEALEKALELDPD 268



 Score = 46.8 bits (109), Expect = 4e-05
 Identities = 57/247 (23%), Positives = 87/247 (35%), Gaps = 8/247 (3%)

Query: 813  SEVLWLMGAKSNKKSIWLRAAYFEKNHGTRESLETLLQKAVAHC-PKSEVLWLMGAKSKW 871
             E L L+        + L A    K     E+LE L +       P      L       
Sbjct: 47   EEALELLPNSDLAGLLLLLALALLKLGRLEEALELLEKALELELLPNLAEALLNLGLLLE 106

Query: 872  LAGDVPAARGILSLAFQANPNSEEIW-LAAVKLESENNEYERARRLLAKARAQAGAFQAN 930
              G    A  +L  A   +P+ +    L A+    E  +YE A  L  KA          
Sbjct: 107  ALGKYEEALELLEKALALDPDPDLAEALLALGALYELGDYEEALELYEKALE---LDPEL 163

Query: 931  PNSEEIWLAAVKLESENNEYERARRLLAKARASAPT--PRVMIQSAKLEWCLDNLERALQ 988
                E  LA   L      YE A  LL KA    P      ++    L   L   E AL+
Sbjct: 164  NELAEALLALGALLEALGRYEEALELLEKALKLNPDDDAEALLNLGLLYLKLGKYEEALE 223

Query: 989  LLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPL-WIMLANLEER 1047
              ++A+++ PD A+       +  +    ++A +   +A++  P    L   +L  L E 
Sbjct: 224  YYEKALELDPDNAEALYNLALLLLELGRYEEALEALEKALELDPDLYNLGLALLLLLAEA 283

Query: 1048 RKMLIKA 1054
             ++L KA
Sbjct: 284  LELLEKA 290


>gnl|CDD|238112 cd00189, TPR, Tetratricopeptide repeat domain; typically contains 34
            amino acids
            [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-
            X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in
            a variety of organisms including bacteria, cyanobacteria,
            yeast, fungi, plants, and humans in various subcellular
            locations; involved in a variety of functions including
            protein-protein interactions, but common features in the
            interaction partners have not been defined; involved in
            chaperone, cell-cycle, transciption, and protein
            transport complexes; the number of TPR motifs varies
            among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats
            generate a right-handed helical structure with an
            amphipathic channel that is thought to accomodate an
            alpha-helix of a target protein; it has been proposed
            that TPR proteins preferably interact with WD-40 repeat
            proteins, but in many instances several TPR-proteins seem
            to aggregate to multi-protein complexes; examples of
            TPR-proteins include, Cdc16p, Cdc23p and Cdc27p
            components of the cyclosome/APC, the Pex5p/Pas10p
            receptor for peroxisomal targeting signals, the Tom70p
            co-receptor for mitochondrial targeting signals, Ser/Thr
            phosphatase 5C and the p110 subunit of O-GlcNAc
            transferase; three copies of the repeat are present here.
          Length = 100

 Score = 48.9 bits (117), Expect = 4e-07
 Identities = 15/87 (17%), Positives = 37/87 (42%)

Query: 974  AKLEWCLDNLERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPH 1033
              L + L + + AL+  ++A+++ PD A  +        +    ++A + + +A++  P 
Sbjct: 7    GNLYYKLGDYDEALEYYEKALELDPDNADAYYNLAAAYYKLGKYEEALEDYEKALELDPD 66

Query: 1034 SVPLWIMLANLEERRKMLIKARSVLEK 1060
            +   +  L     +     +A    EK
Sbjct: 67   NAKAYYNLGLAYYKLGKYEEALEAYEK 93



 Score = 43.1 bits (102), Expect = 5e-05
 Identities = 13/54 (24%), Positives = 29/54 (53%)

Query: 980  LDNLERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPH 1033
            L   E AL+  ++A+++ PD AK +   G    +    ++A + + +A++  P+
Sbjct: 47   LGKYEEALEDYEKALELDPDNAKAYYNLGLAYYKLGKYEEALEAYEKALELDPN 100



 Score = 39.3 bits (92), Expect = 0.001
 Identities = 21/72 (29%), Positives = 33/72 (45%), Gaps = 8/72 (11%)

Query: 599 YRKALEHIPNSVRLW--KAAV--ELEDPEDARILLSRAVECCPTSVELWLALA----RLE 650
           Y KALE  P++   +   AA   +L   E+A     +A+E  P + + +  L     +L 
Sbjct: 23  YEKALELDPDNADAYYNLAAAYYKLGKYEEALEDYEKALELDPDNAKAYYNLGLAYYKLG 82

Query: 651 TYENARKVLNKA 662
            YE A +   KA
Sbjct: 83  KYEEALEAYEKA 94



 Score = 38.1 bits (89), Expect = 0.003
 Identities = 19/100 (19%), Positives = 40/100 (40%)

Query: 1002 KLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLANLEERRKMLIKARSVLEKG 1061
            +  +  G +  +    D+A + + +A++  P +   +  LA    +     +A    EK 
Sbjct: 1    EALLNLGNLYYKLGDYDEALEYYEKALELDPDNADAYYNLAAAYYKLGKYEEALEDYEKA 60

Query: 1062 RLRNPNCAELWLAAIRVEIRAGLKDIANTMMAKALQECPN 1101
               +P+ A+ +        + G  + A     KAL+  PN
Sbjct: 61   LELDPDNAKAYYNLGLAYYKLGKYEEALEAYEKALELDPN 100



 Score = 34.3 bits (79), Expect = 0.068
 Identities = 18/105 (17%), Positives = 35/105 (33%), Gaps = 6/105 (5%)

Query: 862 LWLMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERARRLLAKAR 921
             L      +  GD   A      A + +P++ + +        +  +YE A     K  
Sbjct: 2   ALLNLGNLYYKLGDYDEALEYYEKALELDPDNADAYYNLAAAYYKLGKYEEALEDYEK-- 59

Query: 922 AQAGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARASAPT 966
               A + +P++ + +        +  +YE A     KA    P 
Sbjct: 60  ----ALELDPDNAKAYYNLGLAYYKLGKYEEALEAYEKALELDPN 100


>gnl|CDD|233013 TIGR00540, TPR_hemY_coli, heme biosynthesis-associated TPR protein.
            Members of this protein family are uncharacterized
            tetratricopeptide repeat (TPR) proteins invariably found
            in heme biosynthesis gene clusters. The absence of any
            invariant residues other than Ala argues against this
            protein serving as an enzyme per se. The gene symbol hemY
            assigned in E. coli is unfortunate in that an unrelated
            protein, protoporphyrinogen oxidase (HemG in E. coli) is
            designated HemY in Bacillus subtilis [Unknown function,
            General].
          Length = 367

 Score = 46.9 bits (112), Expect = 5e-05
 Identities = 41/243 (16%), Positives = 83/243 (34%), Gaps = 39/243 (16%)

Query: 840  GTRESLETLLQKAVAHCPKSEVLWLM-GAKSKWLAGDVPAARGILSLAFQANPNSEEIWL 898
            G  E  +  L +A    P SE+  L+  A+      D  AA   L       P    +  
Sbjct: 113  GDEERRDRYLAEAAELAPNSELAVLLTRAELLLDQRDYEAALAALDSLQAQAPRHTAVLR 172

Query: 899  AAVKLESENNEYERARRLLAKAR-------AQAGAFQ--------ANPNSE------EIW 937
             A++    +  ++   +LL   R        +A   +             E        W
Sbjct: 173  LALRAYQRSGNWDALLKLLPALRKAKALSPEEAARLEQQAYIGLLDEAREEDADALKTWW 232

Query: 938  --------------LAAVKLESENNEYERARRLLAKARASAPTPRVMIQSAKLEWCLDNL 983
                          +AA +   +  +++ A +L+ +A      P ++    +L+    + 
Sbjct: 233  KQLPRAERQEPELAVAAAEALIQLGDHDEAEKLIEEALKKEWDPELLRLYGRLQ--PGDP 290

Query: 984  ERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLAN 1043
               ++  ++ +K  PD A L +  G++  ++ L  KA      ++     +    + LA 
Sbjct: 291  SPLIKRAEKWLKKHPDDALLLLALGRLCLRQQLWGKAQSYLEASL-SLAPTEEAHLELAQ 349

Query: 1044 LEE 1046
            L E
Sbjct: 350  LFE 352


>gnl|CDD|225605 COG3063, PilF, Tfp pilus assembly protein PilF [Cell motility and
            secretion / Intracellular trafficking and secretion].
          Length = 250

 Score = 45.9 bits (109), Expect = 6e-05
 Identities = 42/210 (20%), Positives = 77/210 (36%), Gaps = 29/210 (13%)

Query: 906  ENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARASAP 965
            +  +Y +A++ L KA       + +P+     L       +  E + A     KA + AP
Sbjct: 47   QQGDYAQAKKNLEKA------LEHDPSYYLAHLVRAHYYQKLGENDLADESYRKALSLAP 100

Query: 966  TPRVMIQSAKLEWCLDNL----------ERALQLLDEAIK--VFPDFAKLWMMKGQIEEQ 1013
                   +  +   L+N           E A+Q  + A+    + + +      G    +
Sbjct: 101  ------NNGDV---LNNYGAFLCAQGRPEEAMQQFERALADPAYGEPSDTLENLGLCALK 151

Query: 1014 KNLLDKAHDTFSQAIKKCPHSVPLWIMLANLEERRKMLIKARSVLEKGRLR-NPNCAELW 1072
                D+A +   +A++  P   P  + LA L  +      AR  LE+ + R       L 
Sbjct: 152  AGQFDQAEEYLKRALELDPQFPPALLELARLHYKAGDYAPARLYLERYQQRGGAQAESLL 211

Query: 1073 LAAIRVEIRAGLKDIANTMMAKALQECPNA 1102
            L  IR+  R G +  A    A+  +  P +
Sbjct: 212  LG-IRIAKRLGDRAAAQRYQAQLQRLFPYS 240



 Score = 35.1 bits (81), Expect = 0.17
 Identities = 26/89 (29%), Positives = 34/89 (38%), Gaps = 4/89 (4%)

Query: 495 NDIKKARLLLKSVRETNPNHPPAWIASARLEEVTGKVQAARNLIMKGCEE-NQTSEDLWL 553
               +A   LK   E +P  PPA +  ARL    G    AR  + +  +     +E L L
Sbjct: 153 GQFDQAEEYLKRALELDPQFPPALLELARLHYKAGDYAPARLYLERYQQRGGAQAESLLL 212

Query: 554 E---AARLQPVDTARAVIAQAVRHIPTSV 579
               A RL     A+   AQ  R  P S 
Sbjct: 213 GIRIAKRLGDRAAAQRYQAQLQRLFPYSE 241


>gnl|CDD|227122 COG4783, COG4783, Putative Zn-dependent protease, contains TPR
            repeats [General function prediction only].
          Length = 484

 Score = 45.9 bits (109), Expect = 1e-04
 Identities = 27/159 (16%), Positives = 55/159 (34%), Gaps = 7/159 (4%)

Query: 864  LMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARAQ 923
            L  A+ +     +P  +    LA ++                   +Y+ A +LL    A 
Sbjct: 276  LARARIRAKYEALPNQQAADLLAKRSKRGGLAAQYGRALQTYLAGQYDEALKLLQPLIAA 335

Query: 924  AGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARASAPTPRVMIQS-AKLEWCLDN 982
                   P++      A  +  E N+ + A   L KA A  P   ++  + A+       
Sbjct: 336  Q------PDNPYYLELAGDILLEANKAKEAIERLKKALALDPNSPLLQLNLAQALLKGGK 389

Query: 983  LERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAH 1021
             + A+++L+  +   P+    W +  Q   +     +A 
Sbjct: 390  PQEAIRILNRYLFNDPEDPNGWDLLAQAYAELGNRAEAL 428



 Score = 45.9 bits (109), Expect = 1e-04
 Identities = 28/117 (23%), Positives = 48/117 (41%), Gaps = 6/117 (5%)

Query: 983  LERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLA 1042
             + AL+LL   I   PD      + G I  + N   +A +   +A+   P+S  L + LA
Sbjct: 322  YDEALKLLQPLIAAQPDNPYYLELAGDILLEANKAKEAIERLKKALALDPNSPLLQLNLA 381

Query: 1043 NLEERRKMLIKARSVLEKGRLRNPNCAELW--LAAIRVEIRAGLKDIANTMMAKALQ 1097
                +     +A  +L +    +P     W  LA    +  A L + A  ++A+A  
Sbjct: 382  QALLKGGKPQEAIRILNRYLFNDPEDPNGWDLLA----QAYAELGNRAEALLARAEG 434



 Score = 33.5 bits (77), Expect = 0.74
 Identities = 26/127 (20%), Positives = 43/127 (33%), Gaps = 7/127 (5%)

Query: 839 HGTRESLETLLQKAVAHCPKSEVLWLMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWL 898
               +     L+KA+A  P S +L L  A++    G    A  IL+     +P     W 
Sbjct: 353 ANKAKEAIERLKKALALDPNSPLLQLNLAQALLKGGKPQEAIRILNRYLFNDPEDPNGWD 412

Query: 899 AAVKLESENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERAR---R 955
              +  +E      A      ARA+  A         I+L     + +    + AR   R
Sbjct: 413 LLAQAYAELGNRAEALL----ARAEGYALAGRLEQAIIFLMRASQQVKLGFPDWARADAR 468

Query: 956 LLAKARA 962
           +    + 
Sbjct: 469 IDQLRQQ 475



 Score = 33.1 bits (76), Expect = 0.99
 Identities = 23/114 (20%), Positives = 43/114 (37%), Gaps = 8/114 (7%)

Query: 582 WIKAADLETETKAKRRVYRKALEHIPNSVRLWKAAVELE----DPEDARILLSRAVECCP 637
                 L  +     ++ +  +   P++    + A ++       ++A   L +A+   P
Sbjct: 312 RALQTYLAGQYDEALKLLQPLIAAQPDNPYYLELAGDILLEANKAKEAIERLKKALALDP 371

Query: 638 TSVELWLALAR----LETYENARKVLNKARENIPTDRQIWTTAAKLEEAHGNNA 687
            S  L L LA+        + A ++LN+   N P D   W   A+     GN A
Sbjct: 372 NSPLLQLNLAQALLKGGKPQEAIRILNRYLFNDPEDPNGWDLLAQAYAELGNRA 425



 Score = 30.1 bits (68), Expect = 9.0
 Identities = 30/170 (17%), Positives = 55/170 (32%), Gaps = 19/170 (11%)

Query: 760 GAYECARAIYAQALATFPSKKSIWLRAAYFEKNHGTRESLETLLQKAVAHCPKSEVLWLM 819
           G Y+ A  +    +A  P        A          +     L+KA+A  P S +L L 
Sbjct: 320 GQYDEALKLLQPLIAAQPDNPYYLELAGDILLEANKAKEAIERLKKALALDPNSPLLQLN 379

Query: 820 GAKSNKKSIWLRAAYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKSKWLAGDVP-- 877
            A+          A  +   G  +    +L + + + P+    W + A++    G+    
Sbjct: 380 LAQ----------ALLKG--GKPQEAIRILNRYLFNDPEDPNGWDLLAQAYAELGNRAEA 427

Query: 878 -AARGILSLAFQANPNSEEIWLAAVKLESENNEYERAR---RLLAKARAQ 923
             AR     A         I+L     + +    + AR   R+    +  
Sbjct: 428 LLARA-EGYALAGRLEQAIIFLMRASQQVKLGFPDWARADARIDQLRQQN 476


>gnl|CDD|214642 smart00386, HAT, HAT (Half-A-TPR) repeats.  Present in several
            RNA-binding proteins. Structurally and sequentially
            thought to be similar to TPRs.
          Length = 33

 Score = 38.7 bits (91), Expect = 3e-04
 Identities = 12/30 (40%), Positives = 19/30 (63%)

Query: 1018 DKAHDTFSQAIKKCPHSVPLWIMLANLEER 1047
            ++A   + +A++K P SV LW+  A  EER
Sbjct: 4    ERARKIYERALEKFPKSVELWLKYAEFEER 33



 Score = 38.7 bits (91), Expect = 4e-04
 Identities = 15/33 (45%), Positives = 19/33 (57%)

Query: 760 GAYECARAIYAQALATFPSKKSIWLRAAYFEKN 792
           G  E AR IY +AL  FP    +WL+ A FE+ 
Sbjct: 1   GDIERARKIYERALEKFPKSVELWLKYAEFEER 33



 Score = 36.4 bits (85), Expect = 0.002
 Identities = 11/33 (33%), Positives = 16/33 (48%)

Query: 650 ETYENARKVLNKARENIPTDRQIWTTAAKLEEA 682
              E ARK+  +A E  P   ++W   A+ EE 
Sbjct: 1   GDIERARKIYERALEKFPKSVELWLKYAEFEER 33



 Score = 35.2 bits (82), Expect = 0.006
 Identities = 12/25 (48%), Positives = 15/25 (60%)

Query: 596 RRVYRKALEHIPNSVRLWKAAVELE 620
           R++Y +ALE  P SV LW    E E
Sbjct: 7   RKIYERALEKFPKSVELWLKYAEFE 31



 Score = 33.7 bits (78), Expect = 0.020
 Identities = 16/31 (51%), Positives = 18/31 (58%)

Query: 620 EDPEDARILLSRAVECCPTSVELWLALARLE 650
            D E AR +  RA+E  P SVELWL  A  E
Sbjct: 1   GDIERARKIYERALEKFPKSVELWLKYAEFE 31



 Score = 32.9 bits (76), Expect = 0.042
 Identities = 10/29 (34%), Positives = 14/29 (48%)

Query: 1053 KARSVLEKGRLRNPNCAELWLAAIRVEIR 1081
            +AR + E+   + P   ELWL     E R
Sbjct: 5    RARKIYERALEKFPKSVELWLKYAEFEER 33



 Score = 30.6 bits (70), Expect = 0.23
 Identities = 10/31 (32%), Positives = 18/31 (58%)

Query: 559 QPVDTARAVIAQAVRHIPTSVRIWIKAADLE 589
             ++ AR +  +A+   P SV +W+K A+ E
Sbjct: 1   GDIERARKIYERALEKFPKSVELWLKYAEFE 31



 Score = 29.8 bits (68), Expect = 0.49
 Identities = 11/39 (28%), Positives = 18/39 (46%), Gaps = 6/39 (15%)

Query: 908 NEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESE 946
            + ERAR++  +A  +       P S E+WL   + E  
Sbjct: 1   GDIERARKIYERALEK------FPKSVELWLKYAEFEER 33



 Score = 29.4 bits (67), Expect = 0.76
 Identities = 10/32 (31%), Positives = 15/32 (46%)

Query: 495 NDIKKARLLLKSVRETNPNHPPAWIASARLEE 526
            DI++AR + +   E  P     W+  A  EE
Sbjct: 1   GDIERARKIYERALEKFPKSVELWLKYAEFEE 32



 Score = 26.7 bits (60), Expect = 7.2
 Identities = 11/45 (24%), Positives = 18/45 (40%), Gaps = 12/45 (26%)

Query: 794 GTRESLETLLQKAVAHCPKSEVLWLMGAKSNKKSIWLRAAYFEKN 838
           G  E    + ++A+   PKS              +WL+ A FE+ 
Sbjct: 1   GDIERARKIYERALEKFPKSV------------ELWLKYAEFEER 33


>gnl|CDD|219834 pfam08424, NRDE-2, NRDE-2, necessary for RNA interference.  This is a
            family of eukaryotic proteins. Eukaryotic cells express a
            wide variety of endogenous small regulatory RNAs that
            regulate heterochromatin formation, developmental timing,
            defence against parasitic nucleic acids, and genome
            rearrangement. Many small regulatory RNAs are thought to
            function in nuclei, and in plants and fungi small
            interfering (si)RNAs associate with nascent transcripts
            and direct chromatin and/or DNA modifications. This
            family protein, NRDE-2, is required for small interfering
            (si)RNA-mediated silencing in nuclei. NRDE-2 associates
            with the Argonaute protein NRDE-3 within nuclei and is
            recruited by NRDE-3/siRNA complexes to nascent
            transcripts that have been targeted by RNA interference,
            RNAi, the process whereby double-stranded RNA (dsRNA)
            directs the sequence-specific degradation of mRNA.
          Length = 324

 Score = 42.4 bits (100), Expect = 0.001
 Identities = 23/126 (18%), Positives = 48/126 (38%), Gaps = 27/126 (21%)

Query: 927  FQANPNSEEIWLAAVKLESENNEYERARRLLAKARASAPTPRVMIQSAKLEWCLDNLERA 986
             + NP   + W+  ++ + E     R R   A+ +  A                   E+ 
Sbjct: 12   VRENPEDIDAWIELIRFQEELLRLSRRRSTKAERKQLA-------------------EKK 52

Query: 987  LQLLDEAIKVFPDFAKLWMMK----GQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLA 1042
            L +L++A+K  PD  +L +       ++ +   LL +    + + +K+ P S  LW    
Sbjct: 53   LSILEKALKHNPDSERLLLGLLEEGEKVWDTDELLKR----WEKVLKENPGSPKLWRKYL 108

Query: 1043 NLEERR 1048
            +  +  
Sbjct: 109  DFRQGD 114



 Score = 36.6 bits (85), Expect = 0.083
 Identities = 29/131 (22%), Positives = 46/131 (35%), Gaps = 24/131 (18%)

Query: 543 EENQTSEDLWLEAARLQPVDTARAVIAQAVRHIPTSVRIWIKAADLETETKAKRRVYRKA 602
            EN    D W+E  R Q               +  S R   KA   +   + K  +  KA
Sbjct: 13  RENPEDIDAWIELIRFQE------------ELLRLSRRRSTKAERKQL-AEKKLSILEKA 59

Query: 603 LEHIPNSVRLW----KAAVELEDPEDARILLSRAVECCPTSVELWLALARLE-------T 651
           L+H P+S RL     +   ++ D ++      + ++  P S +LW              +
Sbjct: 60  LKHNPDSERLLLGLLEEGEKVWDTDELLKRWEKVLKENPGSPKLWRKYLDFRQGDFSTFS 119

Query: 652 YENARKVLNKA 662
           Y   RK   K 
Sbjct: 120 YSKVRKTYEKC 130


>gnl|CDD|227518 COG5191, COG5191, Uncharacterized conserved protein, contains HAT
            (Half-A-TPR) repeat [General function prediction only].
          Length = 435

 Score = 42.3 bits (99), Expect = 0.002
 Identities = 21/98 (21%), Positives = 43/98 (43%), Gaps = 1/98 (1%)

Query: 984  ERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLAN 1043
            ++ +  L  +   F +  K+W        +K +  +  + F++ + K P +V LWI    
Sbjct: 90   QKKIFELYRSTNKFFNDPKIWSQYAAYVIKKKMYGEMKNIFAECLTKHPLNVDLWIYCCA 149

Query: 1044 LEERRKMLIKA-RSVLEKGRLRNPNCAELWLAAIRVEI 1080
             E      I++ R++  KG   N     +W+   R+E+
Sbjct: 150  FELFEIANIESSRAMFLKGLRMNSRSPRIWIEYFRMEL 187



 Score = 38.0 bits (88), Expect = 0.031
 Identities = 16/98 (16%), Positives = 32/98 (32%), Gaps = 7/98 (7%)

Query: 848 LLQKAVAHCPKSEVLWLMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLE-SE 906
            L ++         +W   A            + I +     +P + ++W+     E  E
Sbjct: 95  ELYRSTNKFFNDPKIWSQYAAYVIKKKMYGEMKNIFAECLTKHPLNVDLWIYCCAFELFE 154

Query: 907 NNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLE 944
               E +R +  K        + N  S  IW+   ++E
Sbjct: 155 IANIESSRAMFLK------GLRMNSRSPRIWIEYFRME 186



 Score = 36.1 bits (83), Expect = 0.13
 Identities = 12/50 (24%), Positives = 24/50 (48%), Gaps = 1/50 (2%)

Query: 511 NPNHPPAWIASARLE-EVTGKVQAARNLIMKGCEENQTSEDLWLEAARLQ 559
           +P +   WI     E      ++++R + +KG   N  S  +W+E  R++
Sbjct: 137 HPLNVDLWIYCCAFELFEIANIESSRAMFLKGLRMNSRSPRIWIEYFRME 186



 Score = 34.9 bits (80), Expect = 0.24
 Identities = 25/119 (21%), Positives = 48/119 (40%), Gaps = 7/119 (5%)

Query: 501 RLLLKSVRETN--PNHPPAWIASARLEEVTGKVQAARNLIMKGCEENQTSEDLWLEAARL 558
           + + +  R TN   N P  W   A            +N+  +   ++  + DLW+     
Sbjct: 91  KKIFELYRSTNKFFNDPKIWSQYAAYVIKKKMYGEMKNIFAECLTKHPLNVDLWIYCCAF 150

Query: 559 Q-----PVDTARAVIAQAVRHIPTSVRIWIKAADLETETKAKRRVYRKALEHIPNSVRL 612
           +      ++++RA+  + +R    S RIWI+   +E     K    R+  E + N + L
Sbjct: 151 ELFEIANIESSRAMFLKGLRMNSRSPRIWIEYFRMELMYITKLINRREKTEILSNEIGL 209



 Score = 33.4 bits (76), Expect = 0.75
 Identities = 13/82 (15%), Positives = 32/82 (39%), Gaps = 1/82 (1%)

Query: 824 NKKSIWLRAAYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKSKW-LAGDVPAARGI 882
           N   IW + A +         ++ +  + +   P +  LW+     +     ++ ++R +
Sbjct: 105 NDPKIWSQYAAYVIKKKMYGEMKNIFAECLTKHPLNVDLWIYCCAFELFEIANIESSRAM 164

Query: 883 LSLAFQANPNSEEIWLAAVKLE 904
                + N  S  IW+   ++E
Sbjct: 165 FLKGLRMNSRSPRIWIEYFRME 186



 Score = 33.4 bits (76), Expect = 0.75
 Identities = 15/70 (21%), Positives = 26/70 (37%), Gaps = 1/70 (1%)

Query: 749 WMEDAESCANQGAYECARAIYAQALATFPSKKSIWLRAAYFE-KNHGTRESLETLLQKAV 807
           W + A     +  Y   + I+A+ L   P    +W+    FE       ES   +  K +
Sbjct: 110 WSQYAAYVIKKKMYGEMKNIFAECLTKHPLNVDLWIYCCAFELFEIANIESSRAMFLKGL 169

Query: 808 AHCPKSEVLW 817
               +S  +W
Sbjct: 170 RMNSRSPRIW 179



 Score = 31.1 bits (70), Expect = 4.2
 Identities = 13/56 (23%), Positives = 27/56 (48%), Gaps = 2/56 (3%)

Query: 924 AGAFQANPNSEEIWLAAVKLE-SENNEYERARRLLAKA-RASAPTPRVMIQSAKLE 977
           A     +P + ++W+     E  E    E +R +  K  R ++ +PR+ I+  ++E
Sbjct: 131 AECLTKHPLNVDLWIYCCAFELFEIANIESSRAMFLKGLRMNSRSPRIWIEYFRME 186


>gnl|CDD|236983 PRK11788, PRK11788, tetratricopeptide repeat protein; Provisional.
          Length = 389

 Score = 41.3 bits (98), Expect = 0.003
 Identities = 39/147 (26%), Positives = 59/147 (40%), Gaps = 17/147 (11%)

Query: 553 LEAARLQPVDTARAVIAQAVRHIPTSVRIWIKAADLETE----TKAK---RRVYRKALEH 605
            +A     +D ARA++ +A+   P  VR  I   DL         A     RV  +  E+
Sbjct: 188 QQALARGDLDAARALLKKALAADPQCVRASILLGDLALAQGDYAAAIEALERVEEQDPEY 247

Query: 606 IPNSV-RLWKAAVELEDPEDARILLSRAVECCPTSVELWLALAR-LETYENARKVLNKAR 663
           +   + +L +    L D  +    L RA+E  P   +L LALA+ LE  E         R
Sbjct: 248 LSEVLPKLMECYQALGDEAEGLEFLRRALEEYP-GADLLLALAQLLEEQEGPEAAQALLR 306

Query: 664 ENI---PTDRQIWTTAAKLEEAHGNNA 687
           E +   P+ R       +L + H   A
Sbjct: 307 EQLRRHPSLR----GFHRLLDYHLAEA 329



 Score = 39.0 bits (92), Expect = 0.013
 Identities = 37/174 (21%), Positives = 59/174 (33%), Gaps = 21/174 (12%)

Query: 871  WLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARAQAGAFQAN 930
               GD+ AAR +L  A  A+P      +    L     +Y  A   L +   Q   +   
Sbjct: 191  LARGDLDAARALLKKALAADPQCVRASILLGDLALAQGDYAAAIEALERVEEQDPEYL-- 248

Query: 931  PNSE--EIWL-AAVKLESENNEYERARRLLAKARASAPTPRVMIQSAKLEWCLDNLERAL 987
              SE     +     L  E          L +A    P   +++  A+L    +  E A 
Sbjct: 249  --SEVLPKLMECYQALGDE----AEGLEFLRRALEEYPGADLLLALAQLLEEQEGPEAAQ 302

Query: 988  QLLDEAIKVFPD---FAKLWMMKGQIEE-----QKNLLDKAHDTFSQAIKKCPH 1033
             LL E ++  P    F +L  +   + E      K  L    D   + +K+ P 
Sbjct: 303  ALLREQLRRHPSLRGFHRL--LDYHLAEAEEGRAKESLLLLRDLVGEQLKRKPR 354



 Score = 30.2 bits (69), Expect = 7.9
 Identities = 34/155 (21%), Positives = 55/155 (35%), Gaps = 17/155 (10%)

Query: 495 NDIKKARLLLKSVRETNPNHPPAWIASARLEEVTGKVQAARNLIMKGCEENQ----TSE- 549
            D+  AR LLK     +P    A I    L    G   AA   + +   E Q     SE 
Sbjct: 194 GDLDAARALLKKALAADPQCVRASILLGDLALAQGDYAAAIEALER--VEEQDPEYLSEV 251

Query: 550 -DLWLEAAR-LQPVDTARAVIAQAVRHIPTSVRIWIKAADLETE--TKAKRRVYRKALEH 605
               +E  + L         + +A+   P +  +   A  LE +   +A + + R+ L  
Sbjct: 252 LPKLMECYQALGDEAEGLEFLRRALEEYPGADLLLALAQLLEEQEGPEAAQALLREQLRR 311

Query: 606 IPNSV---RLWKAAVELEDPEDAR---ILLSRAVE 634
            P+     RL    +   +   A+   +LL   V 
Sbjct: 312 HPSLRGFHRLLDYHLAEAEEGRAKESLLLLRDLVG 346


>gnl|CDD|214818 smart00784, SPT2, SPT2 chromatin protein.  This entry includes the
           Saccharomyces cerevisiae protein SPT2 which is a
           chromatin protein involved in transcriptional
           regulation.
          Length = 106

 Score = 37.0 bits (86), Expect = 0.008
 Identities = 25/104 (24%), Positives = 44/104 (42%), Gaps = 11/104 (10%)

Query: 92  GPARDANDVSDDRHAAPVKRKKKD--EEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDD 149
            P  + +  S D +         D  E++D+E+D +          G   +     D DD
Sbjct: 2   SPRLERSRRSRDDYDEEEDEDMDDFIEDDDEEDDYDRDEIWAMFNKGRKRYAY--RDDDD 59

Query: 150 EEADMIYEEIDKRMDEKR-------KDYREKRLREELERYRQER 186
           ++ DM     D + +E+R       +D  E+RL +E ER ++ R
Sbjct: 60  DDDDMEAGGADIQEEERRSARLARLEDREEERLEKEEEREKRAR 103



 Score = 37.0 bits (86), Expect = 0.008
 Identities = 25/104 (24%), Positives = 44/104 (42%), Gaps = 11/104 (10%)

Query: 261 GPARDANDVSDDRHAAPVKRKKKD--EEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDD 318
            P  + +  S D +         D  E++D+E+D +          G   +     D DD
Sbjct: 2   SPRLERSRRSRDDYDEEEDEDMDDFIEDDDEEDDYDRDEIWAMFNKGRKRYAY--RDDDD 59

Query: 319 EEADMIYEEIDKRMDEKR-------KDYREKRLREELERYRQER 355
           ++ DM     D + +E+R       +D  E+RL +E ER ++ R
Sbjct: 60  DDDDMEAGGADIQEEERRSARLARLEDREEERLEKEEEREKRAR 103


>gnl|CDD|222112 pfam13414, TPR_11, TPR repeat. 
          Length = 69

 Score = 36.1 bits (84), Expect = 0.008
 Identities = 8/52 (15%), Positives = 26/52 (50%), Gaps = 1/52 (1%)

Query: 982  NLERALQLLDEAIKVFPDFAKLWMMKGQI-EEQKNLLDKAHDTFSQAIKKCP 1032
            + + A++  ++A+++ PD A+ +        +     ++A +   +A++  P
Sbjct: 18   DYDEAIEAYEKALELDPDNAEAYYNLALAYLKLGKDYEEALEDLEKALELDP 69



 Score = 35.0 bits (81), Expect = 0.019
 Identities = 14/67 (20%), Positives = 27/67 (40%), Gaps = 1/67 (1%)

Query: 1001 AKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLAN-LEERRKMLIKARSVLE 1059
            A+     G    +    D+A + + +A++  P +   +  LA    +  K   +A   LE
Sbjct: 3    AEALKNLGNALFKLGDYDEAIEAYEKALELDPDNAEAYYNLALAYLKLGKDYEEALEDLE 62

Query: 1060 KGRLRNP 1066
            K    +P
Sbjct: 63   KALELDP 69



 Score = 33.1 bits (76), Expect = 0.077
 Identities = 18/58 (31%), Positives = 27/58 (46%), Gaps = 5/58 (8%)

Query: 615 AAVELEDPEDARILLSRAVECCPTSVELWLALARL-----ETYENARKVLNKARENIP 667
           A  +L D ++A     +A+E  P + E +  LA       + YE A + L KA E  P
Sbjct: 12  ALFKLGDYDEAIEAYEKALELDPDNAEAYYNLALAYLKLGKDYEEALEDLEKALELDP 69



 Score = 30.4 bits (69), Expect = 0.84
 Identities = 16/66 (24%), Positives = 24/66 (36%), Gaps = 3/66 (4%)

Query: 748 TWMEDAESCANQGAYECARAIYAQALATFPSKKSIWLR--AAYFEKNHGTRESLETLLQK 805
                  +    G Y+ A   Y +AL   P     +     AY +      E+LE  L+K
Sbjct: 5   ALKNLGNALFKLGDYDEAIEAYEKALELDPDNAEAYYNLALAYLKLGKDYEEALE-DLEK 63

Query: 806 AVAHCP 811
           A+   P
Sbjct: 64  ALELDP 69


>gnl|CDD|131573 TIGR02521, type_IV_pilW, type IV pilus biogenesis/stability protein
           PilW.  Members of this family are designated PilF in ref
           (PMID:8973346) and PilW in ref (PMID:15612916). This
           outer membrane protein is required both for pilus
           stability and for pilus function such as adherence to
           human cells. Members of this family contain copies of
           the TPR (tetratricopeptide repeat) domain.
          Length = 234

 Score = 38.5 bits (90), Expect = 0.014
 Identities = 21/75 (28%), Positives = 33/75 (44%), Gaps = 7/75 (9%)

Query: 493 DINDIKKARLLLKSVRETNPNHPPAWIASARLEEVTGKVQAARNLIMKGCE-ENQTSEDL 551
              D  KA   L    + +P  P + +  A L  + G+ + AR  + +  +  NQT+E L
Sbjct: 147 KAGDFDKAEKYLTRALQIDPQRPESLLELAELYYLRGQYKDARAYLERYQQTYNQTAESL 206

Query: 552 WLEAARLQPVDTARA 566
           WL       +  ARA
Sbjct: 207 WLG------IRIARA 215



 Score = 37.3 bits (87), Expect = 0.031
 Identities = 47/193 (24%), Positives = 76/193 (39%), Gaps = 33/193 (17%)

Query: 906  ENNEYERARRLLAKARAQAGAFQANPNSEEIWLA-AVKLESENNEYERARRLLAKARASA 964
            E  + E A+  L KA       + +P+    +LA A+  +    E E+A     +A    
Sbjct: 43   EQGDLEVAKENLDKA------LEHDPDDYLAYLALALYYQQLG-ELEKAEDSFRRALTLN 95

Query: 965  P-TPRVMIQSAKLEWCLDN----------LERALQLLDEAIK--VFPDFAKLWMMKGQIE 1011
            P    V          L+N           E+A+Q  ++AI+  ++P  A+     G   
Sbjct: 96   PNNGDV----------LNNYGTFLCQQGKYEQAMQQFEQAIEDPLYPQPARSLENAGLCA 145

Query: 1012 EQKNLLDKAHDTFSQAIKKCPHSVPLWIMLANLEERRKMLIKARSVLEKGRLRNPNCAE- 1070
             +    DKA    ++A++  P      + LA L   R     AR+ LE+ +      AE 
Sbjct: 146  LKAGDFDKAEKYLTRALQIDPQRPESLLELAELYYLRGQYKDARAYLERYQQTYNQTAES 205

Query: 1071 LWLAAIRVEIRAG 1083
            LWL  IR+    G
Sbjct: 206  LWL-GIRIARALG 217



 Score = 36.5 bits (85), Expect = 0.060
 Identities = 24/95 (25%), Positives = 40/95 (42%), Gaps = 10/95 (10%)

Query: 873 AGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARAQAGAFQ--AN 930
           AGD   A   L+ A Q +P   E  L   +L     +Y+ AR  L +       +Q   N
Sbjct: 148 AGDFDKAEKYLTRALQIDPQRPESLLELAELYYLRGQYKDARAYLER-------YQQTYN 200

Query: 931 PNSEEIWLAAVKLESENNEYERARRLLAKARASAP 965
             +E +WL  +++     +   A+R  A+ +   P
Sbjct: 201 QTAESLWL-GIRIARALGDVAAAQRYGAQLQKLFP 234


>gnl|CDD|222121 pfam13428, TPR_14, Tetratricopeptide repeat. 
          Length = 44

 Score = 34.1 bits (78), Expect = 0.020
 Identities = 15/35 (42%), Positives = 19/35 (54%)

Query: 615 AAVELEDPEDARILLSRAVECCPTSVELWLALARL 649
           A + L D ++A  LL RA+   P   E  L LARL
Sbjct: 10  ALLALGDLDEALALLRRALALDPDDPEALLLLARL 44



 Score = 32.1 bits (73), Expect = 0.089
 Identities = 8/44 (18%), Positives = 15/44 (34%)

Query: 1001 AKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLANL 1044
                +   +       LD+A     +A+   P      ++LA L
Sbjct: 1    PAALLALARALLALGDLDEALALLRRALALDPDDPEALLLLARL 44



 Score = 32.1 bits (73), Expect = 0.094
 Identities = 9/42 (21%), Positives = 17/42 (40%)

Query: 1037 LWIMLANLEERRKMLIKARSVLEKGRLRNPNCAELWLAAIRV 1078
              + LA        L +A ++L +    +P+  E  L   R+
Sbjct: 3    ALLALARALLALGDLDEALALLRRALALDPDDPEALLLLARL 44



 Score = 31.0 bits (70), Expect = 0.24
 Identities = 11/44 (25%), Positives = 23/44 (52%)

Query: 967  PRVMIQSAKLEWCLDNLERALQLLDEAIKVFPDFAKLWMMKGQI 1010
            P  ++  A+    L +L+ AL LL  A+ + PD  +  ++  ++
Sbjct: 1    PAALLALARALLALGDLDEALALLRRALALDPDDPEALLLLARL 44



 Score = 30.6 bits (69), Expect = 0.40
 Identities = 12/44 (27%), Positives = 17/44 (38%), Gaps = 4/44 (9%)

Query: 640 VELWLALA----RLETYENARKVLNKARENIPTDRQIWTTAAKL 679
               LALA     L   + A  +L +A    P D +     A+L
Sbjct: 1   PAALLALARALLALGDLDEALALLRRALALDPDDPEALLLLARL 44



 Score = 29.1 bits (65), Expect = 1.4
 Identities = 11/49 (22%), Positives = 18/49 (36%), Gaps = 6/49 (12%)

Query: 895 EIWLAAVKLESENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKL 943
              LA  +      + + A  LL +A A       +P+  E  L   +L
Sbjct: 2   AALLALARALLALGDLDEALALLRRALAL------DPDDPEALLLLARL 44



 Score = 28.3 bits (63), Expect = 2.3
 Identities = 12/44 (27%), Positives = 18/44 (40%)

Query: 515 PPAWIASARLEEVTGKVQAARNLIMKGCEENQTSEDLWLEAARL 558
           P A +A AR     G +  A  L+ +    +    +  L  ARL
Sbjct: 1   PAALLALARALLALGDLDEALALLRRALALDPDDPEALLLLARL 44



 Score = 28.3 bits (63), Expect = 2.7
 Identities = 10/41 (24%), Positives = 15/41 (36%)

Query: 749 WMEDAESCANQGAYECARAIYAQALATFPSKKSIWLRAAYF 789
            +  A +    G  + A A+  +ALA  P      L  A  
Sbjct: 4   LLALARALLALGDLDEALALLRRALALDPDDPEALLLLARL 44



 Score = 26.7 bits (59), Expect = 8.1
 Identities = 10/44 (22%), Positives = 15/44 (34%)

Query: 1069 AELWLAAIRVEIRAGLKDIANTMMAKALQECPNAGILWAEAIFL 1112
                LA  R  +  G  D A  ++ +AL   P+          L
Sbjct: 1    PAALLALARALLALGDLDEALALLRRALALDPDDPEALLLLARL 44


>gnl|CDD|222123 pfam13432, TPR_16, Tetratricopeptide repeat. 
          Length = 65

 Score = 34.2 bits (79), Expect = 0.027
 Identities = 11/53 (20%), Positives = 24/53 (45%)

Query: 982  NLERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHS 1034
            + + AL  L+ A+  +P  A+  ++ G+   ++  L +A      A+   P  
Sbjct: 12   DYDEALAALEAALARYPLAAEALLLLGEALLRQGRLAEAAALLRAALAADPDD 64



 Score = 32.3 bits (74), Expect = 0.13
 Identities = 16/61 (26%), Positives = 20/61 (32%)

Query: 864 LMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARAQ 923
           L  A++   AGD   A   L  A    P + E  L   +          A  LL  A A 
Sbjct: 1   LALARAALRAGDYDEALAALEAALARYPLAAEALLLLGEALLRQGRLAEAAALLRAALAA 60

Query: 924 A 924
            
Sbjct: 61  D 61



 Score = 32.3 bits (74), Expect = 0.16
 Identities = 16/68 (23%), Positives = 23/68 (33%), Gaps = 6/68 (8%)

Query: 898 LAAVKLESENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERARRLL 957
           LA  +      +Y+ A   L  A A+       P + E  L   +          A  LL
Sbjct: 1   LALARAALRAGDYDEALAALEAALAR------YPLAAEALLLLGEALLRQGRLAEAAALL 54

Query: 958 AKARASAP 965
             A A+ P
Sbjct: 55  RAALAADP 62



 Score = 30.7 bits (70), Expect = 0.46
 Identities = 15/63 (23%), Positives = 22/63 (34%), Gaps = 1/63 (1%)

Query: 938 LAAVKLESENNEYERARRLLAKARASAP-TPRVMIQSAKLEWCLDNLERALQLLDEAIKV 996
           LA  +      +Y+ A   L  A A  P     ++   +       L  A  LL  A+  
Sbjct: 1   LALARAALRAGDYDEALAALEAALARYPLAAEALLLLGEALLRQGRLAEAAALLRAALAA 60

Query: 997 FPD 999
            PD
Sbjct: 61  DPD 63



 Score = 27.6 bits (62), Expect = 7.1
 Identities = 8/46 (17%), Positives = 15/46 (32%)

Query: 493 DINDIKKARLLLKSVRETNPNHPPAWIASARLEEVTGKVQAARNLI 538
              D  +A   L++     P    A +         G++  A  L+
Sbjct: 9   RAGDYDEALAALEAALARYPLAAEALLLLGEALLRQGRLAEAAALL 54


>gnl|CDD|222698 pfam14346, DUF4398, Domain of unknown function (DUF4398).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria and archaea.
           Proteins in this family are typically between 127 and
           269 amino acids in length.
          Length = 105

 Score = 34.6 bits (80), Expect = 0.051
 Identities = 24/61 (39%), Positives = 29/61 (47%), Gaps = 6/61 (9%)

Query: 909 EYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLES-----ENNEYERARRLLAKARAS 963
           E   A   L +A A AGA Q  P   E+ LA  KL       +  +YE ARRL  +A A 
Sbjct: 18  ELADAEAALERAEA-AGAEQYAPPYVELKLAREKLAQAKAALDEGKYEEARRLAEQAEAD 76

Query: 964 A 964
           A
Sbjct: 77  A 77


>gnl|CDD|216950 pfam02259, FAT, FAT domain.  The FAT domain is named after FRAP, ATM
            and TRRAP.
          Length = 350

 Score = 36.6 bits (85), Expect = 0.087
 Identities = 40/219 (18%), Positives = 64/219 (29%), Gaps = 47/219 (21%)

Query: 935  EIWLAAVKLESENNEYERARRLLAKAR---ASAPTPRVMIQSAKLEWCLDNLERALQLLD 991
            E+WL    L  ++  +  A + L K          P V+I  AK  W     + A+Q L 
Sbjct: 150  EMWLKFANLARKSGRFSLAEKALLKLLTYDTREDLPNVVIAYAKYLWAKGQQQEAIQKLR 209

Query: 992  EAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLANLEERRKML 1051
            E       F     +   +    +       T+                  NLE     L
Sbjct: 210  E-------FVSC-YLSKPVGSSSDSELLLGLTYEVIS------------STNLEYFEAKL 249

Query: 1052 IKARSVLEKGRLRNPNCAELWLAAIRVEIRAGLKDIANTMMAKALQECPNAGILWAE-AI 1110
              AR  L+ G          WL  +++    G KD        A+Q        W   A+
Sbjct: 250  -LARCFLKLGE---------WLDKLQMNWGQGKKDEILQAYRTAVQFDDQWYKAWHSWAL 299

Query: 1111 FLEPRPQRKTKSVDALKKCEHDP------HVLLAVSKLF 1143
                      + +   ++    P      +V+ AV    
Sbjct: 300  ANF-------EVLQLDEQEPLAPSELRSEYVVPAVEGYL 331



 Score = 31.2 bits (71), Expect = 3.8
 Identities = 28/179 (15%), Positives = 47/179 (26%), Gaps = 31/179 (17%)

Query: 696 ALSSLSANGVEINREHWFKEAIEAEKAGSVHTCQALIRAIIGYG-------VEQEDRKHT 748
            LS ++  G     E W K A  A K+G     +  +  ++ Y        V     K+ 
Sbjct: 136 VLSLINYRGPHELAEMWLKFANLARKSGRFSLAEKALLKLLTYDTREDLPNVVIAYAKYL 195

Query: 749 W----MEDAESCANQGAYECARAIYAQALATFPSKKSIWLRAAYFEKNHGTRESLETLLQ 804
           W     ++A     +       + Y        S   + L    +E    T         
Sbjct: 196 WAKGQQQEAIQKLREFV-----SCYLSKPVGSSSDSEL-LLGLTYEVISSTNLEYFEAKL 249

Query: 805 KAVAHCPKSEVLWLMGAKSNKKSIWLRAAYFEKNHGTRESLETLLQKAVAHCPKSEVLW 863
            A                  K   WL         G ++ +    + AV    +    W
Sbjct: 250 LARCFL--------------KLGEWLDKLQMNWGQGKKDEILQAYRTAVQFDDQWYKAW 294


>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein.  This family includes proteins
           related to Mpp10 (M phase phosphoprotein 10). The U3
           small nucleolar ribonucleoprotein (snoRNP) is required
           for three cleavage events that generate the mature 18S
           rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
           depletion of Mpp10, a U3 snoRNP-specific protein, halts
           18S rRNA production and impairs cleavage at the three U3
           snoRNP-dependent sites.
          Length = 613

 Score = 36.1 bits (83), Expect = 0.14
 Identities = 46/266 (17%), Positives = 92/266 (34%), Gaps = 49/266 (18%)

Query: 115 DEEEDDEEDLNDSNFDEFNGYGGS--LFNKDPYDKDDEEADMIYEEIDKRMDE------- 165
           D+EE++EED  +S  DE         LFN+     +D   D   ++ +K+M+E       
Sbjct: 116 DDEEEEEED--ESLEDEMIDDEDEADLFNESESSLEDLSDDETEDDEEKKMEEEEAGEEK 173

Query: 166 --KRKDYREKRLR-----------EELERYRQ--ERPKIQQQFSDLKRGLVTVSMDEWKN 210
               +  REK+             +E+  + +  E  +      +           E   
Sbjct: 174 ESVEQATREKKFDKSGVDDKFFKLDEMNEFLEATEAEEEAALGDEDDFEDYFQDDSEDGK 233

Query: 211 EGQVVGQAIPPPPIPLVNRNKKHFMGVPAPLGYVAGVGRGATGFTTRSDIGPARDANDVS 270
           + +  G           N   + F                      + D G   +  D  
Sbjct: 234 DDEDFGSGEDEEDDEEGNIEYEDFFDPKEK--------------DKKKDAGDDAELEDDE 279

Query: 271 DDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEIDK 330
            D+ A   +   K EEED+E+D  + + DE           +      +  + + E +D 
Sbjct: 280 PDKEAVKKEADSKPEEEDEEDDEQEDDQDEEE-------PPEAAMDKVKLDEPVLEGVDL 332

Query: 331 RMDEKRKDY--REKRLREELERYRQE 354
              ++   +  R+ +L++++E+  +E
Sbjct: 333 ESPKELSSFEKRQAKLKQQIEQLEKE 358



 Score = 31.9 bits (72), Expect = 2.6
 Identities = 20/98 (20%), Positives = 43/98 (43%), Gaps = 9/98 (9%)

Query: 90  DIGPARDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDD 149
           D G   +  D   D+ A   +   K EEED+E+D  + + DE           +      
Sbjct: 268 DAGDDAELEDDEPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEE-------PPEAAMDKV 320

Query: 150 EEADMIYEEIDKRMDEKRKDY--REKRLREELERYRQE 185
           +  + + E +D    ++   +  R+ +L++++E+  +E
Sbjct: 321 KLDEPVLEGVDLESPKELSSFEKRQAKLKQQIEQLEKE 358


>gnl|CDD|227343 COG5010, TadD, Flp pilus assembly protein TadD, contains TPR
           repeats [Intracellular trafficking and secretion].
          Length = 257

 Score = 35.5 bits (82), Expect = 0.15
 Identities = 32/137 (23%), Positives = 52/137 (37%), Gaps = 9/137 (6%)

Query: 843 ESLETLLQKAVAHCPKSEVLWLMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLA-AV 901
            SL  L + A+A+    E+L   G K++   G+   A  +L  A +  P   E W     
Sbjct: 84  SSLAVLQKSAIAYPKDRELLAAQG-KNQIRNGNFGEAVSVLRKAARLAPTDWEAWNLLGA 142

Query: 902 KLESENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKAR 961
            L+ +   ++ ARR   +A   A    +  N+  + L          + E A  LL  A 
Sbjct: 143 ALD-QLGRFDEARRAYRQALELAPNEPSIANNLGMSLLL------RGDLEDAETLLLPAY 195

Query: 962 ASAPTPRVMIQSAKLEW 978
            S      + Q+  L  
Sbjct: 196 LSPAADSRVRQNLALVV 212



 Score = 30.1 bits (68), Expect = 6.0
 Identities = 18/61 (29%), Positives = 23/61 (37%)

Query: 748 TWMEDAESCANQGAYECARAIYAQALATFPSKKSIWLRAAYFEKNHGTRESLETLLQKAV 807
            W     +    G ++ AR  Y QAL   P++ SI           G  E  ETLL  A 
Sbjct: 136 AWNLLGAALDQLGRFDEARRAYRQALELAPNEPSIANNLGMSLLLRGDLEDAETLLLPAY 195

Query: 808 A 808
            
Sbjct: 196 L 196



 Score = 29.7 bits (67), Expect = 9.1
 Identities = 25/118 (21%), Positives = 42/118 (35%)

Query: 912  RARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARASAPTPRVMI 971
              R+    A A   A   NP    I   A  L    +       L   A A      ++ 
Sbjct: 45   AMRQTQGAAAALGAAVLRNPEDLSIAKLATALYLRGDADSSLAVLQKSAIAYPKDRELLA 104

Query: 972  QSAKLEWCLDNLERALQLLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIK 1029
               K +    N   A+ +L +A ++ P   + W + G   +Q    D+A   + QA++
Sbjct: 105  AQGKNQIRNGNFGEAVSVLRKAARLAPTDWEAWNLLGAALDQLGRFDEARRAYRQALE 162


>gnl|CDD|149048 pfam07768, PVL_ORF50, PVL ORF-50-like family.  This is a family of
           sequences found in both bacteria and bacteriophages.
           This region is approximately 130 residues long and in
           some cases is found as part of the PVL (Panton-Valentine
           leukocidin) group of genes, which encode a member of the
           leukocidin group of bacterial toxins that kill
           leukocytes by creation of pores in the cell membrane.
           PVL appears to be a virulence factor associated with a
           number of human diseases.
          Length = 118

 Score = 33.6 bits (77), Expect = 0.16
 Identities = 11/36 (30%), Positives = 21/36 (58%)

Query: 157 EEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQ 192
           E +D     + K+YRE +  E +E+ R+ER   +++
Sbjct: 41  EALDAPYGMRLKEYREIKKSENIEQERKERELERKR 76



 Score = 33.6 bits (77), Expect = 0.16
 Identities = 11/36 (30%), Positives = 21/36 (58%)

Query: 326 EEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQ 361
           E +D     + K+YRE +  E +E+ R+ER   +++
Sbjct: 41  EALDAPYGMRLKEYREIKKSENIEQERKERELERKR 76


>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
          Length = 619

 Score = 35.2 bits (82), Expect = 0.26
 Identities = 19/105 (18%), Positives = 43/105 (40%), Gaps = 15/105 (14%)

Query: 96  DANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMI 155
           D N   D  H      +  D+E+++EE+                       +  E+    
Sbjct: 175 DPNAEEDPAHVGSELEELDDDEDEEEEE-----------DENDDSLAADESELPEKVLEK 223

Query: 156 YEEIDKRMDEKRKDYREKRLREEL---ERYRQERPKIQQQFSDLK 197
           ++ + K+  + RK  +EK++   L   ++Y + R K++++   L+
Sbjct: 224 FKALAKQYKKLRK-AQEKKVEGRLAQHKKYAKLREKLKEELKSLR 267



 Score = 35.2 bits (82), Expect = 0.26
 Identities = 19/105 (18%), Positives = 43/105 (40%), Gaps = 15/105 (14%)

Query: 265 DANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMI 324
           D N   D  H      +  D+E+++EE+                       +  E+    
Sbjct: 175 DPNAEEDPAHVGSELEELDDDEDEEEEE-----------DENDDSLAADESELPEKVLEK 223

Query: 325 YEEIDKRMDEKRKDYREKRLREEL---ERYRQERPKIQQQFSDLK 366
           ++ + K+  + RK  +EK++   L   ++Y + R K++++   L+
Sbjct: 224 FKALAKQYKKLRK-AQEKKVEGRLAQHKKYAKLREKLKEELKSLR 267


>gnl|CDD|184235 PRK13678, PRK13678, hypothetical protein; Provisional.
          Length = 95

 Score = 32.2 bits (74), Expect = 0.32
 Identities = 20/52 (38%), Positives = 30/52 (57%), Gaps = 5/52 (9%)

Query: 115 DEEEDDEEDLNDSNFDE-FNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDE 165
           +E+EDDE ++   +F E  +G  G L    P +  DEE DMI E ++  +DE
Sbjct: 47  EEDEDDEIEIQAFSFTEDEDGDEGDLQ---PIE-TDEEWDMIEEVLNTFLDE 94



 Score = 32.2 bits (74), Expect = 0.32
 Identities = 20/52 (38%), Positives = 30/52 (57%), Gaps = 5/52 (9%)

Query: 284 DEEEDDEEDLNDSNFDE-FNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDE 334
           +E+EDDE ++   +F E  +G  G L    P +  DEE DMI E ++  +DE
Sbjct: 47  EEDEDDEIEIQAFSFTEDEDGDEGDLQ---PIE-TDEEWDMIEEVLNTFLDE 94


>gnl|CDD|225504 COG2956, COG2956, Predicted N-acetylglucosaminyl transferase
           [Carbohydrate transport and metabolism].
          Length = 389

 Score = 34.7 bits (80), Expect = 0.34
 Identities = 23/98 (23%), Positives = 39/98 (39%), Gaps = 9/98 (9%)

Query: 561 VDTARAVIAQAVRHIPTSVRIWIKAADLET---ETKAKRRVYRKALEHIPNSV-----RL 612
           VD AR ++ +A++     VR  I    +E    + +       + LE  P  +      L
Sbjct: 196 VDRARELLKKALQADKKCVRASIILGRVELAKGDYQKAVEALERVLEQNPEYLSEVLEML 255

Query: 613 WKAAVELEDPEDARILLSRAVECCPTSVELWLALARLE 650
           ++   +L  P +    L RA+E      +  L LA L 
Sbjct: 256 YECYAQLGKPAEGLNFLRRAMETNT-GADAELMLADLI 292


>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
           This family represents Cwf15/Cwc15 (from
           Schizosaccharomyces pombe and Saccharomyces cerevisiae
           respectively) and their homologues. The function of
           these proteins is unknown, but they form part of the
           spliceosome and are thus thought to be involved in mRNA
           splicing.
          Length = 241

 Score = 33.9 bits (78), Expect = 0.38
 Identities = 22/94 (23%), Positives = 42/94 (44%), Gaps = 5/94 (5%)

Query: 95  RDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADM 154
             +N+  +D       ++K+ EE+    D +DS+    +       + D   +D+  A +
Sbjct: 90  DASNEGDEDDDEEDEIKRKRIEEDARNSDADDSDSSSDSDSSDDDSD-DDDSEDETAALL 148

Query: 155 I-YEEIDK-RMDEKRKDYREKRLREELERYRQER 186
              E+I K R +EK +   E+    E E+ R+E 
Sbjct: 149 RELEKIKKERAEEKER--EEEEKAAEEEKAREEE 180



 Score = 33.9 bits (78), Expect = 0.38
 Identities = 22/94 (23%), Positives = 42/94 (44%), Gaps = 5/94 (5%)

Query: 264 RDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADM 323
             +N+  +D       ++K+ EE+    D +DS+    +       + D   +D+  A +
Sbjct: 90  DASNEGDEDDDEEDEIKRKRIEEDARNSDADDSDSSSDSDSSDDDSD-DDDSEDETAALL 148

Query: 324 I-YEEIDK-RMDEKRKDYREKRLREELERYRQER 355
              E+I K R +EK +   E+    E E+ R+E 
Sbjct: 149 RELEKIKKERAEEKER--EEEEKAAEEEKAREEE 180


>gnl|CDD|219444 pfam07516, SecA_SW, SecA Wing and Scaffold domain.  SecA protein
           binds to the plasma membrane where it interacts with
           proOmpA to support translocation of proOmpA through the
           membrane. SecA protein achieves this translocation, in
           association with SecY protein, in an ATP dependent
           manner. This family is composed of two C-terminal alpha
           helical subdomains: the wing and scaffold subdomains.
          Length = 213

 Score = 33.7 bits (78), Expect = 0.39
 Identities = 18/72 (25%), Positives = 32/72 (44%), Gaps = 3/72 (4%)

Query: 112 KKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE-ADMIYEEIDKRMDEKRKDY 170
           +K   EE D E L +    E  G    +  ++     +EE  + + E   +  +EK  + 
Sbjct: 78  EKSYPEEWDLEGLEEE-LRELLGLDLDIDEEELEGLTEEELKERLIEAAKEAYEEKEAEL 136

Query: 171 REKRLREELERY 182
            E+ +R E+ER 
Sbjct: 137 GEELMR-EIERS 147



 Score = 33.7 bits (78), Expect = 0.39
 Identities = 18/72 (25%), Positives = 32/72 (44%), Gaps = 3/72 (4%)

Query: 281 KKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE-ADMIYEEIDKRMDEKRKDY 339
           +K   EE D E L +    E  G    +  ++     +EE  + + E   +  +EK  + 
Sbjct: 78  EKSYPEEWDLEGLEEE-LRELLGLDLDIDEEELEGLTEEELKERLIEAAKEAYEEKEAEL 136

Query: 340 REKRLREELERY 351
            E+ +R E+ER 
Sbjct: 137 GEELMR-EIERS 147


>gnl|CDD|223465 COG0388, COG0388, Predicted amidohydrolase [General function
            prediction only].
          Length = 274

 Score = 34.0 bits (78), Expect = 0.43
 Identities = 18/89 (20%), Positives = 32/89 (35%), Gaps = 6/89 (6%)

Query: 981  DNLERALQLLDEAIK------VFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHS 1034
            +NL R L+L+ EA        VFP+              +    +A +   + +      
Sbjct: 19   ENLARILRLIREAAARGADLVVFPELFLTGYPCEDDLFLEEAAAEAGEETLEFLAALAEE 78

Query: 1035 VPLWIMLANLEERRKMLIKARSVLEKGRL 1063
              + I+   L ER K+   A  +   G +
Sbjct: 79   GGVIIVGGPLPEREKLYNNAALIDPDGEI 107


>gnl|CDD|233223 TIGR00990, 3a0801s09, mitochondrial precursor proteins import
            receptor (72 kDa mitochondrial outermembrane protein)
            (mitochondrial import receptor for the ADP/ATP carrier)
            (translocase of outermembrane tom70).  [Transport and
            binding proteins, Amino acids, peptides and amines].
          Length = 615

 Score = 34.2 bits (78), Expect = 0.58
 Identities = 39/207 (18%), Positives = 82/207 (39%), Gaps = 14/207 (6%)

Query: 870  KWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARAQAGAFQA 929
            K L G    A   LS + + +P   + ++    +  E  + ++A     KA         
Sbjct: 341  KCLKGKHLEALADLSKSIELDPRVTQSYIKRASMNLELGDPDKAEEDFDKALKL------ 394

Query: 930  NPNSEEIWLAAVKLESENNEYERARRLLAKARASAPTPRV-MIQSAKLEWCLDNLERALQ 988
            N    +I+    +L     E+ +A +   K+    P      IQ    ++   ++  ++ 
Sbjct: 395  NSEDPDIYYHRAQLHFIKGEFAQAGKDYQKSIDLDPDFIFSHIQLGVTQYKEGSIASSMA 454

Query: 989  LLDEAIKVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLANLEERR 1048
                  K FP+   ++   G++   +N  D+A + F  AI+    + P+++ +  L  + 
Sbjct: 455  TFRRCKKNFPEAPDVYNYYGELLLDQNKFDEAIEKFDTAIELEKETKPMYMNVLPLINKA 514

Query: 1049 KML-------IKARSVLEKGRLRNPNC 1068
              L       I+A ++ EK  + +P C
Sbjct: 515  LALFQWKQDFIEAENLCEKALIIDPEC 541


>gnl|CDD|148679 pfam07218, RAP1, Rhoptry-associated protein 1 (RAP-1).  This family
           consists of several rhoptry-associated protein 1 (RAP-1)
           sequences which appear to be specific to Plasmodium
           falciparum.
          Length = 790

 Score = 33.9 bits (77), Expect = 0.60
 Identities = 28/121 (23%), Positives = 44/121 (36%), Gaps = 16/121 (13%)

Query: 86  TTRSDIGPARDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFN---- 141
            T S    A  A  V  D  A P  +      E+    L ++N + F      L      
Sbjct: 175 ATDSGKASASVAGIVGADEEAPPAPKNTLTPLEE----LYETNVNLFA-LKHPLEKLEEE 229

Query: 142 ----KDPYDKDDEEADMIYEEIDKRMDEKRKDYREKRLRE---ELERYRQERPKIQQQFS 194
               K+  DK  EE +   +E  +  +E +K+  EK   E   E  ++ +E   I+    
Sbjct: 230 IDILKNDGDKVAEEEEFELDEEHEEAEEDKKEALEKIGAEGDEEKFKFDEEIKFIEHDVK 289

Query: 195 D 195
           D
Sbjct: 290 D 290



 Score = 33.9 bits (77), Expect = 0.60
 Identities = 28/121 (23%), Positives = 44/121 (36%), Gaps = 16/121 (13%)

Query: 255 TTRSDIGPARDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFN---- 310
            T S    A  A  V  D  A P  +      E+    L ++N + F      L      
Sbjct: 175 ATDSGKASASVAGIVGADEEAPPAPKNTLTPLEE----LYETNVNLFA-LKHPLEKLEEE 229

Query: 311 ----KDPYDKDDEEADMIYEEIDKRMDEKRKDYREKRLRE---ELERYRQERPKIQQQFS 363
               K+  DK  EE +   +E  +  +E +K+  EK   E   E  ++ +E   I+    
Sbjct: 230 IDILKNDGDKVAEEEEFELDEEHEEAEEDKKEALEKIGAEGDEEKFKFDEEIKFIEHDVK 289

Query: 364 D 364
           D
Sbjct: 290 D 290


>gnl|CDD|240433 PTZ00482, PTZ00482, membrane-attack complex/perforin (MACPF)
           Superfamily; Provisional.
          Length = 844

 Score = 33.7 bits (77), Expect = 0.90
 Identities = 24/137 (17%), Positives = 54/137 (39%), Gaps = 11/137 (8%)

Query: 101 SDDRHAAPVKRKK-----KDEE-----EDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDE 150
           +  +HA+ + ++K      D+E     EDDE+D  ++   E +    SL      D+D +
Sbjct: 72  NQKKHASFLNQRKSLDDDDDDEFDFLYEDDEDDAGNATSGESSTDDDSLLELPDRDEDAD 131

Query: 151 EADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQF-SDLKRGLVTVSMDEWK 209
                 +  D   D+      ++ L++ +     E+   +++  ++          D  +
Sbjct: 132 TQANNDQTNDFDQDDSSNSQTDQGLKQSVNLSSAEKLIEEKKGQTENTFKFYNFGNDGEE 191

Query: 210 NEGQVVGQAIPPPPIPL 226
              +  G++    P PL
Sbjct: 192 AAAKDGGKSKSSDPGPL 208



 Score = 30.2 bits (68), Expect = 8.0
 Identities = 19/103 (18%), Positives = 43/103 (41%), Gaps = 10/103 (9%)

Query: 270 SDDRHAAPVKRKK-----KDEE-----EDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDE 319
           +  +HA+ + ++K      D+E     EDDE+D  ++   E +    SL      D+D +
Sbjct: 72  NQKKHASFLNQRKSLDDDDDDEFDFLYEDDEDDAGNATSGESSTDDDSLLELPDRDEDAD 131

Query: 320 EADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQF 362
                 +  D   D+      ++ L++ +     E+   +++ 
Sbjct: 132 TQANNDQTNDFDQDDSSNSQTDQGLKQSVNLSSAEKLIEEKKG 174


>gnl|CDD|219925 pfam08598, Sds3, Sds3-like.  Repression of gene transcription is
           mediated by histone deacetylases containing
           repressor-co-repressor complexes, which are recruited to
           promoters of target genes via interactions with
           sequence-specific transcription factors. The
           co-repressor complex contains a core of at least seven
           proteins. This family represents the conserved region
           found in Sds3, Dep1 and BRMS1-homologue p40 proteins.
          Length = 184

 Score = 32.3 bits (74), Expect = 0.94
 Identities = 13/62 (20%), Positives = 27/62 (43%), Gaps = 7/62 (11%)

Query: 155 IYEEIDKRMDEKRK---DYREKRLREELERYRQERPKIQQQF----SDLKRGLVTVSMDE 207
             +++++R D++ K     RE +L      Y  ER   +Q+F      L+  L+    ++
Sbjct: 47  PLKDLEERRDDRLKVAELRREYKLECIEREYEAERQAAKQEFEKEKRLLRERLLEELEEK 106

Query: 208 WK 209
             
Sbjct: 107 IY 108



 Score = 32.3 bits (74), Expect = 0.94
 Identities = 13/62 (20%), Positives = 27/62 (43%), Gaps = 7/62 (11%)

Query: 324 IYEEIDKRMDEKRK---DYREKRLREELERYRQERPKIQQQF----SDLKRGLVTVSMDE 376
             +++++R D++ K     RE +L      Y  ER   +Q+F      L+  L+    ++
Sbjct: 47  PLKDLEERRDDRLKVAELRREYKLECIEREYEAERQAAKQEFEKEKRLLRERLLEELEEK 106

Query: 377 WK 378
             
Sbjct: 107 IY 108


>gnl|CDD|223061 PHA03369, PHA03369, capsid maturational protease; Provisional.
          Length = 663

 Score = 33.0 bits (75), Expect = 1.2
 Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 1/69 (1%)

Query: 144 PYDKDDEEADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGLVTV 203
           PY      A+M+Y    +    +RK  R   L+EEL    +   K++++   L + L   
Sbjct: 443 PYVMPISMANMVYPGHPQEHGHERKRKRGGELKEELIETLKLVKKLKEEQESLAKELEAT 502

Query: 204 S-MDEWKNE 211
           +   E K  
Sbjct: 503 AHKSEIKKI 511



 Score = 32.3 bits (73), Expect = 2.2
 Identities = 13/49 (26%), Positives = 23/49 (46%)

Query: 313 PYDKDDEEADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQ 361
           PY      A+M+Y    +    +RK  R   L+EEL    +   K++++
Sbjct: 443 PYVMPISMANMVYPGHPQEHGHERKRKRGGELKEELIETLKLVKKLKEE 491


>gnl|CDD|218489 pfam05192, MutS_III, MutS domain III.  This domain is found in
           proteins of the MutS family (DNA mismatch repair
           proteins) and is found associated with pfam00488,
           pfam05188, pfam01624 and pfam05190. The MutS family of
           proteins is named after the Salmonella typhimurium MutS
           protein involved in mismatch repair; other members of
           the family included the eukaryotic MSH 1,2,3, 4,5 and 6
           proteins. These have various roles in DNA repair and
           recombination. Human MSH has been implicated in
           non-polyposis colorectal carcinoma (HNPCC) and is a
           mismatch binding protein. The aligned region corresponds
           with domain III, which is central to the structure of
           Thermus aquaticus MutS as characterized in.
          Length = 290

 Score = 32.4 bits (74), Expect = 1.4
 Identities = 18/85 (21%), Positives = 31/85 (36%), Gaps = 2/85 (2%)

Query: 319 EEADMIYEEIDKRMDEKRKDYRE--KRLREELERYRQERPKIQQQFSDLKRGLVTVSMDE 376
            +  +I +  D  +DE R    E  ++L E LER R+       +    +     V   +
Sbjct: 148 RDGGVIKDGYDPELDELRALLDELREKLAELLERERERTGIKSLKVGYNRVFGYYVIEVK 207

Query: 377 WKNVPEVGDARNRKQRNPRAEKFTP 401
                +V     R+     A +FT 
Sbjct: 208 ASKADKVPGDYIRRSTTKNAVRFTT 232


>gnl|CDD|224340 COG1422, COG1422, Predicted membrane protein [Function unknown].
          Length = 201

 Score = 31.9 bits (73), Expect = 1.4
 Identities = 12/52 (23%), Positives = 29/52 (55%), Gaps = 7/52 (13%)

Query: 157 EEIDKRMDEKRKDYREKRLR---EELERYRQERPKIQQQFSDLKRGLVTVSM 205
           +E+ K M E +K++RE +     ++L++ +++    Q +  D +R L+ +  
Sbjct: 75  KELQKMMKEFQKEFREAQESGDMKKLKKLQEK----QMEMMDDQRELMKMQF 122



 Score = 31.9 bits (73), Expect = 1.4
 Identities = 12/52 (23%), Positives = 29/52 (55%), Gaps = 7/52 (13%)

Query: 326 EEIDKRMDEKRKDYREKRLR---EELERYRQERPKIQQQFSDLKRGLVTVSM 374
           +E+ K M E +K++RE +     ++L++ +++    Q +  D +R L+ +  
Sbjct: 75  KELQKMMKEFQKEFREAQESGDMKKLKKLQEK----QMEMMDDQRELMKMQF 122


>gnl|CDD|220680 pfam10300, DUF3808, Protein of unknown function (DUF3808).  This is a
            family of proteins conserved from fungi to humans.
            Members of this family also carry a TPR_2 domain
            pfam07719 at their C-terminus.
          Length = 446

 Score = 32.7 bits (75), Expect = 1.5
 Identities = 19/49 (38%), Positives = 29/49 (59%), Gaps = 2/49 (4%)

Query: 982  NLERALQLLDEAIKVFPDFAKLWM-MKGQIEEQKNLLDKAHDTFSQAIK 1029
             LE A  LL+ + K FP+ A LW+  + +IE  K  LD+A + F + I+
Sbjct: 231  PLEEAEALLEPSRKRFPNSA-LWLFFEARIESLKGNLDEALELFEECIE 278


>gnl|CDD|233924 TIGR02552, LcrH_SycD, type III secretion low calcium response
           chaperone LcrH/SycD.  Genes in this family are found in
           type III secretion operons. LcrH, from Yersinia is
           believed to have a regulatory function in the
           low-calcium response of the secretion system. The same
           protein is also known as SycD (SYC = Specific Yop
           Chaperone) for its chaperone role. In Pseudomonas, where
           the homolog is known as PcrH, the chaperone role has
           been demonstrated and the regulatory role appears to be
           absent. ScyD/LcrH contains three central
           tetratricopeptide-like repeats that are predicted to
           fold into an all-alpha-helical array.
          Length = 135

 Score = 30.7 bits (70), Expect = 1.9
 Identities = 27/91 (29%), Positives = 35/91 (38%), Gaps = 13/91 (14%)

Query: 909 EYERAR---RLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEYERARRLLAKARASAP 965
            Y+ A    +LLA           +P +   WL          EYE A    A A A  P
Sbjct: 32  RYDEALKLFQLLA---------AYDPYNSRYWLGLAACCQMLKEYEEAIDAYALAAALDP 82

Query: 966 -TPRVMIQSAKLEWCLDNLERALQLLDEAIK 995
             PR    +A+    L   E AL+ LD AI+
Sbjct: 83  DDPRPYFHAAECLLALGEPESALKALDLAIE 113


>gnl|CDD|237629 PRK14160, PRK14160, heat shock protein GrpE; Provisional.
          Length = 211

 Score = 31.6 bits (72), Expect = 2.0
 Identities = 17/88 (19%), Positives = 38/88 (43%)

Query: 117 EEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYREKRLR 176
            E + +D    N +E          +D   ++D E + I +E      E+  + + + L+
Sbjct: 1   MEKECKDAKHENMEEDCCKENENKEEDKGKEEDLEFEEIEKEEIIEDSEESNEVKIEELK 60

Query: 177 EELERYRQERPKIQQQFSDLKRGLVTVS 204
           +E  + ++E  K++ +   LK  L+   
Sbjct: 61  DENNKLKEENKKLENELEALKDRLLRTV 88



 Score = 31.6 bits (72), Expect = 2.0
 Identities = 17/88 (19%), Positives = 38/88 (43%)

Query: 286 EEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYREKRLR 345
            E + +D    N +E          +D   ++D E + I +E      E+  + + + L+
Sbjct: 1   MEKECKDAKHENMEEDCCKENENKEEDKGKEEDLEFEEIEKEEIIEDSEESNEVKIEELK 60

Query: 346 EELERYRQERPKIQQQFSDLKRGLVTVS 373
           +E  + ++E  K++ +   LK  L+   
Sbjct: 61  DENNKLKEENKKLENELEALKDRLLRTV 88


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 32.1 bits (74), Expect = 2.0
 Identities = 16/56 (28%), Positives = 26/56 (46%), Gaps = 7/56 (12%)

Query: 150 EEADMIYEEIDKRMDEKRKD------YREKRLREELER-YRQERPKIQQQFSDLKR 198
           EEA  I EE  K  +  +K+          +LR E E+  R+ R ++Q+    L +
Sbjct: 38  EEAKRILEEAKKEAEAIKKEALLEAKEEIHKLRNEFEKELRERRNELQKLEKRLLQ 93



 Score = 32.1 bits (74), Expect = 2.0
 Identities = 16/56 (28%), Positives = 26/56 (46%), Gaps = 7/56 (12%)

Query: 319 EEADMIYEEIDKRMDEKRKD------YREKRLREELER-YRQERPKIQQQFSDLKR 367
           EEA  I EE  K  +  +K+          +LR E E+  R+ R ++Q+    L +
Sbjct: 38  EEAKRILEEAKKEAEAIKKEALLEAKEEIHKLRNEFEKELRERRNELQKLEKRLLQ 93


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 31.9 bits (73), Expect = 2.7
 Identities = 22/89 (24%), Positives = 43/89 (48%), Gaps = 1/89 (1%)

Query: 115 DEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYREKR 174
           D+++ D++DL D   D+   + G   + +  +++  E     +E+ K +  K K Y+ +R
Sbjct: 134 DDDDFDDDDLGDLASDDRAAHFGGGEDDEEDEEEQPERKKSKKEVMKEVIAKSKFYKAER 193

Query: 175 LREELERYRQERPKIQQQFSDLKRGLVTV 203
            + + E     R ++   F DL   L TV
Sbjct: 194 QKAKEEDE-DLREELDDDFKDLMSLLRTV 221



 Score = 31.9 bits (73), Expect = 2.7
 Identities = 22/89 (24%), Positives = 43/89 (48%), Gaps = 1/89 (1%)

Query: 284 DEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYREKR 343
           D+++ D++DL D   D+   + G   + +  +++  E     +E+ K +  K K Y+ +R
Sbjct: 134 DDDDFDDDDLGDLASDDRAAHFGGGEDDEEDEEEQPERKKSKKEVMKEVIAKSKFYKAER 193

Query: 344 LREELERYRQERPKIQQQFSDLKRGLVTV 372
            + + E     R ++   F DL   L TV
Sbjct: 194 QKAKEEDE-DLREELDDDFKDLMSLLRTV 221


>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
            bacterial type.  SMC (structural maintenance of
            chromosomes) proteins bind DNA and act in organizing and
            segregating chromosomes for partition. SMC proteins are
            found in bacteria, archaea, and eukaryotes. This family
            represents the SMC protein of most bacteria. The smc gene
            is often associated with scpB (TIGR00281) and scpA genes,
            where scp stands for segregation and condensation
            protein. SMC was shown (in Caulobacter crescentus) to be
            induced early in S phase but present and bound to DNA
            throughout the cell cycle [Cellular processes, Cell
            division, DNA metabolism, Chromosome-associated
            proteins].
          Length = 1179

 Score = 32.0 bits (73), Expect = 2.7
 Identities = 29/145 (20%), Positives = 50/145 (34%), Gaps = 12/145 (8%)

Query: 895  EIWLAAVKLESENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESE--NNEYER 952
            E+ L  ++LE    E E  +  L +A  +     A     E  L  ++LE      E E 
Sbjct: 226  ELALLVLRLEELREELEELQEELKEAEEELEELTAELQELEEKLEELRLEVSELEEEIEE 285

Query: 953  ARRLL--AKARASAPTPRVMIQSAKLEWCLDNLERA-------LQLLDEAIKVFPDF-AK 1002
             ++ L       S    +  I   +L      LE            LDE  +   +   K
Sbjct: 286  LQKELYALANEISRLEQQKQILRERLANLERQLEELEAQLEELESKLDELAEELAELEEK 345

Query: 1003 LWMMKGQIEEQKNLLDKAHDTFSQA 1027
            L  +K ++E  +  L++      + 
Sbjct: 346  LEELKEELESLEAELEELEAELEEL 370


>gnl|CDD|219761 pfam08243, SPT2, SPT2 chromatin protein.  This family includes the
           Saccharomyces cerevisiae protein SPT2 which is a
           chromatin protein involved in transcriptional
           regulation.
          Length = 116

 Score = 29.9 bits (67), Expect = 3.2
 Identities = 26/110 (23%), Positives = 44/110 (40%), Gaps = 13/110 (11%)

Query: 87  TRSDIGPARDANDVSDDRHAAPVKRKKKD-EEEDDEEDLNDSNFDEFNGYGGSLFNKDPY 145
           T S+           DD H   +    +D +EE DE   +        G G      D Y
Sbjct: 11  TSSNKASRSY-----DDEHDEDMDDFIEDDDEEQDEIPYDSDEIWAIFGKGRKRSYYDRY 65

Query: 146 DKDDEEADMIYEEIDKRMDEKR-------KDYREKRLREELERYRQERPK 188
           D+DD   +M    ++ + +E+R       +D RE    EE E+ ++++  
Sbjct: 66  DEDDALDNMEATFMEIQKEERRSARMARLEDERELAREEEEEKRKKKKKN 115



 Score = 29.9 bits (67), Expect = 3.2
 Identities = 26/110 (23%), Positives = 44/110 (40%), Gaps = 13/110 (11%)

Query: 256 TRSDIGPARDANDVSDDRHAAPVKRKKKD-EEEDDEEDLNDSNFDEFNGYGGSLFNKDPY 314
           T S+           DD H   +    +D +EE DE   +        G G      D Y
Sbjct: 11  TSSNKASRSY-----DDEHDEDMDDFIEDDDEEQDEIPYDSDEIWAIFGKGRKRSYYDRY 65

Query: 315 DKDDEEADMIYEEIDKRMDEKR-------KDYREKRLREELERYRQERPK 357
           D+DD   +M    ++ + +E+R       +D RE    EE E+ ++++  
Sbjct: 66  DEDDALDNMEATFMEIQKEERRSARMARLEDERELAREEEEEKRKKKKKN 115


>gnl|CDD|193205 pfam12729, 4HB_MCP_1, Four helix bundle sensory module for signal
           transduction.  This family is a four helix bundle that
           operates as a ubiquitous sensory module in prokaryotic
           signal-transduction. The 4HB_MCP is always found between
           two predicted transmembrane helices indicating that it
           detects only extracellular signals. In many cases the
           domain is associated with a cytoplasmic HAMP domain
           suggesting that most proteins carrying the bundle might
           share the mechanism of transmembrane signalling which is
           well-characterized in E coli chemoreceptors.
          Length = 181

 Score = 30.7 bits (70), Expect = 3.5
 Identities = 18/64 (28%), Positives = 32/64 (50%), Gaps = 13/64 (20%)

Query: 139 LFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYR-------EKRL----REELERYRQERP 187
           +   DP ++D+   D+  EE+   +D+  K Y        EK+L    +E+L+ YR+ R 
Sbjct: 69  ILTTDPAERDELLKDI--EELRAEIDKLLKKYEKTILTEEEKKLFNEFKEQLKAYRKVRN 126

Query: 188 KIQQ 191
           K+  
Sbjct: 127 KVLD 130



 Score = 30.7 bits (70), Expect = 3.5
 Identities = 18/64 (28%), Positives = 32/64 (50%), Gaps = 13/64 (20%)

Query: 308 LFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYR-------EKRL----REELERYRQERP 356
           +   DP ++D+   D+  EE+   +D+  K Y        EK+L    +E+L+ YR+ R 
Sbjct: 69  ILTTDPAERDELLKDI--EELRAEIDKLLKKYEKTILTEEEKKLFNEFKEQLKAYRKVRN 126

Query: 357 KIQQ 360
           K+  
Sbjct: 127 KVLD 130


>gnl|CDD|187811 cd09680, Cas10_III, CRISPR/Cas system-associated protein Cas10.
           CRISPR (Clustered Regularly Interspaced Short
           Palindromic Repeats) and associated Cas proteins
           comprise a system for heritable host defense by
           prokaryotic cells against phage and other foreign DNA;
           Multidomain protein with permuted HD nuclease domain,
           palm domain and Zn-ribbon; signature gene for type III;
           also known as Csm1 family.
          Length = 650

 Score = 31.5 bits (72), Expect = 3.5
 Identities = 20/83 (24%), Positives = 38/83 (45%), Gaps = 4/83 (4%)

Query: 103 DRHAAPVKRKKKDEEED--DEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEID 160
           D  AA V+R++ +EE +  D+E    S F+  N        K        +         
Sbjct: 75  DNIAASVERREGEEEGEGFDKERPLHSIFNVININEKKKNYKPKKLAYSLKPLNPEIPPS 134

Query: 161 KRMDEKRKDYRE--KRLREELER 181
           +++   + DY+E  ++L+EEL++
Sbjct: 135 EKIKASQSDYKELYEKLKEELKK 157



 Score = 31.5 bits (72), Expect = 3.5
 Identities = 20/83 (24%), Positives = 38/83 (45%), Gaps = 4/83 (4%)

Query: 272 DRHAAPVKRKKKDEEED--DEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEID 329
           D  AA V+R++ +EE +  D+E    S F+  N        K        +         
Sbjct: 75  DNIAASVERREGEEEGEGFDKERPLHSIFNVININEKKKNYKPKKLAYSLKPLNPEIPPS 134

Query: 330 KRMDEKRKDYRE--KRLREELER 350
           +++   + DY+E  ++L+EEL++
Sbjct: 135 EKIKASQSDYKELYEKLKEELKK 157


>gnl|CDD|225613 COG3071, HemY, Uncharacterized enzyme of heme biosynthesis [Coenzyme
            metabolism].
          Length = 400

 Score = 31.2 bits (71), Expect = 3.5
 Identities = 56/307 (18%), Positives = 105/307 (34%), Gaps = 52/307 (16%)

Query: 792  NHGTRESLETLLQKAVAHCPKSEVLWLMGAKSNKKSIWLRAAYFEKNHGTRESLETLLQK 851
              G  +  E LL++   H  +  + +L+ A          AA   +  G  +     L +
Sbjct: 96   FEGDFQQAEKLLRRNAEHGEQPVLAYLLAA---------EAA---QQRGDEDRANRYLAE 143

Query: 852  AVAHCPKSEVL-WLMGAKSKWLAGDVPAAR-GILSLAFQANPNSEEI------------W 897
            A        +   L  A+      D PAAR  +  L      + E +            W
Sbjct: 144  AAELAGDDTLAVELTRARLLLNRRDYPAARENVDQLLEMTPRHPEVLRLALRAYIRLGAW 203

Query: 898  LAAV----KLESEN----NEYERARR-----LLAKARAQAGAF----------QANPNSE 934
             A +    KL         E  R  +     LL +AR   G+           +   N  
Sbjct: 204  QALLAILPKLRKAGLLSDEEAARLEQQAWEGLLQQARDDNGSEGLKTWWKNQPRKLRNDP 263

Query: 935  EIWLAAVKLESENNEYERARRLLAKARASAPTPRVMIQSAKLEWCLDNLERALQLLDEAI 994
            E+ +A  +      +++ A+ ++  A      PR+     +L     + E  ++  ++ +
Sbjct: 264  ELVVAYAERLIRLGDHDEAQEIIEDALKRQWDPRLCRLIPRLR--PGDPEPLIKAAEKWL 321

Query: 995  KVFPDFAKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLANLEERRKMLIKA 1054
            K  P+   L    G++  +  L  KA +    A+K  P S   +  LA+  ++     +A
Sbjct: 322  KQHPEDPLLLSTLGRLALKNKLWGKASEALEAALKLRP-SASDYAELADALDQLGEPEEA 380

Query: 1055 RSVLEKG 1061
              V  + 
Sbjct: 381  EQVRREA 387


>gnl|CDD|225660 COG3118, COG3118, Thioredoxin domain-containing protein
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 304

 Score = 31.2 bits (71), Expect = 3.6
 Identities = 23/87 (26%), Positives = 33/87 (37%), Gaps = 6/87 (6%)

Query: 891 PNSEEIWLAAVKLESENNEYERARRLLAKARAQAGAFQANPNSEEIWLAAVKLESENNEY 950
           P  EE  LA  K   E  ++  A  LL +A       QA P + E  L   +      + 
Sbjct: 131 PAEEEEALAEAKELIEAEDFGEAAPLLKQAL------QAAPENSEAKLLLAECLLAAGDV 184

Query: 951 ERARRLLAKARASAPTPRVMIQSAKLE 977
           E A+ +LA     A         A++E
Sbjct: 185 EAAQAILAALPLQAQDKAAHGLQAQIE 211



 Score = 31.2 bits (71), Expect = 3.8
 Identities = 22/77 (28%), Positives = 29/77 (37%), Gaps = 2/77 (2%)

Query: 855 HCPKSEVLWLMGAKSKWLAGDVPAARGILSLAFQANPNSEEIWLAAVKLESENNEYERAR 914
             P  E   L  AK    A D   A  +L  A QA P + E  L   +      + E A+
Sbjct: 129 VLPAEEEEALAEAKELIEAEDFGEAAPLLKQALQAAPENSEAKLLLAECLLAAGDVEAAQ 188

Query: 915 RLLA--KARAQAGAFQA 929
            +LA    +AQ  A   
Sbjct: 189 AILAALPLQAQDKAAHG 205


>gnl|CDD|205602 pfam13424, TPR_12, Tetratricopeptide repeat. 
          Length = 78

 Score = 28.9 bits (65), Expect = 3.8
 Identities = 12/71 (16%), Positives = 30/71 (42%), Gaps = 7/71 (9%)

Query: 966  TPRVMIQSAKLEWCLDNLERALQLLDEAIKVF-------PDFAKLWMMKGQIEEQKNLLD 1018
                +   A +   L + + AL+LL++A+++        P+ A+      ++       D
Sbjct: 4    LAAALNNLALVLRRLGDYDEALELLEKALELARELGEDHPETARALNNLARLYLALGDYD 63

Query: 1019 KAHDTFSQAIK 1029
            +A +   +A+ 
Sbjct: 64   EALEYLEKALA 74



 Score = 28.5 bits (64), Expect = 4.3
 Identities = 19/70 (27%), Positives = 30/70 (42%), Gaps = 10/70 (14%)

Query: 938 LAAVKLESENNEYERARRLLAKARASAP--------TPRVMIQSAKLEWCLDNLERALQL 989
           LA V     +  Y+ A  LL KA   A         T R +   A+L   L + + AL+ 
Sbjct: 11  LALVLRRLGD--YDEALELLEKALELARELGEDHPETARALNNLARLYLALGDYDEALEY 68

Query: 990 LDEAIKVFPD 999
           L++A+ +   
Sbjct: 69  LEKALALREA 78


>gnl|CDD|184428 PRK13971, PRK13971, hydroxyproline-2-epimerase; Provisional.
          Length = 333

 Score = 31.1 bits (71), Expect = 3.8
 Identities = 17/50 (34%), Positives = 22/50 (44%), Gaps = 10/50 (20%)

Query: 656 RKVLNKARENI-PTDRQI-------WTTAAKLEEAHGNNAMV--DKIIDR 695
           R+ LN+  E + P D +I       WT       A   NA+   DK IDR
Sbjct: 201 RQALNEKYEFVHPEDPRIRGVSHVLWTGKPISPGADARNAVFYGDKAIDR 250


>gnl|CDD|224494 COG1578, COG1578, Uncharacterized conserved protein [Function
           unknown].
          Length = 285

 Score = 30.8 bits (70), Expect = 4.2
 Identities = 25/102 (24%), Positives = 37/102 (36%), Gaps = 14/102 (13%)

Query: 653 ENARKVLNKARENIPTDRQIWTTAAKLEEAHGNN------AMVDKIIDRALSSLSANGVE 706
           E A KVL K RENI    +   TA KL    GN             ++  +  L    + 
Sbjct: 80  EIALKVLPKVRENIEDTPEDLKTAVKLAIV-GNVIDFGVLGFSPFDLEEEVEKLLDAELY 138

Query: 707 INREHWFKEAIE-------AEKAGSVHTCQALIRAIIGYGVE 741
           I+      E ++        + AG +   + LI  I   G +
Sbjct: 139 IDDSPKLLELLKNASVLYLTDNAGEIVFDKVLIEVIKELGKK 180


>gnl|CDD|217203 pfam02724, CDC45, CDC45-like protein.  CDC45 is an essential gene
           required for initiation of DNA replication in S.
           cerevisiae, forming a complex with MCM5/CDC46.
           Homologues of CDC45 have been identified in human, mouse
           and smut fungus among others.
          Length = 583

 Score = 31.1 bits (71), Expect = 4.7
 Identities = 14/90 (15%), Positives = 36/90 (40%), Gaps = 8/90 (8%)

Query: 96  DANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMI 155
           D  D+ ++    P       + E+D++D  +S+ ++         +K   + D+++ D  
Sbjct: 102 DDGDIEEELQDEPRYDDAYRDLEEDDDDDEESDEEDEES------SKS--EDDEDDDDDD 153

Query: 156 YEEIDKRMDEKRKDYREKRLREELERYRQE 185
            ++     +   +  R +R  EE     + 
Sbjct: 154 DDDDIATRERSLERRRRRREWEEKRAELEF 183



 Score = 31.1 bits (71), Expect = 4.7
 Identities = 14/90 (15%), Positives = 36/90 (40%), Gaps = 8/90 (8%)

Query: 265 DANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMI 324
           D  D+ ++    P       + E+D++D  +S+ ++         +K   + D+++ D  
Sbjct: 102 DDGDIEEELQDEPRYDDAYRDLEEDDDDDEESDEEDEES------SKS--EDDEDDDDDD 153

Query: 325 YEEIDKRMDEKRKDYREKRLREELERYRQE 354
            ++     +   +  R +R  EE     + 
Sbjct: 154 DDDDIATRERSLERRRRRREWEEKRAELEF 183


>gnl|CDD|191825 pfam07719, TPR_2, Tetratricopeptide repeat.  This Pfam entry includes
            outlying Tetratricopeptide-like repeats (TPR) that are
            not matched by pfam00515.
          Length = 34

 Score = 27.1 bits (61), Expect = 5.0
 Identities = 5/33 (15%), Positives = 15/33 (45%)

Query: 1001 AKLWMMKGQIEEQKNLLDKAHDTFSQAIKKCPH 1033
            A+     G    +    ++A + + +A++  P+
Sbjct: 1    AEALYNLGLAYYKLGDYEEALEAYEKALELDPN 33


>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
           splicing factor.  These splicing factors consist of an
           N-terminal arginine-rich low complexity domain followed
           by three tandem RNA recognition motifs (pfam00076). The
           well-characterized members of this family are auxilliary
           components of the U2 small nuclear ribonuclearprotein
           splicing factor (U2AF). These proteins are closely
           related to the CC1-like subfamily of splicing factors
           (TIGR01622). Members of this subfamily are found in
           plants, metazoa and fungi.
          Length = 509

 Score = 31.0 bits (70), Expect = 5.1
 Identities = 22/136 (16%), Positives = 40/136 (29%), Gaps = 25/136 (18%)

Query: 315 DKDDEEADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQFS------DLKRG 368
           D+     +  Y E  +  D +R D R  R        R      ++  S        +R 
Sbjct: 36  DRHRRSRERSYREDSRPRDRRRYDSRSPRSLRYSSVRRSRDRPRRRSRSVRSIEQHRRRL 95

Query: 369 LVTVSMDEWKNVPEVGDARNRKQRNPRAEKFTPLPDSVLRGNLGGESTGAIDPNSGLMSQ 428
                 ++W+      D + R   + +   +              E   A    +  +  
Sbjct: 96  RDRSPSNQWRK-----DDKKRSLWDIKPPGY--------------ELVTADQAKASQVFS 136

Query: 429 IPGTATPGMLTPSGDL 444
           +PGTA    +T    L
Sbjct: 137 VPGTAPRPAMTDPEKL 152


>gnl|CDD|112485 pfam03670, UPF0184, Uncharacterized protein family (UPF0184). 
          Length = 83

 Score = 28.4 bits (63), Expect = 5.2
 Identities = 17/58 (29%), Positives = 27/58 (46%), Gaps = 10/58 (17%)

Query: 148 DDEEADMI-YEEIDKRMD---------EKRKDYREKRLREELERYRQERPKIQQQFSD 195
           + +E  +  YE I+  +D         E+R D    +LRE LE  RQ R +  +Q  +
Sbjct: 19  NVDEFGLQEYEAINSMLDQINSALDALEERNDDLMSQLRELLESNRQIRLEFAEQLDN 76



 Score = 28.4 bits (63), Expect = 5.2
 Identities = 17/58 (29%), Positives = 27/58 (46%), Gaps = 10/58 (17%)

Query: 317 DDEEADMI-YEEIDKRMD---------EKRKDYREKRLREELERYRQERPKIQQQFSD 364
           + +E  +  YE I+  +D         E+R D    +LRE LE  RQ R +  +Q  +
Sbjct: 19  NVDEFGLQEYEAINSMLDQINSALDALEERNDDLMSQLRELLESNRQIRLEFAEQLDN 76


>gnl|CDD|185582 PTZ00373, PTZ00373, 60S Acidic ribosomal protein P2; Provisional.
          Length = 112

 Score = 29.1 bits (65), Expect = 5.2
 Identities = 14/39 (35%), Positives = 20/39 (51%)

Query: 92  GPARDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFD 130
             A  A   +    A   K +KK+EEE++E+DL  S F 
Sbjct: 74  AAAPAAGAATAGAKAEAKKEEKKEEEEEEEDDLGFSLFG 112



 Score = 29.1 bits (65), Expect = 5.2
 Identities = 14/39 (35%), Positives = 20/39 (51%)

Query: 261 GPARDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFD 299
             A  A   +    A   K +KK+EEE++E+DL  S F 
Sbjct: 74  AAAPAAGAATAGAKAEAKKEEKKEEEEEEEDDLGFSLFG 112


>gnl|CDD|222648 pfam14283, DUF4366, Domain of unknown function (DUF4366).  This
           family of proteins is found in bacteria and eukaryotes.
           Proteins in this family are typically between 227 and
           387 amino acids in length.
          Length = 213

 Score = 30.4 bits (69), Expect = 5.4
 Identities = 21/112 (18%), Positives = 37/112 (33%), Gaps = 32/112 (28%)

Query: 209 KNEGQVVGQAIPPPPIPLVNRNKKHFMGVPAPLGYVAGVGRGATGFTTRSDIGPARDAND 268
            N  +  G    P P P     KK  MG    +  VA +G GA  +              
Sbjct: 132 VNMTECTGPEPEPEPEPEEEPEKKSGMGPLLLVLAVALIGGGAYYY-------------- 177

Query: 269 VSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE 320
                      + K+ E+   ++DL++ ++ +              ++DDE 
Sbjct: 178 -------FKFYKPKQQEKGAPDDDLDEYDYGDE-----------DEEEDDEP 211



 Score = 29.6 bits (67), Expect = 8.8
 Identities = 7/43 (16%), Positives = 19/43 (44%), Gaps = 11/43 (25%)

Query: 109 VKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE 151
             + K+ E+   ++DL++ ++ +              ++DDE 
Sbjct: 180 FYKPKQQEKGAPDDDLDEYDYGDE-----------DEEEDDEP 211


>gnl|CDD|198111 smart01043, BTAD, Bacterial transcriptional activator domain.
           Found in the DNRI/REDD/AFSR family of regulators. This
           region of AFSR along with the C terminal region is
           capable of independently directing actinorhodin
           production. This family contains TPR repeats.
          Length = 145

 Score = 29.6 bits (67), Expect = 5.5
 Identities = 21/75 (28%), Positives = 29/75 (38%), Gaps = 17/75 (22%)

Query: 596 RRVYRKALEHIPNSVRLWKAAVELEDPEDARILLSRAVECCPTSVELW----LALAR--- 648
           R +  +ALE       L +A + L   E+A  LL R +   P    L      AL R   
Sbjct: 57  RELRLEALE------ALAEALLALGRHEEALALLERLLALDPLRERLHRLLMRALYRAGR 110

Query: 649 ----LETYENARKVL 659
               L  Y   R++L
Sbjct: 111 RAEALRAYRRLRRLL 125


>gnl|CDD|223726 COG0653, SecA, Preprotein translocase subunit SecA (ATPase, RNA
           helicase) [Intracellular trafficking and secretion].
          Length = 822

 Score = 31.0 bits (71), Expect = 5.7
 Identities = 20/89 (22%), Positives = 30/89 (33%), Gaps = 8/89 (8%)

Query: 95  RDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE-AD 153
            D        +  P     +  E  D E L D            +   D  D+ +EE A+
Sbjct: 635 EDVIKALVGEYIPP----PQQAELWDLEGLID-ELKGTVHPDLPINKSDLEDEAEEELAE 689

Query: 154 MIYEEIDKRMDEKRKDYREKRLREELERY 182
            I +  D+  D+K +         E ERY
Sbjct: 690 RILKAADEAYDKKEE--VGPEAMREFERY 716



 Score = 31.0 bits (71), Expect = 5.7
 Identities = 20/89 (22%), Positives = 30/89 (33%), Gaps = 8/89 (8%)

Query: 264 RDANDVSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE-AD 322
            D        +  P     +  E  D E L D            +   D  D+ +EE A+
Sbjct: 635 EDVIKALVGEYIPP----PQQAELWDLEGLID-ELKGTVHPDLPINKSDLEDEAEEELAE 689

Query: 323 MIYEEIDKRMDEKRKDYREKRLREELERY 351
            I +  D+  D+K +         E ERY
Sbjct: 690 RILKAADEAYDKKEE--VGPEAMREFERY 716


>gnl|CDD|217861 pfam04050, Upf2, Up-frameshift suppressor 2.  Transcripts
           harbouring premature signals for translation termination
           are recognised and rapidly degraded by eukaryotic cells
           through a pathway known as nonsense-mediated mRNA decay.
           In Saccharomyces cerevisiae, three trans-acting factors
           (Upf1 to Upf3) are required for nonsense-mediated mRNA
           decay.
          Length = 171

 Score = 29.7 bits (67), Expect = 5.8
 Identities = 21/72 (29%), Positives = 33/72 (45%), Gaps = 9/72 (12%)

Query: 115 DEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEID-KRMDEKRKDYREK 173
           DEE  +E++ ++S+ DE            P D+ DEE+D   E+I   R +E+     E 
Sbjct: 13  DEELPEEDEDDESS-DEEEVD-------LPDDEQDEESDSEEEQIFVTRQEEEVDPEAEA 64

Query: 174 RLREELERYRQE 185
               E E+   E
Sbjct: 65  EFDREFEKMMAE 76



 Score = 29.7 bits (67), Expect = 5.8
 Identities = 21/72 (29%), Positives = 33/72 (45%), Gaps = 9/72 (12%)

Query: 284 DEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEID-KRMDEKRKDYREK 342
           DEE  +E++ ++S+ DE            P D+ DEE+D   E+I   R +E+     E 
Sbjct: 13  DEELPEEDEDDESS-DEEEVD-------LPDDEQDEESDSEEEQIFVTRQEEEVDPEAEA 64

Query: 343 RLREELERYRQE 354
               E E+   E
Sbjct: 65  EFDREFEKMMAE 76


>gnl|CDD|220261 pfam09484, Cas_TM1802, CRISPR-associated protein TM1802
           (cas_TM1802).  Clusters of short DNA repeats with
           non-homologous spacers, which are found at regular
           intervals in the genomes of phylogenetically distinct
           prokaryotic species, comprise a family with recognisable
           features. This family is known as CRISPR (short for
           Clustered, Regularly Interspaced Short Palindromic
           Repeats). A number of protein families appear only in
           association with these repeats and are designated Cas
           (CRISPR-Associated) proteins. This minor cas protein is
           found in at least five prokaryotic genomes:
           Methanosarcina mazei, Sulfurihydrogenibium azorense,
           Thermotoga maritima, Carboxydothermus hydrogenoformans,
           and Dictyoglomus thermophilum, the first of which is
           archaeal while the rest are bacterial.
          Length = 584

 Score = 30.8 bits (70), Expect = 5.9
 Identities = 20/143 (13%), Positives = 50/143 (34%), Gaps = 16/143 (11%)

Query: 259 DIGPARDANDVSDDRHAAPVK--RKKKDEEEDDEEDLNDSNFDEFNGYGG---------- 306
           D+    + N   + +H   +    ++ + E    E+ +     ++   G           
Sbjct: 23  DLLLQENPNKGGNYQHVLKIDFDTEENEIEIVIGEEFDKKTARKYLYKGQAGNGNSSQWY 82

Query: 307 ---SLFNKDPYDKDDEEADMIYEEIDKRMDEKRKD-YREKRLREELERYRQERPKIQQQF 362
              +    DP +  +++   I+++  K   E  K  Y  K +++ LE+  ++  K     
Sbjct: 83  SPTNKITYDPEETLNKKLKSIFKKFYKDKGEINKYRYFLKDIKKVLEKNFEKIIKDLIDL 142

Query: 363 SDLKRGLVTVSMDEWKNVPEVGD 385
              +  L T+   +      + D
Sbjct: 143 KKNEGVLYTIYFLKNDGKKYLSD 165


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 31.1 bits (70), Expect = 6.0
 Identities = 28/127 (22%), Positives = 52/127 (40%), Gaps = 8/127 (6%)

Query: 96   DANDVSDDRHAAPVKRKKKDEEEDD-EEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADM 154
            D N V+ D   +  +    +EE  D +E++ND   D  N     L++ +P ++D  E + 
Sbjct: 3863 DMNGVTKDSVVSENENSDSEEENQDLDEEVNDIPEDLSNSLNEKLWD-EPNEEDLLETEQ 3921

Query: 155  IYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGLVTVSMDEWKNEGQV 214
               E     +E     +E   +   ++ RQE+   ++   D+         DE + + Q 
Sbjct: 3922 KSNEQSAANNESDLVSKEDDNKALEDKDRQEKEDEEEMSDDVGID------DEIQPDIQE 3975

Query: 215  VGQAIPP 221
                 PP
Sbjct: 3976 NNSQPPP 3982


>gnl|CDD|233224 TIGR00993, 3a0901s04IAP86, chloroplast protein import component
           Toc86/159, G and M domains.  The long precursor of the
           86K protein originally described is proposed to have
           three domains. The N-terminal A-domain is acidic,
           repetitive, weakly conserved, readily removed by
           proteolysis during chloroplast isolation, and not
           required for protein translocation. The other domains
           are designated G (GTPase) and M (membrane anchor); this
           family includes most of the G domain and all of M
           [Transport and binding proteins, Amino acids, peptides
           and amines].
          Length = 763

 Score = 30.7 bits (69), Expect = 6.3
 Identities = 56/227 (24%), Positives = 87/227 (38%), Gaps = 49/227 (21%)

Query: 284 DEEEDDEE--DLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRK---- 337
           DEE+ D E  D +DS+ +           +D YD+      +   ++ K   E+RK    
Sbjct: 384 DEEDSDIELEDSSDSDEES---------GEDEYDQLPPFKPLTKAQMAKLSKEQRKAYLE 434

Query: 338 --DYREKRL-----REELERYRQERPKIQQQFSDLKRGLVTVSMDEWKNVPEVGDARNRK 390
             DYR K L     REEL+R  +   K  ++  +L  G  +  +DE    P         
Sbjct: 435 EYDYRVKLLQKKQWREELKR-MKMMKKFGKEIGELPDGY-SEEVDEENGGP--------- 483

Query: 391 QRNPRAEKFTPLPDSVLRGNLGGESTGA----IDPNSGLMSQIPGTATPGMLTPSGDLDL 446
                A    PLPD VL  +   ++       ++P+S L+++      P + T   D D 
Sbjct: 484 -----AAVPVPLPDMVLPASFDSDNPAYRYRYLEPSSQLLTR------PVLDTHGWDHDC 532

Query: 447 RKMGQARNTLMNVKLNQISDSVVGQTVVDPKGYLTDLQSMIPTYGGD 493
              G        VK  +   SV  Q   D K +   L S +    G+
Sbjct: 533 GYDGVNAERSFAVK-EKFPASVTVQVTKDKKDFNIHLDSSVSAKHGE 578


>gnl|CDD|221333 pfam11942, Spt5_N, Spt5 transcription elongation factor, acidic
           N-terminal.  This is the very acidic N-terminal region
           of the early transcription elongation factor Spt5. The
           Spt5-Spt4 complex regulates early transcription
           elongation by RNA polymerase II and has an imputed role
           in pre-mRNA processing via its physical association with
           mRNA capping enzymes. The actual function of this
           N-terminal domain is not known although it is
           dispensable for binding to Spt4.
          Length = 92

 Score = 28.6 bits (64), Expect = 6.5
 Identities = 12/70 (17%), Positives = 29/70 (41%), Gaps = 12/70 (17%)

Query: 115 DEEEDDEEDLNDSNFDEFNGYGG------------SLFNKDPYDKDDEEADMIYEEIDKR 162
           D+EE++EE+  D   D  +                   ++    +++E+A+ + E + KR
Sbjct: 9   DDEEEEEEEEEDDLEDLSDEDEFIDEAEAEDDRRHRRLDRRREKEEEEDAEELAEYLRKR 68

Query: 163 MDEKRKDYRE 172
             ++     +
Sbjct: 69  YGDEADADAD 78



 Score = 28.6 bits (64), Expect = 6.5
 Identities = 12/70 (17%), Positives = 29/70 (41%), Gaps = 12/70 (17%)

Query: 284 DEEEDDEEDLNDSNFDEFNGYGG------------SLFNKDPYDKDDEEADMIYEEIDKR 331
           D+EE++EE+  D   D  +                   ++    +++E+A+ + E + KR
Sbjct: 9   DDEEEEEEEEEDDLEDLSDEDEFIDEAEAEDDRRHRRLDRRREKEEEEDAEELAEYLRKR 68

Query: 332 MDEKRKDYRE 341
             ++     +
Sbjct: 69  YGDEADADAD 78



 Score = 28.2 bits (63), Expect = 8.2
 Identities = 17/73 (23%), Positives = 28/73 (38%), Gaps = 14/73 (19%)

Query: 115 DEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRM---DEKRKDYR 171
            E +D+EE+  +   D            +    +DE  D    E D+R    D +R+   
Sbjct: 6   AEVDDEEEEEEEEEDDL-----------EDLSDEDEFIDEAEAEDDRRHRRLDRRREKEE 54

Query: 172 EKRLREELERYRQ 184
           E+   E  E  R+
Sbjct: 55  EEDAEELAEYLRK 67



 Score = 28.2 bits (63), Expect = 8.2
 Identities = 17/73 (23%), Positives = 28/73 (38%), Gaps = 14/73 (19%)

Query: 284 DEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMIYEEIDKRM---DEKRKDYR 340
            E +D+EE+  +   D            +    +DE  D    E D+R    D +R+   
Sbjct: 6   AEVDDEEEEEEEEEDDL-----------EDLSDEDEFIDEAEAEDDRRHRRLDRRREKEE 54

Query: 341 EKRLREELERYRQ 353
           E+   E  E  R+
Sbjct: 55  EEDAEELAEYLRK 67


>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207).  This
           family is found in eukaryotes; it has several conserved
           tryptophan residues. The function is not known.
          Length = 261

 Score = 30.1 bits (68), Expect = 6.6
 Identities = 15/62 (24%), Positives = 31/62 (50%), Gaps = 4/62 (6%)

Query: 147 KDDEEADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGLVTVSMD 206
           K + +A   YE        K+   ++K  +   E+ +QER K +++ ++L++ L     +
Sbjct: 80  KLERQAQEAYENWLSA---KQAQRQKKLQKLLEEKQKQEREK-EREEAELRQRLAKEKYE 135

Query: 207 EW 208
           EW
Sbjct: 136 EW 137



 Score = 30.1 bits (68), Expect = 6.6
 Identities = 15/62 (24%), Positives = 31/62 (50%), Gaps = 4/62 (6%)

Query: 316 KDDEEADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGLVTVSMD 375
           K + +A   YE        K+   ++K  +   E+ +QER K +++ ++L++ L     +
Sbjct: 80  KLERQAQEAYENWLSA---KQAQRQKKLQKLLEEKQKQEREK-EREEAELRQRLAKEKYE 135

Query: 376 EW 377
           EW
Sbjct: 136 EW 137


>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
          Length = 880

 Score = 30.8 bits (70), Expect = 6.7
 Identities = 17/65 (26%), Positives = 31/65 (47%), Gaps = 4/65 (6%)

Query: 156 YEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGLVTVSMDEWKNEGQVV 215
           YEE+ +   E  ++     LR ELE   + R +I++    LK  L     ++ K E + +
Sbjct: 661 YEELREEYLELSREL--AGLRAELEELEKRREEIKKTLEKLKEELEE--REKAKKELEKL 716

Query: 216 GQAIP 220
            +A+ 
Sbjct: 717 EKALE 721


>gnl|CDD|236361 PRK08999, PRK08999, hypothetical protein; Provisional.
          Length = 312

 Score = 30.2 bits (69), Expect = 7.0
 Identities = 17/79 (21%), Positives = 29/79 (36%), Gaps = 14/79 (17%)

Query: 828 IWLRAAYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKSKWLAGDVPA------ARG 881
           I LRA           +   L + A+  C ++    L+    +  A D+ A      +  
Sbjct: 161 IQLRAP-----QLPPAAYRALARAALGLCRRAGAQLLLNGDPEL-AEDLGADGVHLTSAQ 214

Query: 882 ILSLAFQANPNSEEIWLAA 900
           + +L   A P     W+AA
Sbjct: 215 LAAL--AARPLPAGRWVAA 231


>gnl|CDD|234306 TIGR03674, fen_arch, flap structure-specific endonuclease.
           Endonuclease that cleaves the 5'-overhanging flap
           structure that is generated by displacement synthesis
           when DNA polymerase encounters the 5'-end of a
           downstream Okazaki fragment. Has 5'-endo-/exonuclease
           and 5'-pseudo-Y-endonuclease activities. Cleaves the
           junction between single and double-stranded regions of
           flap DNA.
          Length = 338

 Score = 30.3 bits (69), Expect = 7.2
 Identities = 18/58 (31%), Positives = 26/58 (44%), Gaps = 6/58 (10%)

Query: 584 KAADLETETKAKRRVYRKALEHIPNSVRLWKAAVELEDPEDARILLSRAVECCPTSVE 641
           K  +L+ ET  +RR  R+  E        W+ A+E  D E+AR    R+       VE
Sbjct: 82  KPPELKAETLEERREIREEAE------EKWEEALEKGDLEEARKYAQRSSRLTSEIVE 133


>gnl|CDD|181091 PRK07720, fliJ, flagellar biosynthesis chaperone; Validated.
          Length = 146

 Score = 29.3 bits (66), Expect = 7.4
 Identities = 13/47 (27%), Positives = 28/47 (59%), Gaps = 2/47 (4%)

Query: 161 KRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGLVTVSMDE 207
           ++M+ K++D  EK +  E+++Y + + K Q+ F+  ++      MDE
Sbjct: 92  EQMNRKQQDLTEKNI--EVKKYEKMKEKKQEMFALEEKAAEMKEMDE 136



 Score = 29.3 bits (66), Expect = 7.4
 Identities = 13/47 (27%), Positives = 28/47 (59%), Gaps = 2/47 (4%)

Query: 330 KRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGLVTVSMDE 376
           ++M+ K++D  EK +  E+++Y + + K Q+ F+  ++      MDE
Sbjct: 92  EQMNRKQQDLTEKNI--EVKKYEKMKEKKQEMFALEEKAAEMKEMDE 136


>gnl|CDD|226812 COG4377, COG4377, Predicted membrane protein [Function unknown].
          Length = 258

 Score = 29.8 bits (67), Expect = 7.7
 Identities = 26/121 (21%), Positives = 42/121 (34%), Gaps = 13/121 (10%)

Query: 768 IYAQALATFPSKKSIWLRAAYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKSNKKS 827
           IY   +A F  +    L   + EK    +      L   + H        L+G  S    
Sbjct: 81  IYGLLMAGFFEETGRLLFFRFLEKRSLEK---ADALAYGLGHGGLE--AILLGLTS---L 132

Query: 828 IWLRAAYFEKNHGTRESLETLLQKAVAHCPKSEVLWLMGAKSKWLAGDVPAARGILSLAF 887
           + L       N G  + L     +A++      +L L+ + S W    +   R IL+L  
Sbjct: 133 LNLYIVLSAVNTGNPQVLMQSGAEALS----ENMLKLLQSLSVWQIYGLGFER-ILALGV 187

Query: 888 Q 888
           Q
Sbjct: 188 Q 188


>gnl|CDD|219900 pfam08553, VID27, VID27 cytoplasmic protein.  This is a family of
           fungal and plant proteins and contains many hypothetical
           proteins. VID27 is a cytoplasmic protein that plays a
           potential role in vacuolar protein degradation.
          Length = 794

 Score = 30.5 bits (69), Expect = 8.0
 Identities = 13/52 (25%), Positives = 25/52 (48%), Gaps = 7/52 (13%)

Query: 100 VSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE 151
             +D +      +++DEEE++EE       DE  G      + + +++DD E
Sbjct: 380 EIEDANTERDDEEEEDEEEEEEE-------DEDEGPSKEHSDDEEFEEDDVE 424



 Score = 30.5 bits (69), Expect = 8.0
 Identities = 13/52 (25%), Positives = 25/52 (48%), Gaps = 7/52 (13%)

Query: 269 VSDDRHAAPVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE 320
             +D +      +++DEEE++EE       DE  G      + + +++DD E
Sbjct: 380 EIEDANTERDDEEEEDEEEEEEE-------DEDEGPSKEHSDDEEFEEDDVE 424


>gnl|CDD|222425 pfam13865, FoP_duplication, C-terminal duplication domain of Friend
           of PRMT1.  Fop, or Friend of Prmt1, proteins are
           conserved from fungi and plants to vertebrates. There is
           little that is actually conserved except for this
           C-terminal LDXXLDAYM region where X is any amino acid).
           The Fop proteins themselves are nuclear proteins
           localised to regions with low levels of DAPI, with a
           punctate/speckle-like distribution. Fop is a
           chromatin-associated protein and it colocalises with
           facultative heterochromatin. It is is critical for
           oestrogen-dependent gene activation.
          Length = 76

 Score = 27.8 bits (62), Expect = 8.1
 Identities = 8/59 (13%), Positives = 19/59 (32%), Gaps = 7/59 (11%)

Query: 132 FNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYRE---KRLREELERYRQERP 187
             G       +       +      E++D  +D+    Y      +L  +L+ Y  ++ 
Sbjct: 20  RRGRRRGRGGRKGKGGAAKPKPKTREDLDAELDQ----YMSTTKSKLDADLDAYMSKKD 74



 Score = 27.8 bits (62), Expect = 8.1
 Identities = 8/59 (13%), Positives = 19/59 (32%), Gaps = 7/59 (11%)

Query: 301 FNGYGGSLFNKDPYDKDDEEADMIYEEIDKRMDEKRKDYRE---KRLREELERYRQERP 356
             G       +       +      E++D  +D+    Y      +L  +L+ Y  ++ 
Sbjct: 20  RRGRRRGRGGRKGKGGAAKPKPKTREDLDAELDQ----YMSTTKSKLDADLDAYMSKKD 74


>gnl|CDD|236402 PRK09191, PRK09191, two-component response regulator; Provisional.
          Length = 261

 Score = 29.8 bits (68), Expect = 8.5
 Identities = 29/93 (31%), Positives = 39/93 (41%), Gaps = 13/93 (13%)

Query: 506 SVRETNPNHPPAWIASARLEEVTGKVQAARNLI-MKGCEENQTSEDLWL---EAARLQPV 561
              +  P  P    A  RL  +T   + A  L  ++G    + +E L +   EA  L  +
Sbjct: 68  GANDPEPGSPFEARAERRLAGLTPLPRQAFLLTALEGFSVEEAAEILGVDPAEAEAL--L 125

Query: 562 DTARAVIAQAVRHIPTSVRI----WIKAADLET 590
           D ARA IA+ V    T V I     I A DLE 
Sbjct: 126 DDARAEIARQVA---TRVLIIEDEPIIAMDLEQ 155


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 30.0 bits (67), Expect = 9.2
 Identities = 30/133 (22%), Positives = 54/133 (40%), Gaps = 12/133 (9%)

Query: 269 VSDDRHAAPVKRKKKDEE----EDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEEADMI 324
           V D+  AA  +R+++ EE    + +E  L             S+ +++     D+EA ++
Sbjct: 1   VDDEEEAARERRRREREEQLRQKQEEGSLGQVTTQVEVNSQNSVPDEESKTSTDDEAALL 60

Query: 325 YEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGLVTVSMDEWKNVPEVG 384
             E   R +E+R    ++R  E LER ++ +P    Q   L      +  D       V 
Sbjct: 61  --ERLARREERR----DERFSEALERQKEFKPTSTDQ--SLSEPSRRMQEDSGAENETVE 112

Query: 385 DARNRKQRNPRAE 397
           +    + R  R E
Sbjct: 113 EEEKEESREEREE 125


>gnl|CDD|235033 PRK02363, PRK02363, DNA-directed RNA polymerase subunit delta;
           Reviewed.
          Length = 129

 Score = 28.4 bits (64), Expect = 9.7
 Identities = 11/44 (25%), Positives = 22/44 (50%)

Query: 108 PVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE 151
              +KKK   + D++ ++D    + +     L  +D  D++DEE
Sbjct: 86  KFDKKKKKFMDGDDDIIDDDILPDDDFDEEDLDEEDDEDEEDEE 129



 Score = 28.4 bits (64), Expect = 9.7
 Identities = 11/44 (25%), Positives = 22/44 (50%)

Query: 277 PVKRKKKDEEEDDEEDLNDSNFDEFNGYGGSLFNKDPYDKDDEE 320
              +KKK   + D++ ++D    + +     L  +D  D++DEE
Sbjct: 86  KFDKKKKKFMDGDDDIIDDDILPDDDFDEEDLDEEDDEDEEDEE 129


>gnl|CDD|221250 pfam11831, Myb_Cef, pre-mRNA splicing factor component.  This
           family is a region of the Myb-Related Cdc5p/Cef1
           proteins, in fungi, and is part of the pre-mRNA splicing
           factor complex.
          Length = 363

 Score = 29.6 bits (67), Expect = 9.8
 Identities = 15/79 (18%), Positives = 31/79 (39%), Gaps = 14/79 (17%)

Query: 315 DKDDEEADMIYEEIDKRMDEKRKDYREKRLREELERYRQERPKIQQQFSDLKRGL----- 369
           ++++EE + + EE+++   +     R+ R R   E   QE  + + Q   ++R L     
Sbjct: 151 EEEEEEPEEMEEELEEDAAD-----RDARKRAAEEAKEQEELRRRSQV--IQRNLPRPSV 203

Query: 370 --VTVSMDEWKNVPEVGDA 386
             + V            D 
Sbjct: 204 LDLIVLRPSVNVPLTELDP 222


>gnl|CDD|185101 PRK15179, PRK15179, Vi polysaccharide biosynthesis protein TviE;
            Provisional.
          Length = 694

 Score = 30.0 bits (67), Expect = 9.9
 Identities = 19/110 (17%), Positives = 44/110 (40%), Gaps = 10/110 (9%)

Query: 951  ERARRLLAKARASAPTPRVMIQSAKLEWCLDNLERALQLLDEAIKVFPDFAKLWMMKGQI 1010
            E  R LL +AR      +V+ + A +      L   L      ++ +P      ++  + 
Sbjct: 46   EAGRELLQQAR------QVLERHAAVHKPAAALPELLDY----VRRYPHTELFQVLVARA 95

Query: 1011 EEQKNLLDKAHDTFSQAIKKCPHSVPLWIMLANLEERRKMLIKARSVLEK 1060
             E  +  D+    +    ++ P S   +I++    +R++ +   R+ +E 
Sbjct: 96   LEAAHRSDEGLAVWRGIHQRFPDSSEAFILMLRGVKRQQGIEAGRAEIEL 145


>gnl|CDD|221157 pfam11651, P22_CoatProtein, P22 coat protein - gene protein 5.
           This family of proteins represents gene product 5 from
           bacteriophage P22. This protein is involved in the
           formation of the pro-capsid shells in the bacteriophage.
           In total, there are 415 molecules of the coat protein
           which are arranged in an icosahedral shell.
          Length = 413

 Score = 29.8 bits (67), Expect = 9.9
 Identities = 13/56 (23%), Positives = 20/56 (35%)

Query: 343 RLREELERYRQERPKIQQQFSDLKRGLVTVSMDEWKNVPEVGDARNRKQRNPRAEK 398
           R   + E        +      L  G V V +D+ KNVP    A+ +  R     +
Sbjct: 49  RRPPDFEVRSGAGADLSGNAQGLTEGKVPVVVDKQKNVPFTLTAKEKTLRLEDFSE 104


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.315    0.132    0.392 

Gapped
Lambda     K      H
   0.267   0.0733    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 66,931,502
Number of extensions: 6845912
Number of successful extensions: 11467
Number of sequences better than 10.0: 1
Number of HSP's gapped: 10980
Number of HSP's successfully gapped: 401
Length of query: 1278
Length of database: 10,937,602
Length adjustment: 108
Effective length of query: 1170
Effective length of database: 6,147,370
Effective search space: 7192422900
Effective search space used: 7192422900
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 65 (28.8 bits)