RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy12410
         (615 letters)



>gnl|CDD|240273 PTZ00110, PTZ00110, helicase; Provisional.
          Length = 545

 Score =  341 bits (877), Expect = e-110
 Identities = 144/328 (43%), Positives = 216/328 (65%), Gaps = 6/328 (1%)

Query: 265 ASKQKKELSKVDHSTIEYLPFRKDFYVEVPEIARMTPEEVEKYKEELEGIRVKGKGCPRP 324
           +S   K L  +D  +I  +PF K+FY E PE++ ++ +EV++ ++E E   + G+  P+P
Sbjct: 69  SSTLGKRLQPIDWKSINLVPFEKNFYKEHPEVSALSSKEVDEIRKEKEITIIAGENVPKP 128

Query: 325 IKTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPL 384
           + ++        IL +LK   + +PTPIQ Q  P  +SGRD+IGIA+TGSGKT+AF+LP 
Sbjct: 129 VVSFEYTSFPDYILKSLKNAGFTEPTPIQVQGWPIALSGRDMIGIAETGSGKTLAFLLPA 188

Query: 385 LRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQIS 444
           + HI  QP L   DGP+ ++++PTREL  QI ++  KF  S  +R    YGG     QI 
Sbjct: 189 IVHINAQPLLRYGDGPIVLVLAPTRELAEQIREQCNKFGASSKIRNTVAYGGVPKRGQIY 248

Query: 445 ELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNV 504
            L+RG EI++  PGR+ID L +N   VTNLRRVTY+VLDEADRM DMGFEPQ+ +I+  +
Sbjct: 249 ALRRGVEILIACPGRLIDFLESN---VTNLRRVTYLVLDEADRMLDMGFEPQIRKIVSQI 305

Query: 505 RPDRQTVMFSATFPRQMEALARRIL-NKPIEIQVGGRSV-VCKEVEQHVIVLDEEQKMLK 562
           RPDRQT+M+SAT+P+++++LAR +   +P+ + VG   +  C  ++Q V V++E +K  K
Sbjct: 306 RPDRQTLMWSATWPKEVQSLARDLCKEEPVHVNVGSLDLTACHNIKQEVFVVEEHEKRGK 365

Query: 563 LLELLG-IYQDQGSVIVFVDKQENADSL 589
           L  LL  I +D   +++FV+ ++ AD L
Sbjct: 366 LKMLLQRIMRDGDKILIFVETKKGADFL 393


>gnl|CDD|238167 cd00268, DEADc, DEAD-box helicases. A diverse family of proteins
           involved in ATP-dependent RNA unwinding, needed in a
           variety of cellular processes including splicing,
           ribosome biogenesis and RNA degradation. The name
           derives from the sequence of the Walker  B motif (motif
           II). This domain contains the ATP- binding region.
          Length = 203

 Score =  295 bits (757), Expect = 4e-97
 Identities = 103/204 (50%), Positives = 146/204 (71%), Gaps = 6/204 (2%)

Query: 332 GVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
           G+S ++L  +    +EKPTPIQA+AIP ++SGRD+IG A+TGSGKT AF++P+L  +   
Sbjct: 5   GLSPELLRGIYALGFEKPTPIQARAIPPLLSGRDVIGQAQTGSGKTAAFLIPILEKLDPS 64

Query: 392 PPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAE 451
           P     DGP A+I++PTREL +QI + A+K  K   L+VV +YGGT I +QI +LKRG  
Sbjct: 65  PKK---DGPQALILAPTRELALQIAEVARKLGKHTNLKVVVIYGGTSIDKQIRKLKRGPH 121

Query: 452 IIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTV 511
           I+V TPGR++D+L    G++ +L +V Y+VLDEADRM DMGFE Q+  I+  +  DRQT+
Sbjct: 122 IVVATPGRLLDLL--ERGKL-DLSKVKYLVLDEADRMLDMGFEDQIREILKLLPKDRQTL 178

Query: 512 MFSATFPRQMEALARRILNKPIEI 535
           +FSAT P+++  LAR+ L  P+ I
Sbjct: 179 LFSATMPKEVRDLARKFLRNPVRI 202


>gnl|CDD|223587 COG0513, SrmB, Superfamily II DNA and RNA helicases [DNA
           replication, recombination, and repair / Transcription /
           Translation, ribosomal structure and biogenesis].
          Length = 513

 Score =  275 bits (706), Expect = 2e-85
 Identities = 126/296 (42%), Positives = 180/296 (60%), Gaps = 11/296 (3%)

Query: 298 RMTPEEVEKYKEELEGIRVKGKGCPRPIKTWAQCGVSKKILDALKKQNYEKPTPIQAQAI 357
               +     K +        +G  +    +A  G+S ++L ALK   +E+PTPIQ  AI
Sbjct: 1   LAREDYDRFVKLKSAHNVALSRGEEKTPPEFASLGLSPELLQALKDLGFEEPTPIQLAAI 60

Query: 358 PAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGK 417
           P I++GRD++G A+TG+GKT AF+LPLL+ IL      E     A+I++PTREL +QI +
Sbjct: 61  PLILAGRDVLGQAQTGTGKTAAFLLPLLQKILKSV---ERKYVSALILAPTRELAVQIAE 117

Query: 418 EAKKFTKSL-GLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGRVTNLRR 476
           E +K  K+L GLRV  VYGG  I +QI  LKRG +I+V TPGR++D++        +L  
Sbjct: 118 ELRKLGKNLGGLRVAVVYGGVSIRKQIEALKRGVDIVVATPGRLLDLIKRGKL---DLSG 174

Query: 477 VTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATFPRQMEALARRILNKPIEIQ 536
           V  +VLDEADRM DMGF   + +I+  + PDRQT++FSAT P  +  LARR LN P+EI+
Sbjct: 175 VETLVLDEADRMLDMGFIDDIEKILKALPPDRQTLLFSATMPDDIRELARRYLNDPVEIE 234

Query: 537 VGGRSVVC--KEVEQHVI-VLDEEQKMLKLLELLGIYQDQGSVIVFVDKQENADSL 589
           V    +    K+++Q  + V  EE+K+  LL+LL    D+G VIVFV  +   + L
Sbjct: 235 VSVEKLERTLKKIKQFYLEVESEEEKLELLLKLLKDE-DEGRVIVFVRTKRLVEEL 289


>gnl|CDD|215832 pfam00270, DEAD, DEAD/DEAH box helicase.  Members of this family
           include the DEAD and DEAH box helicases. Helicases are
           involved in unwinding nucleic acids. The DEAD box
           helicases are involved in various aspects of RNA
           metabolism, including nuclear transcription, pre mRNA
           splicing, ribosome biogenesis, nucleocytoplasmic
           transport, translation, RNA decay and organellar gene
           expression.
          Length = 169

 Score =  212 bits (542), Expect = 7e-66
 Identities = 87/176 (49%), Positives = 123/176 (69%), Gaps = 8/176 (4%)

Query: 350 TPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTR 409
           TPIQAQAIPAI+SG+D++  A TGSGKT+AF+LP+L+ +L +       GP A++++PTR
Sbjct: 1   TPIQAQAIPAILSGKDVLVQAPTGSGKTLAFLLPILQALLPKK-----GGPQALVLAPTR 55

Query: 410 ELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRG-AEIIVCTPGRMIDMLAANS 468
           EL  QI +E KK  K LGLRV  + GGT + EQ  +LK+G A+I+V TPGR++D+L    
Sbjct: 56  ELAEQIYEELKKLFKILGLRVALLTGGTSLKEQARKLKKGKADILVGTPGRLLDLL--RR 113

Query: 469 GRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATFPRQMEAL 524
           G++  L+ +  +VLDEA R+ DMGF   +  I+  + PDRQ ++ SAT PR +E L
Sbjct: 114 GKLKLLKNLKLLVLDEAHRLLDMGFGDDLEEILSRLPPDRQILLLSATLPRNLEDL 169


>gnl|CDD|236977 PRK11776, PRK11776, ATP-dependent RNA helicase DbpA; Provisional.
          Length = 460

 Score =  208 bits (533), Expect = 8e-61
 Identities = 97/261 (37%), Positives = 151/261 (57%), Gaps = 23/261 (8%)

Query: 338 LDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEET 397
           L  L +  Y + TPIQAQ++PAI++G+D+I  AKTGSGKT AF L LL+ +         
Sbjct: 16  LANLNELGYTEMTPIQAQSLPAILAGKDVIAQAKTGSGKTAAFGLGLLQKL--------- 66

Query: 398 D----GPMAIIMSPTRELCMQIGKEAKKFTKSL-GLRVVCVYGGTGISEQISELKRGAEI 452
           D       A+++ PTREL  Q+ KE ++  + +  ++V+ + GG  +  QI  L+ GA I
Sbjct: 67  DVKRFRVQALVLCPTRELADQVAKEIRRLARFIPNIKVLTLCGGVPMGPQIDSLEHGAHI 126

Query: 453 IVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVM 512
           IV TPGR++D L   +    +L  +  +VLDEADRM DMGF+  +  II      RQT++
Sbjct: 127 IVGTPGRILDHLRKGT---LDLDALNTLVLDEADRMLDMGFQDAIDAIIRQAPARRQTLL 183

Query: 513 FSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLLELLGIYQD 572
           FSAT+P  + A+++R    P+E++V     +   +EQ    +  ++++  L  LL  +Q 
Sbjct: 184 FSATYPEGIAAISQRFQRDPVEVKVESTHDLPA-IEQRFYEVSPDERLPALQRLLLHHQP 242

Query: 573 QGSVIVF----VDKQENADSL 589
           + S +VF     + QE AD+L
Sbjct: 243 E-SCVVFCNTKKECQEVADAL 262


>gnl|CDD|236722 PRK10590, PRK10590, ATP-dependent RNA helicase RhlE; Provisional.
          Length = 456

 Score =  208 bits (530), Expect = 1e-60
 Identities = 107/259 (41%), Positives = 171/259 (66%), Gaps = 5/259 (1%)

Query: 332 GVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
           G+S  IL A+ +Q Y +PTPIQ QAIPA++ GRDL+  A+TG+GKT  F LPLL+H++ +
Sbjct: 7   GLSPDILRAVAEQGYREPTPIQQQAIPAVLEGRDLMASAQTGTGKTAGFTLPLLQHLITR 66

Query: 392 PPLEETDGPM-AIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGA 450
            P  +   P+ A+I++PTREL  QIG+  + ++K L +R + V+GG  I+ Q+ +L+ G 
Sbjct: 67  QPHAKGRRPVRALILTPTRELAAQIGENVRDYSKYLNIRSLVVFGGVSINPQMMKLRGGV 126

Query: 451 EIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQT 510
           +++V TPGR++D+   N+     L +V  +VLDEADRM DMGF   + R++  +   RQ 
Sbjct: 127 DVLVATPGRLLDLEHQNA---VKLDQVEILVLDEADRMLDMGFIHDIRRVLAKLPAKRQN 183

Query: 511 VMFSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLLELLGIY 570
           ++FSATF   ++ALA ++L+ P+EI+V  R+   ++V QHV  +D+++K   L +++G  
Sbjct: 184 LLFSATFSDDIKALAEKLLHNPLEIEVARRNTASEQVTQHVHFVDKKRKRELLSQMIGKG 243

Query: 571 QDQGSVIVFVDKQENADSL 589
             Q  V+VF   +  A+ L
Sbjct: 244 NWQ-QVLVFTRTKHGANHL 261


>gnl|CDD|215103 PLN00206, PLN00206, DEAD-box ATP-dependent RNA helicase;
           Provisional.
          Length = 518

 Score =  209 bits (533), Expect = 2e-60
 Identities = 125/353 (35%), Positives = 193/353 (54%), Gaps = 18/353 (5%)

Query: 247 EYSSEEEQEDLTS---TAANLASKQKKELSKVDHSTIEYLPFRKD-FYVEVPEI-ARMTP 301
           EY  +E  +D+ S     A L    K  ++ V     + LP   + FYV  P   + ++ 
Sbjct: 39  EYICDETDDDICSLECKQALLRRVAKSRVA-VGAPKPKRLPATDECFYVRDPGSTSGLSS 97

Query: 302 EEVEKYKEELEGIRVKGKGCPRPIKTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIM 361
            + E  + +LE I VKG+  P PI +++ CG+  K+L  L+   YE PTPIQ QAIPA +
Sbjct: 98  SQAELLRRKLE-IHVKGEAVPPPILSFSSCGLPPKLLLNLETAGYEFPTPIQMQAIPAAL 156

Query: 362 SGRDLIGIAKTGSGKTVAFVLPLLRHI----LDQPPLEETDGPMAIIMSPTRELCMQIGK 417
           SGR L+  A TGSGKT +F++P++          P   E   P+A++++PTRELC+Q+  
Sbjct: 157 SGRSLLVSADTGSGKTASFLVPIISRCCTIRSGHPS--EQRNPLAMVLTPTRELCVQVED 214

Query: 418 EAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRV 477
           +AK   K L  +   V GG  + +Q+  +++G E+IV TPGR+ID+L+ +      L  V
Sbjct: 215 QAKVLGKGLPFKTALVVGGDAMPQQLYRIQQGVELIVGTPGRLIDLLSKHD---IELDNV 271

Query: 478 TYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATFPRQMEALARRILNKPIEIQV 537
           + +VLDE D M + GF  QVM+I   +    Q ++FSAT   ++E  A  +    I I +
Sbjct: 272 SVLVLDEVDCMLERGFRDQVMQIFQAL-SQPQVLLFSATVSPEVEKFASSLAKDIILISI 330

Query: 538 GGRSVVCKEVEQHVIVLDEEQKMLKLLELLGIYQD-QGSVIVFVDKQENADSL 589
           G  +   K V+Q  I ++ +QK  KL ++L   Q  +   +VFV  +  AD L
Sbjct: 331 GNPNRPNKAVKQLAIWVETKQKKQKLFDILKSKQHFKPPAVVFVSSRLGADLL 383


>gnl|CDD|214692 smart00487, DEXDc, DEAD-like helicases superfamily. 
          Length = 201

 Score =  191 bits (488), Expect = 1e-57
 Identities = 80/212 (37%), Positives = 122/212 (57%), Gaps = 13/212 (6%)

Query: 341 LKKQNYEKPTPIQAQAIPAIMSG-RDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDG 399
           ++K  +E   P Q +AI A++SG RD+I  A TGSGKT+A +LP L  +          G
Sbjct: 1   IEKFGFEPLRPYQKEAIEALLSGLRDVILAAPTGSGKTLAALLPALEALKRGK------G 54

Query: 400 PMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRG-AEIIVCTPG 458
              +++ PTREL  Q  +E KK   SLGL+VV +YGG    EQ+ +L+ G  +I+V TPG
Sbjct: 55  GRVLVLVPTRELAEQWAEELKKLGPSLGLKVVGLYGGDSKREQLRKLESGKTDILVTTPG 114

Query: 459 RMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATFP 518
           R++D+L  +     +L  V  ++LDEA R+ D GF  Q+ +++  +  + Q ++ SAT P
Sbjct: 115 RLLDLLENDK---LSLSNVDLVILDEAHRLLDGGFGDQLEKLLKLLPKNVQLLLLSATPP 171

Query: 519 RQMEALARRILNKPIEIQVGGRSVVCKEVEQH 550
            ++E L    LN P+ I VG      + +EQ 
Sbjct: 172 EEIENLLELFLNDPVFIDVGFT--PLEPIEQF 201


>gnl|CDD|236877 PRK11192, PRK11192, ATP-dependent RNA helicase SrmB; Provisional.
          Length = 434

 Score =  172 bits (439), Expect = 9e-48
 Identities = 93/266 (34%), Positives = 144/266 (54%), Gaps = 7/266 (2%)

Query: 326 KTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
            T+++  + + +L+AL+ + Y +PT IQA+AIP  + GRD++G A TG+GKT AF+LP L
Sbjct: 1   TTFSELELDESLLEALQDKGYTRPTAIQAEAIPPALDGRDVLGSAPTGTGKTAAFLLPAL 60

Query: 386 RHILDQPPLEETDGPMAI-IMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQIS 444
           +H+LD P      GP  I I++PTREL MQ+  +A++  K   L +  + GG        
Sbjct: 61  QHLLDFP--RRKSGPPRILILTPTRELAMQVADQARELAKHTHLDIATITGGVAYMNHAE 118

Query: 445 ELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNV 504
                 +I+V TPGR++  +   +    + R V  ++LDEADRM DMGF   +  I    
Sbjct: 119 VFSENQDIVVATPGRLLQYIKEEN---FDCRAVETLILDEADRMLDMGFAQDIETIAAET 175

Query: 505 RPDRQTVMFSATFP-RQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKL 563
           R  +QT++FSAT     ++  A R+LN P+E++        K++ Q     D+ +    L
Sbjct: 176 RWRKQTLLFSATLEGDAVQDFAERLLNDPVEVEAEPSRRERKKIHQWYYRADDLEHKTAL 235

Query: 564 LELLGIYQDQGSVIVFVDKQENADSL 589
           L  L    +    IVFV  +E    L
Sbjct: 236 LCHLLKQPEVTRSIVFVRTRERVHEL 261


>gnl|CDD|235314 PRK04837, PRK04837, ATP-dependent RNA helicase RhlB; Provisional.
          Length = 423

 Score =  171 bits (435), Expect = 3e-47
 Identities = 85/253 (33%), Positives = 142/253 (56%), Gaps = 21/253 (8%)

Query: 326 KTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
           + ++   +  ++++AL+K+ +   TPIQA A+P  ++GRD+ G A+TG+GKT+AF+    
Sbjct: 8   QKFSDFALHPQVVEALEKKGFHNCTPIQALALPLTLAGRDVAGQAQTGTGKTMAFLTATF 67

Query: 386 RHILDQPPLE--ETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQI 443
            ++L  P  E  + + P A+IM+PTREL +QI  +A+   ++ GL++   YGG G  +Q+
Sbjct: 68  HYLLSHPAPEDRKVNQPRALIMAPTRELAVQIHADAEPLAQATGLKLGLAYGGDGYDKQL 127

Query: 444 SELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDN 503
             L+ G +I++ T GR+ID    N     NL  +  +VLDEADRMFD+GF       I +
Sbjct: 128 KVLESGVDILIGTTGRLIDYAKQN---HINLGAIQVVVLDEADRMFDLGF-------IKD 177

Query: 504 VR------PD---RQTVMFSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVL 554
           +R      P    R  ++FSAT   ++  LA   +N P  ++V         +++ +   
Sbjct: 178 IRWLFRRMPPANQRLNMLFSATLSYRVRELAFEHMNNPEYVEVEPEQKTGHRIKEELFYP 237

Query: 555 DEEQKMLKLLELL 567
             E+KM  L  L+
Sbjct: 238 SNEEKMRLLQTLI 250


>gnl|CDD|236941 PRK11634, PRK11634, ATP-dependent RNA helicase DeaD; Provisional.
          Length = 629

 Score =  166 bits (423), Expect = 3e-44
 Identities = 85/212 (40%), Positives = 131/212 (61%), Gaps = 9/212 (4%)

Query: 327 TWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLR 386
           T+A  G+   IL+AL    YEKP+PIQA+ IP +++GRD++G+A+TGSGKT AF LPLL 
Sbjct: 7   TFADLGLKAPILEALNDLGYEKPSPIQAECIPHLLNGRDVLGMAQTGSGKTAAFSLPLLH 66

Query: 387 HILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSL-GLRVVCVYGGTGISEQISE 445
           ++       E   P  ++++PTREL +Q+ +    F+K + G+ VV +YGG     Q+  
Sbjct: 67  NL-----DPELKAPQILVLAPTRELAVQVAEAMTDFSKHMRGVNVVALYGGQRYDVQLRA 121

Query: 446 LKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVR 505
           L++G +I+V TPGR++D L   +  ++ L     +VLDEAD M  MGF   V  I+  + 
Sbjct: 122 LRQGPQIVVGTPGRLLDHLKRGTLDLSKLSG---LVLDEADEMLRMGFIEDVETIMAQIP 178

Query: 506 PDRQTVMFSATFPRQMEALARRILNKPIEIQV 537
              QT +FSAT P  +  + RR + +P E+++
Sbjct: 179 EGHQTALFSATMPEAIRRITRRFMKEPQEVRI 210


>gnl|CDD|234938 PRK01297, PRK01297, ATP-dependent RNA helicase RhlB; Provisional.
          Length = 475

 Score =  162 bits (412), Expect = 8e-44
 Identities = 96/240 (40%), Positives = 140/240 (58%), Gaps = 9/240 (3%)

Query: 350 TPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEE--TDGPMAIIMSP 407
           TPIQAQ +   ++G D IG A+TG+GKT AF++ ++  +L  PP +E     P A+I++P
Sbjct: 111 TPIQAQVLGYTLAGHDAIGRAQTGTGKTAAFLISIINQLLQTPPPKERYMGEPRALIIAP 170

Query: 408 TRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELK-RGAEIIVCTPGRMIDMLAA 466
           TREL +QI K+A   TK  GL V+   GG    +Q+ +L+ R  +I+V TPGR++D    
Sbjct: 171 TRELVVQIAKDAAALTKYTGLNVMTFVGGMDFDKQLKQLEARFCDILVATPGRLLDF--- 227

Query: 467 NSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRP--DRQTVMFSATFPRQMEAL 524
           N     +L  V  +VLDEADRM DMGF PQV +II       +RQT++FSATF   +  L
Sbjct: 228 NQRGEVHLDMVEVMVLDEADRMLDMGFIPQVRQIIRQTPRKEERQTLLFSATFTDDVMNL 287

Query: 525 ARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLLELLGIYQDQGSVIVFVDKQE 584
           A++    P  +++   +V    VEQHV  +    K  KLL  L        V+VF ++++
Sbjct: 288 AKQWTTDPAIVEIEPENVASDTVEQHVYAVAGSDKY-KLLYNLVTQNPWERVMVFANRKD 346


>gnl|CDD|235307 PRK04537, PRK04537, ATP-dependent RNA helicase RhlB; Provisional.
          Length = 572

 Score =  158 bits (401), Expect = 1e-41
 Identities = 89/235 (37%), Positives = 141/235 (60%), Gaps = 6/235 (2%)

Query: 337 ILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPL-- 394
           +L  L+   + + TPIQA  +P  + G D+ G A+TG+GKT+AF++ ++  +L +P L  
Sbjct: 20  LLAGLESAGFTRCTPIQALTLPVALPGGDVAGQAQTGTGKTLAFLVAVMNRLLSRPALAD 79

Query: 395 EETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIV 454
            + + P A+I++PTREL +QI K+A KF   LGLR   VYGG    +Q   L++G ++I+
Sbjct: 80  RKPEDPRALILAPTRELAIQIHKDAVKFGADLGLRFALVYGGVDYDKQRELLQQGVDVII 139

Query: 455 CTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNV--RPDRQTVM 512
            TPGR+ID +  +  +V +L      VLDEADRMFD+GF   +  ++  +  R  RQT++
Sbjct: 140 ATPGRLIDYVKQH--KVVSLHACEICVLDEADRMFDLGFIKDIRFLLRRMPERGTRQTLL 197

Query: 513 FSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLLELL 567
           FSAT   ++  LA   +N+P ++ V   ++    V Q +    +E+K   LL LL
Sbjct: 198 FSATLSHRVLELAYEHMNEPEKLVVETETITAARVRQRIYFPADEEKQTLLLGLL 252


>gnl|CDD|238005 cd00046, DEXDc, DEAD-like helicases superfamily. A diverse family
           of proteins involved in ATP-dependent RNA or DNA
           unwinding. This domain contains the ATP-binding region.
          Length = 144

 Score =  146 bits (371), Expect = 2e-41
 Identities = 52/154 (33%), Positives = 87/154 (56%), Gaps = 10/154 (6%)

Query: 364 RDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFT 423
           RD++  A TGSGKT+A +LP+L  +          G   ++++PTREL  Q+ +  K+  
Sbjct: 1   RDVLLAAPTGSGKTLAALLPILELLDSL------KGGQVLVLAPTRELANQVAERLKELF 54

Query: 424 KSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLD 483
              G++V  + GGT I +Q   L    +I+V TPGR++D L        +L+++  ++LD
Sbjct: 55  G-EGIKVGYLIGGTSIKQQEKLLSGKTDIVVGTPGRLLDELERLK---LSLKKLDLLILD 110

Query: 484 EADRMFDMGFEPQVMRIIDNVRPDRQTVMFSATF 517
           EA R+ + GF    ++I+  +  DRQ ++ SAT 
Sbjct: 111 EAHRLLNQGFGLLGLKILLKLPKDRQVLLLSATP 144


>gnl|CDD|185609 PTZ00424, PTZ00424, helicase 45; Provisional.
          Length = 401

 Score =  131 bits (332), Expect = 2e-33
 Identities = 78/236 (33%), Positives = 123/236 (52%), Gaps = 9/236 (3%)

Query: 332 GVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
            +++ +L  +    +EKP+ IQ + I  I+ G D IG A++G+GKT  FV+  L+ I   
Sbjct: 34  KLNEDLLRGIYSYGFEKPSAIQQRGIKPILDGYDTIGQAQSGTGKTATFVIAALQLI--D 91

Query: 392 PPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAE 451
             L       A+I++PTREL  QI K        L +R     GGT + + I++LK G  
Sbjct: 92  YDLNACQ---ALILAPTRELAQQIQKVVLALGDYLKVRCHACVGGTVVRDDINKLKAGVH 148

Query: 452 IIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTV 511
           ++V TPGR+ DM+     RV +L+     +LDEAD M   GF+ Q+  +   + PD Q  
Sbjct: 149 MVVGTPGRVYDMIDKRHLRVDDLK---LFILDEADEMLSRGFKGQIYDVFKKLPPDVQVA 205

Query: 512 MFSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLD-EEQKMLKLLEL 566
           +FSAT P ++  L  + +  P  I V    +  + + Q  + ++ EE K   L +L
Sbjct: 206 LFSATMPNEILELTTKFMRDPKRILVKKDELTLEGIRQFYVAVEKEEWKFDTLCDL 261


>gnl|CDD|224122 COG1201, Lhr, Lhr-like helicases [General function prediction
           only].
          Length = 814

 Score = 76.1 bits (188), Expect = 2e-14
 Identities = 49/142 (34%), Positives = 76/142 (53%), Gaps = 1/142 (0%)

Query: 343 KQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMA 402
           K+ +   TP Q  AIP I SG +++ IA TGSGKT A  LP++  +L     +  DG  A
Sbjct: 17  KRKFTSLTPPQRYAIPEIHSGENVLIIAPTGSGKTEAAFLPVINELLSLGKGKLEDGIYA 76

Query: 403 IIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMID 462
           + +SP + L   I +  ++  + LG+ V   +G T  SE+   LK    I++ TP  +  
Sbjct: 77  LYISPLKALNNDIRRRLEEPLRELGIEVAVRHGDTPQSEKQKMLKNPPHILITTPESLAI 136

Query: 463 MLAANSGRVTNLRRVTYIVLDE 484
           +L +   R   LR V Y+++DE
Sbjct: 137 LLNSPKFR-ELLRDVRYVIVDE 157


>gnl|CDD|224126 COG1205, COG1205, Distinct helicase family with a unique C-terminal
           domain including a metal-binding cysteine cluster
           [General function prediction only].
          Length = 851

 Score = 66.7 bits (163), Expect = 2e-11
 Identities = 43/172 (25%), Positives = 68/172 (39%), Gaps = 17/172 (9%)

Query: 319 KGCPRPIKTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTV 378
            G              + +  AL K   E+    Q  A+  I  GR+++    TGSGKT 
Sbjct: 45  PGKTSEFPELRD----ESLKSALVKAGIERLYSHQVDALRLIREGRNVVVTTGTGSGKTE 100

Query: 379 AFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVC-VYGGT 437
           +F+LP+L H+L  P         A+++ PT  L     +  ++    L  +V    Y G 
Sbjct: 101 SFLLPILDHLLRDP------SARALLLYPTNALANDQAERLRELISDLPGKVTFGRYTGD 154

Query: 438 GISEQISELKRGAEIIVCTPGRMIDMLAANSGRVTN----LRRVTYIVLDEA 485
              E+   + R    I+ T   M+  L             LR + Y+V+DE 
Sbjct: 155 TPPEERRAIIRNPPDILLTNPDMLHYLLLR--NHDAWLWLLRNLKYLVVDEL 204


>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family.  This model
           represents a subfamily of RNA splicing factors including
           the Pad-1 protein (N. crassa), CAPER (M. musculus) and
           CC1.3 (H.sapiens). These proteins are characterized by
           an N-terminal arginine-rich, low complexity domain
           followed by three (or in the case of 4 H. sapiens
           paralogs, two) RNA recognition domains (rrm: pfam00706).
           These splicing factors are closely related to the U2AF
           splicing factor family (TIGR01642). A homologous gene
           from Plasmodium falciparum was identified in the course
           of the analysis of that genome at TIGR and was included
           in the seed.
          Length = 457

 Score = 64.9 bits (158), Expect = 5e-11
 Identities = 42/145 (28%), Positives = 61/145 (42%), Gaps = 14/145 (9%)

Query: 6   RKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLER---RKEKSRGSK 62
           R R R R       R    R DK R+R RR      RS R RDRD  R    + +SR   
Sbjct: 1   RYRDRERGRL----RNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPN 56

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERW 122
           R  R R     +   ++     R  +E     +  + D+ V   +L L+ ++ RD  E +
Sbjct: 57  RYYRPRGDRSYRRDDRRS---GRNTKEPLT--EAERDDRTVFVLQLALKARE-RDLYEFF 110

Query: 123 RAERKKKDIETIKKDIKSNLSSGLG 147
               K +D++ I KD  S  S G+ 
Sbjct: 111 SKVGKVRDVQCI-KDRNSRRSKGVA 134



 Score = 37.2 bits (86), Expect = 0.025
 Identities = 20/47 (42%), Positives = 27/47 (57%), Gaps = 1/47 (2%)

Query: 3  RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
          R RR RSRSRSP+  + RP+  R  +  DRR    + E  +E +RD 
Sbjct: 44 RGRRGRSRSRSPN-RYYRPRGDRSYRRDDRRSGRNTKEPLTEAERDD 89


>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
           splicing factor.  These splicing factors consist of an
           N-terminal arginine-rich low complexity domain followed
           by three tandem RNA recognition motifs (pfam00076). The
           well-characterized members of this family are auxilliary
           components of the U2 small nuclear ribonuclearprotein
           splicing factor (U2AF). These proteins are closely
           related to the CC1-like subfamily of splicing factors
           (TIGR01622). Members of this subfamily are found in
           plants, metazoa and fungi.
          Length = 509

 Score = 58.8 bits (142), Expect = 5e-09
 Identities = 42/152 (27%), Positives = 64/152 (42%), Gaps = 18/152 (11%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERR-SERDRDRDLERRKEKSRGS 61
              R++SR R    S +RP+   RD+ R R R  RS ER   E  R RD  R   +S  S
Sbjct: 6   DREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRSPRS 65

Query: 62  KRRSRSREAERSKDHSKKEEKD-KREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE 120
            R S  R   RS+D  ++  +  +  ++      D S  +            Q R+D  +
Sbjct: 66  LRYSSVR---RSRDRPRRRSRSVRSIEQHRRRLRDRSPSN------------QWRKDDKK 110

Query: 121 RWRAERKKKDIETIKKD-IKSNLSSGLGGSAP 151
           R   + K    E +  D  K++    + G+AP
Sbjct: 111 RSLWDIKPPGYELVTADQAKASQVFSVPGTAP 142



 Score = 50.3 bits (120), Expect = 2e-06
 Identities = 30/118 (25%), Positives = 49/118 (41%), Gaps = 5/118 (4%)

Query: 26  RDKDRDR-RRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR---EAERSKDHSKKEE 81
           RD++ DR R +SR  +R    +R R   R + + R   RRSR R   E  R +D  + + 
Sbjct: 1   RDEEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDS 60

Query: 82  KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
           +  R            +  +    +   +E  +RR R      + +K D +    DIK
Sbjct: 61  RSPRSLRYSSVRRSRDRPRRR-SRSVRSIEQHRRRLRDRSPSNQWRKDDKKRSLWDIK 117


>gnl|CDD|219953 pfam08648, DUF1777, Protein of unknown function (DUF1777).  This is
           a family of eukaryotic proteins of unknown function.
           Some of the proteins in this family are putative nucleic
           acid binding proteins.
          Length = 158

 Score = 54.5 bits (131), Expect = 1e-08
 Identities = 39/109 (35%), Positives = 53/109 (48%), Gaps = 4/109 (3%)

Query: 9   SRSRSPSPSHKRPKESRRDKDRDRRRRSRSHER---RSERDRDRDLERRKEKSRGSKRRS 65
            RSRS SP   R +   R +DR  RRR RS  R   R  R R R   R +      + RS
Sbjct: 2   GRSRSRSPRRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRS 61

Query: 66  RSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
           RSR   R +D  ++ +KD RE ++ E      + D E ++   E+EM K
Sbjct: 62  RSRSPSRRRDRKRERDKDAREPKKRERQKLIKEEDLEGKSDE-EVEMMK 109



 Score = 48.3 bits (115), Expect = 1e-06
 Identities = 28/89 (31%), Positives = 41/89 (46%), Gaps = 3/89 (3%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
           R RR+R R RS S    R    RR + R   R  RS   R  R R R   RR+++ R   
Sbjct: 21  RDRRERRRERSRSRERDR---RRRSRSRSPHRSRRSRSPRRHRSRSRSPSRRRDRKRERD 77

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEEEE 91
           + +R  +    +   K+E+ + +  EE E
Sbjct: 78  KDAREPKKRERQKLIKEEDLEGKSDEEVE 106



 Score = 45.6 bits (108), Expect = 1e-05
 Identities = 37/88 (42%), Positives = 48/88 (54%), Gaps = 3/88 (3%)

Query: 5  RRKRSRSRSPSPSHKRPKESR-RDKDRDRRRRSRSH-ERRSERDRDRDLERRKEKSRGSK 62
          RR R R RS S   +  +  R R ++RDRRRRSRS    RS R R     R + +S  S+
Sbjct: 10 RRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRSRSRSP-SR 68

Query: 63 RRSRSREAERSKDHSKKEEKDKREKEEE 90
          RR R RE ++     KK E+ K  KEE+
Sbjct: 69 RRDRKRERDKDAREPKKRERQKLIKEED 96


>gnl|CDD|237171 PRK12678, PRK12678, transcription termination factor Rho;
           Provisional.
          Length = 672

 Score = 48.7 bits (117), Expect = 6e-06
 Identities = 24/126 (19%), Positives = 47/126 (37%), Gaps = 2/126 (1%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
            +   R+ +   +   +R +  RR    DR+  +   ER    +R RD + R  + R  +
Sbjct: 152 PATEARADAAERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGDDRDRRDRREQ 211

Query: 63  RRSRSRE--AERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE 120
              R      +      ++  +D+R+   ++   D    D +    R     ++ RDR  
Sbjct: 212 GDRREERGRRDGGDRRGRRRRRDRRDARGDDNREDRGDRDGDDGEGRGGRRGRRFRDRDR 271

Query: 121 RWRAER 126
           R R   
Sbjct: 272 RGRRGG 277



 Score = 48.0 bits (115), Expect = 1e-05
 Identities = 18/89 (20%), Positives = 33/89 (37%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
           R  R  +  R  +       E    + R         E R ER R  D E R+ ++   +
Sbjct: 130 RRERGEAARRGAARKAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAERGE 189

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEEEE 91
           R  R        D  +++ +++ ++ EE 
Sbjct: 190 RGRREERGRDGDDRDRRDRREQGDRREER 218



 Score = 44.5 bits (106), Expect = 2e-04
 Identities = 15/90 (16%), Positives = 33/90 (36%), Gaps = 1/90 (1%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRD-LERRKEKSRGS 61
            + R+ +  ++     +   E+R D         R   RR     DR     R E+ R  
Sbjct: 135 EAARRGAARKAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAERGERGRRE 194

Query: 62  KRRSRSREAERSKDHSKKEEKDKREKEEEE 91
           +R     + +R     + + +++R + +  
Sbjct: 195 ERGRDGDDRDRRDRREQGDRREERGRRDGG 224



 Score = 40.7 bits (96), Expect = 0.002
 Identities = 18/88 (20%), Positives = 35/88 (39%), Gaps = 3/88 (3%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
             +      +  + +     E   +++RD RRR    E R       +  RR+E+ R   
Sbjct: 142 ARKAGEGGEQPATEARADAAERTEEEERDERRRRGDREDRQAEAERGERGRREERGRDGD 201

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEEE 90
            R R    E+     ++EE+ +R+  + 
Sbjct: 202 DRDRRDRREQG---DRREERGRRDGGDR 226



 Score = 40.3 bits (95), Expect = 0.003
 Identities = 22/87 (25%), Positives = 32/87 (36%), Gaps = 2/87 (2%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
           R  +   R           +  RR +DR   R   + E R +RD D    R   + R  +
Sbjct: 208 RREQGDRREERGRRDGGDRRGRRRRRDRRDARGDDNREDRGDRDGDDGEGRGGRRGR--R 265

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEE 89
            R R R   R  D   + E + RE + 
Sbjct: 266 FRDRDRRGRRGGDGGNEREPELREDDV 292



 Score = 38.7 bits (91), Expect = 0.010
 Identities = 17/82 (20%), Positives = 34/82 (41%), Gaps = 1/82 (1%)

Query: 5   RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
             ++  + + + + +R +E  RD+ R R  R        ER      E R        RR
Sbjct: 148 GGEQPATEARADAAERTEEEERDERRRRGDREDRQAEA-ERGERGRREERGRDGDDRDRR 206

Query: 65  SRSREAERSKDHSKKEEKDKRE 86
            R  + +R ++  +++  D+R 
Sbjct: 207 DRREQGDRREERGRRDGGDRRG 228



 Score = 34.9 bits (81), Expect = 0.13
 Identities = 14/92 (15%), Positives = 36/92 (39%), Gaps = 1/92 (1%)

Query: 2   VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGS 61
            ++    +   + + + +        + R+RR R  +  R + R      E+   ++R  
Sbjct: 100 AKAEAAPAARAAAAAAAEAASAPEAAQARERRERGEAARRGAARKAGEGGEQPATEARAD 159

Query: 62  KRRSRSREAERSKDHSKKEEKDKREKEEEEAA 93
               R+ E ER +   + + +D++ + E    
Sbjct: 160 -AAERTEEEERDERRRRGDREDRQAEAERGER 190


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 48.2 bits (114), Expect = 1e-05
 Identities = 57/299 (19%), Positives = 124/299 (41%), Gaps = 32/299 (10%)

Query: 4    SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
            +++K   ++  + + K+  E+++ ++     +     +++E  +  D  ++ E+ + +  
Sbjct: 1495 AKKKADEAKKAAEAKKKADEAKKAEEA----KKADEAKKAEEAKKADEAKKAEEKKKADE 1550

Query: 64   RSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEM----------- 112
              ++ E +++++  K EE  K+ +E++  A   ++  K+ E  R+E  M           
Sbjct: 1551 LKKAEELKKAEEKKKAEEA-KKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKA 1609

Query: 113  -QKRRDRIERWRAERKKKDIETIKK--DIKSNLSSGLGGSAPMKKWNLEDD-------SD 162
             + ++    + +AE  KK  E  KK   +K   +     +  +KK   E+          
Sbjct: 1610 EEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKK 1669

Query: 163  EDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVI 222
             +E+  K E  K AEED       ++   EE +K  +       + K A+   K      
Sbjct: 1670 AEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEEN- 1728

Query: 223  VTGVVKKSVEKAKGELMEENQDGLEYSSEEEQEDLTSTAANLASKQKKELSKVDHSTIE 281
                 K   E+AK E  E+ +   E   +EE++   +       K+ +E+ K   + IE
Sbjct: 1729 -----KIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIE 1782



 Score = 45.1 bits (106), Expect = 1e-04
 Identities = 57/295 (19%), Positives = 125/295 (42%), Gaps = 8/295 (2%)

Query: 19   KRPKESRRDKDRDRRRRSRSHE--RRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDH 76
            ++ +E+R+ +D  R   +R  E  R++E  R  +  ++ E +R ++   ++ E  +++D 
Sbjct: 1140 RKAEEARKAEDAKRVEIARKAEDARKAEEARKAEDAKKAEAARKAEEVRKAEELRKAEDA 1199

Query: 77   SKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKK 136
             K E   K E+E +      ++  K+ EA +   E +K  +  ++   ER  ++I   ++
Sbjct: 1200 RKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEE 1259

Query: 137  DIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRK 196
               ++ +         +    ++    +E    DE  K   E+    D   +   EE +K
Sbjct: 1260 ARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAKKA--EEKKKADEAKKKA-EEAKK 1316

Query: 197  VNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDGLEYSSEEEQED 256
             ++         K AD+  K A          K+  +A  +  E  ++  E + ++++E 
Sbjct: 1317 ADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEA 1376

Query: 257  LTSTAANLASKQKKELSKVDHSTIEYLPFRKDF-YVEVPEIARMTPEEVEKYKEE 310
                 A+ A K+ +E  K D +  +    +K    ++    A+   +E +K  EE
Sbjct: 1377 --KKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEE 1429



 Score = 45.1 bits (106), Expect = 1e-04
 Identities = 39/196 (19%), Positives = 89/196 (45%), Gaps = 7/196 (3%)

Query: 5    RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
            ++K  + +      K+  E  +  + + + ++    +++E D+ +  E +K +    +++
Sbjct: 1632 KKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEE--DEKK 1689

Query: 65   SRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRA 124
            +     + +++  K EE  K+E EE++ A    +L K  E  +++ E  K+    ++ +A
Sbjct: 1690 AAEALKKEAEEAKKAEELKKKEAEEKKKA---EELKKAEEENKIKAEEAKKEAEEDKKKA 1746

Query: 125  ERKKKDIETIKK--DIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDP 182
            E  KKD E  KK   +K             K+  +E++ DE++   + E  K  ++  D 
Sbjct: 1747 EEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIFDN 1806

Query: 183  LDAFMQGVHEEMRKVN 198
                ++G  E    +N
Sbjct: 1807 FANIIEGGKEGNLVIN 1822



 Score = 44.4 bits (104), Expect = 2e-04
 Identities = 55/280 (19%), Positives = 109/280 (38%), Gaps = 17/280 (6%)

Query: 6    RKRSRSRSPSPSHKRPKESRR-DKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
            +K    +    + K  +E+++ +++R+     +  E R      R    + E++R +   
Sbjct: 1224 KKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKADEL 1283

Query: 65   SRSREAERSKDHSKKEEKDKRE---------KEEEEAAFDPSKLDKEVEATRLELEMQKR 115
             ++ E +++ +  K EEK K +         K+ +EA     +  K+ +A + + E  K+
Sbjct: 1284 KKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKK 1343

Query: 116  RDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKT 175
                 +  AE    + E  ++  ++            KK         +E    DE  K 
Sbjct: 1344 AAEAAKAEAEAAADEAEAAEEKAEAAEKK----KEEAKKKADAAKKKAEEKKKADEAKKK 1399

Query: 176  AEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKA- 234
            AEED    D   +      +K    A     + K AD   K A         KK  E+A 
Sbjct: 1400 AEEDKKKADELKKA--AAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAK 1457

Query: 235  KGELMEENQDGLEYSSEEEQEDLTSTAANLASKQKKELSK 274
            K E  ++  +  + + E +++   +  A+ A K+ +E  K
Sbjct: 1458 KAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKK 1497



 Score = 43.2 bits (101), Expect = 5e-04
 Identities = 65/316 (20%), Positives = 120/316 (37%), Gaps = 20/316 (6%)

Query: 4    SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
            +++K   ++  +   K+  E+ + +       + + E ++E    +  E +K+     K+
Sbjct: 1327 AKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKK 1386

Query: 64   RSRSREAERSKDHSKKEEKDKREKEE----EEAAFDPSKLDKEVEATRLELEMQKRRDRI 119
                ++A+ +K   KK E+DK++ +E      A     +  K+ E  +   E +K+ +  
Sbjct: 1387 AEEKKKADEAK---KKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEE- 1442

Query: 120  ERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEED 179
             +   E KKK  E  K       +      A   K   E     +E    DE  K AEE 
Sbjct: 1443 AKKADEAKKKAEEAKK-------AEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEA 1495

Query: 180  IDPLDAFMQGVHEEMR--KVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGE 237
                D   +    + +  +  K      AD       +K A         KK+ E  K E
Sbjct: 1496 KKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAE 1555

Query: 238  LM---EENQDGLEYSSEEEQEDLTSTAANLASKQKKELSKVDHSTIEYLPFRKDFYVEVP 294
             +   EE +   E    EE +++    A  A K ++   +      E     K    +  
Sbjct: 1556 ELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKA 1615

Query: 295  EIARMTPEEVEKYKEE 310
            E A++  EE++K +EE
Sbjct: 1616 EEAKIKAEELKKAEEE 1631



 Score = 42.1 bits (98), Expect = 9e-04
 Identities = 51/258 (19%), Positives = 100/258 (38%), Gaps = 7/258 (2%)

Query: 19   KRPKESRRDKDRDRRRRSRSHE--RRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDH 76
            K+ + +R+ ++  +    R  E  R++E  R  + ER+ E++R ++   ++   +++++ 
Sbjct: 1176 KKAEAARKAEEVRKAEELRKAEDARKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEA 1235

Query: 77   SKKEEKDKREKEEEEAAFDPSKLDKEVEA----TRLELEMQKRRDRIERWRAERKKKDIE 132
             K  E + ++ EEE    +  K ++   A     +  ++ ++ R   E  +AE KKK  E
Sbjct: 1236 KKDAE-EAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADE 1294

Query: 133  TIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHE 192
              K + K         +   KK +      E+     D   K AEE     +A       
Sbjct: 1295 AKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEA 1354

Query: 193  EMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDGLEYSSEE 252
               +         A  K  +   K A         KK  ++AK +  E+ +   E     
Sbjct: 1355 AADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAA 1414

Query: 253  EQEDLTSTAANLASKQKK 270
              +     A   A ++KK
Sbjct: 1415 AAKKKADEAKKKAEEKKK 1432



 Score = 42.1 bits (98), Expect = 0.001
 Identities = 56/276 (20%), Positives = 102/276 (36%), Gaps = 7/276 (2%)

Query: 3    RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
            ++  K+    +     K+  +  + K  + ++   + ++  E  +  D  ++K +     
Sbjct: 1285 KAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKA 1344

Query: 63   RRSRSREAERSKDHSKKEEKDKR--EKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE 120
              +   EAE + D ++  E+     EK++EEA        K+ E  +   E +K+ +  +
Sbjct: 1345 AEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDK 1404

Query: 121  RWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDI 180
            +   E KK      K D     +     +   KK   E+    DE   K E  K AEE  
Sbjct: 1405 KKADELKKAAAAKKKADEAKKKAEEKKKADEAKK-KAEEAKKADEAKKKAEEAKKAEEAK 1463

Query: 181  DPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELME 240
               +   +   +E +K  + A       K A+   K A         KK  ++AK    E
Sbjct: 1464 KKAEEAKKA--DEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKA--E 1519

Query: 241  ENQDGLEYSSEEEQEDLTSTAANLASKQKKELSKVD 276
            E +   E    EE +           K+  EL K +
Sbjct: 1520 EAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAE 1555



 Score = 37.4 bits (86), Expect = 0.024
 Identities = 44/219 (20%), Positives = 92/219 (42%), Gaps = 14/219 (6%)

Query: 3    RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
            ++   R          ++  ++   K  +  +      +++E ++ +  + +K+++   K
Sbjct: 1588 KAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKK 1647

Query: 63   RRSRSREAE-----RSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRD 117
            +    ++AE     ++ + +KK E+DK++ EE + A +  K  K  EA + E E  K+ +
Sbjct: 1648 KAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEK--KAAEALKKEAEEAKKAE 1705

Query: 118  RIERWRAERKKKDIETIK-KDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNK-----DE 171
             +++  AE KKK  E  K ++     +      A   K   E+   ++E   K      E
Sbjct: 1706 ELKKKEAEEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKE 1765

Query: 172  NGKTAEEDIDPLDAFM-QGVHEEMRKVNKPAVPTTADVK 209
              K AEE     +A + + + EE  K          D+ 
Sbjct: 1766 EEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKKIKDIF 1804


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 47.2 bits (112), Expect = 2e-05
 Identities = 39/268 (14%), Positives = 90/268 (33%), Gaps = 22/268 (8%)

Query: 7   KRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR 66
           K   ++ P     + +E  +++ ++ +++ +   +   +DR    + ++E       + +
Sbjct: 91  KTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEEPKDR----KPKEEAKEKRPPKEK 146

Query: 67  SREAERSKDHSKKEEKDK---REKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWR 123
            +E E+  +  +  E++K   R + +      P K     +    E E Q++  R     
Sbjct: 147 EKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEPPEEEKQRQAARE---- 202

Query: 124 AERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPL 183
              K K  E    + +                  ED+S +    ++  +    + D  P 
Sbjct: 203 -AVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISRRSSSSLKKPDPSPS 261

Query: 184 DAFMQ------GVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGE 237
            A  +            R   +P     A  +PA    K   +V V    +   +     
Sbjct: 262 MASPETRESSKRTETRPRTSLRPPSARPASARPAPPRVKRKEIVTVLQDAQGVGKIVSNV 321

Query: 238 LME----ENQDGLEYSSEEEQEDLTSTA 261
           ++E    E++D   +  E   +     A
Sbjct: 322 ILEGKKSEDEDDENFVVEAAAQAPDIVA 349



 Score = 37.2 bits (86), Expect = 0.027
 Identities = 27/170 (15%), Positives = 51/170 (30%), Gaps = 5/170 (2%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
             ++  ++ + P    K+ + +R            + ER  E D  +D E         +
Sbjct: 179 PKKKPPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDE 238

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERW 122
            R  S   E S+  S   +K              S    E            R       
Sbjct: 239 SRQSS---EISRRSSSSLKKPDPSPSMASPETRESSKRTETRPRTSLRPPSARPASARPA 295

Query: 123 RAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDEN 172
               K+K+I T+ +D  +     +  +  ++    ED+ DE+        
Sbjct: 296 PPRVKRKEIVTVLQD--AQGVGKIVSNVILEGKKSEDEDDENFVVEAAAQ 343



 Score = 33.3 bits (76), Expect = 0.44
 Identities = 21/150 (14%), Positives = 52/150 (34%), Gaps = 3/150 (2%)

Query: 5   RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
              + R        KRP + +  +   +    R  E   +R+R R   R K+     K++
Sbjct: 126 EEPKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKP---PKKK 182

Query: 65  SRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRA 124
             +++ E  ++  +++   +  K + E      + +KE +  +         +  E  ++
Sbjct: 183 PPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQS 242

Query: 125 ERKKKDIETIKKDIKSNLSSGLGGSAPMKK 154
               +   +  K    + S     +    K
Sbjct: 243 SEISRRSSSSLKKPDPSPSMASPETRESSK 272


>gnl|CDD|234365 TIGR03817, DECH_helic, helicase/secretion neighborhood putative
           DEAH-box helicase.  A conserved gene neighborhood widely
           spread in the Actinobacteria contains this
           uncharacterized DEAH-box family helicase encoded
           convergently towards an operon of genes for protein
           homologous to type II secretion and pilus formation
           proteins. The context suggests that this helicase may
           play a role in conjugal transfer of DNA.
          Length = 742

 Score = 46.6 bits (111), Expect = 3e-05
 Identities = 47/175 (26%), Positives = 77/175 (44%), Gaps = 27/175 (15%)

Query: 318 GKGCPRPIKTWAQCGVSKKILDALKKQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKT 377
           G+  P P   WA       ++ AL+     +P   QA+A     +GR ++    T SGK+
Sbjct: 12  GRTAPWP--AWAH----PDVVAALEAAGIHRPWQHQARAAELAHAGRHVVVATGTASGKS 65

Query: 378 VAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVV--CVYG 435
           +A+ LP+L  + D P         A+ ++PT+ L      +  +  + L LR V    Y 
Sbjct: 66  LAYQLPVLSALADDP------RATALYLAPTKAL----AADQLRAVRELTLRGVRPATYD 115

Query: 436 GTGISEQISELKRGAEIIVCTPGRMIDML----AANSGRVTN-LRRVTYIVLDEA 485
           G   +E+    +  A  ++  P    DML      +  R    LRR+ Y+V+DE 
Sbjct: 116 GDTPTEERRWAREHARYVLTNP----DMLHRGILPSHARWARFLRRLRYVVIDEC 166


>gnl|CDD|224125 COG1204, COG1204, Superfamily II helicase [General function
           prediction only].
          Length = 766

 Score = 46.2 bits (110), Expect = 4e-05
 Identities = 34/135 (25%), Positives = 60/135 (44%), Gaps = 14/135 (10%)

Query: 351 PIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRE 410
           P Q      ++S  +++  A TGSGKT+  +L +L  +L+        G   + + P + 
Sbjct: 35  PQQEAVEKGLLSDENVLISAPTGSGKTLIALLAILSTLLE-------GGGKVVYIVPLKA 87

Query: 411 LCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGR 470
           L  +  +E  +  + LG+RV      TG  +   E     ++IV TP ++ D L      
Sbjct: 88  LAEEKYEEFSRL-EELGIRVGIS---TGDYDLDDERLARYDVIVTTPEKL-DSLTRKRPS 142

Query: 471 VTNLRRVTYIVLDEA 485
              +  V  +V+DE 
Sbjct: 143 W--IEEVDLVVIDEI 155


>gnl|CDD|234478 TIGR04121, DEXH_lig_assoc, DEXH box helicase, DNA
           ligase-associated.  Members of this protein family are
           DEAD/DEAH box helicases found associated with a
           bacterial ATP-dependent DNA ligase, part of a four-gene
           system that occurs in about 12 % of prokaryotic
           reference genomes. The actual motif in this family is
           DE[VILW]H.
          Length = 803

 Score = 45.6 bits (109), Expect = 6e-05
 Identities = 45/142 (31%), Positives = 72/142 (50%), Gaps = 11/142 (7%)

Query: 348 KPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLP-LLRHILDQPPLEETDGPMAIIMS 406
            P P Q +   A + GR  + IA TGSGKT+A  LP L+     + P     G   + ++
Sbjct: 13  TPRPFQLEMWAAALEGRSGLLIAPTGSGKTLAGFLPSLIDLAGPEKP---KKGLHTLYIT 69

Query: 407 PTRELCMQIGKEAKKFTKSLGL--RVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDML 464
           P R L + I +  +   + LGL  RV    G T  SE+  + K+  +I++ TP  +  +L
Sbjct: 70  PLRALAVDIARNLQAPIEELGLPIRVETRTGDTSSSERARQRKKPPDILLTTPESLALLL 129

Query: 465 A-ANSGRV-TNLRRVTYIVLDE 484
           +  ++ R+  +LR V   V+DE
Sbjct: 130 SYPDAARLFKDLRCV---VVDE 148


>gnl|CDD|221821 pfam12871, PRP38_assoc, Pre-mRNA-splicing factor 38-associated
          hydrophilic C-term.  This domain is a hydrophilic
          region found at the C-terminus of plant and metazoan
          pre-mRNA-splicing factor 38 proteins. The function is
          not known.
          Length = 97

 Score = 41.3 bits (97), Expect = 1e-04
 Identities = 24/79 (30%), Positives = 34/79 (43%), Gaps = 3/79 (3%)

Query: 4  SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEK---SRG 60
                  R       R + S R + R R RR +   +R  R RDRD  R +++    R 
Sbjct: 19 EEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRDRD 78

Query: 61 SKRRSRSREAERSKDHSKK 79
             RSRSR   RS+D  ++
Sbjct: 79 RYDRSRSRSRSRSRDRRRR 97



 Score = 40.5 bits (95), Expect = 2e-04
 Identities = 25/73 (34%), Positives = 35/73 (47%), Gaps = 1/73 (1%)

Query: 3  RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
          R + +R   R      +R +   R + R R+RR R  +R   R RDRD   R    R S+
Sbjct: 26 RRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRDRDRYDR-SR 84

Query: 63 RRSRSREAERSKD 75
           RSRSR  +R + 
Sbjct: 85 SRSRSRSRDRRRR 97



 Score = 37.1 bits (86), Expect = 0.003
 Identities = 20/84 (23%), Positives = 34/84 (40%), Gaps = 1/84 (1%)

Query: 4  SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
             +           +R  +  R   R R RR     +RS + R R  +R + + R    
Sbjct: 15 ESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDD 74

Query: 64 RSRSREAERSKDHSKKEEKDKREK 87
          R R R  +RS+  S+   +D+R +
Sbjct: 75 RDRDRY-DRSRSRSRSRSRDRRRR 97



 Score = 29.8 bits (67), Expect = 1.1
 Identities = 17/71 (23%), Positives = 28/71 (39%)

Query: 21 PKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKE 80
            E   D +  RR+  R  +R     R R   R + + R  KRR R R+ +R++   + +
Sbjct: 15 ESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDD 74

Query: 81 EKDKREKEEEE 91
              R      
Sbjct: 75 RDRDRYDRSRS 85


>gnl|CDD|223989 COG1061, SSL2, DNA or RNA helicases of superfamily II
           [Transcription / DNA replication, recombination, and
           repair].
          Length = 442

 Score = 41.7 bits (98), Expect = 9e-04
 Identities = 37/178 (20%), Positives = 62/178 (34%), Gaps = 34/178 (19%)

Query: 348 KPTPIQAQAIPAIMSGRDLIG----IAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAI 403
           +  P Q +A+ A++  R        +  TG+GKTV      +  +              +
Sbjct: 36  ELRPYQEEALDALVKNRRTERRGVIVLPTGAGKTVVAAE-AIAEL----------KRSTL 84

Query: 404 IMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDM 463
           ++ PT+EL  Q  +  KKF   L    + +YGG             A++ V T    +  
Sbjct: 85  VLVPTKELLDQWAEALKKFL--LLNDEIGIYGGGEKEL------EPAKVTVAT----VQT 132

Query: 464 LAANSGRVTNL-RRVTYIVLDEADRMFDMGFEPQVMRIIDNVRPDRQTVM-FSATFPR 519
           LA        L      I+ DE   +    +     R I  +       +  +AT  R
Sbjct: 133 LARRQLLDEFLGNEFGLIIFDEVHHLPAPSY-----RRILELLSAAYPRLGLTATPER 185


>gnl|CDD|236766 PRK10811, rne, ribonuclease E; Reviewed.
          Length = 1068

 Score = 40.4 bits (95), Expect = 0.003
 Identities = 26/117 (22%), Positives = 55/117 (47%), Gaps = 10/117 (8%)

Query: 5   RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRR 64
           RR R+  R    +  R +E R +++ +RR R ++ ++ +E  R+       EK+R    +
Sbjct: 614 RRDRNERRDTRDNRTR-REGRENREENRRNRRQAQQQTAE-TRESQQAEVTEKARTQDEQ 671

Query: 65  SRSREAERSKDHSKKEEKDKREKEEEEAAF---DPSKLDKEVEATRLELEMQKRRDR 118
            ++   ER     ++   +KR+ ++E  A    + S  + E E    ++   +R+ R
Sbjct: 672 QQAPRRER----QRRRNDEKRQAQQEAKALNVEEQSVQETEQEERVQQV-QPRRKQR 723



 Score = 38.9 bits (91), Expect = 0.008
 Identities = 25/110 (22%), Positives = 46/110 (41%), Gaps = 13/110 (11%)

Query: 19  KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
             PK   + + +  RR+ R + RR +R+  RD    + +  G + R  +R   R      
Sbjct: 592 PAPKAEAKPERQQDRRKPRQNNRR-DRNERRDTRDNRTRREGRENREENRRNRRQAQQQT 650

Query: 79  KEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKK 128
            E ++ ++ E  E              T+ E +   RR+R +R R + K+
Sbjct: 651 AETRESQQAEVTEK-----------ARTQDEQQQAPRRER-QRRRNDEKR 688


>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
           component YidC; Validated.
          Length = 429

 Score = 39.8 bits (93), Expect = 0.003
 Identities = 27/160 (16%), Positives = 59/160 (36%), Gaps = 16/160 (10%)

Query: 18  HKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHS 77
            K     +R + R++++  ++      R+R R +       R  +  + + E ++++   
Sbjct: 283 FKEHHAEQRAQYREKQKEKKAFLWTLRRNRLRMIIT---PWRAPELHAENAEIKKTRTAE 339

Query: 78  KKEEKD--KREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIK 135
           K E K   K   ++  AA    ++++E    R    M + R R    +A +KK  I+   
Sbjct: 340 KNEAKARKKEIAQKRRAAER--EINREARQERAA-AMARARARRAAVKA-KKKGLIDASP 395

Query: 136 KDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKT 175
            +   + +    GS P               +   E  + 
Sbjct: 396 NEDTPSENEESKGSPP-------QVEATTTAEPNREPSQE 428



 Score = 38.3 bits (89), Expect = 0.010
 Identities = 19/94 (20%), Positives = 38/94 (40%), Gaps = 8/94 (8%)

Query: 5   RRKRSRS-----RSPS---PSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKE 56
           RR R R      R+P     + +  K    +K+  + R+    ++R   +R+ + E R+E
Sbjct: 309 RRNRLRMIITPWRAPELHAENAEIKKTRTAEKNEAKARKKEIAQKRRAAEREINREARQE 368

Query: 57  KSRGSKRRSRSREAERSKDHSKKEEKDKREKEEE 90
           ++    R    R A ++K     +     +   E
Sbjct: 369 RAAAMARARARRAAVKAKKKGLIDASPNEDTPSE 402


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 40.0 bits (94), Expect = 0.004
 Identities = 31/240 (12%), Positives = 85/240 (35%), Gaps = 7/240 (2%)

Query: 23   ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK 82
            E   +K+  + +R +S  +       +   ++KEK +      +S++A    +  + +  
Sbjct: 1145 EEVEEKEIAKEQRLKSKTKGKASKLRKPKLKKKEKKKKKSSADKSKKASVVGNSKRVDSD 1204

Query: 83   DKREKEEEEAAFDP--SKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKS 140
            +KR+ +++        S  D+E +  +     +    R++  +    K   +  +     
Sbjct: 1205 EKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDD 1264

Query: 141  NLSSGLGGSAPMKKWNLEDDSDEDENDN----KDENGKTAEEDIDPLDAFMQGVHEEMRK 196
                G   +AP +  +    S    +       +   K +      +   ++G    ++K
Sbjct: 1265 LSKEGKPKNAPKRV-SAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRLEGSLAALKK 1323

Query: 197  VNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDGLEYSSEEEQED 256
              K    T    K      + +       + +   +K+     +++   ++ S +E+ ED
Sbjct: 1324 KKKSEKKTARKKKSKTRVKQASASQSSRLLRRPRKKKSDSSSEDDDDSEVDDSEDEDDED 1383



 Score = 38.5 bits (90), Expect = 0.013
 Identities = 27/181 (14%), Positives = 62/181 (34%), Gaps = 8/181 (4%)

Query: 2    VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGS 61
                +K + S S     +  K   +     R +  +++  +S  D D        K    
Sbjct: 1212 KPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKP 1271

Query: 62   KRR-SRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE 120
            K    R    + S     K    +     + ++    K+ K +E +   L+ +K+ ++  
Sbjct: 1272 KNAPKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRLEGSLAALKKKKKSEKK- 1330

Query: 121  RWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDI 180
               A +KK      +     +           +K   +  S++D++   D++    +ED 
Sbjct: 1331 --TARKKKSKTRVKQASASQSSRLL----RRPRKKKSDSSSEDDDDSEVDDSEDEDDEDD 1384

Query: 181  D 181
            +
Sbjct: 1385 E 1385



 Score = 33.1 bits (76), Expect = 0.56
 Identities = 34/235 (14%), Positives = 73/235 (31%), Gaps = 11/235 (4%)

Query: 44   ERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEV 103
            E++   + E  KE+   SK + ++ +  + K   KKE+K K+   ++          K V
Sbjct: 1143 EQEEVEEKEIAKEQRLKSKTKGKASKLRKPKL-KKKEKKKKKSSADKSKKASVVGNSKRV 1201

Query: 104  EATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDE 163
            ++        K  ++        ++ D E   K  KS++           K + ++D   
Sbjct: 1202 DSDEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFS 1261

Query: 164  DENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIV 223
             ++ +K+   K A + +  +            K         +             +   
Sbjct: 1262 SDDLSKEGKPKNAPKRVSAVQYSPP----PPSKRPDGESNGGSKPSSPTKKKVKKRLEGS 1317

Query: 224  TGVVKKSV--EKAKGELMEENQDGLEYSSEEEQEDLTSTAANLASKQKKELSKVD 276
               +KK    EK      +      + S        +        K+    S+ D
Sbjct: 1318 LAALKKKKKSEKKTARKKKSKTRVKQAS----ASQSSRLLRRPRKKKSDSSSEDD 1368


>gnl|CDD|234702 PRK00254, PRK00254, ski2-like helicase; Provisional.
          Length = 720

 Score = 39.0 bits (91), Expect = 0.007
 Identities = 41/153 (26%), Positives = 74/153 (48%), Gaps = 15/153 (9%)

Query: 333 VSKKILDALKKQNYEKPTPIQAQAIPA-IMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
           V ++I   LK++  E+  P QA+A+ + ++ G++L+    T SGKT+   + ++  +L  
Sbjct: 8   VDERIKRVLKERGIEELYPPQAEALKSGVLEGKNLVLAIPTASGKTLVAEIVMVNKLL-- 65

Query: 392 PPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAE 451
                 +G  A+ + P + L  +  +E K + K LGLRV      TG  +   E     +
Sbjct: 66  -----REGGKAVYLVPLKALAEEKYREFKDWEK-LGLRVAMT---TGDYDSTDEWLGKYD 116

Query: 452 IIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDE 484
           II+ T  +   +L   S  +   + V  +V DE
Sbjct: 117 IIIATAEKFDSLLRHGSSWI---KDVKLVVADE 146


>gnl|CDD|236779 PRK10864, PRK10864, putative methyltransferase; Provisional.
          Length = 346

 Score = 38.2 bits (89), Expect = 0.010
 Identities = 28/92 (30%), Positives = 36/92 (39%), Gaps = 14/92 (15%)

Query: 12  RSPSPSHKRPKESR--RDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR- 68
           RS   S KR    R  +   R   +      RR  RD DR+ + R  K   S  R+ SR 
Sbjct: 18  RSDDDSDKRTHNPRTGKGGGRPSGKSRADGGRRPARD-DRNSQSRDRKWEDSPWRTVSRA 76

Query: 69  ---EAERSKDH---SKKEEKD----KREKEEE 90
              E     DH   S K   D    +R++ EE
Sbjct: 77  PGDETPEKADHGGISGKSFIDPEVLRRQRAEE 108



 Score = 31.7 bits (72), Expect = 1.1
 Identities = 23/97 (23%), Positives = 31/97 (31%), Gaps = 10/97 (10%)

Query: 27  DKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKRE 86
           DK     R  +   R S + R     R     R S+ R R  E    +  S+    +  E
Sbjct: 24  DKRTHNPRTGKGGGRPSGKSRADGGRRPARDDRNSQSRDRKWEDSPWRTVSRAPGDETPE 83

Query: 87  KEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWR 123
           K +       S +D E            RR R E  R
Sbjct: 84  KADHGGISGKSFIDPE----------VLRRQRAEETR 110


>gnl|CDD|219061 pfam06495, Transformer, Fruit fly transformer protein.  This family
           consists of transformer proteins from several Drosophila
           species and also from Ceratitis capitata (Mediterranean
           fruit fly). The transformer locus (tra) produces an RNA
           processing protein that alternatively splices the
           doublesex pre-mRNA in the sex determination hierarchy of
           Drosophila melanogaster.
          Length = 182

 Score = 37.3 bits (86), Expect = 0.011
 Identities = 21/65 (32%), Positives = 27/65 (41%), Gaps = 1/65 (1%)

Query: 5   RRKRSRSRSPSPSHKRPKESR-RDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
           +RK   +R  +    R   SR R +  +R    R H  RS      D   R   S   +R
Sbjct: 41  QRKTQSTRPTTSHRGRRTRSRSRSQSAERNSCQRRHRSRSRSRNRSDSRHRSTSSTERRR 100

Query: 64  RSRSR 68
           RSRSR
Sbjct: 101 RSRSR 105



 Score = 36.6 bits (84), Expect = 0.014
 Identities = 26/65 (40%), Positives = 29/65 (44%), Gaps = 7/65 (10%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
           R RR RSRSRS S         R    R  R RSRS  R   R R      R+ +SR   
Sbjct: 54  RGRRTRSRSRSQS-------AERNSCQRRHRSRSRSRNRSDSRHRSTSSTERRRRSRSRS 106

Query: 63  RRSRS 67
           R SR+
Sbjct: 107 RYSRT 111



 Score = 34.3 bits (78), Expect = 0.10
 Identities = 26/96 (27%), Positives = 37/96 (38%), Gaps = 10/96 (10%)

Query: 4   SRRKRSRSRSPSPSHKRP--KESRRDKDRDRRRRSRSHE-------RRSERDRDRDLERR 54
           SR  R   R      K P   +  R++DR R  R R  +        R  R R R   + 
Sbjct: 7   SRSPRDTRRDSRKKEKIPYFADEVRERDRVRNLRQRKTQSTRPTTSHRGRRTRSRSRSQS 66

Query: 55  KEKSRGSKR-RSRSREAERSKDHSKKEEKDKREKEE 89
            E++   +R RSRSR   RS    +     +R +  
Sbjct: 67  AERNSCQRRHRSRSRSRNRSDSRHRSTSSTERRRRS 102



 Score = 33.1 bits (75), Expect = 0.22
 Identities = 26/78 (33%), Positives = 32/78 (41%), Gaps = 7/78 (8%)

Query: 2   VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHER--RSERDRDRDLERRKEKSR 59
           VR R +    R       RP  S R +    R RS+S ER     R R R   R +  SR
Sbjct: 30  VRERDRVRNLRQRKTQSTRPTTSHRGRRTRSRSRSQSAERNSCQRRHRSRSRSRNRSDSR 89

Query: 60  -----GSKRRSRSREAER 72
                 ++RR RSR   R
Sbjct: 90  HRSTSSTERRRRSRSRSR 107



 Score = 33.1 bits (75), Expect = 0.25
 Identities = 23/79 (29%), Positives = 37/79 (46%), Gaps = 4/79 (5%)

Query: 6   RKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRS 65
           R+R R R+     +R  +S R     R RR+RS  R    +R+    R + +SR S+ RS
Sbjct: 31  RERDRVRN---LRQRKTQSTRPTTSHRGRRTRSRSRSQSAERNSCQRRHRSRSR-SRNRS 86

Query: 66  RSREAERSKDHSKKEEKDK 84
            SR    S    ++  + +
Sbjct: 87  DSRHRSTSSTERRRRSRSR 105



 Score = 31.9 bits (72), Expect = 0.54
 Identities = 21/45 (46%), Positives = 23/45 (51%), Gaps = 3/45 (6%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDR 47
             RR RSRSRS + S  R    R     +RRRRSRS  R S   R
Sbjct: 72  CQRRHRSRSRSRNRSDSR---HRSTSSTERRRRSRSRSRYSRTPR 113


>gnl|CDD|237497 PRK13767, PRK13767, ATP-dependent helicase; Provisional.
          Length = 876

 Score = 38.7 bits (91), Expect = 0.011
 Identities = 44/158 (27%), Positives = 73/158 (46%), Gaps = 20/158 (12%)

Query: 343 KQNYEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEET----D 398
           K+ +   TP Q  AIP I  G++++  + TGSGKT+A  L ++  +     L       D
Sbjct: 27  KEKFGTFTPPQRYAIPLIHEGKNVLISSPTGSGKTLAAFLAIIDELFR---LGREGELED 83

Query: 399 GPMAIIMSPTREL-----------CMQIGKEAKKFTKSL-GLRVVCVYGGTGISEQISEL 446
               + +SP R L             +I + AK+  + L  +RV    G T   E+   L
Sbjct: 84  KVYCLYVSPLRALNNDIHRNLEEPLTEIREIAKERGEELPEIRVAIRTGDTSSYEKQKML 143

Query: 447 KRGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDE 484
           K+   I++ TP  +  +L +   R   LR V ++++DE
Sbjct: 144 KKPPHILITTPESLAILLNSPKFR-EKLRTVKWVIVDE 180


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 38.2 bits (89), Expect = 0.014
 Identities = 28/167 (16%), Positives = 67/167 (40%), Gaps = 1/167 (0%)

Query: 28  KDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKD-HSKKEEKDKRE 86
           K      RS        R +  +LER+ E+ +           +        +EE ++ E
Sbjct: 691 KSLKNELRSLEDLLEELRRQLEELERQLEELKRELAALEEELEQLQSRLEELEEELEELE 750

Query: 87  KEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGL 146
           +E EE      +L++E+E+    L   K        + +  ++++E ++++++       
Sbjct: 751 EELEELQERLEELEEELESLEEALAKLKEEIEELEEKRQALQEELEELEEELEEAERRLD 810

Query: 147 GGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEE 193
                ++      +  E E +  +E  +  EE +D L+  ++ + +E
Sbjct: 811 ALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKE 857



 Score = 35.8 bits (83), Expect = 0.085
 Identities = 41/255 (16%), Positives = 91/255 (35%), Gaps = 17/255 (6%)

Query: 28  KDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREK 87
           ++          E    ++   +LE      R       +   E  +   + +EK +  K
Sbjct: 277 EELREELEELQEELLELKEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALK 336

Query: 88  EEEEAAFDPSKLDKEVEATRLELEMQ-KRRDRIERWRAERKKKDIETIKKDIKSNLSSGL 146
           EE E       L +E+E    ELE   +  +       E  ++  E +++++    +   
Sbjct: 337 EELEER---ETLLEELEQLLAELEEAKEELEEKLSALLEELEELFEALREELAELEAELA 393

Query: 147 GGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTA 206
                +++   E +S E+  +   E  +  +E++  L+A ++ +  E+ ++N+       
Sbjct: 394 EIRNELEELKREIESLEERLERLSERLEDLKEELKELEAELEELQTELEELNEELEELEE 453

Query: 207 DVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDGLEYSSEEEQEDLTSTAANLAS 266
            ++                   K +E+   EL EE Q   +  S  E       A   AS
Sbjct: 454 QLEELRD-------------RLKELERELAELQEELQRLEKELSSLEARLDRLEAEQRAS 500

Query: 267 KQKKELSKVDHSTIE 281
           +  + + +   S + 
Sbjct: 501 QGVRAVLEALESGLP 515



 Score = 33.9 bits (78), Expect = 0.28
 Identities = 22/113 (19%), Positives = 42/113 (37%), Gaps = 5/113 (4%)

Query: 23  ESRRDKDRDRRRRSRSHERRSERDRDR--DLERRKEKSRGSKRRSRSREAERSKDHSKKE 80
           E    +  + +R   + E   E+ + R  +LE   E+        + R  E  ++    E
Sbjct: 712 EELERQLEELKRELAALEEELEQLQSRLEELEEELEELEEELEELQERLEELEEELESLE 771

Query: 81  EKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIET 133
           E   + KEE E      +  + ++    ELE +           ER+ + +E 
Sbjct: 772 EALAKLKEEIEEL---EEKRQALQEELEELEEELEEAERRLDALERELESLEQ 821



 Score = 33.5 bits (77), Expect = 0.45
 Identities = 30/180 (16%), Positives = 66/180 (36%), Gaps = 8/180 (4%)

Query: 22  KESRRDKDRDRR--RRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKK 79
           KE   + + +    R           + +  LE  KEK    K     RE    +     
Sbjct: 294 KEEIEELEGEISLLRERLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLL 353

Query: 80  EEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
            E ++ ++E EE        + E     L  E+ +    +   R E     +E +K++I+
Sbjct: 354 AELEEAKEELEEKL-SALLEELEELFEALREELAELEAELAEIRNE-----LEELKREIE 407

Query: 140 SNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNK 199
           S        S  ++    E    E E +      +   E+++ L+  ++ + + ++++ +
Sbjct: 408 SLEERLERLSERLEDLKEELKELEAELEELQTELEELNEELEELEEQLEELRDRLKELER 467



 Score = 32.8 bits (75), Expect = 0.74
 Identities = 31/170 (18%), Positives = 74/170 (43%), Gaps = 8/170 (4%)

Query: 19  KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
           ++ +    + + +        E   ER  + + E    +   +K +    E E  +  + 
Sbjct: 733 EQLQSRLEELEEELEELEEELEELQERLEELEEELESLEEALAKLKEEIEELEEKRQ-AL 791

Query: 79  KEEKDKREKEEEEAAFDPSKLDKEVEAT-----RLELEMQKRRDRIERWRAERK--KKDI 131
           +EE ++ E+E EEA      L++E+E+      RLE E+++  + IE    +    ++++
Sbjct: 792 QEELEELEEELEEAERRLDALERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEEL 851

Query: 132 ETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDID 181
           E ++K+++          A  ++   E    E+E +  +E  +  E ++ 
Sbjct: 852 EELEKELEELKEELEELEAEKEELEDELKELEEEKEELEEELRELESELA 901



 Score = 30.5 bits (69), Expect = 3.4
 Identities = 23/130 (17%), Positives = 48/130 (36%), Gaps = 4/130 (3%)

Query: 17  SHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDH 76
             K   E +     +         R    + + +L   + +    KR   S E    +  
Sbjct: 358 EAKEELEEKLSALLEELEELFEALREELAELEAELAEIRNELEELKREIESLEERLERLS 417

Query: 77  SKKEEKDKR--EKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETI 134
            + E+  +   E E E         +   E   LE ++++ RDR++    ER+  +++  
Sbjct: 418 ERLEDLKEELKELEAELEELQTELEELNEELEELEEQLEELRDRLK--ELERELAELQEE 475

Query: 135 KKDIKSNLSS 144
            + ++  LSS
Sbjct: 476 LQRLEKELSS 485



 Score = 30.1 bits (68), Expect = 4.5
 Identities = 22/134 (16%), Positives = 53/134 (39%), Gaps = 6/134 (4%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
           + R +       S      K     ++ + +R++   E     +   + ERR +      
Sbjct: 757 QERLEELEEELESLEEALAKLKEEIEELEEKRQALQEELEELEEELEEAERRLDALEREL 816

Query: 63  RRSRSREAERSKD-HSKKEEKDKREKEEEEAAFDPSKLDKEVEA-----TRLELEMQKRR 116
                R     ++    +EE ++ E++ +E   +  +L+KE+E        LE E ++  
Sbjct: 817 ESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEELKEELEELEAEKEELE 876

Query: 117 DRIERWRAERKKKD 130
           D ++    E+++ +
Sbjct: 877 DELKELEEEKEELE 890



 Score = 29.3 bits (66), Expect = 7.2
 Identities = 22/118 (18%), Positives = 45/118 (38%)

Query: 22  KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE 81
            E   +    RR R        E + +   E+  E     +   +  E  + +    + E
Sbjct: 812 LERELESLEQRRERLEQEIEELEEEIEELEEKLDELEEELEELEKELEELKEELEELEAE 871

Query: 82  KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
           K++ E E +E   +  +L++E+     EL   K      R R E  +  +E ++ ++ 
Sbjct: 872 KEELEDELKELEEEKEELEEELRELESELAELKEEIEKLRERLEELEAKLERLEVELP 929


>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
           unknown].
          Length = 652

 Score = 37.8 bits (88), Expect = 0.020
 Identities = 24/124 (19%), Positives = 55/124 (44%), Gaps = 9/124 (7%)

Query: 20  RPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKK 79
           RP+E    ++ +RR  +     +  +  +  +ER +E++   KR     + E  K  S+ 
Sbjct: 402 RPREKEGTEEEERREITVY--EKRIKKLEETVERLEEENSELKRELEELKREIEKLESEL 459

Query: 80  EEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
           E   +  +++        + D+E+ A    +E  ++    ++ R E  ++ +  ++K  K
Sbjct: 460 ERFRREVRDKV-------RKDREIRARDRRIERLEKELEEKKKRVEELERKLAELRKMRK 512

Query: 140 SNLS 143
             LS
Sbjct: 513 LELS 516


>gnl|CDD|179385 PRK02224, PRK02224, chromosome segregation protein; Provisional.
          Length = 880

 Score = 37.7 bits (88), Expect = 0.021
 Identities = 42/175 (24%), Positives = 70/175 (40%), Gaps = 33/175 (18%)

Query: 20  RPKESRRDKDRDRRRR-----SRSHERRSERDRDR-----------DLERRKEKSRGSKR 63
           R +E   ++ RD R R         +  +E   D            +LE R E+ R    
Sbjct: 272 REREELAEEVRDLRERLEELEEERDDLLAEAGLDDADAEAVEARREELEDRDEELRDRLE 331

Query: 64  RSR-SREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERW 122
             R + +A   +  S +E+ D  E+  EE   + ++L+ E+E  R  +E   RR+ IE  
Sbjct: 332 ECRVAAQAHNEEAESLREDADDLEERAEELREEAAELESELEEAREAVE--DRREEIEEL 389

Query: 123 RAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAE 177
             E     IE +++           G AP+   N ED  +E   +  +   + AE
Sbjct: 390 EEE-----IEELRERF---------GDAPVDLGNAEDFLEELREERDELREREAE 430



 Score = 29.2 bits (66), Expect = 7.2
 Identities = 39/138 (28%), Positives = 63/138 (45%), Gaps = 21/138 (15%)

Query: 22  KESRRDKDRDRRRRSRSHERRSERDR--DRDLERRKE----KSRGSKRRSRSREAERSKD 75
            E   + +R   +R ++ E R E D   +   ERR+E    ++     R    E ER ++
Sbjct: 216 AELDEEIERYEEQREQARETRDEADEVLEEHEERREELETLEAEIEDLRETIAETERERE 275

Query: 76  HSKKEEKDKREKEEE----------EAAFDPSKLDKEVEATRLELEMQKR--RDRIE--R 121
              +E +D RE+ EE          EA  D     + VEA R ELE +    RDR+E  R
Sbjct: 276 ELAEEVRDLRERLEELEEERDDLLAEAGLD-DADAEAVEARREELEDRDEELRDRLEECR 334

Query: 122 WRAERKKKDIETIKKDIK 139
             A+   ++ E++++D  
Sbjct: 335 VAAQAHNEEAESLREDAD 352



 Score = 29.2 bits (66), Expect = 8.2
 Identities = 25/112 (22%), Positives = 47/112 (41%), Gaps = 12/112 (10%)

Query: 20  RPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKK 79
             +    ++  +R       E R ER  +R   R   +   ++RR    E +R +    +
Sbjct: 488 EEEVEEVEERLERAEDLVEAEDRIERLEER---REDLEELIAERRETI-EEKRERAEELR 543

Query: 80  EEKDKREKEEEEAAFDPSKLDKEVEATRLEL--------EMQKRRDRIERWR 123
           E   + E E EE     ++ ++E E  R E+        E+++R + +ER R
Sbjct: 544 ERAAELEAEAEEKREAAAEAEEEAEEAREEVAELNSKLAELKERIESLERIR 595


>gnl|CDD|223588 COG0514, RecQ, Superfamily II DNA helicase [DNA replication,
           recombination, and repair].
          Length = 590

 Score = 37.3 bits (87), Expect = 0.023
 Identities = 41/198 (20%), Positives = 80/198 (40%), Gaps = 36/198 (18%)

Query: 346 YEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLP-LLRHILDQPPLEETDGPMAII 404
           Y    P Q + I A++SG+D + +  TG GK++ + +P LL            +G   ++
Sbjct: 15  YASFRPGQQEIIDALLSGKDTLVVMPTGGGKSLCYQIPALLL-----------EGL-TLV 62

Query: 405 MSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELKRGAE-----IIVCTPGR 459
           +SP   L M+   +     ++ G+R   +   T   E+  ++    +     ++  +P R
Sbjct: 63  VSPLISL-MKDQVDQ---LEAAGIRAAYL-NSTLSREERQQVLNQLKSGQLKLLYISPER 117

Query: 460 MIDMLAANSGRVTNLR---RVTYIVLDEADRMFDMG--FEPQVMRIIDNVR--PDRQTVM 512
           +       S R   L     ++ + +DEA  +   G  F P   R+       P+   + 
Sbjct: 118 L------MSPRFLELLKRLPISLVAIDEAHCISQWGHDFRPDYRRLGRLRAGLPNPPVLA 171

Query: 513 FSATFPRQMEALARRILN 530
            +AT   ++    R  L 
Sbjct: 172 LTATATPRVRDDIREQLG 189


>gnl|CDD|234468 TIGR04095, dnd_restrict_1, DNA phosphorothioation system
           restriction enzyme.  The DNA phosphorothioate
           modification system dnd (DNA instability during
           electrophoresis) recently has been shown to provide a
           modification essential to a restriction system. This
           protein family was detected by Partial Phylogenetic
           Profiling as linked to dnd, and its members usually are
           clustered with the dndABCDE genes.
          Length = 451

 Score = 36.9 bits (86), Expect = 0.027
 Identities = 37/152 (24%), Positives = 61/152 (40%), Gaps = 31/152 (20%)

Query: 348 KPTPIQAQAIPAIMS--GRDLIGIAKTGSGKTV-AFVLPLLRHILDQPPLEETDGPMAII 404
           +    Q +AI A     GR ++ +A TG+GKT+ A       +       E+    + ++
Sbjct: 8   ELRDYQKEAIRAWFKNNGRGILKMA-TGTGKTLTALAAASKLY-------EKIGLLVLLV 59

Query: 405 MSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTG-----ISEQISELKRGAE---IIVCT 456
           + P + L  Q  +EA+KF    GL  +  Y         +S  +  L  G +    I+ T
Sbjct: 60  VCPYQHLVDQWAREAEKF----GLNPILCYESVSNWQSELSTGLYNLNSGNQKFLAIITT 115

Query: 457 PGRMIDMLAANSGRVTNLRRV---TYIVLDEA 485
               I          + LRR    T ++ DEA
Sbjct: 116 NATFI-----GKNFQSQLRRFPGKTLLIGDEA 142


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
          approximately 300 residues, found in plants and
          vertebrates. They contain a highly conserved DDRGK
          motif.
          Length = 189

 Score = 35.8 bits (83), Expect = 0.031
 Identities = 15/69 (21%), Positives = 37/69 (53%)

Query: 22 KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE 81
          K+  + +++  RR+ R  E     +R +  E+R+ + +  +     RE ++ ++  K+ E
Sbjct: 6  KKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKERE 65

Query: 82 KDKREKEEE 90
          +  R+++EE
Sbjct: 66 EQARKEQEE 74


>gnl|CDD|219293 pfam07093, SGT1, SGT1 protein.  This family consists of several
           eukaryotic SGT1 proteins. Human SGT1 or hSGT1 is known
           to suppress GCR2 and is highly expressed in the muscle
           and heart. The function of this family is unknown
           although it has been speculated that SGT1 may be
           functionally analogous to the Gcr2p protein of
           Saccharomyces cerevisiae which is known to be a
           regulatory factor of glycolytic gene expression.
          Length = 557

 Score = 36.6 bits (85), Expect = 0.035
 Identities = 26/165 (15%), Positives = 54/165 (32%), Gaps = 17/165 (10%)

Query: 74  KDHSKKEEKDKREKEEEEAAFDPSKLDKEVEA--------TRLELEMQKRRDRIERWRAE 125
           ++    ++  K  KE+     D  ++   +E            E    +  D  E   +E
Sbjct: 392 QERQGDKKDLKSNKEDANEVDDLEEVVSSMEEFLNKVSSFEGAEFADDEDEDDDEPDDSE 451

Query: 126 RKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDA 185
            K    +    +    L + LG        +L DDSD+ + D+ +++ +  +   D    
Sbjct: 452 DKDVSFDE--DEFFEFLKNMLGLKDDEIDNDLPDDSDDADEDDDEDDDEDEDSSSDSTLE 509

Query: 186 FMQGVHEEM-------RKVNKPAVPTTADVKPADSGSKPAGVVIV 223
            ++   ++M          N   +  +      D      GV  V
Sbjct: 510 ELEEYMDQMDAELKQTDSSNNADISNSGSSGAEDDDDDIEGVEPV 554


>gnl|CDD|224123 COG1202, COG1202, Superfamily II helicase, archaea-specific
           [General function prediction only].
          Length = 830

 Score = 36.7 bits (85), Expect = 0.036
 Identities = 59/243 (24%), Positives = 109/243 (44%), Gaps = 25/243 (10%)

Query: 333 VSKKILDALKKQNYEKPTPIQAQAIPA-IMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQ 391
           + +K    LK++  E+  P+Q  A+ A ++ G +L+ ++ T SGKT      L+  +   
Sbjct: 201 IPEKFKRMLKREGIEELLPVQVLAVEAGLLEGENLLVVSATASGKT------LIGELAGI 254

Query: 392 PPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISELK---- 447
           P L      M + + P   L  Q  ++ K+    LGL+V    G + I  +   +     
Sbjct: 255 PRLLSGGKKM-LFLVPLVALANQKYEDFKERYSKLGLKVAIRVGMSRIKTREEPVVVDTS 313

Query: 448 RGAEIIVCTPGRMIDMLAANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVR-- 505
             A+IIV T   +  +L   +G+  +L  +  +V+DE   + D    P++  +I  +R  
Sbjct: 314 PDADIIVGTYEGIDYLL--RTGK--DLGDIGTVVIDEIHTLEDEERGPRLDGLIGRLRYL 369

Query: 506 -PDRQTVMFSATFPRQMEALARRILNKPIEIQVGGRSVVCKEVEQHVIVLDEEQKMLKLL 564
            P  Q +  SAT     E LA+++  K +      R V    +E+H++    E +   ++
Sbjct: 370 FPGAQFIYLSATVGNPEE-LAKKLGAKLVLYD--ERPV---PLERHLVFARNESEKWDII 423

Query: 565 ELL 567
             L
Sbjct: 424 ARL 426


>gnl|CDD|219256 pfam06991, Prp19_bind, Splicing factor, Prp19-binding domain.  This
           family represents the C-terminus (approximately 300
           residues) of proteins that are involved as binding
           partners for Prp19 as part of the nuclear pore complex.
           The family in Drosophila is necessary for pre-mRNA
           splicing, and the human protein has been found in
           purifications of the spliceosome. In the past this
           family was thought, erroneously, to be associated with
           microfibrillin.
          Length = 277

 Score = 35.3 bits (81), Expect = 0.063
 Identities = 32/107 (29%), Positives = 53/107 (49%), Gaps = 8/107 (7%)

Query: 37  RSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSK---DHSKKEEKDKREKEEEEAA 93
           R  +R + ++R+R+  + K     +KR++  R+ E  K   +  KKE + K+     EA 
Sbjct: 43  RKKDRITIQEREREAAKEKALEEEAKRKAEERKRETLKIVEEEVKKELELKKRNTLLEAN 102

Query: 94  FDPSKLDKEVEATRLEL----EM-QKRRDRIERWRAERKKKDIETIK 135
            D    D E E    E     E+ + +RDR ER   ER+K +IE ++
Sbjct: 103 IDDVDTDDENEEEEYEAWKLRELKRIKRDREEREEMEREKAEIEKMR 149


>gnl|CDD|109943 pfam00906, Hepatitis_core, Hepatitis core antigen.  The core
           antigen of hepatitis viruses possesses a carboxyl
           terminus rich in arginine. On this basis it was
           predicted that the core antigen would bind DNA. There is
           some experimental evidence to support this.
          Length = 182

 Score = 34.8 bits (80), Expect = 0.073
 Identities = 16/47 (34%), Positives = 20/47 (42%), Gaps = 15/47 (31%)

Query: 2   VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRD 48
           VR R +  R R+PSP               RRRRS+S  RR  +   
Sbjct: 148 VRRRGRSPRRRTPSP---------------RRRRSQSPRRRRSQSPS 179



 Score = 32.8 bits (75), Expect = 0.32
 Identities = 15/39 (38%), Positives = 22/39 (56%), Gaps = 6/39 (15%)

Query: 33  RRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAE 71
           RRR RS  RR+   R     RR+ +S   +RRS+S  ++
Sbjct: 149 RRRGRSPRRRTPSPR-----RRRSQSPR-RRRSQSPSSQ 181


>gnl|CDD|219939 pfam08619, Nha1_C, Alkali metal cation/H+ antiporter Nha1 C
           terminus.  The C terminus of the plasma membrane Nha1
           antiporter plays an important role in the immediate cell
           response to hypo-osmotic shock which prevents an
           execessive loss of ions and water. This domain is found
           with pfam00999.
          Length = 430

 Score = 35.2 bits (81), Expect = 0.087
 Identities = 25/107 (23%), Positives = 33/107 (30%), Gaps = 14/107 (13%)

Query: 19  KRPKESRRDKDRDRRRRSRSHE---------RRSERD-----RDRDLERRKEKSRGSKRR 64
           K  +  RR     RRRR R  E           S+       R    E   E    S   
Sbjct: 63  KGSRAGRRASSLRRRRRQRRKEPQAGTGALGPISQSAISPQRRSSTGENSAESDNTSYGL 122

Query: 65  SRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELE 111
           S+  E   + D     E D+R    EE +      ++E   T    E
Sbjct: 123 SKLAEDSENIDVRPVYESDERSGISEEGSRPSKLREQEQRPTEAYQE 169


>gnl|CDD|219406 pfam07420, DUF1509, Protein of unknown function (DUF1509).  This
           family consists of several uncharacterized viral
           proteins from the Marek's disease-like viruses. Members
           of this family are typically around 400 residues in
           length. The function of this family is unknown.
          Length = 377

 Score = 35.0 bits (80), Expect = 0.10
 Identities = 22/67 (32%), Positives = 28/67 (41%), Gaps = 4/67 (5%)

Query: 9   SRSRSPSPSHK---RPKESRRDKDRDRRR-RSRSHERRSERDRDRDLERRKEKSRGSKRR 64
           S      P H      +  RR++   R R RSRS  RR  R R R +  R+ +SR     
Sbjct: 310 SHHTMRRPPHSTSGERRGRRRNRSESRSRSRSRSGSRRYRRRRGRGVPGRRSESRQDTVL 369

Query: 65  SRSREAE 71
             S EA 
Sbjct: 370 VSSSEAS 376



 Score = 31.9 bits (72), Expect = 1.0
 Identities = 18/54 (33%), Positives = 25/54 (46%), Gaps = 1/54 (1%)

Query: 29  DRDRRRRSRSHERRSERDRDRDLERRKEKSRGS-KRRSRSREAERSKDHSKKEE 81
           +R  RRR+RS  R   R R      R+ + RG   RRS SR+       S+  +
Sbjct: 324 ERRGRRRNRSESRSRSRSRSGSRRYRRRRGRGVPGRRSESRQDTVLVSSSEASD 377


>gnl|CDD|217348 pfam03064, U79_P34, HSV U79 / HCMV P34.  This family represents
           herpes virus protein U79 and cytomegalovirus early
           phosphoprotein P34 (UL112).
          Length = 238

 Score = 34.5 bits (79), Expect = 0.11
 Identities = 15/69 (21%), Positives = 27/69 (39%)

Query: 22  KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE 81
           KE RR +D  + +  R  ++  +R  D D            +   S + E  K+  +K  
Sbjct: 167 KEKRRVEDSQKHKEDRRKKQEEKRRNDEDKRPGGGGGSSGGQSGLSTKDEPPKEKRQKHH 226

Query: 82  KDKREKEEE 90
             +R  E +
Sbjct: 227 DPERRLEPQ 235



 Score = 29.8 bits (67), Expect = 3.2
 Identities = 16/85 (18%), Positives = 35/85 (41%), Gaps = 4/85 (4%)

Query: 49  RDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRL 108
           R L R+K      KR  + +E  R +D  K +E  ++++EE+    +  +      ++  
Sbjct: 148 RALSRKKSDDEHRKRSGKQKEKRRVEDSQKHKEDRRKKQEEKRRNDEDKRPGGGGGSS-- 205

Query: 109 ELEMQKRRDRIERWRAERKKKDIET 133
               Q      +    E+++K  + 
Sbjct: 206 --GGQSGLSTKDEPPKEKRQKHHDP 228


>gnl|CDD|223046 PHA03328, PHA03328, nuclear egress lamina protein UL31;
          Provisional.
          Length = 316

 Score = 34.7 bits (80), Expect = 0.11
 Identities = 18/55 (32%), Positives = 23/55 (41%), Gaps = 1/55 (1%)

Query: 20 RPKESRRDK-DRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERS 73
            + S   +  R   RRSR   R   R R R   RR+   R S RR+   + ER 
Sbjct: 6  LRRSSSSLRRSRRAARRSRRDGRVGSRGRSRYRSRRRSSRRSSTRRAELADTERD 60



 Score = 33.5 bits (77), Expect = 0.28
 Identities = 13/45 (28%), Positives = 18/45 (40%)

Query: 3  RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDR 47
           +RR R   R  S    R +  RR   R   RR+   +   +R R
Sbjct: 19 AARRSRRDGRVGSRGRSRYRSRRRSSRRSSTRRAELADTERDRYR 63



 Score = 32.8 bits (75), Expect = 0.49
 Identities = 18/53 (33%), Positives = 22/53 (41%), Gaps = 6/53 (11%)

Query: 3  RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSE-RDRDRDLERR 54
          R   +RSR      S  R +   R     RR   RS  RR+E  D +RD  R 
Sbjct: 17 RRAARRSRRDGRVGSRGRSRYRSR-----RRSSRRSSTRRAELADTERDRYRA 64



 Score = 32.0 bits (73), Expect = 0.74
 Identities = 16/47 (34%), Positives = 21/47 (44%)

Query: 3  RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
          RSRR   RSR       R +   R + R  RR S      ++ +RDR
Sbjct: 15 RSRRAARRSRRDGRVGSRGRSRYRSRRRSSRRSSTRRAELADTERDR 61



 Score = 32.0 bits (73), Expect = 0.91
 Identities = 22/61 (36%), Positives = 25/61 (40%), Gaps = 4/61 (6%)

Query: 8  RSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRS 67
          R  S S   S +  + SRRD     R RSR   RR  R   R   RR E +     R R 
Sbjct: 7  RRSSSSLRRSRRAARRSRRDGRVGSRGRSRYRSRR--RSSRRSSTRRAELAD--TERDRY 62

Query: 68 R 68
          R
Sbjct: 63 R 63



 Score = 30.5 bits (69), Expect = 2.4
 Identities = 19/71 (26%), Positives = 28/71 (39%), Gaps = 8/71 (11%)

Query: 25 RRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDK 84
             +     RRSR   RRS RD  R   R + + R  +R SR          +++ E   
Sbjct: 5  ALRRSSSSLRRSRRAARRSRRD-GRVGSRGRSRYRSRRRSSRRS-------STRRAELAD 56

Query: 85 REKEEEEAAFD 95
           E++   A F 
Sbjct: 57 TERDRYRAYFA 67


>gnl|CDD|129701 TIGR00614, recQ_fam, ATP-dependent DNA helicase, RecQ family.  All
           proteins in this family for which functions are known
           are 3'-5' DNA-DNA helicases. These proteins are used for
           recombination, recombinational repair, and possibly
           maintenance of chromosome stability. This family is
           based on the phylogenomic analysis of JA Eisen (1999,
           Ph.D. Thesis, Stanford University) [DNA metabolism, DNA
           replication, recombination, and repair].
          Length = 470

 Score = 35.1 bits (81), Expect = 0.12
 Identities = 34/151 (22%), Positives = 67/151 (44%), Gaps = 34/151 (22%)

Query: 346 YEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIM 405
                P+Q + I A++ GRD   +  TG GK++ + LP L           +DG + +++
Sbjct: 9   LSSFRPVQLEVINAVLLGRDCFVVMPTGGGKSLCYQLPALC----------SDG-ITLVI 57

Query: 406 SPTREL----CMQIGKEAKKFTKSLGLRVVCVYGGTGISEQ---ISELKRGA-EIIVCTP 457
           SP   L     +Q+        K+ G+    +       +Q   +++LK G  +++  TP
Sbjct: 58  SPLISLMEDQVLQL--------KASGIPATFLNSSQSKEQQKNVLTDLKDGKIKLLYVTP 109

Query: 458 GRMIDMLAANSG---RVTNLRRVTYIVLDEA 485
            +     +A++     +   + +T I +DEA
Sbjct: 110 EK----CSASNRLLQTLEERKGITLIAVDEA 136


>gnl|CDD|221931 pfam13136, DUF3984, Protein of unknown function (DUF3984).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in eukaryotes. Proteins in
           this family are typically between 393 and 442 amino
           acids in length.
          Length = 301

 Score = 34.7 bits (80), Expect = 0.12
 Identities = 29/118 (24%), Positives = 39/118 (33%), Gaps = 26/118 (22%)

Query: 2   VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGS 61
             SRR RS   +PS +               RR SRS  RR  R    DL          
Sbjct: 179 RASRRGRSGYSTPSAAL-------------SRRGSRSASRRGSRA---DLSM-----TPL 217

Query: 62  KRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDP-----SKLDKEVEATRLELEMQK 114
           + R    E  R       +  D+  + E  +  D      S  + E E    E E+Q+
Sbjct: 218 EARRADAEDSRDTVLLGPDFVDEDIRAEMASIDDESFSSLSDSESESEDEIDEAEVQR 275



 Score = 30.8 bits (70), Expect = 1.9
 Identities = 24/71 (33%), Positives = 31/71 (43%), Gaps = 2/71 (2%)

Query: 8   RSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRS 67
            SRS S S  HKR K SRR    D   +S+S          R    R+ KS  +  R  S
Sbjct: 58  HSRSPSRSRLHKRKKSSRRSPMSDTLLKSKSSAHLLHHQSTR--SHRRSKSGTTSPRKPS 115

Query: 68  REAERSKDHSK 78
             A R ++ S+
Sbjct: 116 SSAHRRRNDSE 126


>gnl|CDD|218684 pfam05672, MAP7, MAP7 (E-MAP-115) family.  The organisation of
           microtubules varies with the cell type and is presumably
           controlled by tissue-specific microtubule-associated
           proteins (MAPs). The 115-kDa epithelial MAP
           (E-MAP-115/MAP7) has been identified as a
           microtubule-stabilising protein predominantly expressed
           in cell lines of epithelial origin. The binding of this
           microtubule associated protein is nucleotide
           independent.
          Length = 171

 Score = 33.5 bits (76), Expect = 0.14
 Identities = 35/125 (28%), Positives = 66/125 (52%), Gaps = 4/125 (3%)

Query: 19  KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
           +  +E R  +++DR  R    E    R  +  L R +E  R  + R+R +E +  +   +
Sbjct: 41  QEEQERREQEEQDRLER----EELKRRAAEERLRREEEARRQEEERAREKEEKAKRKAEE 96

Query: 79  KEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDI 138
           +E++++ E+E  +   + ++     EA R+ LE +K   +IE+ R ERKK+  E +K+  
Sbjct: 97  EEKQEQEEQERIQKQKEEAEARAREEAERMRLEREKHFQQIEQERLERKKRLEEIMKRTR 156

Query: 139 KSNLS 143
           KS +S
Sbjct: 157 KSEVS 161


>gnl|CDD|114629 pfam05917, DUF874, Helicobacter pylori protein of unknown function
           (DUF874).  This family consists of several hypothetical
           proteins specific to Helicobacter pylori. The function
           of this family is unknown.
          Length = 417

 Score = 34.5 bits (78), Expect = 0.15
 Identities = 26/110 (23%), Positives = 54/110 (49%), Gaps = 7/110 (6%)

Query: 29  DRDRRRRSRSHERRSERDRDR------DLERRKEKSRGSKRRSRSREAERSKDHSKKE-E 81
           D+D++      ++ +E  RDR      +LE+ ++K+   K+++     E +    K E E
Sbjct: 123 DQDKKIELAQAKKEAENARDRANKSGIELEQEEQKTEQEKQKTEKEGIELANSQIKAEQE 182

Query: 82  KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDI 131
           K K E+E+++   +  K         +ELE +K++   E+    +++KD 
Sbjct: 183 KQKTEQEKQKTEQEKQKTSNIANKNAIELEQEKQKTENEKQDLIKEQKDF 232


>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
          Length = 1021

 Score = 34.7 bits (79), Expect = 0.16
 Identities = 38/172 (22%), Positives = 74/172 (43%), Gaps = 12/172 (6%)

Query: 25  RRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDK 84
           R DKD   R R    E+ +   +  +++  ++K      R      ER +    + E+ +
Sbjct: 429 RVDKDHAERARI---EKENAHRKALEMKILEKKRIERLEREERERLERERMERIERERLE 485

Query: 85  REKEEEEAAFDPSKLDKE-VEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLS 143
           RE+ E E      +L+++ +E  RL+   ++R DR+ER R E+ +++   +K     N  
Sbjct: 486 RERLERE------RLERDRLERDRLDRLERERVDRLERDRLEKARRNSYFLKG--MENGL 537

Query: 144 SGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMR 195
           S  GG            +    +D ++ +G  +        +   GVH+ +R
Sbjct: 538 SAGGGPGDGPGVGAGVGAGVGTSDGRNHSGVRSGIHCSIQSSARGGVHDSVR 589



 Score = 30.9 bits (69), Expect = 2.8
 Identities = 25/90 (27%), Positives = 42/90 (46%), Gaps = 2/90 (2%)

Query: 17  SHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLER-RKEKSRGSKRRSRSREAERSK- 74
           +H++  E +  + +   R  R    R ER+R   +ER R E+ R  + R      ER + 
Sbjct: 445 AHRKALEMKILEKKRIERLEREERERLERERMERIERERLERERLERERLERDRLERDRL 504

Query: 75  DHSKKEEKDKREKEEEEAAFDPSKLDKEVE 104
           D  ++E  D+ E++  E A   S   K +E
Sbjct: 505 DRLERERVDRLERDRLEKARRNSYFLKGME 534


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 34.8 bits (80), Expect = 0.18
 Identities = 17/65 (26%), Positives = 20/65 (30%), Gaps = 2/65 (3%)

Query: 4   SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
                S SRSPSPS  RP          +R R             R   RR   +   + 
Sbjct: 341 VSPGPSPSRSPSPS--RPPPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARAAVAGRA 398

Query: 64  RSRSR 68
           R R  
Sbjct: 399 RRRDA 403



 Score = 31.7 bits (72), Expect = 1.6
 Identities = 11/47 (23%), Positives = 14/47 (29%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
                 S  R      + P        R  RRR+R+      R RD 
Sbjct: 357 PPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARAAVAGRARRRDA 403



 Score = 30.9 bits (70), Expect = 2.5
 Identities = 16/71 (22%), Positives = 23/71 (32%), Gaps = 4/71 (5%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
            S   R  + SP PS  R     R              R+  R             R ++
Sbjct: 332 SSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSP----RKRPRPSRAPSSPAASAGRPTR 387

Query: 63  RRSRSREAERS 73
           RR+R+  A R+
Sbjct: 388 RRARAAVAGRA 398


>gnl|CDD|235032 PRK02362, PRK02362, ski2-like helicase; Provisional.
          Length = 737

 Score = 34.5 bits (80), Expect = 0.20
 Identities = 66/255 (25%), Positives = 118/255 (46%), Gaps = 39/255 (15%)

Query: 351 PIQAQAIPA-IMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTR 409
           P QA+A+ A ++ G++L+    T SGKT+   L +L+ I          G  A+ + P R
Sbjct: 26  PPQAEAVEAGLLDGKNLLAAIPTASGKTLIAELAMLKAIA--------RGGKALYIVPLR 77

Query: 410 ELCMQIGKEAKKFTKSLGLRVVCVYGGTGIS----EQISELKRGAEIIVCTPGRMIDMLA 465
            L  +  +E ++F + LG+RV       GIS    +   E     +IIV T  + +D L 
Sbjct: 78  ALASEKFEEFERFEE-LGVRV-------GISTGDYDSRDEWLGDNDIIVATSEK-VDSLL 128

Query: 466 ANSGRVTNLRRVTYIVLDEADRMFDMGFEPQVMRIIDNVR---PDRQTVMFSATF--PRQ 520
            N      L  +T +V+DE   +      P +   +  +R   PD Q V  SAT     +
Sbjct: 129 RNGAPW--LDDITCVVVDEVHLIDSANRGPTLEVTLAKLRRLNPDLQVVALSATIGNADE 186

Query: 521 MEAL--ARRILN--KPIEIQVG---GRSVVCKEVEQHVIVLDEEQKMLKLLELLGIYQDQ 573
           +     A  + +  +PI+++ G   G ++   + ++ V V  ++  +  +L+ L   ++ 
Sbjct: 187 LADWLDAELVDSEWRPIDLREGVFYGGAIHFDDSQREVEVPSKDDTLNLVLDTL---EEG 243

Query: 574 GSVIVFVDKQENADS 588
           G  +VFV  + NA+ 
Sbjct: 244 GQCLVFVSSRRNAEG 258


>gnl|CDD|152107 pfam11671, Apis_Csd, Complementary sex determiner protein.  This
          family of proteins represents the complementary sex
          determiner in the honeybee. In the honeybee, the
          mechanism of sex determination depends on the csd gene
          which produces an SR-type protein. Males are homozygous
          while females are homozygous for the csd gene.
          Heterozygosity generates an active protein which
          initiates female development.
          Length = 146

 Score = 32.8 bits (74), Expect = 0.24
 Identities = 14/40 (35%), Positives = 23/40 (57%)

Query: 3  RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERR 42
          RSR +  +S     S++  +E+ R++ RDR  R RS E +
Sbjct: 8  RSREREQKSYKNENSYREYRETSRERSRDRTERERSREHK 47



 Score = 30.8 bits (69), Expect = 1.0
 Identities = 16/46 (34%), Positives = 23/46 (50%), Gaps = 2/46 (4%)

Query: 24 SRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSRE 69
          SR+   R R R  +S+  ++E       E  +E+SR    R RSRE
Sbjct: 2  SRKRYSRSREREQKSY--KNENSYREYRETSRERSRDRTERERSRE 45



 Score = 29.3 bits (65), Expect = 3.0
 Identities = 18/46 (39%), Positives = 28/46 (60%), Gaps = 3/46 (6%)

Query: 32 RRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR-EAERSKDH 76
          R+R SRS ER  +  ++ +  R   ++  S+ RSR R E ERS++H
Sbjct: 3  RKRYSRSREREQKSYKNENSYREYRET--SRERSRDRTERERSREH 46


>gnl|CDD|216205 pfam00937, Corona_nucleoca, Coronavirus nucleocapsid protein. 
          Length = 346

 Score = 33.5 bits (77), Expect = 0.28
 Identities = 24/83 (28%), Positives = 33/83 (39%), Gaps = 8/83 (9%)

Query: 3   RSR---RKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSR 59
           RSR   R  SRS S  PS        R+  R+R   S      +       L   K+KS 
Sbjct: 155 RSRSSSRSSSRSNSRGPSRGSS----RNNSRNRNSSSPDDLVAAVLAALAKLGFGKQKSS 210

Query: 60  GSKRRSRSREAERSKDHSKKEEK 82
            SK+ SR  +   ++   K+  K
Sbjct: 211 -SKKPSRVTKKSAAEAAKKQLNK 232


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 33.9 bits (78), Expect = 0.29
 Identities = 40/179 (22%), Positives = 74/179 (41%), Gaps = 16/179 (8%)

Query: 32  RRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEE 91
           R++  R    R + +R + L + K +  G +        ER K+  +++     E+E E+
Sbjct: 197 RQQLERLRREREKAERYQALLKEKREYEGYELLKEKEALERQKEAIERQLASL-EEELEK 255

Query: 92  AAFDPSKLDKEVEATRLEL-EMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSA 150
              + S+L+K +E     L E+ K+   +      R K+ I  ++ +I S     L  S 
Sbjct: 256 LTEEISELEKRLEEIEQLLEELNKKIKDLGEEEQLRVKEKIGELEAEIAS-----LERSI 310

Query: 151 PMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVK 209
             K+  LED          +E     E +ID L A ++ +  E+ +  K     T +  
Sbjct: 311 AEKERELED---------AEERLAKLEAEIDKLLAEIEELEREIEEERKRRDKLTEEYA 360


>gnl|CDD|137505 PRK09751, PRK09751, putative ATP-dependent helicase Lhr;
           Provisional.
          Length = 1490

 Score = 33.7 bits (77), Expect = 0.30
 Identities = 34/133 (25%), Positives = 55/133 (41%), Gaps = 19/133 (14%)

Query: 369 IAKTGSGKTV-AFVLPLLRHILDQPPLEETDGPMA----IIMSPTRELCMQIGKEAKKFT 423
           IA TGSGKT+ AF+  L R   +                + +SP + L   + +  +   
Sbjct: 2   IAPTGSGKTLAAFLYALDRLFREGGEDTREAHKRKTSRILYISPIKALGTDVQRNLQIPL 61

Query: 424 KSLG------------LRVVCVYGGTGISEQISELKRGAEIIVCTPGRMIDMLAANSGRV 471
           K +             LRV    G T   E+    +   +I++ TP  +  ML + + R 
Sbjct: 62  KGIADERRRRGETEVNLRVGIRTGDTPAQERSKLTRNPPDILITTPESLYLMLTSRA-RE 120

Query: 472 TNLRRVTYIVLDE 484
           T LR V  +++DE
Sbjct: 121 T-LRGVETVIIDE 132


>gnl|CDD|153337 cd07653, F-BAR_CIP4-like, The F-BAR (FES-CIP4 Homology and
           Bin/Amphiphysin/Rvs) domain of Cdc42-Interacting Protein
           4 and similar proteins.  F-BAR domains are dimerization
           modules that bind and bend membranes and are found in
           proteins involved in membrane dynamics and actin
           reorganization. This subfamily is composed of
           Cdc42-Interacting Protein 4 (CIP4), Formin Binding
           Protein 17 (FBP17), FormiN Binding Protein 1-Like
           (FNBP1L), and similar proteins. CIP4 and FNBP1L are
           Cdc42 effectors that bind Wiskott-Aldrich syndrome
           protein (WASP) and function in endocytosis. CIP4 and
           FBP17 bind to the Fas ligand and may be implicated in
           the inflammatory response. CIP4 may also play a role in
           phagocytosis. Members of this subfamily typically
           contain an N-terminal F-BAR domain and a C-terminal SH3
           domain. In addition, some members such as FNBP1L contain
           a central Cdc42-binding HR1 domain. F-BAR domains form
           banana-shaped dimers with a positively-charged concave
           surface that binds to negatively-charged lipid
           membranes. They can induce membrane deformation in the
           form of long tubules.
          Length = 251

 Score = 33.0 bits (76), Expect = 0.32
 Identities = 19/65 (29%), Positives = 35/65 (53%), Gaps = 3/65 (4%)

Query: 52  ERRKEKSRGSKRRSRSREAERSKDHSKKE-EKDKREKEEEEAAFDPSKLDKEVEATRLEL 110
           ER+K  S GSK + +   + +  + SKK  EK  +E E+ +  ++  K D ++  T+ ++
Sbjct: 106 ERKKHLSEGSKLQQKLESSIKQLEKSKKAYEKAFKEAEKAKQKYE--KADADMNLTKADV 163

Query: 111 EMQKR 115
           E  K 
Sbjct: 164 EKAKA 168


>gnl|CDD|217301 pfam02956, TT_ORF1, TT viral orf 1.  TT virus (TTV), isolated
          initially from a Japanese patient with hepatitis of
          unknown aetiology, has since been found to infect both
          healthy and diseased individuals and numerous
          prevalence studies have raised questions about its role
          in unexplained hepatitis. ORF1 is a large 750 residue
          protein. The N-terminal half of this protein
          corresponds to the capsid protein.
          Length = 525

 Score = 33.4 bits (77), Expect = 0.34
 Identities = 26/66 (39%), Positives = 32/66 (48%), Gaps = 12/66 (18%)

Query: 3  RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
          R RR+R R R      +R    RR   R RRRR R   RR  R R R   RR+ + R  +
Sbjct: 7  RRRRRRWRGRR----RRRR---RR---RARRRRRRRRVRR--RRRGRRRRRRRRRRRRRR 54

Query: 63 RRSRSR 68
          RR R +
Sbjct: 55 RRKRKK 60


>gnl|CDD|222914 PHA02666, PHA02666, hypothetical protein; Provisional.
          Length = 287

 Score = 33.4 bits (75), Expect = 0.34
 Identities = 21/113 (18%), Positives = 34/113 (30%), Gaps = 5/113 (4%)

Query: 7   KRSRSRSPSPSHKRPKESRRDKDRDRRR-RSRSHERRSERDRDRDLERRKEKSRGSKRRS 65
           +R  +   S    RP    R  +R      S +HE  +   R       K  SR S  R 
Sbjct: 32  RRRANSMESRRKSRPSRQHRSAERTPTTASSLTHENNTAPSRHGKQHSCKASSRSSHNRG 91

Query: 66  R----SREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
                         H     + K +   +     P   + ++   RL  E ++
Sbjct: 92  STSSSHNHHAHRGPHQSAHRRSKHDAVRDTYQPCPQSPETDLYKGRLPGETER 144


>gnl|CDD|236794 PRK10917, PRK10917, ATP-dependent DNA helicase RecG; Provisional.
          Length = 681

 Score = 33.2 bits (77), Expect = 0.45
 Identities = 28/88 (31%), Positives = 39/88 (44%), Gaps = 12/88 (13%)

Query: 373 GSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVC 432
           GSGKTV   L  L  I          G  A +M+PT  L  Q  +  KK  + LG+RV  
Sbjct: 292 GSGKTVVAALAALAAI--------EAGYQAALMAPTEILAEQHYENLKKLLEPLGIRVAL 343

Query: 433 VYGGTGIS---EQISELKRG-AEIIVCT 456
           + G        E +  +  G A+I++ T
Sbjct: 344 LTGSLKGKERREILEAIASGEADIVIGT 371


>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein.  TolA couples the inner
           membrane complex of itself with TolQ and TolR to the
           outer membrane complex of TolB and OprL (also called
           Pal). Most of the length of the protein consists of
           low-complexity sequence that may differ in both length
           and composition from one species to another,
           complicating efforts to discriminate TolA (the most
           divergent gene in the tol-pal system) from paralogs such
           as TonB. Selection of members of the seed alignment and
           criteria for setting scoring cutoffs are based largely
           conserved operon struction. //The Tol-Pal complex is
           required for maintaining outer membrane integrity. Also
           involved in transport (uptake) of colicins and
           filamentous DNA, and implicated in pathogenesis.
           Transport is energized by the proton motive force. TolA
           is an inner membrane protein that interacts with
           periplasmic TolB and with outer membrane porins ompC,
           phoE and lamB [Transport and binding proteins, Other,
           Cellular processes, Pathogenesis].
          Length = 346

 Score = 32.9 bits (75), Expect = 0.46
 Identities = 31/128 (24%), Positives = 55/128 (42%), Gaps = 8/128 (6%)

Query: 20  RPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR--------RSRSREAE 71
             KE  R K  +++      +R +E+ R ++LE+R    + +K+          + ++AE
Sbjct: 63  AKKEQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAE 122

Query: 72  RSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDI 131
            +K     E K K E E E+ A + +K   E EA        K++    + +AE + K  
Sbjct: 123 EAKAKQAAEAKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAK 182

Query: 132 ETIKKDIK 139
              K   K
Sbjct: 183 AEAKAKAK 190



 Score = 31.7 bits (72), Expect = 1.2
 Identities = 23/99 (23%), Positives = 46/99 (46%), Gaps = 16/99 (16%)

Query: 31  DRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEE 90
           +R ++ +    + E++R + LE++ E+      + R+ E  R K+  ++   +K  K+ E
Sbjct: 53  NRIQQQKKPAAKKEQERQKKLEQQAEE----AEKQRAAEQARQKELEQRAAAEKAAKQAE 108

Query: 91  EAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKK 129
           +AA       K+ E  + + E  K +       AE K K
Sbjct: 109 QAA-------KQAEEKQKQAEEAKAKQA-----AEAKAK 135



 Score = 29.4 bits (66), Expect = 6.5
 Identities = 27/122 (22%), Positives = 59/122 (48%), Gaps = 7/122 (5%)

Query: 19  KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAE---RSKD 75
           ++ K+    K+++R+++       +E+ R  +  R+KE  + +     +++AE   +  +
Sbjct: 56  QQQKKPAAKKEQERQKKLEQQAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAE 115

Query: 76  HSKKEEKDKREKEEEEAAFDPSKLDKEVE-ATRLELEMQKRRDRIERWRAERKKKDIETI 134
             +K+ ++ + K+  EA    +K + E E   + E + Q   +   +  AE KKK  E  
Sbjct: 116 EKQKQAEEAKAKQAAEAK---AKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAEAK 172

Query: 135 KK 136
           KK
Sbjct: 173 KK 174


>gnl|CDD|112890 pfam04094, DUF390, Protein of unknown function (DUF390).  This is a
           family of long proteins currently only found in the rice
           genome. They have no known function. However they may be
           some kind of transposable element.
          Length = 843

 Score = 33.3 bits (75), Expect = 0.46
 Identities = 25/97 (25%), Positives = 46/97 (47%), Gaps = 3/97 (3%)

Query: 4   SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
           S+  +S +  P+ +  R +E+ R +  DR R +    + + R R  +   R+E +R  + 
Sbjct: 220 SKSGQSEAEDPAAAEARRREADRREAADRLREAEEAAQDAARARQAEEAAREEAARARQA 279

Query: 64  RSRSREAE---RSKDHSKKEEKDKREKEEEEAAFDPS 97
              +REAE   R+ + +   E  + E    + A DPS
Sbjct: 280 EEAAREAEAAFRADEAAATSEAARDEAAGAQLAPDPS 316



 Score = 30.2 bits (67), Expect = 4.2
 Identities = 25/99 (25%), Positives = 44/99 (44%), Gaps = 3/99 (3%)

Query: 12  RSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAE 71
            S  PS  RP    +    +    + +  RR E DR    +R +E    ++  +R+R+AE
Sbjct: 209 LSEIPS--RPSRHSKSGQSEAEDPAAAEARRREADRREAADRLREAEEAAQDAARARQAE 266

Query: 72  RS-KDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLE 109
            + ++ + +  + +    E EAAF   +     EA R E
Sbjct: 267 EAAREEAARARQAEEAAREAEAAFRADEAAATSEAARDE 305


>gnl|CDD|227615 COG5296, COG5296, Transcription factor involved in TATA site
           selection and in elongation by RNA polymerase II
           [Transcription].
          Length = 521

 Score = 33.1 bits (75), Expect = 0.47
 Identities = 23/82 (28%), Positives = 38/82 (46%), Gaps = 5/82 (6%)

Query: 24  SRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRS-RSREAERSKDHSKKEEK 82
           S    D  R  R       +E  + + LE +K + R  +  S R  E +R KD+ + EE 
Sbjct: 122 SSGCTDTRRSTRYEPLTSAAEEKKKKLLELKKTREREERLYSERHIELQRFKDYKELEES 181

Query: 83  DKREKEEEEAAFDPSKLDKEVE 104
           ++  +EE    + PS  ++ VE
Sbjct: 182 EQGLQEE----YTPSYAEEAVE 199


>gnl|CDD|218734 pfam05758, Ycf1, Ycf1.  The chloroplast genomes of most higher
           plants contain two giant open reading frames designated
           ycf1 and ycf2. Although the function of Ycf1 is unknown,
           it is known to be an essential gene.
          Length = 832

 Score = 33.1 bits (76), Expect = 0.49
 Identities = 21/88 (23%), Positives = 35/88 (39%), Gaps = 14/88 (15%)

Query: 12  RSPSPSH-KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREA 70
           R PSP   K+ KE+   ++R             E + D ++E   E     + +  S E 
Sbjct: 216 RIPSPFFTKKLKETSETEER-------------EEETDVEIETTSETKGTKQEQEGSTEE 262

Query: 71  ERSKDHSKKEEKDKREKEEEEAAFDPSK 98
           + S    +KE+ DK E  ++       K
Sbjct: 263 DPSLFSEEKEDPDKTEDLDKLEILKEKK 290


>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein.  This family
           consists of several Borrelia P83/P100 antigen proteins.
          Length = 489

 Score = 33.1 bits (75), Expect = 0.51
 Identities = 31/184 (16%), Positives = 76/184 (41%), Gaps = 9/184 (4%)

Query: 66  RSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE 125
             RE      + +++  D +E+E +E A    +L +E++  +++ +  +++    +  A+
Sbjct: 185 ALREDNEKGVNFRRDMTDLKERESQEDAKRAQQLKEELDKKQIDADKAQQKADFAQDNAD 244

Query: 126 RKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDA 185
           +++ ++   +++ K+        S    K   E+   E E     E  K  EE +   D 
Sbjct: 245 KQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQI-EIKKNDEEALKAKDH 303

Query: 186 FMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIVTGVVKKSVEKAKGELMEENQDG 245
               + +E +   K A     +   A    +P     V   ++K+  + + +    N+D 
Sbjct: 304 KAFDLKQESKASEKEAEDKELE---AQKKREP-----VAEDLQKTKPQVEAQPTSLNEDA 355

Query: 246 LEYS 249
           ++ S
Sbjct: 356 IDSS 359



 Score = 31.1 bits (70), Expect = 2.0
 Identities = 22/114 (19%), Positives = 55/114 (48%), Gaps = 10/114 (8%)

Query: 23  ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK 82
             RRD    + R S+   +R+++ ++   +++ +  +  +      +A+ ++D++ K+  
Sbjct: 195 NFRRDMTDLKERESQEDAKRAQQLKEELDKKQIDADKAQQ------KADFAQDNADKQRD 248

Query: 83  DKREKEEEEA-AFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIK 135
           + R+K++E      P+      E  ++    ++    IE+ + E KK D E +K
Sbjct: 249 EVRQKQQEAKNLPKPADTSSPKEDKQVAENQKR---EIEKAQIEIKKNDEEALK 299



 Score = 29.2 bits (65), Expect = 6.8
 Identities = 20/136 (14%), Positives = 53/136 (38%), Gaps = 11/136 (8%)

Query: 7   KRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR 66
           KR++        K+    +  +  D  + +   +R   R + ++ +   + +  S  +  
Sbjct: 213 KRAQQLKEELDKKQIDADKAQQKADFAQDNADKQRDEVRQKQQEAKNLPKPADTSSPKED 272

Query: 67  SREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAER 126
            + AE  K   +K + + ++ +EE          K  +    +L   K+  +     AE 
Sbjct: 273 KQVAENQKREIEKAQIEIKKNDEE--------ALKAKDHKAFDL---KQESKASEKEAED 321

Query: 127 KKKDIETIKKDIKSNL 142
           K+ + +  ++ +  +L
Sbjct: 322 KELEAQKKREPVAEDL 337


>gnl|CDD|113514 pfam04747, DUF612, Protein of unknown function, DUF612.  This
           family includes several uncharacterized proteins from
           Caenorhabditis elegans.
          Length = 517

 Score = 32.7 bits (73), Expect = 0.55
 Identities = 29/127 (22%), Positives = 64/127 (50%), Gaps = 3/127 (2%)

Query: 31  DRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEE 90
           D+R+ + +    +E+ +  +  ++ EK +  K+ ++  EAE+  +  K  EK+ R  E E
Sbjct: 49  DQRKEAFASLELTEQPQQVEKVKKSEKKKAQKQIAKDHEAEQKVNAKKAAEKEARRAEAE 108

Query: 91  ---EAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLG 147
               AA +      + E  R++ E +K+   +++ +AE+KK+     +K  K+  +    
Sbjct: 109 AKKRAAQEEEHKQWKAEQERIQKEQEKKEADLKKLQAEKKKEKAVKAEKAEKAEKTKKAS 168

Query: 148 GSAPMKK 154
             AP+++
Sbjct: 169 TPAPVEE 175


>gnl|CDD|130009 TIGR00934, 2a38euk, potassium uptake protein, Trk family.  The
           proteins of the Trk family are derived from
           Gram-negative and Gram-positive bacteria, yeast and
           wheat. The proteins of E. coli K12 TrkH and TrkG as well
           as several yeast proteins have been functionally
           characterized.The E. coli TrkH and TrkG proteins are
           complexed to two peripheral membrane proteins, TrkA, an
           NAD-binding protein, and TrkE, an ATP-binding protein.
           This complex forms the potassium uptake system. This
           family is specific for the eukaryotic Trk system
           [Transport and binding proteins, Cations and iron
           carrying compounds].
          Length = 800

 Score = 33.0 bits (75), Expect = 0.58
 Identities = 42/314 (13%), Positives = 90/314 (28%), Gaps = 37/314 (11%)

Query: 4   SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
           S+++    R+ +   +  +   R +     R +  H     RD    L   +   R    
Sbjct: 141 SKQRFFLRRTKTLLQR--ELEDRPETGVAGRVTVPHGSAKRRDFQDKLFSGEFVKRDEPD 198

Query: 64  -RSRSREAERSKDHS-KKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIER 121
             S   +++   D S    E +K  K       DP  L + +   +   E  + +     
Sbjct: 199 QNSPDVKSDTRADESISDLEFEKFAKRRGSRDVDPEDLYRSIMMLQGIHERIREKSSANS 258

Query: 122 WRAERKKKDIETIKKDIKSNLSSGLGG--------------SAPMKKWNLEDDSDEDEND 167
              ER  + I+   +   S                      +  +++    D ++  + +
Sbjct: 259 RSDERSSESIQEQVERRPSTSDIERNSQSLTRRYDDKSFDKAVRLRRSKTIDRAEACDLE 318

Query: 168 NKDENGKTAEEDIDPLDAFM--------QGVHEEMRKVNKPAVPTTADVKPADSGSKPAG 219
             D      +   D   A          +G + + RK        +      +     A 
Sbjct: 319 ELDRAKDFEKMTYDNWKAHHRKKKNFRPRGWNLKFRK-ASRFPKDSDRNYEDNGNHLSAS 377

Query: 220 VVIVTGVVKKSVEKAKGELMEENQD----------GLEYSSEEEQEDLTSTAANLASKQK 269
               +     S E+       + ++             Y S +      S    L  +QK
Sbjct: 378 SSFGSEEPSLSSEENLYPTYNKKREDSRHTLSKTMSTNYLSWQPTIGRNSNFVGLTKEQK 437

Query: 270 KELSKVDHSTIEYL 283
            EL  +++  ++ L
Sbjct: 438 DELGGIEYRALKCL 451



 Score = 30.0 bits (67), Expect = 4.6
 Identities = 21/114 (18%), Positives = 44/114 (38%)

Query: 1   MVRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRG 60
           M++   +R R +S + S    + S   +++  RR S S   R+ +   R  + +      
Sbjct: 242 MLQGIHERIREKSSANSRSDERSSESIQEQVERRPSTSDIERNSQSLTRRYDDKSFDKAV 301

Query: 61  SKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
             RRS++ +   + D  + +     EK   +      +  K        L+ +K
Sbjct: 302 RLRRSKTIDRAEACDLEELDRAKDFEKMTYDNWKAHHRKKKNFRPRGWNLKFRK 355


>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex.  This
           entry is characterized by proteins with alternating
           conserved and low-complexity regions. Bud13 together
           with Snu17p and a newly identified factor,
           Pml1p/Ylr016c, form a novel trimeric complex. called The
           RES complex, pre-mRNA retention and splicing complex.
           Subunits of this complex are not essential for viability
           of yeasts but they are required for efficient splicing
           in vitro and in vivo. Furthermore, inactivation of this
           complex causes pre-mRNA leakage from the nucleus. Bud13
           contains a unique, phylogenetically conserved C-terminal
           region of unknown function.
          Length = 141

 Score = 31.1 bits (71), Expect = 0.63
 Identities = 26/83 (31%), Positives = 40/83 (48%), Gaps = 4/83 (4%)

Query: 35  RSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAF 94
           R +S       ++  + ER KE+    K R   +E E  K   +KEE++KR +E E+A  
Sbjct: 5   RDKSGRIIDIEEKREEKEREKEE----KERKEEKEKEWGKGLVQKEEREKRLEELEKAKN 60

Query: 95  DPSKLDKEVEATRLELEMQKRRD 117
            P     + E    EL+ Q+R D
Sbjct: 61  KPLARYADDEDYDEELKEQERWD 83


>gnl|CDD|221188 pfam11725, AvrE, Pathogenicity factor.  This family is secreted by
           gram-negative Gammaproteobacteria such as Pseudomonas
           syringae of tomato and the fire blight plant pathogen
           Erwinia amylovora, amongst others. It is an essential
           pathogenicity factor of approximately 198 kDa. Its
           injection into the host-plant is dependent upon the
           bacterial type III or Hrp secretion system. The family
           is long and carries a number of predicted functional
           regions, including an ERMS or endoplasmic reticulum
           membrane retention signal at both the C- and the
           N-termini, a leucine-zipper motif from residues 539-560,
           and a nuclear localisation signal at 1358-1361. this
           conserved AvrE-family of effectors is among the few that
           are required for full virulence of many phytopathogenic
           pseudomonads, erwinias and pantoeas.
          Length = 1771

 Score = 32.8 bits (75), Expect = 0.63
 Identities = 32/176 (18%), Positives = 48/176 (27%), Gaps = 10/176 (5%)

Query: 4   SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
           SR    R     P      ++       RR        R E +        K  +    R
Sbjct: 85  SRGPTLRELLALPEDDGETQAPESSPSARRLTRSEGVARHEMEDLAGRPVVKPDADRQLR 144

Query: 64  -----RSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDR 118
                +S S              K         A F   ++ +EV+A R +   Q R   
Sbjct: 145 QDILNKSSSSRRPPVSKEEGTSSKMPATALASAALFKDDEIRQEVDAARSDQASQSRLS- 203

Query: 119 IERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGK 174
               R+      I       +  L+   GG    +  NLE +         D+ GK
Sbjct: 204 ----RSRGNPPAIPPDAAPRQPMLTRSAGGRFEGEDENLERNLQPQSPITLDKKGK 255


>gnl|CDD|222636 pfam14265, DUF4355, Domain of unknown function (DUF4355).  This
           family of proteins is found in bacteria and viruses.
           Proteins in this family are typically between 180 and
           214 amino acids in length.
          Length = 125

 Score = 31.1 bits (71), Expect = 0.65
 Identities = 17/77 (22%), Positives = 31/77 (40%), Gaps = 5/77 (6%)

Query: 62  KRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIER 121
           K  +     +       K EK + EK+ E       KL K     + E E++K    +E 
Sbjct: 6   KTFTDKEVDKAIAKEKAKWEKKQEEKKSEAE-----KLAKMSAEEKAEYELEKLEKELEE 60

Query: 122 WRAERKKKDIETIKKDI 138
             AE  +++++   K +
Sbjct: 61  LEAELARRELKAEAKKM 77


>gnl|CDD|165391 PHA03118, PHA03118, multifunctional expression regulator;
           Provisional.
          Length = 474

 Score = 32.4 bits (73), Expect = 0.69
 Identities = 27/137 (19%), Positives = 45/137 (32%), Gaps = 24/137 (17%)

Query: 20  RPKESRRDKDR---DRRRRSRSHERRSERDRDRD-------------LERRKEK-----S 58
            P +   D DR   +     ++H RR     D +               R  E       
Sbjct: 69  NPADVCEDADRAYTNPNFEKKAHGRREGYHHDDEKCLVTFLDDINHHGGRDTEPGHAHIE 128

Query: 59  RGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDR 118
            G ++  +S   +  K H  +  ++K  +     A  P +   +       L   +R  R
Sbjct: 129 NGERKSPKSYNQQSRKKHRDESLRNKHGRPSGPPAMSPGEHFDQTHDAEYRLRFNERDAR 188

Query: 119 IERWRAERKKKDIETIK 135
            +R    RK+ DI T K
Sbjct: 189 RDRI---RKEYDIPTDK 202



 Score = 30.0 bits (67), Expect = 4.2
 Identities = 14/47 (29%), Positives = 19/47 (40%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
           R    R++   PS           D+  D   R R +ER + RDR R
Sbjct: 147 RDESLRNKHGRPSGPPAMSPGEHFDQTHDAEYRLRFNERDARRDRIR 193



 Score = 29.7 bits (66), Expect = 5.9
 Identities = 12/73 (16%), Positives = 24/73 (32%), Gaps = 4/73 (5%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
              RK  +S +     K     R +  R++  R       S  +        + + R ++
Sbjct: 129 NGERKSPKSYNQQSRKKH----RDESLRNKHGRPSGPPAMSPGEHFDQTHDAEYRLRFNE 184

Query: 63  RRSRSREAERSKD 75
           R +R     +  D
Sbjct: 185 RDARRDRIRKEYD 197


>gnl|CDD|236394 PRK09169, PRK09169, hypothetical protein; Validated.
          Length = 2316

 Score = 32.8 bits (75), Expect = 0.72
 Identities = 17/88 (19%), Positives = 28/88 (31%), Gaps = 5/88 (5%)

Query: 4   SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSE-----RDRDRDLERRKEKS 58
               R R R        P+ +R D     R R R+   R       RD +R L+R     
Sbjct: 17  PADPRPRRRPRLGDAPAPRTARADSGATPRGRPRAGADREPTSEQLRDYERWLDRAAAGQ 76

Query: 59  RGSKRRSRSREAERSKDHSKKEEKDKRE 86
             ++R  +          ++  + D   
Sbjct: 77  LDAQREQQCARLWFLVQQARARKVDPDF 104


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
           biogenesis [Translation, ribosomal structure and
           biogenesis].
          Length = 1077

 Score = 32.4 bits (73), Expect = 0.79
 Identities = 29/130 (22%), Positives = 53/130 (40%), Gaps = 7/130 (5%)

Query: 70  AERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRI-ERWRAERKK 128
            E   + +K  E D   ++E E  FD SK+  E  ++  E  M+   + + ++W +  + 
Sbjct: 517 EEYKGESAKSSESDLVVQDEPEDFFDVSKVANESISSNHEKLMESEFEELKKKWSSLAQL 576

Query: 129 KDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDE---DENDNKDENGK--TAEEDIDPL 183
           K     K     ++          +K N ED  DE    +N+ ++  G   TAE +    
Sbjct: 577 KS-RFQKDATLDSIEGEEELIQDDEKGNFEDLEDEENSSDNEMEESRGSSVTAENEESAD 635

Query: 184 DAFMQGVHEE 193
           +   +   EE
Sbjct: 636 EVDYETEREE 645


>gnl|CDD|234173 TIGR03346, chaperone_ClpB, ATP-dependent chaperone ClpB.  Members
           of this protein family are the bacterial ATP-dependent
           chaperone ClpB. This protein belongs to the AAA family,
           ATPases associated with various cellular activities
           (pfam00004). This molecular chaperone does not act as a
           protease, but rather serves to disaggregate misfolded
           and aggregated proteins [Protein fate, Protein folding
           and stabilization].
          Length = 852

 Score = 32.2 bits (74), Expect = 0.90
 Identities = 22/66 (33%), Positives = 34/66 (51%), Gaps = 6/66 (9%)

Query: 54  RKEKSRGSKRRSRSRE---AERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLEL 110
           +KEK   SK R    E   AE  ++++  EE+ K EK   +      ++ +E+E  RLEL
Sbjct: 425 KKEKDEASKERLEDLEKELAELEEEYADLEEQWKAEKAAIQGI---QQIKEEIEQVRLEL 481

Query: 111 EMQKRR 116
           E  +R 
Sbjct: 482 EQAERE 487


>gnl|CDD|218790 pfam05876, Terminase_GpA, Phage terminase large subunit (GpA).
           This family consists of several phage terminase large
           subunit proteins as well as related sequences from
           several bacterial species. The DNA packaging enzyme of
           bacteriophage lambda, terminase, is a heteromultimer
           composed of a small subunit, gpNu1, and a large subunit,
           gpA, products of the Nu1 and A genes, respectively.
           Terminase is involved in the site-specific binding and
           cutting of the DNA in the initial stages of packaging.
           It is now known that gpA is actively involved in late
           stages of packaging, including DNA translocation, and
           that this enzyme contains separate functional domains
           for its early and late packaging activities.
          Length = 552

 Score = 32.2 bits (74), Expect = 0.90
 Identities = 13/34 (38%), Positives = 18/34 (52%), Gaps = 4/34 (11%)

Query: 457 PGRMIDMLAANSGRVTNLRRVT--YIVLDEADRM 488
           PG  + ++ ANS    NLR     Y++LDE D  
Sbjct: 115 PGGSLLLIGANS--PANLRSRPVRYVILDEVDAY 146


>gnl|CDD|215824 pfam00260, Protamine_P1, Protamine P1. 
          Length = 51

 Score = 28.7 bits (64), Expect = 0.93
 Identities = 20/54 (37%), Positives = 25/54 (46%), Gaps = 3/54 (5%)

Query: 4  SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEK 57
          +R +  RSRS S   +R    RR   R RRR  R   RR    R R   RR+ +
Sbjct: 1  ARYRCCRSRSRSRCRRRR---RRRCRRRRRRCCRRRRRRVGCCRRRYTRRRRRR 51



 Score = 28.7 bits (64), Expect = 1.0
 Identities = 19/47 (40%), Positives = 22/47 (46%)

Query: 3  RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDR 49
          R R  RSRSRS     +R +  RR +   RRRR R    R    R R
Sbjct: 2  RYRCCRSRSRSRCRRRRRRRCRRRRRRCCRRRRRRVGCCRRRYTRRR 48


>gnl|CDD|182933 PRK11057, PRK11057, ATP-dependent DNA helicase RecQ; Provisional.
          Length = 607

 Score = 32.0 bits (73), Expect = 0.99
 Identities = 15/40 (37%), Positives = 26/40 (65%)

Query: 346 YEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
           Y++  P Q + I A++SGRD + +  TG GK++ + +P L
Sbjct: 23  YQQFRPGQQEIIDAVLSGRDCLVVMPTGGGKSLCYQIPAL 62


>gnl|CDD|215814 pfam00242, DNA_pol_viral_N, DNA polymerase (viral) N-terminal
           domain. 
          Length = 379

 Score = 31.7 bits (72), Expect = 0.99
 Identities = 17/83 (20%), Positives = 27/83 (32%), Gaps = 8/83 (9%)

Query: 2   VRSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSH--ERRSERDRD---RDLERRKE 56
            +S+ +RSR    +    + K +   + R    R R H   RR              R  
Sbjct: 241 RQSQIQRSRLGLQA---NQGKLAHGQQGRSGSIRGRKHSTTRRPFGVEPSSSGVTTNRAS 297

Query: 57  KSRGSKRRSRSREAERSKDHSKK 79
            S     +S  RE   S   + +
Sbjct: 298 SSSSCFHQSAVRETAYSSLSTSE 320


>gnl|CDD|130456 TIGR01389, recQ, ATP-dependent DNA helicase RecQ.  The
           ATP-dependent DNA helicase RecQ of E. coli is about 600
           residues long. This model represents bacterial proteins
           with a high degree of similarity in domain architecture
           and in primary sequence to E. coli RecQ. The model
           excludes eukaryotic and archaeal proteins with RecQ-like
           regions, as well as more distantly related bacterial
           helicases related to RecQ [DNA metabolism, DNA
           replication, recombination, and repair].
          Length = 591

 Score = 32.0 bits (73), Expect = 1.0
 Identities = 13/40 (32%), Positives = 24/40 (60%)

Query: 346 YEKPTPIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
           Y+   P Q + I  ++ GRD++ +  TG GK++ + +P L
Sbjct: 11  YDDFRPGQEEIISHVLDGRDVLVVMPTGGGKSLCYQVPAL 50


>gnl|CDD|236545 PRK09510, tolA, cell envelope integrity inner membrane protein
           TolA; Provisional.
          Length = 387

 Score = 31.7 bits (72), Expect = 1.0
 Identities = 24/115 (20%), Positives = 48/115 (41%), Gaps = 14/115 (12%)

Query: 22  KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE 81
            E +R K   ++      ++ +E++R + LE+ +  ++  +++ ++ EA +     +K+ 
Sbjct: 77  AEEQRKKKEQQQAEELQQKQAAEQERLKQLEKERLAAQ--EQKKQAEEAAKQAALKQKQA 134

Query: 82  KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKK 136
           ++   K    A     K   E EA R     +K         AE KKK      K
Sbjct: 135 EEAAAKAAAAA-----KAKAEAEAKRAAAAAKKA-------AAEAKKKAEAEAAK 177


>gnl|CDD|113290 pfam04514, BTV_NS2, Bluetongue virus non-structural protein NS2.
           This family includes NS2 proteins from other members of
           the Orbivirus genus. NS2 is a non-specific
           single-stranded RNA-binding protein that forms large
           homomultimers and accumulates in viral inclusion bodies
           of infected cells. Three RNA binding regions have been
           identified in Bluetongue virus serotype 17 at residues
           2-11, 153-166 and 274-286. NS2 multimers also possess
           nucleotidyl phosphatase activity. The precise function
           of NS2 is not known, but it may be involved in the
           transport and condensation of viral mRNAs.
          Length = 363

 Score = 31.8 bits (72), Expect = 1.2
 Identities = 16/74 (21%), Positives = 28/74 (37%), Gaps = 7/74 (9%)

Query: 21  PKESRRDKDRDRRR-RSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKK 79
           P+ S++D+  +RR    R      E       ERR++       R      E  +  S  
Sbjct: 205 PETSKQDQKEERRAAVERRLAELVEMINWNLEERRRDL------RKEQELEENVERDSDD 258

Query: 80  EEKDKREKEEEEAA 93
           E++   + E+ E  
Sbjct: 259 EDEHGEDSEDGETK 272



 Score = 31.4 bits (71), Expect = 1.5
 Identities = 18/74 (24%), Positives = 28/74 (37%), Gaps = 1/74 (1%)

Query: 11  SRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREA 70
             +P  S +  KE RR     R          +  +R RDL + +E     +R S   + 
Sbjct: 202 DMTPETSKQDQKEERRAAVERRLAELVEMINWNLEERRRDLRKEQELEENVERDSDDED- 260

Query: 71  ERSKDHSKKEEKDK 84
           E  +D    E K +
Sbjct: 261 EHGEDSEDGETKPE 274



 Score = 29.5 bits (66), Expect = 5.9
 Identities = 18/131 (13%), Positives = 47/131 (35%), Gaps = 7/131 (5%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRR-------RRSRSHERRSERDRDRDLERRK 55
           R + K  +   P+   +R        + +             + ++  + +R   +ERR 
Sbjct: 165 REKEKEEQPMKPAFKPERWMGGPDSDEDENPLDEEAPDMTPETSKQDQKEERRAAVERRL 224

Query: 56  EKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKR 115
            +       +        +   + EE  +R+ ++E+   + S+  +    + +  E  +R
Sbjct: 225 AELVEMINWNLEERRRDLRKEQELEENVERDSDDEDEHGEDSEDGETKPESYITSEYIER 284

Query: 116 RDRIERWRAER 126
              I + + ER
Sbjct: 285 ISEIRKMKDER 295


>gnl|CDD|217198 pfam02718, Herpes_UL31, Herpesvirus UL31-like protein.  This is a
          family of Herpesvirus proteins including UL31, UL53,
          and the product of ORF 69 in some strains. The proteins
          in this family have no known function.
          Length = 262

 Score = 31.1 bits (71), Expect = 1.4
 Identities = 13/44 (29%), Positives = 15/44 (34%), Gaps = 10/44 (22%)

Query: 8  RSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDL 51
          RSRS        RP           R  SR   RR+ R   R+ 
Sbjct: 1  RSRSSVRRLPRSRPS----------RSSSRKKARRALRLTLREF 34


>gnl|CDD|227492 COG5163, NOP7, Protein required for biogenesis of the 60S ribosomal
           subunit [Translation, ribosomal structure and
           biogenesis].
          Length = 591

 Score = 31.6 bits (71), Expect = 1.4
 Identities = 24/93 (25%), Positives = 42/93 (45%), Gaps = 3/93 (3%)

Query: 46  DRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEA 105
           D D +L+ +KE    ++    S  +E  KD +K + K ++  EEEE       +    + 
Sbjct: 495 DDDEELQAQKELELEAQGIKYSETSEADKDVNKSKNKKRKVDEEEEEK-KLKMIMMSNKQ 553

Query: 106 TRLELEMQKRRDRIERWRA--ERKKKDIETIKK 136
            +L  +M+    + E      ++KKK I   KK
Sbjct: 554 KKLYKKMKYSNAKKEEQAENLKKKKKQIAKQKK 586


>gnl|CDD|198139 smart01071, CDC37_N, Cdc37 N terminal kinase binding.  Cdc37 is a
           molecular chaperone required for the activity of
           numerous eukaryotic protein kinases. This domain
           corresponds to the N terminal domain which binds
           predominantly to protein kinases.and is found N terminal
           to the Hsp (Heat shocked protein) 90-binding domain.
           Expression of a construct consisting of only the
           N-terminal domain of Saccharomyces pombe Cdc37 results
           in cellular viability. This indicates that interactions
           with the cochaperone Hsp90 may not be essential for
           Cdc37 function.
          Length = 154

 Score = 30.5 bits (69), Expect = 1.4
 Identities = 26/111 (23%), Positives = 48/111 (43%), Gaps = 13/111 (11%)

Query: 32  RRRRSRSHERRSER-----------DRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKE 80
           R ++   H+ R ER             +  L +R +K     R         + +    E
Sbjct: 31  RWKQRDIHQARVERMEEIKNLKYELIMNDHLNKRIDKLLKGLREEELSPETPTYNEMLAE 90

Query: 81  EKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKR--RDRIERWRAERKKK 129
            +D+ +KE EEA  D   L +E++  R +L+ +++  R +++    E KKK
Sbjct: 91  LQDQLKKELEEANGDSEGLLEELKKHRDKLKKEQKELRKKLDELEKEEKKK 141


>gnl|CDD|224120 COG1199, DinG, Rad3-related DNA helicases [Transcription / DNA
           replication, recombination, and repair].
          Length = 654

 Score = 31.7 bits (72), Expect = 1.4
 Identities = 23/78 (29%), Positives = 38/78 (48%), Gaps = 11/78 (14%)

Query: 346 YEKPTPIQAQAI----PAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPM 401
             +P P Q +       A+  G  L+  A TG+GKT+A++LP L +  +       +G  
Sbjct: 13  GFEPRPEQREMAEAVAEALKGGEGLLIEAPTGTGKTLAYLLPALAYARE-------EGKK 65

Query: 402 AIIMSPTRELCMQIGKEA 419
            II + T+ L  Q+ +E 
Sbjct: 66  VIISTRTKALQEQLLEED 83


>gnl|CDD|239286 cd02988, Phd_like_VIAF, Phosducin (Phd)-like family, Viral
           inhibitor of apoptosis (IAP)-associated factor (VIAF)
           subfamily; VIAF is a Phd-like protein that functions in
           caspase activation during apoptosis. It was identified
           as an IAP binding protein through a screen of a human
           B-cell library using a prototype IAP. VIAF lacks a
           consensus IAP binding motif and while it does not
           function as an IAP antagonist, it still plays a
           regulatory role in the complete activation of caspases.
           VIAF itself is a substrate for IAP-mediated
           ubiquitination, suggesting that it may be a target of
           IAPs in the prevention of cell death. The similarity of
           VIAF to Phd points to a potential role distinct from
           apoptosis regulation. Phd functions as a cytosolic
           regulator of G protein by specifically binding to G
           protein betagamma (Gbg)-subunits. The C-terminal domain
           of Phd adopts a thioredoxin fold, but it does not
           contain a CXXC motif. Phd interacts with G protein beta
           mostly through the N-terminal helical domain.
          Length = 192

 Score = 30.7 bits (70), Expect = 1.5
 Identities = 17/75 (22%), Positives = 28/75 (37%), Gaps = 11/75 (14%)

Query: 55  KEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
           K  S   +       A +    +  E+K   E +EE         D+E +   LE   + 
Sbjct: 16  KPPSPKEEEEEALELAIQEAHENALEKKLLDELDEEL--------DEEEDDRFLE---EY 64

Query: 115 RRDRIERWRAERKKK 129
           RR R+   +A  +K 
Sbjct: 65  RRKRLAEMKALAEKS 79


>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
           bacterial type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. This family
           represents the SMC protein of most bacteria. The smc
           gene is often associated with scpB (TIGR00281) and scpA
           genes, where scp stands for segregation and condensation
           protein. SMC was shown (in Caulobacter crescentus) to be
           induced early in S phase but present and bound to DNA
           throughout the cell cycle [Cellular processes, Cell
           division, DNA metabolism, Chromosome-associated
           proteins].
          Length = 1179

 Score = 31.6 bits (72), Expect = 1.5
 Identities = 21/90 (23%), Positives = 34/90 (37%)

Query: 52  ERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELE 111
           ER     R  +      E   SK     EE  + E++ EE   +   L+ E+E    ELE
Sbjct: 309 ERLANLERQLEELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAELEELEAELE 368

Query: 112 MQKRRDRIERWRAERKKKDIETIKKDIKSN 141
             + R      + E  +  +  ++  I S 
Sbjct: 369 ELESRLEELEEQLETLRSKVAQLELQIASL 398



 Score = 30.4 bits (69), Expect = 3.1
 Identities = 18/119 (15%), Positives = 46/119 (38%), Gaps = 1/119 (0%)

Query: 23  ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR-SREAERSKDHSKKEE 81
           E    +  +   R    E + E  R +  +   + +  +    R     ER +D  ++ +
Sbjct: 361 EELEAELEELESRLEELEEQLETLRSKVAQLELQIASLNNEIERLEARLERLEDRRERLQ 420

Query: 82  KDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKS 140
           ++  E  ++    +  +L  E+E    ELE  +          E  ++++E  ++ + +
Sbjct: 421 QEIEELLKKLEEAELKELQAELEELEEELEELQEELERLEEALEELREELEEAEQALDA 479



 Score = 29.6 bits (67), Expect = 7.0
 Identities = 21/109 (19%), Positives = 39/109 (35%), Gaps = 9/109 (8%)

Query: 23  ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK 82
           E   ++    R+      R+    R        E  +  +R ++  +     +   +E +
Sbjct: 708 EELEEELEQLRKELEELSRQISALRKDLARLEAEVEQLEERIAQLSKELTELEAEIEELE 767

Query: 83  DKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDI 131
           ++ E+ EEE          E EA   ELE  +     E  +A R+  D 
Sbjct: 768 ERLEEAEEEL--------AEAEAEIEELE-AQIEQLKEELKALREALDE 807


>gnl|CDD|216108 pfam00769, ERM, Ezrin/radixin/moesin family.  This family of
           proteins contain a band 4.1 domain (pfam00373), at their
           amino terminus. This family represents the rest of these
           proteins.
          Length = 244

 Score = 30.9 bits (70), Expect = 1.5
 Identities = 30/180 (16%), Positives = 62/180 (34%), Gaps = 17/180 (9%)

Query: 19  KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
           +R ++   D  R ++      E   E +     E  + +    K      E  R ++ + 
Sbjct: 12  ERMEQMEEDMRRAQKELEEYEETALELEEKLKQEEEEAQLLEKKADELEEENRRLEEEAA 71

Query: 79  KEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE--RKKKDIETIKK 136
             E+++   E E             E  +LE E +K+     + + E    ++  E  ++
Sbjct: 72  ASEEERERLEAEVDEA-------TAEVAKLEEEREKKEAETRQLQQELREAQEAHERARQ 124

Query: 137 DIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRK 196
           ++    ++                    E    D+NG+ A  D++  D  M+   EE R 
Sbjct: 125 ELLEAAAAPTAPPHVA-------APVNGEQLEPDDNGEEASADLET-DPDMKDRSEEERV 176


>gnl|CDD|180485 PRK06245, cofG, FO synthase subunit 1; Reviewed.
          Length = 336

 Score = 31.4 bits (72), Expect = 1.5
 Identities = 16/43 (37%), Positives = 19/43 (44%), Gaps = 5/43 (11%)

Query: 500 IIDNVRPDRQTVMFSATFP-----RQMEALARRILNKPIEIQV 537
           II N  P     M +   P      ++ ALAR IL   I IQV
Sbjct: 205 IIQNFSPKPGIPMENHPEPSLEEMLRVVALARLILPPDISIQV 247


>gnl|CDD|215597 PLN03137, PLN03137, ATP-dependent DNA helicase; Q4-like;
           Provisional.
          Length = 1195

 Score = 31.4 bits (71), Expect = 1.6
 Identities = 15/35 (42%), Positives = 21/35 (60%)

Query: 351 PIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLL 385
           P Q + I A MSG D+  +  TG GK++ + LP L
Sbjct: 463 PNQREIINATMSGYDVFVLMPTGGGKSLTYQLPAL 497


>gnl|CDD|215434 PLN02813, PLN02813, pfkB-type carbohydrate kinase family protein.
          Length = 426

 Score = 31.3 bits (71), Expect = 1.8
 Identities = 16/67 (23%), Positives = 27/67 (40%), Gaps = 2/67 (2%)

Query: 9  SRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSER--DRDRDLERRKEKSRGSKRRSR 66
          S + S SPS   PK +RR +    +R +    R   R  +    +++ +E+  G      
Sbjct: 4  SSTASTSPSLYVPKPNRRLRRVTSQRGAPGLFRIHSRANNAALAIQQDEEQPEGFGPIPE 63

Query: 67 SREAERS 73
              ER 
Sbjct: 64 KAVPERW 70


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 31.6 bits (71), Expect = 1.8
 Identities = 16/55 (29%), Positives = 28/55 (50%)

Query: 19  KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERS 73
           KR +   + K    ++     ER  E++++R+ ER +E  R +K  S S E+  S
Sbjct: 580 KREEAVEKAKREAEQKAREEREREKEKEKEREREREREAERAAKASSSSHESRMS 634


>gnl|CDD|235370 PRK05244, PRK05244, Der GTPase activator; Provisional.
          Length = 177

 Score = 30.3 bits (69), Expect = 1.8
 Identities = 18/68 (26%), Positives = 33/68 (48%), Gaps = 4/68 (5%)

Query: 19 KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR-SREAERSKDHS 77
          K+ K S +      + + ++   R E D +    +RK+K +G K  SR +    +SK   
Sbjct: 1  KKKKSSPKRSKGMAKSKKKT---REELDAEARERKRKKKHKGLKSGSRHNEGNTQSKGKG 57

Query: 78 KKEEKDKR 85
          + ++KD R
Sbjct: 58 QAQKKDPR 65


>gnl|CDD|218561 pfam05340, DUF740, Protein of unknown function (DUF740).  This
           family consists of several uncharacterized plant
           proteins of unknown function.
          Length = 565

 Score = 31.2 bits (70), Expect = 2.0
 Identities = 30/139 (21%), Positives = 54/139 (38%), Gaps = 7/139 (5%)

Query: 33  RRRS---RSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEE 89
           +RRS   RS         D D E         +     R+        ++EE+ + E++E
Sbjct: 93  QRRSCDVRSRSTLWSLFHDDDEENLPSSIAPPEIDPEPRKPIVPDLVLEEEEEVEMEEDE 152

Query: 90  EEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGS 149
           E    +P K+  E      E E++  +D I+   ++ KK  ++   K   S  S     S
Sbjct: 153 EYYEKEPGKVVDEKSEEEEEEELKTMKDFID-LESQTKKPSVKDNGKSFWSAASV---FS 208

Query: 150 APMKKWNLEDDSDEDENDN 168
             ++KW  +    +  + N
Sbjct: 209 KKLQKWRQKQKLKKPRSGN 227


>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
          Length = 880

 Score = 31.2 bits (71), Expect = 2.1
 Identities = 17/102 (16%), Positives = 40/102 (39%), Gaps = 8/102 (7%)

Query: 40  ERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKL 99
                 D  R++E+R  +           E    +   K+E  ++ +K+ +E      +L
Sbjct: 301 FYEEYLDELREIEKRLSRLE---EEINGIEERIKELEEKEERLEELKKKLKELEKRLEEL 357

Query: 100 DKEVEATRLELEMQKRR-DRIERWRAERKKKDIETIKKDIKS 140
           ++  E      E  K + + +ER +        E ++K+++ 
Sbjct: 358 EERHE----LYEEAKAKKEELERLKKRLTGLTPEKLEKELEE 395


>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR).  This
           family consists of several bovine specific leukaemia
           virus receptors which are thought to function as
           transmembrane proteins, although their exact function is
           unknown.
          Length = 561

 Score = 30.8 bits (69), Expect = 2.2
 Identities = 40/184 (21%), Positives = 71/184 (38%), Gaps = 5/184 (2%)

Query: 37  RSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDP 96
           R H +R E+D+    ++++EK +  +RR  S   E  +D +  +  D   +E  E A   
Sbjct: 88  RRHRQRLEKDKRE--KKKREKEKRGRRRHHSLGTESDEDIAPAQMVDIVTEEMPENALPS 145

Query: 97  SKLDKEVEATRLELEMQKRRDRIERWRAE-RKKKDIETIKKDIKSNLSSGLGGSAPMKKW 155
            + DK+       L++   +   +  +   +K ++ ET K   K ++ +    S   KK 
Sbjct: 146 DEDDKDPNDPYRALDIDLDKPLADSEKLPVQKHRNAETSKSPEKGDVPAVEKKSKKPKKK 205

Query: 156 NLEDDSDEDENDNK--DENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADS 213
             ++   E + D K   E  K+    +D   A    V E         V  TA     D 
Sbjct: 206 EKKEKEKERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGTAPDSEPDE 265

Query: 214 GSKP 217
               
Sbjct: 266 PKDA 269



 Score = 28.9 bits (64), Expect = 8.9
 Identities = 20/117 (17%), Positives = 37/117 (31%), Gaps = 33/117 (28%)

Query: 12  RSPSPSHKRPKESRRDKDRDRRRRSRSHE------------------------------- 40
           +S  P  K  KE  +++D+D+++     +                               
Sbjct: 198 KSKKPKKKEKKEKEKERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGT 257

Query: 41  -RRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDP 96
              SE D  +D E  + K     ++ + R+ +  K   KK     R    +  A  P
Sbjct: 258 APDSEPDEPKDAEAEETKKSPKHKKKKQRKEKEEKKKKKKHHH-HRCHHSDGGAEQP 313


>gnl|CDD|223003 PHA03169, PHA03169, hypothetical protein; Provisional.
          Length = 413

 Score = 30.7 bits (69), Expect = 2.2
 Identities = 22/92 (23%), Positives = 34/92 (36%), Gaps = 7/92 (7%)

Query: 1  MVRSRRKRSRSRSPSPSHKRPKESRRDKDR-DRRRRSRSHERRSERDRDRDLERRKEKSR 59
          M R RRK  RSR    S  R    R    R    RR  +  R ++               
Sbjct: 1  MSRQRRKAKRSRHTLRSSCRGHCKRHGGTREQAGRRRGTAARAAKPAPPAPTT------S 54

Query: 60 GSKRRSRSREAERSKDHSKKEEKDKREKEEEE 91
          G + R+ + +  R  +   +  ++ R  E+EE
Sbjct: 55 GPQVRAVAEQGHRQTESDTETAEESRHGEKEE 86


>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
           envelope biogenesis, outer membrane].
          Length = 387

 Score = 30.7 bits (69), Expect = 2.2
 Identities = 26/124 (20%), Positives = 59/124 (47%), Gaps = 7/124 (5%)

Query: 10  RSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSRE 69
           R +S   S K+ ++ R+ K+  +       ++ +E++R + LE  KE+ +  +++ ++ E
Sbjct: 66  RIQSQQSSAKKGEQQRKKKEE-QVAEELKPKQAAEQERLKQLE--KERLKAQEQQKQAEE 122

Query: 70  AERSKDHSKKEEKDKREKEEEE----AAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE 125
           AE+     +K+++++  K   E    A    +K   E    +   E +K+ +   +   E
Sbjct: 123 AEKQAQLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEE 182

Query: 126 RKKK 129
            K K
Sbjct: 183 AKAK 186



 Score = 29.1 bits (65), Expect = 6.8
 Identities = 21/115 (18%), Positives = 50/115 (43%), Gaps = 3/115 (2%)

Query: 19  KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSK---D 75
           +R K   + K  +   +    E++ + ++ R     ++K   + +   + EA + K   +
Sbjct: 109 ERLKAQEQQKQAEEAEKQAQLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAE 168

Query: 76  HSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKD 130
             KK E+  +  EE +A  + +   K+ EA       + + +   + +AE+K + 
Sbjct: 169 AKKKAEEAAKAAEEAKAKAEAAAAKKKAEAEAKAAAEKAKAEAEAKAKAEKKAEA 223


>gnl|CDD|197664 smart00338, BRLZ, basic region leucin zipper. 
          Length = 65

 Score = 27.9 bits (63), Expect = 2.3
 Identities = 23/63 (36%), Positives = 35/63 (55%), Gaps = 9/63 (14%)

Query: 52  ERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELE 111
            RR+E++R + RRSR R         KK E ++ E++ E+   +  +L KE+E  R ELE
Sbjct: 7   RRRRERNREAARRSRER---------KKAEIEELERKVEQLEAENERLKKEIERLRRELE 57

Query: 112 MQK 114
             K
Sbjct: 58  KLK 60


>gnl|CDD|131281 TIGR02226, two_anch, N-terminal double-transmembrane domain.  This
           model represents a prokaryotic N-terminal region of
           about 80 amino acids. The predicted membrane topology by
           TMHMM puts the N-terminus outside and spans the membrane
           twice, with a cytosolic region of about 25 amino acids
           between the two transmembrane regions. Member proteins
           tend to be between 600 and 1000 amino acids in length
           [Hypothetical proteins, Domain].
          Length = 82

 Score = 28.5 bits (64), Expect = 2.4
 Identities = 13/34 (38%), Positives = 17/34 (50%), Gaps = 2/34 (5%)

Query: 379 AFVLPLLRHILDQPPLEETD-GPMAIIM-SPTRE 410
           A VLPLL H+L + P    D   +  +   P RE
Sbjct: 14  AAVLPLLIHLLRRRPPRPVDFPALRFLREVPKRE 47


>gnl|CDD|224036 COG1111, MPH1, ERCC4-like helicases [DNA replication,
           recombination, and repair].
          Length = 542

 Score = 30.8 bits (70), Expect = 2.4
 Identities = 27/117 (23%), Positives = 50/117 (42%), Gaps = 13/117 (11%)

Query: 372 TGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVV 431
           TG GKT    + +   +          G   + ++PT+ L +Q  +  +K T      + 
Sbjct: 38  TGLGKTFIAAMVIANRLRW-------FGGKVLFLAPTKPLVLQHAEFCRKVTGIPEDEIA 90

Query: 432 CVYGGTGISEQISELKRGAEIIVCTPGRMI-DMLAANSGRVTNLRRVTYIVLDEADR 487
            + G     E+     +  ++ V TP  +  D+ A   GR+ +L  V+ ++ DEA R
Sbjct: 91  ALTGEVRPEEREELWAKK-KVFVATPQVVENDLKA---GRI-DLDDVSLLIFDEAHR 142


>gnl|CDD|225620 COG3078, COG3078, Uncharacterized protein conserved in bacteria
          [Function unknown].
          Length = 169

 Score = 29.8 bits (67), Expect = 2.5
 Identities = 23/71 (32%), Positives = 37/71 (52%), Gaps = 9/71 (12%)

Query: 19 KRPKESRRDK---DRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR-SREAERSK 74
          KR K++RR +   D+  RR++R       RDR     +R++K +G    SR S   E S 
Sbjct: 2  KRSKKTRRPRSKADKKARRKTREELDAEARDR-----KRQKKRKGLASGSRHSGGNENSG 56

Query: 75 DHSKKEEKDKR 85
          +  + ++KD R
Sbjct: 57 NKQQNQKKDPR 67


>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
           RPA34.5.  This is a family of proteins conserved from
           yeasts to human. Subunit A34.5 of RNA polymerase I is a
           non-essential subunit which is thought to help Pol I
           overcome topological constraints imposed on ribosomal
           DNA during the process of transcription.
          Length = 193

 Score = 30.1 bits (68), Expect = 2.6
 Identities = 12/74 (16%), Positives = 34/74 (45%), Gaps = 1/74 (1%)

Query: 15  SPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEK-SRGSKRRSRSREAERS 73
                 P E   + +   +  +   E+ +E + +   E++K+K  +  K+  + ++ +  
Sbjct: 119 GAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMV 178

Query: 74  KDHSKKEEKDKREK 87
           +    K++K K++K
Sbjct: 179 EPKGSKKKKKKKKK 192


>gnl|CDD|165442 PHA03171, PHA03171, UL37 tegument protein; Provisional.
          Length = 499

 Score = 30.8 bits (69), Expect = 2.7
 Identities = 29/102 (28%), Positives = 46/102 (45%), Gaps = 11/102 (10%)

Query: 7   KRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSH----ERRSERDRDRDLERRKEKSRGSK 62
           +R R+  P P   R +++   + + R R  R H       SE +R RDL        G +
Sbjct: 26  QRKRAEDPLPPWLRKEKACALRQQRRHRLQRQHGVIDGENSETERPRDLTAALFAEAGEE 85

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEEEEAAFDP--SKLDKE 102
             +   + +R    ++ EE+D   +EEE  A DP  + LD E
Sbjct: 86  --AEEEDNDRECPDTEAEEED---EEEEIEAPDPEVNPLDAE 122


>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
           recombination, and repair].
          Length = 908

 Score = 30.9 bits (70), Expect = 2.8
 Identities = 17/108 (15%), Positives = 50/108 (46%), Gaps = 1/108 (0%)

Query: 38  SHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPS 97
            HE+      + +LE  +E+    K  +  RE     +   +E +++  +  E       
Sbjct: 470 EHEKELLELYELELEELEEELSREKEEAELREEIEELEKELRELEEELIELLELEEALKE 529

Query: 98  KLDKEVEATRLELE-MQKRRDRIERWRAERKKKDIETIKKDIKSNLSS 144
           +L++++E     LE +++ +++++  + + + + +E   +++K  L  
Sbjct: 530 ELEEKLEKLENLLEELEELKEKLQLQQLKEELRQLEDRLQELKELLEE 577



 Score = 30.5 bits (69), Expect = 3.2
 Identities = 26/147 (17%), Positives = 51/147 (34%), Gaps = 10/147 (6%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
           R    ++R            + R ++ R+  R     E + ER  + + E  + +     
Sbjct: 250 RLEELKARLLEIESLELEALKIREEELRELERLLEELEEKIERLEELEREIEELEEELEG 309

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEE-------EEAAFDPSKLDKEVEATRLELEMQKR 115
            R+   E E   +  K  E+   + EE       E       K +          E+++R
Sbjct: 310 LRALLEELEELLEKLKSLEERLEKLEEKLEKLESELEELAEEKNELAKLLEERLKELEER 369

Query: 116 RDRIE---RWRAERKKKDIETIKKDIK 139
            + +E       ER K+  E I++  +
Sbjct: 370 LEELEKELEKALERLKQLEEAIQELKE 396



 Score = 29.7 bits (67), Expect = 5.7
 Identities = 14/113 (12%), Positives = 41/113 (36%), Gaps = 4/113 (3%)

Query: 18  HKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHS 77
            +  +E    ++     R+   E     ++ + LE R EK          +     ++ +
Sbjct: 294 EELEREIEELEEELEGLRALLEELEELLEKLKSLEERLEK----LEEKLEKLESELEELA 349

Query: 78  KKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKD 130
           +++ +  +  EE     +    + E E  +    +++  + I+  + E  +  
Sbjct: 350 EEKNELAKLLEERLKELEERLEELEKELEKALERLKQLEEAIQELKEELAELS 402


>gnl|CDD|241525 cd13374, PH_RASAL3, RAS protein activator like-3 Pleckstrin
          homology (PH) domain.  RASAL3 is thought to be a Ras
          GTPase-activating protein. It is involved in positive
          regulation of Ras GTPase activity and of small GTPase
          mediated signal transduction as well as negative
          regulation of Ras protein signal transduction. It
          contains a PH domain, a C2 domain, and a Ras-GAP
          domain. PH domains have diverse functions, but in
          general are involved in targeting proteins to the
          appropriate cellular location or in the interaction
          with a binding partner. They share little sequence
          conservation, but all have a common fold, which is
          electrostatically polarized. Less than 10% of PH
          domains bind phosphoinositide phosphates (PIPs) with
          high affinity and specificity. PH domains are
          distinguished from other PIP-binding domains by their
          specific high-affinity binding to PIPs with two vicinal
          phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or
          PtdIns(3,4,5)P3 which results in targeting some PH
          domain proteins to the plasma membrane. A few display
          strong specificity in lipid binding. Any specificity is
          usually determined by loop regions or insertions in the
          N-terminus of the domain, which are not conserved
          across all PH domains. PH domains are found in cellular
          signaling proteins such as serine/threonine kinase,
          tyrosine kinases, regulators of G-proteins, endocytotic
          GTPases, adaptors, as well as cytoskeletal associated
          molecules and in lipid associated enzymes.
          Length = 180

 Score = 29.9 bits (67), Expect = 2.9
 Identities = 20/75 (26%), Positives = 29/75 (38%), Gaps = 6/75 (8%)

Query: 5  RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLE-RRKEKSR---- 59
          RR R  S S   S    K S  D DR   +       +++    R L  R KE+ +    
Sbjct: 13 RRPRPGSASSGGSIISAKGSGGDPDRKPGKTEPEAAGQNQVHNVRGLLKRLKEEKKARVS 72

Query: 60 -GSKRRSRSREAERS 73
             K  S +R ++ S
Sbjct: 73 GEGKPSSSARGSQES 87


>gnl|CDD|224121 COG1200, RecG, RecG-like helicase [DNA replication, recombination,
           and repair / Transcription].
          Length = 677

 Score = 30.6 bits (70), Expect = 3.1
 Identities = 21/59 (35%), Positives = 30/59 (50%), Gaps = 8/59 (13%)

Query: 373 GSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGLRVV 431
           GSGKTV  +L +L  I          G  A +M+PT  L  Q  +  +K+ + LG+RV 
Sbjct: 293 GSGKTVVALLAMLAAI--------EAGYQAALMAPTEILAEQHYESLRKWLEPLGIRVA 343


>gnl|CDD|236544 PRK09506, mrcB, bifunctional glycosyl transferase/transpeptidase;
          Reviewed.
          Length = 830

 Score = 30.5 bits (69), Expect = 3.2
 Identities = 14/57 (24%), Positives = 20/57 (35%)

Query: 12 RSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR 68
          R P     +P    + K   RR R        +   D +   RK K +G K R +  
Sbjct: 6  REPIGRKGKPSRPVKQKVSRRRYRDDDDYDDYDDYEDEEPMPRKGKGKGRKPRGKRG 62


>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain.  This domain is
           found at the N terminus of SMC proteins. The SMC
           (structural maintenance of chromosomes) superfamily
           proteins have ATP-binding domains at the N- and
           C-termini, and two extended coiled-coil domains
           separated by a hinge in the middle. The eukaryotic SMC
           proteins form two kind of heterodimers: the SMC1/SMC3
           and the SMC2/SMC4 types. These heterodimers constitute
           an essential part of higher order complexes, which are
           involved in chromatin and DNA dynamics. This family also
           includes the RecF and RecN proteins that are involved in
           DNA metabolism and recombination.
          Length = 1162

 Score = 30.7 bits (69), Expect = 3.2
 Identities = 25/159 (15%), Positives = 67/159 (42%), Gaps = 2/159 (1%)

Query: 22  KESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK--K 79
               ++K           +     +   DL +   +    +  S  +E E+ ++      
Sbjct: 213 YYQLKEKLELEEENLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVL 272

Query: 80  EEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIK 139
           +E  + EKE++    +   L KE E  + EL   +RR   +  + +  +K+++ ++K++K
Sbjct: 273 KENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKELKKLEKELK 332

Query: 140 SNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEE 178
                       +K+  ++ +++E+E +  ++  +  E+
Sbjct: 333 KEKEEIEELEKELKELEIKREAEEEEEEQLEKLQEKLEQ 371



 Score = 29.6 bits (66), Expect = 6.1
 Identities = 36/160 (22%), Positives = 65/160 (40%), Gaps = 7/160 (4%)

Query: 40  ERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKL 99
           ERR   D ++  E  KE  +  K   + +E     +   KE + KRE EEEE      + 
Sbjct: 307 ERRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQ--LEK 364

Query: 100 DKEVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLED 159
            +E      E  + K++   ER  +  K K+ E   K+ +   +  L     ++    E+
Sbjct: 365 LQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAKLL-----LELSEQEE 419

Query: 160 DSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNK 199
           D  ++E   + +  +  EE ++     +    EE+ K   
Sbjct: 420 DLLKEEKKEELKIVEELEESLETKQGKLTEEKEELEKQAL 459


>gnl|CDD|237178 PRK12705, PRK12705, hypothetical protein; Provisional.
          Length = 508

 Score = 30.4 bits (69), Expect = 3.3
 Identities = 28/144 (19%), Positives = 54/144 (37%), Gaps = 9/144 (6%)

Query: 5   RRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR- 63
           +R+R    +     +  KE+    +           R   + R      R+E  R  +R 
Sbjct: 27  KRQRLAKEAERILQEAQKEAEEKLEAALLEAKELLLRERNQQRQEARREREELQREEERL 86

Query: 64  --RSRSREAERSKDHSKKEEKDKREK----EEEEAAFDPSKLDKE-VEATRLELEMQKRR 116
             +    +A   K  + + + ++REK     E E      +LD E      L  E Q R+
Sbjct: 87  VQKEEQLDARAEKLDNLENQLEEREKALSARELELEELEKQLDNELYRVAGLTPE-QARK 145

Query: 117 DRIERWRAERKKKDIETIKKDIKS 140
             ++   AE +++  + +KK  + 
Sbjct: 146 LLLKLLDAELEEEKAQRVKKIEEE 169


>gnl|CDD|227352 COG5019, CDC3, Septin family protein [Cell division and chromosome
           partitioning / Cytoskeleton].
          Length = 373

 Score = 30.0 bits (68), Expect = 3.4
 Identities = 18/86 (20%), Positives = 36/86 (41%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
           R+ +      S  PS K   E+R +++    ++  + + R +  R  +LE+   + R   
Sbjct: 288 RTEKLSGLKNSGEPSLKEIHEARLNEEERELKKKFTEKIREKEKRLEELEQNLIEERKEL 347

Query: 63  RRSRSREAERSKDHSKKEEKDKREKE 88
                   ++ +D  K+ EK K  K 
Sbjct: 348 NSKLEEIQKKLEDLEKRLEKLKSNKS 373


>gnl|CDD|227594 COG5269, ZUO1, Ribosome-associated chaperone zuotin [Translation,
           ribosomal structure and biogenesis / Posttranslational
           modification, protein turnover, chaperones].
          Length = 379

 Score = 30.0 bits (67), Expect = 3.4
 Identities = 22/99 (22%), Positives = 41/99 (41%), Gaps = 2/99 (2%)

Query: 43  SERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEE-KDKREKEEEEAAFDPSKLDK 101
            ERDR R  E +  + R   +   +   +R    +KK + + K  KE+E+      K ++
Sbjct: 191 EERDRKRYSEAKNREKRAKLKNQDNARLKRLVQIAKKRDPRIKSFKEQEKEMKKIRKWER 250

Query: 102 EVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKS 140
           E  A    L   K +   +  +AE + + + +     K 
Sbjct: 251 EAGARLKALAALKGKAEAKN-KAEIEAEALASATAVKKK 288


>gnl|CDD|144738 pfam01254, TP2, Nuclear transition protein 2. 
          Length = 132

 Score = 29.0 bits (64), Expect = 3.6
 Identities = 17/69 (24%), Positives = 27/69 (39%), Gaps = 9/69 (13%)

Query: 7   KRSRSRSPSPS---HKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKR 63
            +S+S SPSP    HK    S     R         + R      ++LE +  K +  KR
Sbjct: 62  HQSQSPSPSPPPKHHKTTMHSHYSPSRPTTHSCSCPKNR------KNLEGKVSKRKAVKR 115

Query: 64  RSRSREAER 72
             +  + +R
Sbjct: 116 SKQVYKTKR 124


>gnl|CDD|235334 PRK05035, PRK05035, electron transport complex protein RnfC;
           Provisional.
          Length = 695

 Score = 30.3 bits (69), Expect = 3.6
 Identities = 14/51 (27%), Positives = 22/51 (43%)

Query: 85  REKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDIETIK 135
           R  E+E+   + +K   E    RLE E   R  R ++    R  KD + + 
Sbjct: 439 RAIEQEKKKAEEAKARFEARQARLEREKAAREARHKKAAEARAAKDKDAVA 489


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
           eukaryotic proteins with undetermined function.
          Length = 321

 Score = 30.1 bits (68), Expect = 3.6
 Identities = 50/226 (22%), Positives = 86/226 (38%), Gaps = 43/226 (19%)

Query: 67  SREAERSKDHSKKEEKDKREKEEEEAAFDP----------SKLDKEVEATRLEL-EMQKR 115
           +R  E  K   ++EE+ ++E+EEE A  D           SK + EV     EL E+Q R
Sbjct: 105 TRNYEADKLDEEQEERVEKEREEELAG-DAMKKLENRTADSKREMEVLERLEELKELQSR 163

Query: 116 RD-------------RIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSD 162
           R              R ++   E +++D   IK      LS G   +   ++   ++DS+
Sbjct: 164 RADVDVNSMLEALFRREKKEEEEEEEEDEALIKS-----LSFGP-ETEEDRRRADDEDSE 217

Query: 163 EDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVI 222
           +DE DN +     +          +     +     +   P+++  K    G       +
Sbjct: 218 DDEEDNDNTPSPKSGSSSPAKPTSI----LKKSAAKRSEAPSSSKAKKNSRGIPKPRDAL 273

Query: 223 VTGVVKKSVEKAKGELMEENQDGLEYSSEEEQEDLTSTAANLASKQ 268
            + VV+K   KA  E   ++      SS E   +   TA N +   
Sbjct: 274 SSLVVRK---KAAPESTSQSP-----SSAEPTSESPQTAGNSSLSS 311


>gnl|CDD|165563 PHA03308, PHA03308, transcriptional regulator ICP4; Provisional.
          Length = 1463

 Score = 30.5 bits (68), Expect = 3.6
 Identities = 25/81 (30%), Positives = 33/81 (40%), Gaps = 12/81 (14%)

Query: 53  RRKEKSRGSKR--RSRSREAER----------SKDHSKKEEKDKREKEEEEAAFDPSKLD 100
           RR+E  R S+R  RSRSR   R          S   + K+   KR K E      PS L 
Sbjct: 180 RRQEVVRKSERVARSRSRRPWRDLWSNRRPVPSPQRTSKDRLPKRGKREFSKKMGPSHLT 239

Query: 101 KEVEATRLELEMQKRRDRIER 121
               ++     +  RR R+ R
Sbjct: 240 SSSSSSSSSFSLSGRRGRLAR 260


>gnl|CDD|238432 cd00852, NifB, NifB belongs to a family of iron-molybdenum
           cluster-binding proteins that includes NifX, and NifY,
           all of which are involved in the synthesis of an
           iron-molybdenum cofactor (FeMo-co) that binds the active
           site of the dinitrogenase enzyme as part of nitrogen
           fixation in bacteria. This domain is sometimes found
           fused to a N-terminal domain (the Radical SAM domain) in
           nifB-like proteins.
          Length = 106

 Score = 28.4 bits (64), Expect = 3.7
 Identities = 14/36 (38%), Positives = 20/36 (55%)

Query: 411 LCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISEL 446
           LC +IG E K+  +  G+ V+  Y G  I E + EL
Sbjct: 70  LCAKIGDEPKEKLEEAGIEVIEAYAGEYIEEALLEL 105


>gnl|CDD|217970 pfam04220, YihI, Der GTPase activator (YihI).  YihI activates the
          GTPase activity of Der, a 50S ribosomal subunit
          stability factor. The stimulation is specific to Der as
          YihI does not stimulate the GTPase activity of Era or
          ObgE. The interaction of YihI with Der requires only
          the C-terminal 78 amino acids of YihI. A yihI deletion
          mutant is viable and shows a shorter lag period, but
          the same post-lag growth rate as a wild-type strain.
          yihI is expressed during the lag period. Overexpression
          of yihI inhibits cell growth and biogenesis of the 50S
          ribosomal subunit. YihI is an unusual, highly
          hydrophilic protein with an uneven distribution of
          charged residues, resulting in an N-terminal region
          with high pI and a C-terminal region with low pI.
          Length = 169

 Score = 29.2 bits (66), Expect = 3.7
 Identities = 19/73 (26%), Positives = 31/73 (42%), Gaps = 11/73 (15%)

Query: 14 PSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSR-SREAER 72
                 PK + + K + R           E D++    +RK+K +G K  SR + E+E 
Sbjct: 1  RKSGKNGPKLAPKGKKKTR----------YELDQEARERKRKKKRKGLKSGSRHNEESES 50

Query: 73 SKDHSKKEEKDKR 85
           K     ++KD R
Sbjct: 51 QKQKGAAQKKDPR 63


>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y.  Members of this family are
           RNase Y, an endoribonuclease. The member from Bacillus
           subtilis, YmdA, has been shown to be involved in
           turnover of yitJ riboswitch [Transcription, Degradation
           of RNA].
          Length = 514

 Score = 30.3 bits (69), Expect = 3.9
 Identities = 25/83 (30%), Positives = 45/83 (54%), Gaps = 3/83 (3%)

Query: 39  HERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSK 98
           H+ R+E +R+   ERR E  R  +R  +  E    K  S  ++++  EK+E+E +     
Sbjct: 61  HKLRAELERELK-ERRNELQRLERRLLQREETLDRKMESLDKKEENLEKKEKELSNKEKN 119

Query: 99  LDKEVEATRLELEMQKRRDRIER 121
           LD++ E   LE  + ++R+ +ER
Sbjct: 120 LDEKEE--ELEELIAEQREELER 140


>gnl|CDD|221432 pfam12128, DUF3584, Protein of unknown function (DUF3584).  This
           protein is found in bacteria and eukaryotes. Proteins in
           this family are typically between 943 to 1234 amino
           acids in length. This family contains a P-loop motif
           suggesting it is a nucleotide binding protein. It may be
           involved in replication.
          Length = 1198

 Score = 30.4 bits (69), Expect = 3.9
 Identities = 20/74 (27%), Positives = 38/74 (51%), Gaps = 4/74 (5%)

Query: 56  EKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKR 115
           E++  S    + +  E+    + + E+ KR + E   A   ++LD +    RL+ E Q  
Sbjct: 613 EEALQSAVAKQKQAEEQLVQANAELEEQKRAEAEARTALKQARLDLQ----RLQNEQQSL 668

Query: 116 RDRIERWRAERKKK 129
           +D++E   AERK++
Sbjct: 669 KDKLELAIAERKQQ 682


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
           Provisional.
          Length = 482

 Score = 29.9 bits (68), Expect = 4.0
 Identities = 13/53 (24%), Positives = 30/53 (56%)

Query: 52  ERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVE 104
             + EK R  +++ + ++A   K   ++EE++K +KEEE+   +    +++ E
Sbjct: 416 VEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEE 468



 Score = 29.5 bits (67), Expect = 5.9
 Identities = 14/56 (25%), Positives = 30/56 (53%)

Query: 51  LERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEAT 106
            +  K++    K + +   A + K+  ++EEK+K+E+E+EE   +  +  +E E  
Sbjct: 417 EKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEAEEEKEEEEEK 472


>gnl|CDD|173965 cd08045, TAF4, TATA Binding Protein (TBP) Associated Factor 4
           (TAF4) is one of several TAFs that bind TBP and is
           involved in forming Transcription Factor IID (TFIID)
           complex.  The TATA Binding Protein (TBP) Associated
           Factor 4 (TAF4) is one of several TAFs that bind TBP and
           are involved in forming the Transcription Factor IID
           (TFIID) complex. TFIID is one of seven General
           Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE,
           TFIIF, and TFIID) that are involved in accurate
           initiation of transcription by RNA polymerase II in
           eukaryote. TFIID plays an important role in the
           recognition of promoter DNA and assembly of the
           pre-initiation complex. TFIID complex is composed of the
           TBP and at least 13 TAFs. TAFs from various species were
           originally named by their predicted molecular weight or
           their electrophoretic mobility in polyacrylamide gels. A
           new, unified nomenclature for the pol II TAFs has been
           suggested to show the relationship between TAF orthologs
           and paralogs. Several hypotheses are proposed for TAFs
           functions such as serving as activator-binding sites,
           core-promoter recognition or a role in essential
           catalytic activity. Each TAF, with the help of a
           specific activator, is required only for the expression
           of subset of genes and is not universally involved for
           transcription as are GTFs. In yeast and human cells,
           TAFs have been found as components of other complexes
           besides TFIID.   Several TAFs interact via histone-fold
           (HFD) motifs; HFD is the interaction motif involved in
           heterodimerization of the core histones and their
           assembly into nucleosome octamers. The minimal HFD
           contains three alpha-helices linked by two loops and is
           found in core histones, TAFS and many other
           transcription factors. TFIID has a histone octamer-like
           substructure. TAF4 domain interacts with TAF12 and makes
           a novel histone-like heterodimer that binds DNA and has
           a core promoter function of a subset of genes.
          Length = 212

 Score = 29.6 bits (67), Expect = 4.0
 Identities = 19/84 (22%), Positives = 31/84 (36%), Gaps = 9/84 (10%)

Query: 3   RSRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSK 62
           R   ++   R    S  R K+ R  +  +R       E     + +R+   R  KSR  +
Sbjct: 96  RVDSEKEDERYEITSDVR-KQLRFLEQLER------EEEEKRDEEERERLLRAAKSRSEQ 148

Query: 63  RRSRSREAERSKDHSKKEEKDKRE 86
            R + +  E  K+    EE   R 
Sbjct: 149 SRLKQKAKEMQKEED--EEMRHRA 170


>gnl|CDD|217927 pfam04147, Nop14, Nop14-like family.  Emg1 and Nop14 are novel
           proteins whose interaction is required for the
           maturation of the 18S rRNA and for 40S ribosome
           production.
          Length = 809

 Score = 30.4 bits (69), Expect = 4.0
 Identities = 35/203 (17%), Positives = 77/203 (37%), Gaps = 29/203 (14%)

Query: 39  HERRSERDRDRDLERRKEKSRGSKRRS-RSREAERSKDHSKKEEKDKREKEEEEAAFDP- 96
            ER+  ++ D DL    +          R+ +       + +E+ D+ ++   E  FD  
Sbjct: 191 AERQKAKEEDEDLREELDDDFKDLMSLLRTVKPPPKPPMTPEEKDDEYDQRVRELTFDRR 250

Query: 97  ------SKLDKEV---EATRLE-LEMQKRRDRIERWR--------AERKKKDIETIKKDI 138
                 +K ++E+   EA RL+ LE     +R+ R R         E  K+  + +  + 
Sbjct: 251 AQPTDRTKTEEELAKEEAERLKKLE----AERLRRMRGEEEDDEEEEDSKESADDLDDEF 306

Query: 139 KSNLSSGLGGSAPMKKWN-----LEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEE 193
           + +     G     +        ++D+ +ED++D+ +E  +  +   +  D   +   +E
Sbjct: 307 EPDDDDNFGLGQGEEDEEEEEDGVDDEDEEDDDDDLEEEEEDVDLSDEEEDEEDEDSDDE 366

Query: 194 MRKVNKPAVPTTADVKPADSGSK 216
             +  +         K A+S   
Sbjct: 367 DDEEEEEEEKEKKKKKSAESTRS 389


>gnl|CDD|163426 TIGR03714, secA2, accessory Sec system translocase SecA2.  Members
           of this protein family are homologous to SecA and part
           of the accessory Sec system. This system, including both
           five core proteins for export and a variable number of
           proteins for glycosylation, operates in certain
           Gram-positive pathogens for the maturation and delivery
           of serine-rich glycoproteins such as the cell surface
           glycoprotein GspB in Streptococcus gordonii [Protein
           fate, Protein and peptide secretion and trafficking].
          Length = 762

 Score = 30.0 bits (68), Expect = 4.2
 Identities = 39/149 (26%), Positives = 61/149 (40%), Gaps = 27/149 (18%)

Query: 351 PIQAQAIPAIMSGRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRE 410
           P   Q + AI+  +  I   KTG GKT+   +PL         L    G  A++++    
Sbjct: 71  PYDVQVLGAIVLHQGNIAEMKTGEGKTLTATMPLY--------LNALTGKGAMLVTTNDY 122

Query: 411 LCMQIGKEAKKFTKSLGLRVVCVYGGTGISEQISE-----LKR---GAEIIVCTPGR--- 459
           L  +  +E     + LGL V       G+ +   E      KR    ++I+  T      
Sbjct: 123 LAKRDAEEMGPVYEWLGLTV-----SLGVVDDPDEEYDANEKRKIYNSDIVYTTNSALGF 177

Query: 460 --MIDMLAANSGRVTNLRRVTYIVLDEAD 486
             +ID LA+N      LR   Y+++DE D
Sbjct: 178 DYLIDNLASNK-EGKFLRPFNYVIVDEVD 205


>gnl|CDD|215770 pfam00176, SNF2_N, SNF2 family N-terminal domain.  This domain is
           found in proteins involved in a variety of processes
           including transcription regulation (e.g., SNF2, STH1,
           brahma, MOT1), DNA repair (e.g. ERCC6, RAD16, RAD5), DNA
           recombination (e.g. RAD54), and chromatin unwinding
           (e.g. ISWI) as well as a variety of other proteins with
           little functional information (e.g. lodestar, ETL1).
          Length = 301

 Score = 29.6 bits (67), Expect = 4.2
 Identities = 26/126 (20%), Positives = 51/126 (40%), Gaps = 23/126 (18%)

Query: 372 TGSGKT---VAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGL 428
            G GKT   +A +   L+   D+       GP  ++   +         E +K+     L
Sbjct: 25  MGLGKTLQTIALLATYLKEGKDRR------GPTLVVCPLS--TLHNWLNEFEKWA--PAL 74

Query: 429 RVVCVYGGTGISEQISELK----RGAEIIVCTPGRMIDMLAANSGRVTNLRRVT--YIVL 482
           RVV  +G      ++ +         ++++ T     ++L  +   ++ L +V    +VL
Sbjct: 75  RVVVYHGDGRERSKLRQSMAKRLDTYDVVITT----YEVLRKDKKLLSLLNKVEWDRVVL 130

Query: 483 DEADRM 488
           DEA R+
Sbjct: 131 DEAHRL 136


>gnl|CDD|180626 PRK06565, PRK06565, amidase; Validated.
          Length = 566

 Score = 30.1 bits (68), Expect = 4.2
 Identities = 25/76 (32%), Positives = 36/76 (47%), Gaps = 14/76 (18%)

Query: 156 NLEDD--SDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKP---------AV-- 202
           N E D  +  DE  N  + G  + + I  L   ++G+ E+ RK++           AV  
Sbjct: 414 NREGDLAAGMDEYVNMAKRGLKSWDQIPTLPDGLRGL-EKTRKLDLEDWMDGLGLDAVLF 472

Query: 203 PTTADVKPADSGSKPA 218
           PT ADV PAD+   PA
Sbjct: 473 PTVADVGPADADVNPA 488


>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381).  This
           domain is functionally uncharacterized. This domain is
           found in eukaryotes. This presumed domain is typically
           between 156 to 174 amino acids in length. This domain is
           found associated with pfam07780, pfam01728.
          Length = 154

 Score = 29.2 bits (66), Expect = 4.2
 Identities = 16/54 (29%), Positives = 33/54 (61%), Gaps = 6/54 (11%)

Query: 78  KKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKDI 131
           K++E+++ E+ E E   +  ++D+ +E    +L+ +KRR+       ERK+K+I
Sbjct: 99  KEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRE------NERKQKEI 146


>gnl|CDD|233069 TIGR00643, recG, ATP-dependent DNA helicase RecG.  [DNA metabolism,
           DNA replication, recombination, and repair].
          Length = 630

 Score = 30.0 bits (68), Expect = 4.2
 Identities = 25/98 (25%), Positives = 35/98 (35%), Gaps = 18/98 (18%)

Query: 348 KPTPIQAQAIPAIMS--------GRDLIGIAKTGSGKTVAFVLPLLRHILDQPPLEETDG 399
           K T  Q + +  I+          R L G    GSGKT+   L +L  I          G
Sbjct: 235 KLTRAQKRVVKEILQDLKSDVPMNRLLQG--DVGSGKTLVAALAMLAAI--------EAG 284

Query: 400 PMAIIMSPTRELCMQIGKEAKKFTKSLGLRVVCVYGGT 437
               +M+PT  L  Q     +     LG+ V  + G  
Sbjct: 285 YQVALMAPTEILAEQHYNSLRNLLAPLGIEVALLTGSL 322


>gnl|CDD|223649 COG0576, GrpE, Molecular chaperone GrpE (heat shock protein)
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 193

 Score = 29.2 bits (66), Expect = 4.3
 Identities = 18/80 (22%), Positives = 39/80 (48%), Gaps = 6/80 (7%)

Query: 69  EAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE--- 125
           E   +++  + E+ ++ E EEEE   +    +++ E   LE ++++ +D+  R +AE   
Sbjct: 9   EEPDAEETEEAEKSEEEEAEEEEPEEENELEEEQQEIAELEAQLEELKDKYLRAQAEFEN 68

Query: 126 ---RKKKDIETIKKDIKSNL 142
              R +++ E  KK      
Sbjct: 69  LRKRTEREREEAKKYAIEKF 88


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 30.0 bits (68), Expect = 4.4
 Identities = 32/179 (17%), Positives = 61/179 (34%), Gaps = 19/179 (10%)

Query: 18  HKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHS 77
            +R +  +++++       R      E   + + E   +K+ G  RR    E    +  S
Sbjct: 379 MQRAEARKKEENDAEIEELRRELEGEEESDEEENEEPSKKNVG--RRKFGPENGEKEAES 436

Query: 78  KKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIER---------------W 122
           KK +K+ + + +E+   D  +  ++ E  ++E    K   R E+               W
Sbjct: 437 KKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEENPW 496

Query: 123 RAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDED--ENDNKDENGKTAEED 179
                       K+D K   SS L  +A            +   E     ++    EED
Sbjct: 497 LKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEKSIDLDDDLIDEED 555


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 29.5 bits (67), Expect = 4.5
 Identities = 17/73 (23%), Positives = 33/73 (45%), Gaps = 8/73 (10%)

Query: 55  KEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
             K    K R    E        +++E+ + +KEE++      K ++E +  +L  E Q+
Sbjct: 258 LRKVD--KTREEEEEKILKAAEEERQEEAQEKKEEKK------KEEREAKLAKLSPEEQR 309

Query: 115 RRDRIERWRAERK 127
           + +  ER +  RK
Sbjct: 310 KLEEKERKKQARK 322


>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168).  This
           family consists of several hypothetical eukaryotic
           proteins of unknown function.
          Length = 142

 Score = 28.9 bits (65), Expect = 4.6
 Identities = 29/116 (25%), Positives = 48/116 (41%), Gaps = 15/116 (12%)

Query: 63  RRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERW 122
           R  R RE ER +     +EK K+E E           D+E +  R E +  K  ++  + 
Sbjct: 42  RALRRREYERLE---LMDEKWKKETE-----------DEEFQQKREEKKR-KDEEKTAKK 86

Query: 123 RAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEE 178
           RA+R+KK  +  KK      +               D+ +E E D ++E  +  E+
Sbjct: 87  RAKRQKKKQKKKKKKKAKKGNKKEEKEGSKSSEESSDEEEEGEEDKQEEPVEIMEK 142


>gnl|CDD|234229 TIGR03490, Mycoplas_LppA, mycoides cluster lipoprotein, LppA/P72
           family.  Members of this protein family occur in
           Mycoplasma mycoides, Mycoplasma hyopneumoniae, and
           related Mycoplasmas in small paralogous families that
           may also include truncated forms and/or pseudogenes.
           Members are predicted lipoproteins with a conserved
           signal peptidase II processing and lipid attachment
           site. Note that the name for certain characterized
           members, p72, reflects an anomalous apparent molecular
           weight, given a theoretical MW of about 61 kDa.
          Length = 541

 Score = 29.8 bits (67), Expect = 4.6
 Identities = 30/172 (17%), Positives = 63/172 (36%), Gaps = 30/172 (17%)

Query: 16  PSHKRPKESRR----DKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAE 71
           P+   PK  ++    +   +   +S +  +  E     + E++ + S+  +      E E
Sbjct: 42  PNENTPKIPKKPDNKEPSENNNNKSNNENKDEENPSSTNPEKKPDPSKNKE------EIE 95

Query: 72  RSKDHSKKEEKDKREKEEEEAAFDPS-----------KLDKEVEATRLELEMQKRRDRIE 120
           + KD  KK +K  +  +      D             KL KE+      L    ++D   
Sbjct: 96  KPKDEPKKPDKKPQADQPNNVHADQPNNNKVDFSDLDKLKKELSFENFTL--YSQKDPKT 153

Query: 121 RWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDEN 172
                  K D+ T  K I             + K+NL+ +S+++  ++ ++ 
Sbjct: 154 AL--SSLKGDLSTFFKTIFYK-----TNKDILDKYNLKLESNKEPKEDFEKG 198


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family
           are designated YL1. These proteins have been shown to be
           DNA-binding and may be a transcription factor.
          Length = 238

 Score = 29.3 bits (66), Expect = 4.7
 Identities = 30/131 (22%), Positives = 55/131 (41%), Gaps = 19/131 (14%)

Query: 6   RKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRS 65
             ++ +  P    +R   +    D  RR+ SRS   +++      L+ R+ + +  + ++
Sbjct: 112 SPKAAAPRPKKKSERISWAPTLLDSPRRKSSRSSTVQNKEATHERLKEREIRRKKIQAKA 171

Query: 66  RSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAE 125
           R R         K+++K+K   +EE  A          EA   E    K  +R E    E
Sbjct: 172 RKR---------KEKKKEKELTQEERLA----------EAKETERINLKSLERYEEQEEE 212

Query: 126 RKKKDIETIKK 136
           +KK  I+ +KK
Sbjct: 213 KKKAKIQALKK 223


>gnl|CDD|218883 pfam06075, DUF936, Plant protein of unknown function (DUF936).
           This family consists of several hypothetical proteins
           from Arabidopsis thaliana and Oryza sativa. The function
           of this family is unknown.
          Length = 564

 Score = 29.8 bits (67), Expect = 4.8
 Identities = 22/138 (15%), Positives = 36/138 (26%), Gaps = 15/138 (10%)

Query: 2   VRSRRKRSRSR----SPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKE- 56
           V   R RS S       +P+  R   S           S        R        R   
Sbjct: 161 VIGPRPRSFSELNLTDRTPAKVRSSRSELGAPSPSGGTSCPSSSGGRRSSIGSRRLRGSA 220

Query: 57  ---KSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQ 113
              K        R      S+    K     R      +A  P K   + +AT+   ++ 
Sbjct: 221 SLRKKVAVLSAPRKP---GSRSSDCKSSPRARSS----SAKSPFKSSIQRKATKALSKLS 273

Query: 114 KRRDRIERWRAERKKKDI 131
            R    +  ++ + +   
Sbjct: 274 LRASPKDTSKSSKSEVAP 291


>gnl|CDD|225995 COG3464, COG3464, Transposase and inactivated derivatives [DNA
           replication, recombination, and repair].
          Length = 402

 Score = 29.7 bits (67), Expect = 4.9
 Identities = 17/115 (14%), Positives = 31/115 (26%), Gaps = 4/115 (3%)

Query: 28  KDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREK 87
                + R R   +    D+      ++     S+          S    ++      E 
Sbjct: 241 SRALEQVRRRVRNQFRSEDKRIKALWKRRARLSSRYLCDKNFQNLSLLRYERLSPILGEL 300

Query: 88  EE-EEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAER---KKKDIETIKKDI 138
                A      L +E+ A R    ++K    IE           +   T+KK  
Sbjct: 301 YSLYPALRVAYDLAQELAADRRREAVKKLIQWIEDAVKSAIKELARLAATLKKHQ 355


>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
           domain.  This domain is found in a number of different
           types of plant proteins including NAM-like proteins.
          Length = 147

 Score = 28.9 bits (65), Expect = 5.0
 Identities = 31/109 (28%), Positives = 51/109 (46%), Gaps = 6/109 (5%)

Query: 19  KRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSK 78
           KR +  +  K + +R  S      +E + D D E   E  R  + R +++E  R      
Sbjct: 21  KRSELKKASKKKKKRSNSSPGSTSNEENEDEDDESTAESKR-PEGRKKAKEKLRRDKLKA 79

Query: 79  KEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERK 127
           K+E+ ++EKE+EE      K   E E  R ELE  K++   +  + E+K
Sbjct: 80  KKEEAEKEKEKEERF---MKALAEAEKERAELE--KKKAEAKLMKEEKK 123


>gnl|CDD|221803 pfam12846, AAA_10, AAA-like domain.  This family of domains contain
           a P-loop motif that is characteristic of the AAA
           superfamily. Many of the proteins in this family are
           conjugative transfer proteins.
          Length = 316

 Score = 29.7 bits (67), Expect = 5.1
 Identities = 17/72 (23%), Positives = 30/72 (41%), Gaps = 16/72 (22%)

Query: 369 IAKTGSGKTVAFVLPLLRHILDQPPLEETDGPMAIIMSPTRELCMQIGKEAKKFTKSLGL 428
           +  +GSGK+      LL+ +  +       G   I++ P          E     ++LG 
Sbjct: 7   VGPSGSGKST-----LLKLLALRLLAR---GGRVIVIDP--------KGEYSGLARALGG 50

Query: 429 RVVCVYGGTGIS 440
            V+ +  G+GIS
Sbjct: 51  EVIDLGPGSGIS 62


>gnl|CDD|213994 cd12110, PHP_HisPPase_Hisj_like, Polymerase and Histidinol
           Phosphatase domain of Histidinol phosphate phosphatase
           of Hisj like.  Bacillus subtilis YtvP HisJ has strong
           histidinol phosphate phosphatase (HisPPase) activity.
           The PHP (also called histidinol phosphatase-2/HIS2)
           domain is associated with several types of DNA
           polymerases, such as PolIIIA and family X DNA
           polymerases, stand alone histidinol phosphate
           phosphatases (HisPPases), and a number of
           uncharacterized protein families. HisPPase catalyzes the
           eighth step of histidine biosynthesis, in which
           L-histidinol phosphate undergoes dephosphorylation to
           produce histidinol. The PHP domain has four conserved
           sequence motifs and contains an invariant histidine that
           is involved in metal ion coordination. The PHP domain of
           HisPPase is structurally homologous to other members of
           the PHP family that have a distorted (beta/alpha)7
           barrel fold with a trinuclear metal site on the
           C-terminal side of the barrel.
          Length = 244

 Score = 29.5 bits (67), Expect = 5.2
 Identities = 17/48 (35%), Positives = 25/48 (52%), Gaps = 7/48 (14%)

Query: 270 KELSKVDHSTIEYLPFRKDFYVEVPEIARMTPEEVEKYKEELEGIRVK 317
            E+   +H+    LPF  D Y E    +RM  EE+E Y EE+  ++ K
Sbjct: 30  TEIGFSEHA---PLPFEFDDYPE----SRMAEEELEDYVEEIRRLKEK 70


>gnl|CDD|227935 COG5648, NHP6B, Chromatin-associated proteins containing the HMG
           domain [Chromatin structure and dynamics].
          Length = 211

 Score = 29.1 bits (65), Expect = 5.3
 Identities = 18/94 (19%), Positives = 30/94 (31%), Gaps = 1/94 (1%)

Query: 50  DLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLE 109
           D E+       +  R R  + E+ + + K   K       E       K++       L 
Sbjct: 114 DEEKEPYYKEANSDRERY-QREKEEYNKKLPNKAPIGPFIENEPKIRPKVEGPSPDKALV 172

Query: 110 LEMQKRRDRIERWRAERKKKDIETIKKDIKSNLS 143
            E +            +KKK I+  KK  +   S
Sbjct: 173 EETKIISKAWSELDESKKKKYIDKYKKLKEEYDS 206


>gnl|CDD|215507 PLN02939, PLN02939, transferase, transferring glycosyl groups.
          Length = 977

 Score = 29.9 bits (67), Expect = 5.6
 Identities = 21/102 (20%), Positives = 42/102 (41%), Gaps = 5/102 (4%)

Query: 8   RSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRS 67
           RSR+    PS +R   S R      RRR  S +++ +R ++   ++R   S+       +
Sbjct: 18  RSRAPFYLPSRRRLAVSCR-----ARRRGFSSQQKKKRGKNIAPKQRSSNSKLQSNTDEN 72

Query: 68  REAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLE 109
            + E +   +  E   K    +++      + D+ + A   E
Sbjct: 73  GQLENTSLRTVMELPQKSTSSDDDHNRASMQRDEAIAAIDNE 114


>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
          Length = 509

 Score = 29.6 bits (67), Expect = 5.7
 Identities = 20/159 (12%), Positives = 54/159 (33%), Gaps = 10/159 (6%)

Query: 55  KEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQK 114
           K+    ++     ++ + +   +  +   K++ ++E  +   ++    ++       ++ 
Sbjct: 64  KDTDDATESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKD 123

Query: 115 RRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGK 174
                +    +    D +    DI                 + ++D D+D+ D++DE  K
Sbjct: 124 IDVLNQADDDDDDDDDDDLDDDDID----------DDDDDEDDDEDDDDDDVDDEDEEKK 173

Query: 175 TAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADS 213
            A+E     D       E+  +  + A         AD 
Sbjct: 174 EAKELEKLSDDDDFVWDEDDSEALRQARKDAKLTATADP 212



 Score = 28.8 bits (65), Expect = 9.3
 Identities = 27/179 (15%), Positives = 60/179 (33%), Gaps = 9/179 (5%)

Query: 43  SERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKE 102
           +E +  + L++   KS+     ++    E  +   K  E+  +           +    E
Sbjct: 12  AEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIFLSGMVKDTDDATE 71

Query: 103 VEATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSN--------LSSGLGGSAPMKK 154
            +  + + +   +    +   A++K KD     K  +          L+         + 
Sbjct: 72  SDIPKKKTKTAAKAAAAKA-PAKKKLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQA 130

Query: 155 WNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADS 213
            + +DD D+D+ D+ D +    +ED D  D       E+  K     +   +D      
Sbjct: 131 DDDDDDDDDDDLDDDDIDDDDDDEDDDEDDDDDDVDDEDEEKKEAKELEKLSDDDDFVW 189


>gnl|CDD|221041 pfam11241, DUF3043, Protein of unknown function (DUF3043).  Some
          members in this family of proteins with unknown
          function are annotated as membrane proteins. This
          cannot be confirmed.
          Length = 168

 Score = 28.8 bits (65), Expect = 5.8
 Identities = 17/56 (30%), Positives = 27/56 (48%), Gaps = 8/56 (14%)

Query: 20 RPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKD 75
          RP   RR+ +  RRR     +R++ +   R    R+E      RR+R+R A  + D
Sbjct: 4  RPTPKRREAEAARRRPLVPEDRKAAKKAAR--AARRE------RRARARAAMMAGD 51


>gnl|CDD|153359 cd07675, F-BAR_FNBP1L, The F-BAR (FES-CIP4 Homology and
           Bin/Amphiphysin/Rvs) domain of Formin Binding Protein
           1-Like.  F-BAR domains are dimerization modules that
           bind and bend membranes and are found in proteins
           involved in membrane dynamics and actin reorganization.
           FormiN Binding Protein 1-Like (FNBP1L), also known as
           Toca-1 (Transducer of Cdc42-dependent actin assembly),
           forms a complex with neural Wiskott-Aldrich syndrome
           protein (N-WASP). The FNBP1L/N-WASP complex induces the
           formation of filopodia and endocytic vesicles. FNBP1L is
           required for Cdc42-induced actin assembly and is
           essential for autophagy of intracellular pathogens. It
           contains an N-terminal F-BAR domain, a central
           Cdc42-binding HR1 domain, and a C-terminal SH3 domain.
           F-BAR domains form banana-shaped dimers with a
           positively-charged concave surface that binds to
           negatively-charged lipid membranes. They can induce
           membrane deformation in the form of long tubules.
          Length = 252

 Score = 29.2 bits (65), Expect = 5.8
 Identities = 20/80 (25%), Positives = 42/80 (52%), Gaps = 3/80 (3%)

Query: 52  ERRKEKSRGSKRRSRSREAERSKDHSKKE-EKDKREKEEEEAAFDPSKLDKEVEATRLEL 110
           ER+     G K +       +  D+SKK+ E++ RE E+ + +++  +LD +  AT+ ++
Sbjct: 107 ERKMHLQEGRKAQQYLDMCWKQMDNSKKKFERECREAEKAQQSYE--RLDNDTNATKSDV 164

Query: 111 EMQKRRDRIERWRAERKKKD 130
           E  K++  +    A+  K +
Sbjct: 165 EKAKQQLNLRTHMADESKNE 184


>gnl|CDD|177439 PHA02620, PHA02620, VP3; Provisional.
          Length = 353

 Score = 29.6 bits (66), Expect = 5.8
 Identities = 13/35 (37%), Positives = 19/35 (54%)

Query: 4   SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRS 38
           +++KR  SR  S   K P+ S +   + R R SRS
Sbjct: 319 NKKKRRMSRGSSQKAKGPRASSKTSYKRRSRSSRS 353


>gnl|CDD|173534 PTZ00341, PTZ00341, Ring-infected erythrocyte surface antigen;
            Provisional.
          Length = 1136

 Score = 29.8 bits (66), Expect = 6.1
 Identities = 22/111 (19%), Positives = 56/111 (50%)

Query: 71   ERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKKKD 130
            E  +++ ++ +++  E+ EE       +  +E+E    E   +   + IE +  E  ++ 
Sbjct: 1013 ENIEENVEEYDEENVEEVEENVEEYDEENVEEIEENAEENVEENIEENIEEYDEENVEEI 1072

Query: 131  IETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDID 181
             E I+++I+ N+   +  +    + N+E++ +E+  +N +EN +   E+ D
Sbjct: 1073 EENIEENIEENVEENVEENVEEIEENVEENVEENAEENAEENAEENAEEYD 1123


>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
           Mitofilin controls mitochondrial cristae morphology.
           Mitofilin is enriched in the narrow space between the
           inner boundary and the outer membranes, where it forms a
           homotypic interaction and assembles into a large
           multimeric protein complex. The first 78 amino acids
           contain a typical amino-terminal-cleavable mitochondrial
           presequence rich in positive-charged and hydroxylated
           residues and a membrane anchor domain. In addition, it
           has three centrally located coiled coil domains.
          Length = 493

 Score = 29.6 bits (67), Expect = 6.1
 Identities = 23/97 (23%), Positives = 40/97 (41%), Gaps = 8/97 (8%)

Query: 48  DRDLERR-KEKSRGSKRRSRSREAERSKDHSKKEEKDKRE-----KEEEEAAFDPSKLDK 101
           + +LER  KEK      +       R +      EK  R      KEE    ++  KL +
Sbjct: 187 EEELERALKEKREELLSKLEEELLARLESKEAALEKQLRLEFEREKEELRKKYE-EKLRQ 245

Query: 102 EVEATRLELEMQKRRDRIERWRAERKKKDIETIKKDI 138
           E+E      E QK ++ +     E +++  + IK+ +
Sbjct: 246 ELERQAEAHE-QKLKNELALQAIELQREFNKEIKEKV 281


>gnl|CDD|218292 pfam04851, ResIII, Type III restriction enzyme, res subunit. 
          Length = 100

 Score = 27.6 bits (62), Expect = 6.2
 Identities = 10/31 (32%), Positives = 18/31 (58%)

Query: 348 KPTPIQAQAIPAIMSGRDLIGIAKTGSGKTV 378
           +  P Q +AI  ++  +  + +  TGSGKT+
Sbjct: 3   ELRPYQEEAIERLLEKKRGLIVMATGSGKTL 33


>gnl|CDD|205235 pfam13054, DUF3915, Protein of unknown function (DUF3915).  This
          family of proteins is functionally uncharacterized.
          This family of proteins is found in bacteria. Proteins
          in this family are approximately 120 amino acids in
          length.
          Length = 116

 Score = 27.9 bits (62), Expect = 6.5
 Identities = 10/25 (40%), Positives = 14/25 (56%)

Query: 35 RSRSHERRSERDRDRDLERRKEKSR 59
          R   H  R + +R+R+ ER KE  R
Sbjct: 12 RDCHHHEREDFEREREREREKEPQR 36


>gnl|CDD|151146 pfam10628, CotE, Outer spore coat protein E (CotE).  CotE is a
           morphogenic protein that is required for the assembly of
           the outer coat of the endospore and spore resistance to
           lysozyme. CotE also regulates the expression of cotA,
           cotB, cotC and other genes encoding spore outer coat
           proteins. The timing of cotE expression has been shown
           in Bacillus subtilis to affect spore coat morphology but
           not lysozyme resistance.
          Length = 182

 Score = 28.6 bits (64), Expect = 6.8
 Identities = 16/40 (40%), Positives = 19/40 (47%), Gaps = 7/40 (17%)

Query: 154 KWNLEDDSDEDENDNKDENGKTAEEDIDPLDAFMQGVHEE 193
               EDD +EDE    DE      ED+DP   F+ G  EE
Sbjct: 150 DGCEEDDDEEDEEITDDE-----FEDLDP--DFLVGEEEE 182


>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
          Length = 1036

 Score = 29.5 bits (66), Expect = 7.1
 Identities = 16/56 (28%), Positives = 30/56 (53%), Gaps = 8/56 (14%)

Query: 32  RRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREK 87
           RR   +  +  +ER+R  + +RR+E+ + +       EA+R    +K E + +REK
Sbjct: 255 RRELEKLAKEEAERERQAEEQRRREEEKAA------MEADR--AQAKAEVEKRREK 302


>gnl|CDD|237133 PRK12553, PRK12553, ATP-dependent Clp protease proteolytic subunit;
           Reviewed.
          Length = 207

 Score = 28.8 bits (65), Expect = 7.2
 Identities = 14/40 (35%), Positives = 24/40 (60%), Gaps = 3/40 (7%)

Query: 105 ATRLEL---EMQKRRDRIERWRAERKKKDIETIKKDIKSN 141
           A+ LE+   E+ + R+R+ER  AE   + +E I+KD   +
Sbjct: 143 ASDLEIQAREILRMRERLERILAEHTGQSVEKIRKDTDRD 182


>gnl|CDD|234634 PRK00103, PRK00103, rRNA large subunit methyltransferase;
           Provisional.
          Length = 157

 Score = 28.2 bits (64), Expect = 7.4
 Identities = 23/101 (22%), Positives = 34/101 (33%), Gaps = 26/101 (25%)

Query: 485 ADRM-FDMGFEPQVMRIIDNVRPDRQTVMFSATFPRQMEALARRILNKPIEIQVGGRSVV 543
             R    +  E   + I D  RP       +A   +       RIL       +   +  
Sbjct: 24  LKRFPRYLKLEL--IEIPDEKRPK------NADAEQIKAKEGERILAA-----LPKGA-- 68

Query: 544 CKEVEQHVIVLDEEQKML---KLLELLGIYQDQG-SVIVFV 580
                  VI LDE  K L   +  + L  ++D G S + FV
Sbjct: 69  ------RVIALDERGKQLSSEEFAQELERWRDDGRSDVAFV 103


>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62. 
          Length = 217

 Score = 28.6 bits (64), Expect = 7.5
 Identities = 12/72 (16%), Positives = 26/72 (36%), Gaps = 6/72 (8%)

Query: 49  RDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRL 108
           R LE  K K+   K          S+D + ++ K          A    ++ K     + 
Sbjct: 14  RALESEKYKANKDKGNPEIYNKINSQDKAIEKFK------LLIKAQMAERVKKLHSQEKK 67

Query: 109 ELEMQKRRDRIE 120
           E + + ++ ++ 
Sbjct: 68  EEKKKPKKKKVP 79


>gnl|CDD|222011 pfam13257, DUF4048, Domain of unknown function (DUF4048).  This
           presumed domain is functionally uncharacterized. This
           domain family is found in eukaryotes, and is typically
           between 228 and 257 amino acids in length.
          Length = 242

 Score = 28.6 bits (64), Expect = 7.6
 Identities = 19/73 (26%), Positives = 24/73 (32%), Gaps = 17/73 (23%)

Query: 9   SRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR 68
           + SR+  P             R RR  SRS  R   R +   L         S R S S+
Sbjct: 117 TESRTVPPP------------RSRRSGSRSTSRSRLRLQGGSLSS-----SRSSRSSTSK 159

Query: 69  EAERSKDHSKKEE 81
            A   KD    + 
Sbjct: 160 GATSGKDSKSADI 172


>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
          Length = 1465

 Score = 29.4 bits (66), Expect = 7.6
 Identities = 37/226 (16%), Positives = 73/226 (32%), Gaps = 4/226 (1%)

Query: 48   DRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK----DKREKEEEEAAFDPSKLDKEV 103
            D   E  +EK + +  R  S  A++    + K+       K+  E E           E 
Sbjct: 1172 DAKAEEAREKLQRAAARGESGAAKKVSRQAPKKPAPKKTTKKASESETTEETYGSSAMET 1231

Query: 104  EATRLELEMQKRRDRIERWRAERKKKDIETIKKDIKSNLSSGLGGSAPMKKWNLEDDSDE 163
            E     ++ + R    ++  A  K+K+ E    D+K  L++    SAP +   +E+    
Sbjct: 1232 ENVAEVVKPKGRAGAKKKAPAAAKEKEEEDEILDLKDRLAAYNLDSAPAQSAKMEETVKA 1291

Query: 164  DENDNKDENGKTAEEDIDPLDAFMQGVHEEMRKVNKPAVPTTADVKPADSGSKPAGVVIV 223
                      K         D+        +       +      KPA +  K A     
Sbjct: 1292 VPARRAAARKKPLASVSVISDSDDDDDDFAVEVSLAERLKKKGGRKPAAANKKAAKPPAA 1351

Query: 224  TGVVKKSVEKAKGELMEENQDGLEYSSEEEQEDLTSTAANLASKQK 269
                  +  ++  +L+ E     E      ++ +    A+  +K+ 
Sbjct: 1352 AKKRGPATVQSGQKLLTEMLKPAEAIGISPEKKVRKMRASPFNKKS 1397


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 29.0 bits (66), Expect = 7.7
 Identities = 23/85 (27%), Positives = 45/85 (52%), Gaps = 7/85 (8%)

Query: 39  HERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSK 98
           H+ R+E +++   ERR E  +  ++R   +E    +   K E  +KRE+E E+   +  +
Sbjct: 67  HKLRNEFEKEL-RERRNELQK-LEKRLLQKEENLDR---KLELLEKREEELEKKEKELEQ 121

Query: 99  LDKEVEATRLELE--MQKRRDRIER 121
             +E+E    ELE  ++++   +ER
Sbjct: 122 KQQELEKKEEELEELIEEQLQELER 146


>gnl|CDD|212661 cd07777, FGGY_SHK_like, sedoheptulokinase-like proteins; a
           subfamily of the FGGY family of carbohydrate kinases.
           This subfamily is predominantly composed of
           uncharacterized bacterial and eukaryotic proteins with
           similarity to human sedoheptulokinase (SHK, also known
           as D-altro-heptulose or heptulokinase, EC 2.7.1.14)
           encoded by the carbohydrate kinase-like (CARKL/SHPK)
           gene. SHK catalyzes the ATP-dependent phosphorylation of
           sedoheptulose to produce sedoheptulose 7-phosphate and
           ADP. The presence of Mg2+ or Mn2+ might be required for
           catalytic activity. Members of this subfamily belong to
           the FGGY family of carbohydrate kinases, the monomers of
           which contain two large domains, which are separated by
           a deep cleft that forms the active site. This model
           includes both the N-terminal domain, which adopts a
           ribonuclease H-like fold, and the structurally related
           C-terminal domain.
          Length = 448

 Score = 29.2 bits (66), Expect = 7.7
 Identities = 10/35 (28%), Positives = 18/35 (51%)

Query: 259 STAANLASKQKKELSKVDHSTIEYLPFRKDFYVEV 293
            T+A L+     +   V  ++ EY P+ K+ Y+ V
Sbjct: 263 GTSAQLSFLPVFKPETVPPASPEYRPYFKNHYLAV 297


>gnl|CDD|221408 pfam12072, DUF3552, Domain of unknown function (DUF3552).  This
           presumed domain is functionally uncharacterized. This
           domain is found in bacteria, archaea and eukaryotes.
           This domain is about 200 amino acids in length. This
           domain is found associated with pfam00013, pfam01966.
           This domain has a single completely conserved residue A
           that may be functionally important.
          Length = 201

 Score = 28.7 bits (65), Expect = 7.9
 Identities = 28/84 (33%), Positives = 44/84 (52%), Gaps = 4/84 (4%)

Query: 38  SHERRSERDRDRDLERRKEKSRGSKRRSRSREA--ERSKDHSKKEEKDKREKEEEEAAFD 95
            H+ R+E +R+   ERR E  R  ++R   +E   +R  +  +K+E+   EKE+E AA  
Sbjct: 62  IHKLRAEAEREL-KERRNELQR-QEKRLLQKEETLDRKDESLEKKEESLEEKEKELAARQ 119

Query: 96  PSKLDKEVEATRLELEMQKRRDRI 119
               +KE E   L  E Q+  +RI
Sbjct: 120 QQLEEKEEELEELIEEQQQELERI 143


>gnl|CDD|110514 pfam01517, HDV_ag, Hepatitis delta virus delta antigen.  The
           hepatitis delta virus (HDV) encodes a single protein,
           the hepatitis delta antigen (HDAg). The central region
           of this protein has been shown to bind RNA. Several
           interactions are also mediated by a coiled-coil region
           at the N terminus of the protein.
          Length = 194

 Score = 28.7 bits (64), Expect = 7.9
 Identities = 22/84 (26%), Positives = 39/84 (46%), Gaps = 5/84 (5%)

Query: 7   KRSRSRSPSPSHKRPKESRRDKDRDRRRRSRS-----HERRSERDRDRDLERRKEKSRGS 61
           ++ +    +P  KRP+  + + D    +R  +      ERR  R R     ++K+ S G 
Sbjct: 59  RKDKDGEGAPPAKRPRTDQMEVDSGPGKRPHAGGFTDQERRDHRRRKALENKKKQLSSGG 118

Query: 62  KRRSRSREAERSKDHSKKEEKDKR 85
           K  SR  E E  +   + EE+++R
Sbjct: 119 KHLSREEEEELRRLTEEDEERERR 142


>gnl|CDD|218336 pfam04935, SURF6, Surfeit locus protein 6.  The surfeit locus
           protein SURF-6 is shown to be a component of the
           nucleolar matrix and has a strong binding capacity for
           nucleic acids.
          Length = 206

 Score = 28.4 bits (64), Expect = 8.2
 Identities = 25/129 (19%), Positives = 56/129 (43%), Gaps = 14/129 (10%)

Query: 25  RRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSK---------D 75
           +R++ + R+++ R   ++ E  +  + E  K +   SK+++   E              D
Sbjct: 14  KREQRKARKKQKRKEAKKKEDAQKSEAEEVKNEENKSKKKAAPIENAEGNIVFSKVEFAD 73

Query: 76  HSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIE-----RWRAERKKKD 130
             + ++  K +K++++   D  +L K++EA + +LE        E     +W     K +
Sbjct: 74  GEQAKKDLKLKKKKKKKKTDYKQLLKKLEARKKKLEELDEDKAAEIEEKEKWTKALAKAE 133

Query: 131 IETIKKDIK 139
              +K D K
Sbjct: 134 GVKVKDDEK 142


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 29.1 bits (65), Expect = 8.6
 Identities = 14/71 (19%), Positives = 36/71 (50%), Gaps = 12/71 (16%)

Query: 21 PKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKE 80
           +E  R K ++            E+ ++++L++ K   + +K + ++++A    +  KK 
Sbjct: 15 EEELERKKKKE------------EKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKKS 62

Query: 81 EKDKREKEEEE 91
          EK  R+++ E+
Sbjct: 63 EKKSRKRDVED 73


>gnl|CDD|218545 pfam05300, DUF737, Protein of unknown function (DUF737).  This
           family consists of several uncharacterized mammalian
           proteins of unknown function.
          Length = 187

 Score = 28.2 bits (63), Expect = 8.6
 Identities = 24/122 (19%), Positives = 52/122 (42%), Gaps = 8/122 (6%)

Query: 15  SPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSK 74
                   E +   +  +R    +       +   DL +R E+ +   +   +R A+R +
Sbjct: 48  ELRRLIAGELKGALEDAKRPSEETAGGLQSSEVKEDLLKRYEQEQAIVQEELARIAKRER 107

Query: 75  DHSKKE-------EKDKREKEEEEAAFDPSKLD-KEVEATRLELEMQKRRDRIERWRAER 126
           + ++++       EK   E+E ++A     +L+ KE E  RL+   +++  R+E   +E 
Sbjct: 108 EAAEEQLSRAVLREKASAEQERQKAKHLARQLEEKEAELKRLDAFYKEQLARLEEKNSEF 167

Query: 127 KK 128
            K
Sbjct: 168 YK 169


>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).  This
           family consists of several Plasmodium falciparum SPAM
           (secreted polymorphic antigen associated with
           merozoites) proteins. Variation among SPAM alleles is
           the result of deletions and amino acid substitutions in
           non-repetitive sequences within and flanking the alanine
           heptad-repeat domain. Heptad repeats in which the a and
           d position contain hydrophobic residues generate
           amphipathic alpha-helices which give rise to helical
           bundles or coiled-coil structures in proteins. SPAM is
           an example of a P. falciparum antigen in which a
           repetitive sequence has features characteristic of a
           well-defined structural element.
          Length = 164

 Score = 28.3 bits (63), Expect = 9.0
 Identities = 27/128 (21%), Positives = 52/128 (40%), Gaps = 21/128 (16%)

Query: 67  SREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAER 126
            +E E  KD  +++++++ E++EEE        D+E      E E +   D ++    E+
Sbjct: 37  IKENEDVKDEKQEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEEDEEDNVDLKDIEK 96

Query: 127 KKK-DIETIKKDIKS-NLSSGLGGSAPMKKWNLEDDSDEDENDNKDENGKTAEEDIDPLD 184
           K   DI    +D  + NL S                         +++ KTAE+ +  L 
Sbjct: 97  KNINDIFNSTQDDNAQNLIS-------------------KNYKKNEKSKKTAEDIVKTLF 137

Query: 185 AFMQGVHE 192
             + G ++
Sbjct: 138 GLLNGNNQ 145


>gnl|CDD|226263 COG3740, COG3740, Phage head maturation protease [General function
           prediction only].
          Length = 194

 Score = 28.2 bits (63), Expect = 9.2
 Identities = 15/60 (25%), Positives = 21/60 (35%), Gaps = 6/60 (10%)

Query: 72  RSKDHSKKEEKDKREKEEEEAAFD------PSKLDKEVEATRLELEMQKRRDRIERWRAE 125
           R  D    +         E    +      P+  D  VEA RLE   + RR R E+ +  
Sbjct: 129 RDGDEWGDDGSPIVRIRLEATLLEVSVVTFPAYPDARVEAVRLEELFEVRRTRAEKRKLL 188


>gnl|CDD|226809 COG4372, COG4372, Uncharacterized protein conserved in bacteria
           with the myosin-like domain [Function unknown].
          Length = 499

 Score = 28.8 bits (64), Expect = 9.3
 Identities = 15/100 (15%), Positives = 35/100 (35%), Gaps = 6/100 (6%)

Query: 31  DRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEKDKR--EKE 88
             +               ++L  R   ++        R A   +     +++D +  +K 
Sbjct: 185 KSQVLDLKLRSAQIEQEAQNLATRANAAQARTEELARRAAAAQQTAQAIQQRDAQISQKA 244

Query: 89  EEEAAFDPSKLDKEVEATRLELEMQKRRDRIERWRAERKK 128
           ++ AA      ++E +  RLE        R+E+  A+ + 
Sbjct: 245 QQIAARAEQIRERERQLQRLETAQ----ARLEQEVAQLEA 280


>gnl|CDD|183984 PRK13340, PRK13340, alanine racemase; Reviewed.
          Length = 406

 Score = 28.8 bits (65), Expect = 9.4
 Identities = 14/64 (21%), Positives = 30/64 (46%), Gaps = 9/64 (14%)

Query: 482 LDEADRMFDMGFEPQVMRI-------IDNVRPDRQTVMF-SATFPRQMEALARRILNKPI 533
            +EA R+ ++GF  Q++R+       I+         +       + + A+A++   KPI
Sbjct: 100 NEEARRVRELGFTGQLLRVRSASPAEIEQALRYDLEELIGDDEQAKLLAAIAKKN-GKPI 158

Query: 534 EIQV 537
           +I +
Sbjct: 159 DIHL 162


>gnl|CDD|221313 pfam11917, DUF3435, Protein of unknown function (DUF3435).  This
           family of proteins are functionally uncharacterized.
           This protein is found in eukaryotes. Proteins in this
           family are typically between 435 to 791 amino acids in
           length. This family is related to pfam00589 suggesting
           it may be an integrase enzyme.
          Length = 418

 Score = 28.9 bits (65), Expect = 9.5
 Identities = 14/83 (16%), Positives = 32/83 (38%)

Query: 9   SRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSR 68
           SR+R P        E +   + D   +    +R   +     L  +  K++G+    R  
Sbjct: 265 SRTRDPRRPRDLTDEQKASVEEDPELQELIRKRDHLKKEIIALYGQVAKAKGTPLYERLE 324

Query: 69  EAERSKDHSKKEEKDKREKEEEE 91
           +  R   + ++  + + +K+  E
Sbjct: 325 KRRREVRNERQRLRRELKKKIRE 347


>gnl|CDD|191187 pfam05087, Rota_VP2, Rotavirus VP2 protein.  Rotavirus particles
           consist of three concentric proteinaceous capsid layers.
           The innermost capsid (core) is made of VP2. The genomic
           RNA and the two minor proteins VP1 and VP3 are
           encapsidated within this layer. The N-terminus of
           rotavirus VP2 is necessary for the encapsidation of VP1
           and VP3.
          Length = 887

 Score = 29.1 bits (65), Expect = 9.7
 Identities = 19/101 (18%), Positives = 40/101 (39%), Gaps = 14/101 (13%)

Query: 54  RKEKSRGSKRRSRSREAERSK--DHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLEL- 110
           R  +        R +E +  K    ++ E K+K   ++EE   D      + ++++  L 
Sbjct: 4   RNRREANINNNDRMQEKDDEKQDQKNRMELKEKVLDKKEEVVTDNVDSPVKEQSSQENLK 63

Query: 111 ----------EMQKRRDRIERWRAERKKK-DIETIKKDIKS 140
                     E  K+   + + + E +K+   E ++K I S
Sbjct: 64  IADEVKKSTKEESKQLLEVLKTKEEHQKEIQYEILQKTIPS 104


>gnl|CDD|236944 PRK11642, PRK11642, exoribonuclease R; Provisional.
          Length = 813

 Score = 28.9 bits (65), Expect = 9.7
 Identities = 16/76 (21%), Positives = 34/76 (44%)

Query: 54  RKEKSRGSKRRSRSREAERSKDHSKKEEKDKREKEEEEAAFDPSKLDKEVEATRLELEMQ 113
           R  ++ G   R ++++ +  K   K+ +  K+   E ++AF   K  K   A +   + +
Sbjct: 729 RAPRNVGKTAREKAKKGDAGKKGGKRRQVGKKVNFEPDSAFRGEKKAKPKAAKKDARKAK 788

Query: 114 KRRDRIERWRAERKKK 129
           K   + ++  A  K K
Sbjct: 789 KPSAKTQKIAAATKAK 804


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 29.2 bits (65), Expect = 9.7
 Identities = 14/80 (17%), Positives = 30/80 (37%)

Query: 23   ESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRGSKRRSRSREAERSKDHSKKEEK 82
            E  +D+D      +   + + +   D   +  K    G +   +  E          EE 
Sbjct: 4029 EPMQDEDPLEENNTLDEDIQQDDFSDLAEDDEKMNEDGFEENVQENEESTEDGVKSDEEL 4088

Query: 83   DKREKEEEEAAFDPSKLDKE 102
            ++ E  E++A  +  K+D +
Sbjct: 4089 EQGEVPEDQAIDNHPKMDAK 4108


>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207).  This
           family is found in eukaryotes; it has several conserved
           tryptophan residues. The function is not known.
          Length = 261

 Score = 28.5 bits (64), Expect = 9.8
 Identities = 16/57 (28%), Positives = 30/57 (52%), Gaps = 1/57 (1%)

Query: 4   SRRKRSRSRSPSPSHKRPKESRRDKDRDRRRRSRSHERRSERDRDRDLERRKEKSRG 60
              K  R+ S   + KR +E    K + ++++ R  ERR +R + ++ E RK+K+  
Sbjct: 167 GSAKPERNVSQEEAKKRLQEWELKKLKQQQQK-REEERRKQRKKQQEEEERKQKAEE 222


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.314    0.132    0.369 

Gapped
Lambda     K      H
   0.267   0.0677    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 32,083,254
Number of extensions: 3302346
Number of successful extensions: 8123
Number of sequences better than 10.0: 1
Number of HSP's gapped: 6315
Number of HSP's successfully gapped: 776
Length of query: 615
Length of database: 10,937,602
Length adjustment: 103
Effective length of query: 512
Effective length of database: 6,369,140
Effective search space: 3260999680
Effective search space used: 3260999680
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 62 (27.8 bits)