RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy5825
         (596 letters)



>gnl|CDD|187547 cd05236, FAR-N_SDR_e, fatty acyl CoA reductases (FARs), extended
           (e) SDRs.  SDRs are Rossmann-fold NAD(P)H-binding
           proteins, many of which may function as fatty acyl CoA
           reductases (FAR), acting on medium and long chain fatty
           acids, and have been reported to be involved in diverse
           processes such as biosynthesis of insect pheromones,
           plant cuticular wax production, and mammalian wax
           biosynthesis. In Arabidopsis thaliana, proteins with
           this particular architecture have also been identified
           as the MALE STERILITY 2 (MS2) gene product, which is
           implicated in male gametogenesis. Mutations in MS2
           inhibit the synthesis of exine (sporopollenin),
           rendering plants unable to reduce pollen wall fatty
           acids to corresponding alcohols. This N-terminal domain
           shares the catalytic triad (but not the upstream Asn)
           and characteristic NADP-binding motif of the extended
           SDR family. Extended SDRs are distinct from classical
           SDRs. In addition to the Rossmann fold (alpha/beta
           folding pattern with a central beta-sheet) core region
           typical of all SDRs, extended SDRs have a less conserved
           C-terminal extension of approximately 100 amino acids.
           Extended SDRs are a diverse collection of proteins, and
           include isomerases, epimerases, oxidoreductases, and
           lyases; they typically have a TGXXGXXG cofactor binding
           motif. SDRs are a functionally diverse family of
           oxidoreductases that have a single domain with a
           structurally conserved Rossmann fold, an
           NAD(P)(H)-binding region, and a structurally diverse
           C-terminal region. Sequence identity between different
           SDR enzymes is typically in the 15-30% range; they
           catalyze a wide range of activities including the
           metabolism of steroids, cofactors, carbohydrates,
           lipids, aromatic compounds, and amino acids, and act in
           redox sensing. Classical SDRs have an TGXXX[AG]XG
           cofactor binding motif and a YXXXK active site motif,
           with the Tyr residue of the active site motif serving as
           a critical catalytic residue (Tyr-151, human
           15-hydroxyprostaglandin dehydrogenase numbering). In
           addition to the Tyr and Lys, there is often an upstream
           Ser and/or an Asn, contributing to the active site;
           while substrate binding is in the C-terminal region,
           which determines specificity. The standard reaction
           mechanism is a 4-pro-S hydride transfer and proton relay
           involving the conserved Tyr and Lys, a water molecule
           stabilized by Asn, and nicotinamide. Atypical SDRs
           generally lack the catalytic residues characteristic of
           the SDRs, and their glycine-rich NAD(P)-binding motif is
           often different from the forms normally seen in
           classical or extended SDRs. Complex (multidomain) SDRs
           such as ketoreductase domains of fatty acid synthase
           have a GGXGXXG NAD(P)-binding motif and an altered
           active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
          Length = 320

 Score =  278 bits (712), Expect = 4e-89
 Identities = 113/297 (38%), Positives = 167/297 (56%), Gaps = 16/297 (5%)

Query: 234 PEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGI 293
           P++G IY+L+R K  ++ +ERL E  KD+LFDR +N        K+  I GD+S+P+LG+
Sbjct: 25  PDIGKIYLLIRGKSGQSAEERLRELLKDKLFDRGRNLNPLFE-SKIVPIEGDLSEPNLGL 83

Query: 294 SSHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVST 353
           S  D Q +   +++IIH AA++ FDE + +A ++N+  T  LL+LA RC +LKA +HVST
Sbjct: 84  SDEDLQTLIEEVNIIIHCAATVTFDERLDEALSINVLGTLRLLELAKRCKKLKAFVHVST 143

Query: 354 LYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGE 413
            Y +  R+ I+E+ YPP    E L  +++  +  ELE  +  L GG + N+Y+FTKA+ E
Sbjct: 144 AYVNGDRQLIEEKVYPPPADPEKLIDILELMDDLELERATPKLLGG-HPNTYTFTKALAE 202

Query: 414 SVVEKYLYKLPLAMVRPSIVVSTWKEPIVGWSNNLYGPGGAAAGAALGLIHTFYAKHDKK 473
            +V K    LPL +VRPSIV +T KEP  GW +N  GP G       G++ T  A  +  
Sbjct: 203 RLVLKERGNLPLVIVRPSIVGATLKEPFPGWIDNFNGPDGLFLAYGKGILRTMNADPNAV 262

Query: 474 CDLIPVDVATNMMLGVVWKTALDHGHVAPPASLVAPIPRTDPPVYNLSISSSYPITW 530
            D+IPVDV  N +L     + +                  +  VY+   S   P TW
Sbjct: 263 ADIIPVDVVANALLAAAAYSGVR--------------KPRELEVYHCGSSDVNPFTW 305


>gnl|CDD|219687 pfam07993, NAD_binding_4, Male sterility protein.  This family
           represents the C-terminal region of the male sterility
           protein in a number of arabidopsis and drosophila. A
           sequence-related jojoba acyl CoA reductase is also
           included.
          Length = 245

 Score =  190 bits (486), Expect = 9e-57
 Identities = 80/250 (32%), Positives = 120/250 (48%), Gaps = 31/250 (12%)

Query: 239 IYILLRSKKNKTVQERLAEQF-KDELFDRLKNEQADILQRKVHIISGDISQPSLGISSHD 297
           IY L+R+K  ++  ERL ++  K  LFDRLK         ++  ++GD+S+P+LG+S  D
Sbjct: 25  IYCLVRAKDGESALERLRQELLKYGLFDRLK------ALERIIPVAGDLSEPNLGLSDED 78

Query: 298 QQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVSTLYTH 357
            Q +   + VIIH AA++ F E   D    N+  TRE+L LA +  +     HVST Y +
Sbjct: 79  FQELAEEVDVIIHNAATVNFVEPYSDLRATNVLGTREVLRLAKQMKK-LPFHHVSTAYVN 137

Query: 358 SYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVE 417
             R  + EE                    +  E   ++L G    N Y+ +K + E +V 
Sbjct: 138 GERGGLLEE-----------------KPYKLDEDEPALLGG--LPNGYTQSKWLAEQLVR 178

Query: 418 KYLYKLPLAMVRPSIVVSTWKEPIVGWSNNLY-GPGGAAAGAALGLIHTFYAKHDKKCDL 476
           +    LP+ + RPSI+     E   GW N    GP G   GA LG++       D + DL
Sbjct: 179 EAAGGLPVVIYRPSIITG---ESRTGWINGDDFGPRGLLGGAGLGVLPDILGDPDARLDL 235

Query: 477 IPVDVATNMM 486
           +PVD   N +
Sbjct: 236 VPVDYVANAI 245


>gnl|CDD|215538 PLN02996, PLN02996, fatty acyl-CoA reductase.
          Length = 491

 Score =  130 bits (329), Expect = 2e-32
 Identities = 83/289 (28%), Positives = 133/289 (46%), Gaps = 41/289 (14%)

Query: 234 PEVGGIYILLRSKKNKTVQERL-AEQFKDELFDRLKNEQAD----ILQRKVHIISGDISQ 288
           P V  +Y+LLR+   K+  +RL  E    +LF  L+ +  +    ++  KV  + GDIS 
Sbjct: 36  PNVKKLYLLLRASDAKSATQRLHDEVIGKDLFKVLREKLGENLNSLISEKVTPVPGDISY 95

Query: 289 PSLGIS-SHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKA 347
             LG+  S+ ++ +   I ++++ AA+  FDE    A  +N      +L+ A +C ++K 
Sbjct: 96  DDLGVKDSNLREEMWKEIDIVVNLAATTNFDERYDVALGINTLGALNVLNFAKKCVKVKM 155

Query: 348 ILHVSTLYTHSYRE--------------------DIQEEFYPPLFSYEDLAHVMQTTNQE 387
           +LHVST Y    +                     DI EE        E L   +   +  
Sbjct: 156 LLHVSTAYVCGEKSGLILEKPFHMGETLNGNRKLDINEE---KKLVKEKLKE-LNEQDAS 211

Query: 388 ELEILSSM---------LFGGIYNNSYSFTKAIGESVVEKYLYKLPLAMVRPSIVVSTWK 438
           E EI  +M         L G  + N+Y FTKA+GE ++  +   LPL ++RP+++ ST+K
Sbjct: 212 EEEITQAMKDLGMERAKLHG--WPNTYVFTKAMGEMLLGNFKENLPLVIIRPTMITSTYK 269

Query: 439 EPIVGWSNNLYGPGGAAAGAALGLIHTFYAKHDKKCDLIPVDVATNMML 487
           EP  GW   L        G   G +  F A  +   D+IP D+  N M+
Sbjct: 270 EPFPGWIEGLRTIDSVIVGYGKGKLTCFLADPNSVLDVIPADMVVNAMI 318


>gnl|CDD|215279 PLN02503, PLN02503, fatty acyl-CoA reductase 2.
          Length = 605

 Score =  109 bits (273), Expect = 5e-25
 Identities = 89/315 (28%), Positives = 143/315 (45%), Gaps = 59/315 (18%)

Query: 234 PEVGGIYILLRSKKNKTVQERLAEQFKD-ELFDRL-----KNEQADILQRKVHIISGDIS 287
           P+VG IY+L+++K  +   ERL  +  D ELF  L     K+ Q+ +L + V ++ G++ 
Sbjct: 144 PDVGKIYLLIKAKDKEAAIERLKNEVIDAELFKCLQETHGKSYQSFMLSKLVPVV-GNVC 202

Query: 288 QPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKA 347
           + +LG+       I   + VII++AA+  FDE    A  +N +    L+  A +C +LK 
Sbjct: 203 ESNLGLEPDLADEIAKEVDVIINSAANTTFDERYDVAIDINTRGPCHLMSFAKKCKKLKL 262

Query: 348 ILHVSTLYTHSYRE-------------------------------DIQEEFYPPLFSYE- 375
            L VST Y +  R+                               DI+ E    L S   
Sbjct: 263 FLQVSTAYVNGQRQGRIMEKPFRMGDCIARELGISNSLPHNRPALDIEAEIKLALDSKRH 322

Query: 376 -----DLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVEKYLYKLPLAMVRP 430
                  A  M+     +L +  + L+G  + ++Y FTKA+GE V+      +P+ ++RP
Sbjct: 323 GFQSNSFAQKMK-----DLGLERAKLYG--WQDTYVFTKAMGEMVINSMRGDIPVVIIRP 375

Query: 431 SIVVSTWKEPIVGW--SNNLYGPGGAAAGAALGLIHTFYAKHDKKCDLIPVDVATNMMLG 488
           S++ STWK+P  GW   N +  P     G   G +  F A  +   D++P D+  N  L 
Sbjct: 376 SVIESTWKDPFPGWMEGNRMMDPIVLYYGK--GQLTGFLADPNGVLDVVPADMVVNATLA 433

Query: 489 VVWKTALDHGHVAPP 503
            + K    HG  A P
Sbjct: 434 AMAK----HGGAAKP 444


>gnl|CDD|187573 cd05263, MupV_like_SDR_e, Pseudomonas fluorescens MupV-like,
           extended (e) SDRs.  This subgroup of extended SDR family
           domains have the characteristic active site tetrad and a
           well-conserved NAD(P)-binding motif. This subgroup is
           not well characterized, its members are annotated as
           having a variety of putative functions. One
           characterized member is Pseudomonas fluorescens MupV a
           protein  involved in the biosynthesis of Mupirocin, a
           polyketide-derived antibiotic. Extended SDRs are
           distinct from classical SDRs. In addition to the
           Rossmann fold (alpha/beta folding pattern with a central
           beta-sheet) core region typical of all SDRs, extended
           SDRs have a less conserved C-terminal extension of
           approximately 100 amino acids. Extended SDRs are a
           diverse collection of proteins, and include isomerases,
           epimerases, oxidoreductases, and lyases; they typically
           have a TGXXGXXG cofactor binding motif. SDRs are a
           functionally diverse family of oxidoreductases that have
           a single domain with a structurally conserved Rossmann
           fold, an NAD(P)(H)-binding region, and a structurally
           diverse C-terminal region. Sequence identity between
           different SDR enzymes is typically in the 15-30% range;
           they catalyze a wide range of activities including the
           metabolism of steroids, cofactors, carbohydrates,
           lipids, aromatic compounds, and amino acids, and act in
           redox sensing. Classical SDRs have an TGXXX[AG]XG
           cofactor binding motif and a YXXXK active site motif,
           with the Tyr residue of the active site motif serving as
           a critical catalytic residue (Tyr-151, human
           15-hydroxyprostaglandin dehydrogenase numbering). In
           addition to the Tyr and Lys, there is often an upstream
           Ser and/or an Asn, contributing to the active site;
           while substrate binding is in the C-terminal region,
           which determines specificity. The standard reaction
           mechanism is a 4-pro-S hydride transfer and proton relay
           involving the conserved Tyr and Lys, a water molecule
           stabilized by Asn, and nicotinamide. Atypical SDRs
           generally lack the catalytic residues characteristic of
           the SDRs, and their glycine-rich NAD(P)-binding motif is
           often different from the forms normally seen in
           classical or extended SDRs. Complex (multidomain) SDRs
           such as ketoreductase domains of fatty acid synthase
           have a GGXGXXG NAD(P)-binding motif and an altered
           active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
          Length = 293

 Score = 98.2 bits (245), Expect = 1e-22
 Identities = 61/250 (24%), Positives = 99/250 (39%), Gaps = 45/250 (18%)

Query: 239 IYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGISSHDQ 298
           + +L+RS+      ER+              E+A +   +V ++ GD++QP+LG+S+   
Sbjct: 25  VLVLVRSESLGEAHERI--------------EEAGLEADRVRVLEGDLTQPNLGLSAAAS 70

Query: 299 QFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVSTLYTHS 358
           + +   +  +IH AAS  F    +DA+  NI  T  +L+LA R    +   +VST Y   
Sbjct: 71  RELAGKVDHVIHCAASYDFQAPNEDAWRTNIDGTEHVLELAARLDI-QRFHYVSTAYVAG 129

Query: 359 YREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVEK 418
            RE    E                                  + N Y  +KA  E +V  
Sbjct: 130 NREGNIRE----------TELNPGQN----------------FKNPYEQSKAEAEQLVRA 163

Query: 419 YLYKLPLAMVRPSIVVSTWKEPIVGWSNNLYGPGGAAAGAA-LGLIHTFYAKHDKKCDLI 477
              ++PL + RPSIVV   K    G    + G        A LG           + +L+
Sbjct: 164 AATQIPLTVYRPSIVVGDSK---TGRIEKIDGLYELLNLLAKLGRWLPMPGNKGARLNLV 220

Query: 478 PVDVATNMML 487
           PVD   + ++
Sbjct: 221 PVDYVADAIV 230


>gnl|CDD|187546 cd05235, SDR_e1, extended (e) SDRs, subgroup 1.  This family
           consists of an SDR module of multidomain proteins
           identified as putative polyketide sythases fatty acid
           synthases (FAS), and nonribosomal peptide synthases,
           among others. However, unlike the usual ketoreductase
           modules of FAS and polyketide synthase, these domains
           are related to the extended SDRs, and have canonical
           NAD(P)-binding motifs and an active site tetrad.
           Extended SDRs are distinct from classical SDRs. In
           addition to the Rossmann fold (alpha/beta folding
           pattern with a central beta-sheet) core region typical
           of all SDRs, extended SDRs have a less conserved
           C-terminal extension of approximately 100 amino acids.
           Extended SDRs are a diverse collection of proteins, and
           include isomerases, epimerases, oxidoreductases, and
           lyases; they typically have a TGXXGXXG cofactor binding
           motif. SDRs are a functionally diverse family of
           oxidoreductases that have a single domain with a
           structurally conserved Rossmann fold, an
           NAD(P)(H)-binding region, and a structurally diverse
           C-terminal region. Sequence identity between different
           SDR enzymes is typically in the 15-30% range; they
           catalyze a wide range of activities including the
           metabolism of steroids, cofactors, carbohydrates,
           lipids, aromatic compounds, and amino acids, and act in
           redox sensing. Classical SDRs have an TGXXX[AG]XG
           cofactor binding motif and a YXXXK active site motif,
           with the Tyr residue of the active site motif serving as
           a critical catalytic residue (Tyr-151, human
           15-hydroxyprostaglandin dehydrogenase numbering). In
           addition to the Tyr and Lys, there is often an upstream
           Ser and/or an Asn, contributing to the active site;
           while substrate binding is in the C-terminal region,
           which determines specificity. The standard reaction
           mechanism is a 4-pro-S hydride transfer and proton relay
           involving the conserved Tyr and Lys, a water molecule
           stabilized by Asn, and nicotinamide. Atypical SDRs
           generally lack the catalytic residues characteristic of
           the SDRs, and their glycine-rich NAD(P)-binding motif is
           often different from the forms normally seen in
           classical or extended SDRs. Complex (multidomain) SDRs
           such as ketoreductase domains of fatty acid synthase
           have a GGXGXXG NAD(P)-binding motif and an altered
           active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
          Length = 290

 Score = 71.9 bits (177), Expect = 7e-14
 Identities = 56/208 (26%), Positives = 90/208 (43%), Gaps = 37/208 (17%)

Query: 234 PEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGI 293
             V  IY L+R+K  +   ERL +  K E    L +E       ++ ++ GD+S+P+LG+
Sbjct: 23  KNVSKIYCLVRAKDEEAALERLIDNLK-EYGLNLWDELE---LSRIKVVVGDLSKPNLGL 78

Query: 294 SSHDQQFIQHHIHVIIHAAASL----RFDELIQDAFTLNIQATRELLDLATRCSQLKAIL 349
           S  D Q +   + VIIH  A++     ++EL       N+  T+ELL LA    +LK + 
Sbjct: 79  SDDDYQELAEEVDVIIHNGANVNWVYPYEEL----KPANVLGTKELLKLAAT-GKLKPLH 133

Query: 350 HVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIY-NNSYSFT 408
            VSTL   S                       +  N  + E    ML       N Y  +
Sbjct: 134 FVSTLSVFS----------------------AEEYNALDDEESDDMLESQNGLPNGYIQS 171

Query: 409 KAIGESVVEKYL-YKLPLAMVRPSIVVS 435
           K + E ++ +     LP+A++RP  +  
Sbjct: 172 KWVAEKLLREAANRGLPVAIIRPGNIFG 199


>gnl|CDD|220648 pfam10243, MIP-T3, Microtubule-binding protein MIP-T3.  This
           protein, which interacts with both microtubules and
           TRAF3 (tumour necrosis factor receptor-associated factor
           3), is conserved from worms to humans. The N-terminal
           region is the microtubule binding domain and is
           well-conserved; the C-terminal 100 residues, also
           well-conserved, constitute the coiled-coil region which
           binds to TRAF3. The central region of the protein is
           rich in lysine and glutamic acid and carries KKE motifs
           which may also be necessary for tubulin-binding, but
           this region is the least well-conserved.
          Length = 506

 Score = 70.3 bits (172), Expect = 1e-12
 Identities = 42/210 (20%), Positives = 87/210 (41%), Gaps = 7/210 (3%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
           ++K     A+       S      +  +K+ K +  +++EKEKE+  +EK K     KE+
Sbjct: 68  ESKLSSDEAVKRVEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKPKEE 127

Query: 80  EKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS-HKHKDKDRERDKDEKKEQKESK 138
            KD+   +E + K         EKEK+KEKK ++ +   + K ++R R K   K+  + K
Sbjct: 128 PKDRKPKEEAKEKRPPK-----EKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKK 182

Query: 139 SSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHK 198
             +K        K+  +  + +   P       ++   +   K++E  +S   ++    +
Sbjct: 183 PPNKKKEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPM-EEDESRQ 241

Query: 199 HKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
             +  +    +  K   + S      + S+
Sbjct: 242 SSEISRRSSSSLKKPDPSPSMASPETRESS 271



 Score = 61.1 bits (148), Expect = 8e-10
 Identities = 37/144 (25%), Positives = 72/144 (50%), Gaps = 1/144 (0%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
             K++ K+K   KEKEKEK+ K+ ++     +EK++++V +K + +K  K K  + +KE 
Sbjct: 132 KPKEEAKEKRPPKEKEKEKE-KKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKKKEP 190

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
            +E+K ++      K K  E D +E++E++E     +  ++S   ++ +  S  IS    
Sbjct: 191 PEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISRRSS 250

Query: 166 PAPTPTQKSPVKTKEKEKEKESST 189
            +      SP     + +E    T
Sbjct: 251 SSLKKPDPSPSMASPETRESSKRT 274



 Score = 54.1 bits (130), Expect = 1e-07
 Identities = 37/229 (16%), Positives = 77/229 (33%), Gaps = 9/229 (3%)

Query: 16  PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVS 75
           P   K K++     P                 ++++K ++R + K + KK  +K      
Sbjct: 128 PKDRKPKEEAKEKRPPKEKEKEKEKKVEEPRDREEEKKRERVRAKSRPKKPPKKKPPNKK 187

Query: 76  SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK---------EKSHKHKDKDRER 126
            +  E++K     +E  + KP+E    +E++KE+ D K         E+    +  +  R
Sbjct: 188 KEPPEEEKQRQAAREAVKGKPEEPDVNEEREKEEDDGKDRETTTSPMEEDESRQSSEISR 247

Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
                 ++ +   S     +  +SK   +  +    PP   P   + +P + K KE    
Sbjct: 248 RSSSSLKKPDPSPSMASPETRESSKRTETRPRTSLRPPSARPASARPAPPRVKRKEIVTV 307

Query: 187 SSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPE 235
                       +   +    +    E        ++    AG +   E
Sbjct: 308 LQDAQGVGKIVSNVILEGKKSEDEDDENFVVEAAAQAPDIVAGGEDEAE 356



 Score = 45.3 bits (107), Expect = 7e-05
 Identities = 26/161 (16%), Positives = 63/161 (39%), Gaps = 18/161 (11%)

Query: 71  KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
           K A S    ++     ++   K    K   +++ K +  K+++++  + K++ +++ +  
Sbjct: 65  KCAESKLSSDEAVKRVEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKP 124

Query: 131 KKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTT 190
           K+E K+ K   +                    PP       +K   + +++E+EK+    
Sbjct: 125 KEEPKDRKPKEEAKEKR---------------PPKEKEKEKEKKVEEPRDREEEKKRERV 169

Query: 191 HDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPK 231
             K    K  KK         KE   + K++++ + +   K
Sbjct: 170 RAKSRPKKPPKKKP---PNKKKEPPEEEKQRQAAREAVKGK 207



 Score = 35.2 bits (81), Expect = 0.090
 Identities = 37/188 (19%), Positives = 73/188 (38%), Gaps = 37/188 (19%)

Query: 86  SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
           S ++  K  +   S     K K  K+ K +S K ++K++E+ K+EKK++KE         
Sbjct: 72  SSDEAVKRVEKGGSKGPAAKTKPAKEPKNESGKEEEKEKEQVKEEKKKKKEKP------- 124

Query: 146 SSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKH 205
                                     ++ P   K KE+ KE      +  K K KK ++ 
Sbjct: 125 --------------------------KEEPKDRKPKEEAKEKR-PPKEKEKEKEKKVEEP 157

Query: 206 GDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFD 265
            D+   K+++ + + K   K    PK  P           K+ +  +E +  + ++   +
Sbjct: 158 RDREEEKKRE-RVRAKSRPKKP--PKKKPPNKKKEPPEEEKQRQAAREAVKGKPEEPDVN 214

Query: 266 RLKNEQAD 273
             + ++ D
Sbjct: 215 EEREKEED 222


>gnl|CDD|223528 COG0451, WcaG, Nucleoside-diphosphate-sugar epimerases [Cell
           envelope biogenesis, outer membrane / Carbohydrate
           transport and metabolism].
          Length = 314

 Score = 64.2 bits (156), Expect = 4e-11
 Identities = 61/299 (20%), Positives = 100/299 (33%), Gaps = 72/299 (24%)

Query: 253 ERLAEQFKD-ELFDRLKNEQADILQRKVHIISGDISQPSLGISSHDQQFIQHHIHVIIHA 311
           ERL     D    DRL  +  D L   V  +  D++   L       +  +     +IH 
Sbjct: 18  ERLLAAGHDVRGLDRL-RDGLDPLLSGVEFVVLDLTDRDL-----VDELAKGVPDAVIHL 71

Query: 312 AASLRFDELI----QDAFTLNIQATRELLDLATRCSQLKAILHVST---LYTHSYREDIQ 364
           AA     +       +   +N+  T  LL+ A R + +K  +  S+   +Y       I 
Sbjct: 72  AAQSSVPDSNASDPAEFLDVNVDGTLNLLEAA-RAAGVKRFVFASSVSVVYGDPPPLPID 130

Query: 365 EEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVEKY--LYK 422
           E+  PP                                N Y  +K   E ++  Y  LY 
Sbjct: 131 EDLGPPRP-----------------------------LNPYGVSKLAAEQLLRAYARLYG 161

Query: 423 LPLAMVRPSIVVSTWKEPIVGWSNNLYGPGGAAAGAALGLIHTFYAKHDKKCDLIPVDVA 482
           LP+ ++RP                N+YGPG      + G++  F  +  K   +I +   
Sbjct: 162 LPVVILRPF---------------NVYGPGD-KPDLSSGVVSAFIRQLLKGEPIIVIGGD 205

Query: 483 TNMMLGVVWKTALDHGHVAPPASLVAP-IPRTDPPVYNLSISSSYPITWLEYMNSVQAA 540
            +           D  +V   A  +   +   D  V+N+  S +  IT  E   +V  A
Sbjct: 206 GSQ--------TRDFVYVDDVADALLLALENPDGGVFNIG-SGTAEITVRELAEAVAEA 255


>gnl|CDD|233557 TIGR01746, Thioester-redct, thioester reductase domain.  This model
           includes the terminal domain from the fungal alpha
           aminoadipate reductase enzyme (also known as
           aminoadipate semialdehyde dehydrogenase) which is
           involved in the biosynthesis of lysine , as well as the
           reductase-containing component of the myxochelin
           biosynthetic gene cluster, MxcG. The mechanism of
           reduction involves activation of the substrate by
           adenylation and transfer to a covalently-linked
           pantetheine cofactor as a thioester. This thioester is
           then reduced to give an aldehyde (thus releasing the
           product) and a regenerated pantetheine thiol. (In
           myxochelin biosynthesis this aldehyde is further reduced
           to an alcohol or converted to an amine by an
           aminotransferase.) This is a fundamentally different
           reaction than beta-ketoreductase domains of polyketide
           synthases which act at a carbonyl two carbons removed
           from the thioester and forms an alcohol as a product.
           This domain is invariably found at the C-terminus of the
           proteins which contain it (presumably because it results
           in the release of the product). The majority of hits to
           this model are non-ribosomal peptide synthetases in
           which this domain is similarly located proximal to a
           thiolation domain (pfam00550). In some cases this domain
           is found at the end of a polyketide synthetase enzyme,
           but is unlike ketoreductase domains which are found
           before the thiolase domains. Exceptions to this observed
           relationship with the thiolase domain include three
           proteins which consist of stand-alone reductase domains
           (GP|466833 from M. leprae, GP|435954 from Anabaena and
           OMNI|NTL02SC1199 from Strep. coelicolor) and one protein
           (OMNI|NTL01NS2636 from Nostoc) which contains N-terminal
           homology with a small group of hypothetical proteins but
           no evidence of a thiolation domain next to the putative
           reductase domain. Below the noise cutoff to this model
           are proteins containing more distantly related
           ketoreductase and dehydratase/epimerase domains. It has
           been suggested that a NADP-binding motif can be found in
           the N-terminal portion of this domain that may form a
           Rossman-type fold.
          Length = 367

 Score = 60.9 bits (148), Expect = 6e-10
 Identities = 41/205 (20%), Positives = 81/205 (39%), Gaps = 40/205 (19%)

Query: 232 CYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSL 291
                  +  L+R+   +   ERL E  +            D+ + ++ +++GD+S+P L
Sbjct: 21  RRSTQAKVICLVRAASEEHAMERLREALRSYRL-----WHEDLARERIEVVAGDLSEPRL 75

Query: 292 GISSHDQQFIQHHIHVIIHAAASLRF----DELIQDAFTLNIQATRELLDLATRCSQLKA 347
           G+S  + + +  ++  I+H  A + +     EL       N+  TRE+L LA    + K 
Sbjct: 76  GLSDAEWERLAENVDTIVHNGALVNWVYPYSELRGA----NVLGTREVLRLAAS-GRAKP 130

Query: 348 ILHVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYN-NSYS 406
           + +VST+        +                    T  E+   ++            Y+
Sbjct: 131 LHYVSTI-------SVGAAIDLS-------------TVTEDDATVT----PPPGLAGGYA 166

Query: 407 FTKAIGESVVEKY-LYKLPLAMVRP 430
            +K + E +V +     LP+ +VRP
Sbjct: 167 QSKWVAELLVREASDRGLPVTIVRP 191


>gnl|CDD|240271 PTZ00108, PTZ00108, DNA topoisomerase 2-like protein; Provisional.
          Length = 1388

 Score = 61.6 bits (150), Expect = 8e-10
 Identities = 45/198 (22%), Positives = 82/198 (41%), Gaps = 2/198 (1%)

Query: 34   TSSSTSNPTNSSSSKKDKKDKDRD-KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
                 S     +S  +  K K ++ K+K+      +K     +SK  + D+    + +  
Sbjct: 1155 EQRLKSKTKGKASKLRKPKLKKKEKKKKKSSADKSKKASVVGNSKRVDSDEKRKLDDKPD 1214

Query: 93   ESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
              K   S S++E  +E+K K +KS   + K ++ +  +  E  +  SS  +         
Sbjct: 1215 NKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNA 1274

Query: 153  PASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPK 212
            P   S +   PPPP+  P  +S   +K     K+      + S    KKK K   KT  K
Sbjct: 1275 PKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARK 1334

Query: 213  EKDAKSKEKESHKSSAGP 230
            +K +K++ K++  S +  
Sbjct: 1335 KK-SKTRVKQASASQSSR 1351



 Score = 46.2 bits (110), Expect = 5e-05
 Identities = 28/205 (13%), Positives = 61/205 (29%), Gaps = 2/205 (0%)

Query: 55   DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
            +  +EKE  K+ + K K+   + +  K     K++++K+    + S +       K    
Sbjct: 1145 EEVEEKEIAKEQRLKSKTKGKASKLRK-PKLKKKEKKKKKSSADKSKKASVVGNSKRVDS 1203

Query: 115  KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKS 174
               +  D   +  K       +     +      +S +     +  S             
Sbjct: 1204 DEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSD 1263

Query: 175  PVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA-KSKEKESHKSSAGPKCY 233
             +  + K K      +  ++S     K+         K     K K K+  + S      
Sbjct: 1264 DLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRLEGSLAALKK 1323

Query: 234  PEVGGIYILLRSKKNKTVQERLAEQ 258
             +        + K    V++  A Q
Sbjct: 1324 KKKSEKKTARKKKSKTRVKQASASQ 1348



 Score = 45.8 bits (109), Expect = 6e-05
 Identities = 33/194 (17%), Positives = 63/194 (32%), Gaps = 11/194 (5%)

Query: 4    SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSS-SSKKDKKDKDRDKEKEK 62
            SV  +S    +      +   D+    S+ +         +       K+ K +     K
Sbjct: 1193 SVVGNSKRVDSDEKRKLDDKPDNKKSNSSGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSK 1252

Query: 63   EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
              +D ++  S   SKE +      +    + S P  S     +          + K   K
Sbjct: 1253 SSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKK 1312

Query: 123  DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ---LISHPPPPAPTPTQKSPVKTK 179
              E      K++K+S+  +     S    + AS SQ   L+  P        +K    + 
Sbjct: 1313 RLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRLLRRPR-------KKKSDSSS 1365

Query: 180  EKEKEKESSTTHDK 193
            E + + E   + D+
Sbjct: 1366 EDDDDSEVDDSEDE 1379



 Score = 45.4 bits (108), Expect = 8e-05
 Identities = 23/137 (16%), Positives = 49/137 (35%), Gaps = 9/137 (6%)

Query: 1    MAYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEK 60
             +  +       +A       +             S+  +  +S + KK KK        
Sbjct: 1261 SSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRPDGESNGGSKPSSPTKKKVKKRL-EGSLA 1319

Query: 61   EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
              +KK K + K+A         K  SK + ++ S  + S   +  +K+K D   +     
Sbjct: 1320 ALKKKKKSEKKTA--------RKKKSKTRVKQASASQSSRLLRRPRKKKSDSSSEDDDDS 1371

Query: 121  DKDRERDKDEKKEQKES 137
            + D   D+D++ ++ + 
Sbjct: 1372 EVDDSEDEDDEDDEDDD 1388



 Score = 45.0 bits (107), Expect = 1e-04
 Identities = 38/179 (21%), Positives = 72/179 (40%), Gaps = 4/179 (2%)

Query: 47   SKKDKKDKDRDKEKEKEKKD---KEKDK-SAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
            ++ +KK+K+ +K K    KD   ++ DK      +++E ++    +++R +SK K  +S+
Sbjct: 1109 AELEKKEKELEKLKNTTPKDMWLEDLDKFEEALEEQEEVEEKEIAKEQRLKSKTKGKASK 1168

Query: 103  KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
              K K KK +K+K     DK ++       ++ +S    K+     N K  +SGS     
Sbjct: 1169 LRKPKLKKKEKKKKKSSADKSKKASVVGNSKRVDSDEKRKLDDKPDNKKSNSSGSDQEDD 1228

Query: 163  PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
                             +K    +SS  +D+ S     K+ K  +          S   
Sbjct: 1229 EEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPP 1287



 Score = 41.6 bits (98), Expect = 0.001
 Identities = 25/149 (16%), Positives = 47/149 (31%), Gaps = 12/149 (8%)

Query: 5    VKSSSSSSSAHPSPHKNKDKDSSAIPSTST--SSSTSNPTNS-SSSKKDKKDKDRDKEKE 61
             K   SS     S   N  K S      S+   S    P N+       +       ++ 
Sbjct: 1233 TKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPPSKRP 1292

Query: 62   KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH---- 117
              + +     S+ + K+ +K    S    +K+ K ++ ++ K+K K +  +   S     
Sbjct: 1293 DGESNGGSKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRL 1352

Query: 118  -----KHKDKDRERDKDEKKEQKESKSSS 141
                 K K      D D+ +         
Sbjct: 1353 LRRPRKKKSDSSSEDDDDSEVDDSEDEDD 1381


>gnl|CDD|236304 PRK08581, PRK08581, N-acetylmuramoyl-L-alanine amidase; Validated.
          Length = 619

 Score = 60.6 bits (147), Expect = 1e-09
 Identities = 38/238 (15%), Positives = 79/238 (33%), Gaps = 21/238 (8%)

Query: 4   SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
             K S++ +++H S   N D+ S    S  T  + +N   S+    DKK    D      
Sbjct: 30  PQKDSTAKTTSHDSKKSNDDETSKDTSSKDTDKADNN-NTSNQDNNDKKFSTIDSSTSDS 88

Query: 64  KKDKEKDKSAVSSKEKEKDK-----------------VSSKEKERKESKPKESSSEKEKK 106
               +     +      +                   + +   +  + +   +S +    
Sbjct: 89  NNIIDFIYKNLPQTNINQLLTKNKYDDNYSLTTLIQNLFNLNSDISDYEQPRNSEKSTND 148

Query: 107 KEKKDKKEKSHKHKDKDRERDK-DEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
             K       +    +  ++DK D +K    + +     +   NS +P   +Q  +  P 
Sbjct: 149 SNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNSPKPTQPNQ-SNSQPA 207

Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHK-KKDKHGDKTNPKEKDAKSKEKE 222
              T  QKS  K  +   +    +  D++S+   K +KD        K + + +K  +
Sbjct: 208 SDDTANQKSSSKDNQSMSDSALDSILDQYSEDAKKTQKDYASQSKKDKTETSNTKNPQ 265



 Score = 42.1 bits (99), Expect = 8e-04
 Identities = 23/200 (11%), Positives = 68/200 (34%), Gaps = 3/200 (1%)

Query: 29  IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
           +P+ ++ ++ ++     S+ K      +    ++  KD     S  + K    +  +   
Sbjct: 17  LPTLTSPTAYADDPQKDSTAKTTSHDSKKSNDDETSKD---TSSKDTDKADNNNTSNQDN 73

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
            ++K S    S+S+     +   K            +   D+         +   ++S  
Sbjct: 74  NDKKFSTIDSSTSDSNNIIDFIYKNLPQTNINQLLTKNKYDDNYSLTTLIQNLFNLNSDI 133

Query: 149 NSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDK 208
           +  E    S+  ++        + K+   T+  +++K  +      +  K    +K  + 
Sbjct: 134 SDYEQPRNSEKSTNDSNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNS 193

Query: 209 TNPKEKDAKSKEKESHKSSA 228
             P + +  + +  S  ++ 
Sbjct: 194 PKPTQPNQSNSQPASDDTAN 213



 Score = 35.1 bits (81), Expect = 0.12
 Identities = 34/159 (21%), Positives = 63/159 (39%), Gaps = 8/159 (5%)

Query: 4   SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
           S+K+ + + S+      N+   SS     STS+   N    +   +       D    ++
Sbjct: 156 SIKNDTDTQSSKQDKADNQKAPSSNNTKPSTSNKQPNSPKPTQPNQSNSQPASDDTANQK 215

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK-----EKSHK 118
              K+    + S+ +   D+ S  E  +K  K   S S+K+K +    K      +   K
Sbjct: 216 SSSKDNQSMSDSALDSILDQYS--EDAKKTQKDYASQSKKDKTETSNTKNPQLPTQDELK 273

Query: 119 HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGS 157
           HK K  +  +++   Q  ++S+S   +    S    SGS
Sbjct: 274 HKSKPAQSFENDVN-QSNTRSTSLFETGPSLSNNDDSGS 311


>gnl|CDD|212494 cd08946, SDR_e, extended (e) SDRs.  Extended SDRs are distinct from
           classical SDRs. In addition to the Rossmann fold
           (alpha/beta folding pattern with a central beta-sheet)
           core region typical of all SDRs, extended SDRs have a
           less conserved C-terminal extension of approximately 100
           amino acids. Extended SDRs are a diverse collection of
           proteins, and include isomerases, epimerases,
           oxidoreductases, and lyases; they typically have a
           TGXXGXXG cofactor binding motif. SDRs are a functionally
           diverse family of oxidoreductases that have a single
           domain with a structurally conserved Rossmann fold, an
           NAD(P)(H)-binding region, and a structurally diverse
           C-terminal region. Sequence identity between different
           SDR enzymes is typically in the 15-30% range; they
           catalyze a wide range of activities including the
           metabolism of steroids, cofactors, carbohydrates,
           lipids, aromatic compounds, and amino acids, and act in
           redox sensing. Classical SDRs have an TGXXX[AG]XG
           cofactor binding motif and a YXXXK active site motif,
           with the Tyr residue of the active site motif serving as
           a critical catalytic residue (Tyr-151, human
           15-hydroxyprostaglandin dehydrogenase numbering). In
           addition to the Tyr and Lys, there is often an upstream
           Ser and/or an Asn, contributing to the active site;
           while substrate binding is in the C-terminal region,
           which determines specificity. The standard reaction
           mechanism is a 4-pro-S hydride transfer and proton relay
           involving the conserved Tyr and Lys, a water molecule
           stabilized by Asn, and nicotinamide. Atypical SDRs
           generally lack the catalytic residues characteristic of
           the SDRs, and their glycine-rich NAD(P)-binding motif is
           often different from the forms normally seen in
           classical or extended SDRs. Complex (multidomain) SDRs
           such as ketoreductase domains of fatty acid synthase
           have a GGXGXXG NAD(P)-binding motif and an altered
           active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
          Length = 200

 Score = 57.3 bits (139), Expect = 2e-09
 Identities = 33/178 (18%), Positives = 59/178 (33%), Gaps = 53/178 (29%)

Query: 307 VIIHAAASLRFDELIQDA---FTLNIQATRELLDLATRCSQLKAILHVSTLYTHSYREDI 363
           V++H AA +       +    F  N+  T  LL+ A +   +K  ++ S+   +   E +
Sbjct: 33  VVVHLAALVGVPASWDNPDEDFETNVVGTLNLLEAARKAG-VKRFVYASSASVYGSPEGL 91

Query: 364 QEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVEKYL--Y 421
            EE   P                                + Y  +K   E ++  Y   Y
Sbjct: 92  PEEEETPPRP----------------------------LSPYGVSKLAAEHLLRSYGESY 123

Query: 422 KLPLAMVRPSIVVSTWKEPIVGWSNNLYGPGGAAAGAALGLIHTFY--AKHDKKCDLI 477
            LP+ ++R +               N+YGPG        G+++ F   A   K   + 
Sbjct: 124 GLPVVILRLA---------------NVYGPGQRPRLD--GVVNDFIRRALEGKPLTVF 164


>gnl|CDD|225857 COG3320, COG3320, Putative dehydrogenase domain of multifunctional
           non-ribosomal peptide synthetases and related enzymes
           [Secondary metabolites biosynthesis, transport, and
           catabolism].
          Length = 382

 Score = 56.2 bits (136), Expect = 2e-08
 Identities = 42/198 (21%), Positives = 81/198 (40%), Gaps = 27/198 (13%)

Query: 239 IYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGISSHDQ 298
           +  L+R++ ++    RL + F       L     ++   +V +++GD+++P LG+S    
Sbjct: 28  VICLVRAQSDEAALARLEKTF------DLYRHWDELSADRVEVVAGDLAEPDLGLSERTW 81

Query: 299 QFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVSTLYTHS 358
           Q +  ++ +IIH AA +       +    N+  T E+L LA    + K + +VS++    
Sbjct: 82  QELAENVDLIIHNAALVNHVFPYSELRGANVLGTAEVLRLAAT-GKPKPLHYVSSI---- 136

Query: 359 YREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVV-E 417
               + E  Y                  +  EI  +   G      Y  +K + E +V E
Sbjct: 137 ---SVGETEYY------------SNFTVDFDEISPTRNVGQGLAGGYGRSKWVAEKLVRE 181

Query: 418 KYLYKLPLAMVRPSIVVS 435
                LP+ + RP  +  
Sbjct: 182 AGDRGLPVTIFRPGYITG 199


>gnl|CDD|235250 PRK04195, PRK04195, replication factor C large subunit;
           Provisional.
          Length = 482

 Score = 55.7 bits (135), Expect = 4e-08
 Identities = 26/75 (34%), Positives = 49/75 (65%)

Query: 42  TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
           T S  + K  K      EK++E++ KEK K A + K+KE+++   KEK+ +E + +E  +
Sbjct: 403 TGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEKEEEEEEA 462

Query: 102 EKEKKKEKKDKKEKS 116
           E+EK++E++ KK+++
Sbjct: 463 EEEKEEEEEKKKKQA 477



 Score = 48.8 bits (117), Expect = 5e-06
 Identities = 21/82 (25%), Positives = 52/82 (63%), Gaps = 3/82 (3%)

Query: 57  DKEKE---KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           ++E E     KK  +K K  V   EK++++   ++K++  +  K+   E+E+K++K+++K
Sbjct: 396 EEEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEK 455

Query: 114 EKSHKHKDKDRERDKDEKKEQK 135
           E+  +  ++++E ++++KK+Q 
Sbjct: 456 EEEEEEAEEEKEEEEEKKKKQA 477



 Score = 48.8 bits (117), Expect = 6e-06
 Identities = 21/76 (27%), Positives = 44/76 (57%), Gaps = 5/76 (6%)

Query: 41  PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
            T       +K +K R++EK+++KK     K     +E+EK+K     KE ++ + +E +
Sbjct: 408 ATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEK-----KEEEKEEEEEEA 462

Query: 101 SEKEKKKEKKDKKEKS 116
            E+++++E+K KK+ +
Sbjct: 463 EEEKEEEEEKKKKQAT 478



 Score = 43.8 bits (104), Expect = 2e-04
 Identities = 20/82 (24%), Positives = 45/82 (54%), Gaps = 3/82 (3%)

Query: 61  EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
           E+E +     K A    +K K  V   EK+R+E K ++       KK++++++E+  K +
Sbjct: 396 EEEIEFLTGSKKA---TKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKE 452

Query: 121 DKDRERDKDEKKEQKESKSSSK 142
           ++  E +++ ++E++E +   K
Sbjct: 453 EEKEEEEEEAEEEKEEEEEKKK 474



 Score = 43.4 bits (103), Expect = 3e-04
 Identities = 15/77 (19%), Positives = 41/77 (53%)

Query: 66  DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
            +E+ +    SK+  K      EK  K+ + ++   +K+    KK ++E+  + + K+ E
Sbjct: 395 TEEEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEE 454

Query: 126 RDKDEKKEQKESKSSSK 142
           ++++E++ ++E +   +
Sbjct: 455 KEEEEEEAEEEKEEEEE 471



 Score = 40.7 bits (96), Expect = 0.002
 Identities = 20/81 (24%), Positives = 46/81 (56%)

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
           E+E + ++  +K  K+ K     +EK++++EKK+KK+K+   K K+ E +++++K+++E 
Sbjct: 396 EEEIEFLTGSKKATKKIKKIVEKAEKKREEEKKEKKKKAFAGKKKEEEEEEEKEKKEEEK 455

Query: 138 KSSSKIVSSSHNSKEPASGSQ 158
           +   +        +E     Q
Sbjct: 456 EEEEEEAEEEKEEEEEKKKKQ 476


>gnl|CDD|173412 PTZ00121, PTZ00121, MAEBL; Provisional.
          Length = 2084

 Score = 54.0 bits (129), Expect = 2e-07
 Identities = 37/179 (20%), Positives = 78/179 (43%), Gaps = 2/179 (1%)

Query: 49   KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
               + +    E E  ++  E  +      +K+ D    K +E+K++   +  +E++KKK 
Sbjct: 1348 AKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKA 1407

Query: 109  KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
             + KK  + K K  + ++  +EKK+  E+K  ++    +  +K+ A  ++        A 
Sbjct: 1408 DELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAE 1467

Query: 169  TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
               +    K K +E +K       K ++   KK D+       K+K  ++K+ E  K +
Sbjct: 1468 EAKKADEAKKKAEEAKKADEAK--KKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKA 1524



 Score = 53.2 bits (127), Expect = 4e-07
 Identities = 35/182 (19%), Positives = 77/182 (42%), Gaps = 14/182 (7%)

Query: 51   KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE-KERKESKPKESSSEKEKKKEK 109
            KK +++ K  E +KK +E  K+  + K+ E+ K  +   K++ E   K + + K + +  
Sbjct: 1296 KKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAA 1355

Query: 110  KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
             D+ E + +  +   ++ ++ KK+   +K  ++    +  +K+ A   +           
Sbjct: 1356 ADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDK----------- 1404

Query: 170  PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
              +   +K     K+K          K K  +  K  ++   K  +AK K +E+ K+   
Sbjct: 1405 -KKADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK-KADEAKKKAEEAKKAEEA 1462

Query: 230  PK 231
             K
Sbjct: 1463 KK 1464



 Score = 53.2 bits (127), Expect = 4e-07
 Identities = 37/186 (19%), Positives = 85/186 (45%), Gaps = 6/186 (3%)

Query: 46   SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
            + +K K D+ + K +E +K D+ K K+      +E  K +   K++ E   K + + K +
Sbjct: 1298 AEEKKKADEAKKKAEEAKKADEAKKKA------EEAKKKADAAKKKAEEAKKAAEAAKAE 1351

Query: 106  KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
             +   D+ E + +  +   ++ ++ KK+   +K  ++    +  +K+ A   +  +    
Sbjct: 1352 AEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELK 1411

Query: 166  PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
             A    +K+    K+ E++K++     K  + K   + K   +   K ++AK K +E+ K
Sbjct: 1412 KAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKK 1471

Query: 226  SSAGPK 231
            +    K
Sbjct: 1472 ADEAKK 1477



 Score = 51.7 bits (123), Expect = 9e-07
 Identities = 38/185 (20%), Positives = 78/185 (42%), Gaps = 2/185 (1%)

Query: 49   KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
            ++KK  D  K+ E++KK  E  K A  +K+ ++ K  ++E ++K    K+ + E +K  E
Sbjct: 1287 EEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAE 1346

Query: 109  KKDKKEKSHKHKDKDRERDK--DEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP 166
                + ++   + +  E      EKK+++  K +      +   K+     +        
Sbjct: 1347 AAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKK 1406

Query: 167  APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
            A    + +  K K  E +K++          K  ++ K  D+   K ++AK  E+   K+
Sbjct: 1407 ADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKA 1466

Query: 227  SAGPK 231
                K
Sbjct: 1467 EEAKK 1471



 Score = 51.7 bits (123), Expect = 1e-06
 Identities = 42/175 (24%), Positives = 79/175 (45%), Gaps = 10/175 (5%)

Query: 51   KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
            KK ++   + E+ KK +E+ K     K+KE ++    E+ +K  +  +  + +E KK ++
Sbjct: 1613 KKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEE 1672

Query: 111  DKKEKSHKHKDKDRERDKDE--KKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
            DKK+     K ++ E+   E  KKE +E+K + ++       K+ A   +          
Sbjct: 1673 DKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELK--------KA 1724

Query: 169  TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKES 223
                K   +  +KE E++     +     + KKK  H  K   K+ +   KEKE+
Sbjct: 1725 EEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEA 1779



 Score = 51.7 bits (123), Expect = 1e-06
 Identities = 36/193 (18%), Positives = 79/193 (40%), Gaps = 11/193 (5%)

Query: 44   SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
            + ++KK  ++  +  E  K + +   D++  + ++ E  +   +E ++K    K+ + EK
Sbjct: 1331 ADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEK 1390

Query: 104  EK-----KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
            +K     KK ++DKK      K  + ++    KK+  E+K  ++    +  +K+ A  ++
Sbjct: 1391 KKADEAKKKAEEDKK------KADELKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK 1444

Query: 159  LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
                    A    +    K K +E +K            K  +  K  ++   K  +AK 
Sbjct: 1445 KADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKK 1504

Query: 219  KEKESHKSSAGPK 231
              +   K+    K
Sbjct: 1505 AAEAKKKADEAKK 1517



 Score = 51.3 bits (122), Expect = 2e-06
 Identities = 44/220 (20%), Positives = 96/220 (43%), Gaps = 4/220 (1%)

Query: 48   KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            +  ++  +  ++K++E K K       + ++K+ D+   K +E K+   +   +   KKK
Sbjct: 1360 EAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKK 1419

Query: 108  EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
              + KK+   K K  + ++  +E K+  E+K  ++    +  +K+ A  ++        A
Sbjct: 1420 ADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKA 1479

Query: 168  PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
                +    K K +E +K++    +     + KKK     K    +K  ++K+ E  K +
Sbjct: 1480 EEAKKADEAKKKAEEAKKKAD---EAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKA 1536

Query: 228  AGPKCYPEVGGIYILLRSKKNKTVQE-RLAEQFKDELFDR 266
               K   E      L ++++ K  +E + AE+ K    D+
Sbjct: 1537 DEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDK 1576



 Score = 50.9 bits (121), Expect = 2e-06
 Identities = 42/183 (22%), Positives = 82/183 (44%), Gaps = 3/183 (1%)

Query: 51   KKDKDRDKEKEKEKKDKEKDKSAVSSKEK--EKDKVSSKEKERKESKPKESSSEKEKKKE 108
            KK ++  K  E +KK +E  K A ++K+K  E  K +   K   E+   E+ + +EK + 
Sbjct: 1309 KKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEA 1368

Query: 109  KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
             + KKE++ K  D  +++ +++KK  +  K + +    +   K+ A+  +        A 
Sbjct: 1369 AEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAE 1428

Query: 169  TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
               +    K K +E +K            K ++  K  ++   K  +AK K +E+ K+  
Sbjct: 1429 EKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAK-KADEAKKKAEEAKKADE 1487

Query: 229  GPK 231
              K
Sbjct: 1488 AKK 1490



 Score = 47.1 bits (111), Expect = 3e-05
 Identities = 27/149 (18%), Positives = 76/149 (51%)

Query: 48   KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            KK +++      +E +K +++K K+  + K +E +K +++  +++  + K++   K+K+ 
Sbjct: 1653 KKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEA 1712

Query: 108  EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
            E+K K E+  K +++++ + ++ KKE +E K  ++        K+  +  +         
Sbjct: 1713 EEKKKAEELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEE 1772

Query: 168  PTPTQKSPVKTKEKEKEKESSTTHDKHSK 196
                +++ ++ +  E++++     DK  K
Sbjct: 1773 IRKEKEAVIEEELDEEDEKRRMEVDKKIK 1801



 Score = 45.5 bits (107), Expect = 8e-05
 Identities = 45/216 (20%), Positives = 94/216 (43%), Gaps = 3/216 (1%)

Query: 47   SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-KPKESSSEKEK 105
             ++ KK  D  K+K +EKK  ++ K      +K+ D++      +K++ + K+ + EK+K
Sbjct: 1373 KEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKK 1432

Query: 106  KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK-EPASGSQLISHPP 164
              E K K E++ K  +  ++ ++ +K E+ + K+     +     K E A  +       
Sbjct: 1433 ADEAKKKAEEAKKADEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKKA 1492

Query: 165  PPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESH 224
              A     ++    + K+K  E+    +     + KK ++       K+ + K K  E  
Sbjct: 1493 EEAKKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELK 1552

Query: 225  KSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFK 260
            K+    K   E        +++++K +  R AE+ K
Sbjct: 1553 KAEELKKA-EEKKKAEEAKKAEEDKNMALRKAEEAK 1587



 Score = 45.5 bits (107), Expect = 8e-05
 Identities = 47/255 (18%), Positives = 102/255 (40%), Gaps = 17/255 (6%)

Query: 22   KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
            +++++  I     +         ++ K ++  K  + +K +EKK  ++ K A   K+ ++
Sbjct: 1247 EERNNEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEAKKAEEKKKADE 1306

Query: 82   DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD----KDRERDKDEKKEQKES 137
             K  ++E  +K  + K+ + E +KK +   KK +  K        + E   DE +  +E 
Sbjct: 1307 AKKKAEEA-KKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEK 1365

Query: 138  KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKH 197
              +++        K  A+  +        A    +    K K +E +K++       +  
Sbjct: 1366 AEAAEKKKEEAKKKADAAKKK--------AEEKKKADEAKKKAEEDKKKADELKKAAAAK 1417

Query: 198  KHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAE 257
            K   + K   +   K  +AK K +E+ K+    K   E           K K  + + A+
Sbjct: 1418 KKADEAKKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKA----EEAKKKAEEAKKAD 1473

Query: 258  QFKDELFDRLKNEQA 272
            + K +  +  K ++A
Sbjct: 1474 EAKKKAEEAKKADEA 1488



 Score = 44.0 bits (103), Expect = 2e-04
 Identities = 43/241 (17%), Positives = 90/241 (37%), Gaps = 19/241 (7%)

Query: 43   NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
             +    ++ K  D  K+K +E K  ++ K      +K+ D+     + +K++   + + E
Sbjct: 1461 EAKKKAEEAKKADEAKKKAEEAKKADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEE 1520

Query: 103  KEKKKE--KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEP------- 153
             +K  E  K ++ +K+ + K  + ++  DE K+ +E K + +   +    K         
Sbjct: 1521 AKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMAL 1580

Query: 154  --ASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNP 211
              A  ++              +   K K +E +K            K +++ K  ++   
Sbjct: 1581 RKAEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKK 1640

Query: 212  KEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQ 271
            KE + K K +E  K+    K               K     ++ AE+ K    D  K  +
Sbjct: 1641 KEAEEKKKAEELKKAEEENKIKAA--------EEAKKAEEDKKKAEEAKKAEEDEKKAAE 1692

Query: 272  A 272
            A
Sbjct: 1693 A 1693



 Score = 44.0 bits (103), Expect = 3e-04
 Identities = 37/174 (21%), Positives = 81/174 (46%), Gaps = 3/174 (1%)

Query: 49   KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
            +D K  +  K+ E+ KKD E+ K A   +E+  +++   E+ R     +  ++ K ++  
Sbjct: 1221 EDAKKAEAVKKAEEAKKDAEEAKKA--EEERNNEEIRKFEEARMAHFARRQAAIKAEEAR 1278

Query: 109  KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPA-SGSQLISHPPPPA 167
            K D+ +K+ + K  D  +  +EKK+  E+K  ++    +  +K+ A    +        A
Sbjct: 1279 KADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKA 1338

Query: 168  PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
                + +     E E   + +   ++ ++   KKK++   K +  +K A+ K+K
Sbjct: 1339 EEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKK 1392



 Score = 43.2 bits (101), Expect = 4e-04
 Identities = 51/226 (22%), Positives = 91/226 (40%), Gaps = 28/226 (12%)

Query: 46   SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
            + +K K D+ + K +E +K D+ K K+  + K +E  K    E+ +K  + K+ + E +K
Sbjct: 1427 AEEKKKADEAKKKAEEAKKADEAKKKAEEAKKAEEAKK--KAEEAKKADEAKKKAEEAKK 1484

Query: 106  KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
              E K K E++ K  D+ ++  + +KK  +  K+     +      E A  +        
Sbjct: 1485 ADEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKADEAKKAEEAKKAD------- 1537

Query: 166  PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
                  +K+  K K  E +K       +  K   + K    DK     K  ++K+ E  +
Sbjct: 1538 ----EAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEAR 1593

Query: 226  SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQ 271
                 K Y E          KK K      AE+ K     ++K E+
Sbjct: 1594 IEEVMKLYEE---------EKKMK------AEEAKKAEEAKIKAEE 1624



 Score = 41.7 bits (97), Expect = 0.001
 Identities = 47/215 (21%), Positives = 93/215 (43%), Gaps = 2/215 (0%)

Query: 48   KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            K ++K K  + +K +E K  E+ K A  +K+ E+DK  +  K  +  K +E+  E+  K 
Sbjct: 1541 KAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEVMKL 1600

Query: 108  EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
             +++KK K+ + K  +  + K E+ ++ E +           ++E     +L        
Sbjct: 1601 YEEEKKMKAEEAKKAEEAKIKAEELKKAEEEKKKVEQLKKKEAEEKKKAEELKKAEEENK 1660

Query: 168  PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
                +++  K  E++K+K       +  + K  +  K   +   K ++ K KE E  K +
Sbjct: 1661 IKAAEEA--KKAEEDKKKAEEAKKAEEDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKA 1718

Query: 228  AGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDE 262
               K   E   I      K+ +  +++  E  KDE
Sbjct: 1719 EELKKAEEENKIKAEEAKKEAEEDKKKAEEAKKDE 1753



 Score = 38.2 bits (88), Expect = 0.015
 Identities = 50/228 (21%), Positives = 93/228 (40%), Gaps = 18/228 (7%)

Query: 48   KKDKKDKDRDKEKEKE-KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
            K D+  K  + +K  E KK +EK K+    K +E  K   K+K  +  K +E  +   +K
Sbjct: 1523 KADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNMALRK 1582

Query: 107  KEKKDKKEKSH-KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
             E+  K E++  +   K  E +K  K E+ +    +KI        E    ++       
Sbjct: 1583 AEEAKKAEEARIEEVMKLYEEEKKMKAEEAKKAEEAKI------KAEELKKAEEEKKKVE 1636

Query: 166  PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
                   +   K +E +K +E +         K ++  K  ++    E+D K   +   K
Sbjct: 1637 QLKKKEAEEKKKAEELKKAEEENKIKAAEEAKKAEEDKKKAEEAKKAEEDEKKAAEALKK 1696

Query: 226  SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQF-KDELFDRLKNEQA 272
             +   K   E+         KK +  +++ AE+  K E  +++K E+A
Sbjct: 1697 EAEEAKKAEEL---------KKKEAEEKKKAEELKKAEEENKIKAEEA 1735



 Score = 38.2 bits (88), Expect = 0.017
 Identities = 40/203 (19%), Positives = 92/203 (45%), Gaps = 5/203 (2%)

Query: 20   KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
            +++ K + A+   +  +  +       +++ KK ++  K +E+ K   E+ K     KE 
Sbjct: 1685 EDEKKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAEEAK-----KEA 1739

Query: 80   EKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKS 139
            E+DK  ++E ++ E + K+ +  K+++++K ++  K  +   ++   ++DEK+  +  K 
Sbjct: 1740 EEDKKKAEEAKKDEEEKKKIAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEKRRMEVDKK 1799

Query: 140  SSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKH 199
               I  +  N  E      L+ +          K    +K  + E+  +    K +K+  
Sbjct: 1800 IKDIFDNFANIIEGGKEGNLVINDSKEMEDSAIKEVADSKNMQLEEADAFEKHKFNKNNE 1859

Query: 200  KKKDKHGDKTNPKEKDAKSKEKE 222
              +D + +    KEKD K  ++E
Sbjct: 1860 NGEDGNKEADFNKEKDLKEDDEE 1882



 Score = 37.4 bits (86), Expect = 0.022
 Identities = 45/226 (19%), Positives = 93/226 (41%), Gaps = 16/226 (7%)

Query: 51   KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
            KKD +  K+ E+E+ ++E  K   +       + ++ + E      +   +E++KK ++ 
Sbjct: 1236 KKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKADELKKAEEKKKADEA 1295

Query: 111  DKKEKSHKHKDKDRERDKDEKKEQKES-KSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
             K E+  K K  + ++  +E K+  E+ K + +    +  +K+ A  ++  +        
Sbjct: 1296 KKAEE--KKKADEAKKKAEEAKKADEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAE 1353

Query: 170  PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKK---DKHGDKTNPKEKDAKSKEKESHKS 226
                     +EK +  E      K      KKK    K  D+   K ++ K K  E  K+
Sbjct: 1354 AAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAEEKKKADEAKKKAEEDKKKADELKKA 1413

Query: 227  SAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQA 272
            +A  K   E           K K  +++ A++ K +  +  K ++A
Sbjct: 1414 AAAKKKADEA----------KKKAEEKKKADEAKKKAEEAKKADEA 1449



 Score = 36.3 bits (83), Expect = 0.065
 Identities = 31/185 (16%), Positives = 76/185 (41%), Gaps = 4/185 (2%)

Query: 48   KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE---KDKVSSKEKERKESKPKESSSEKE 104
            ++++K ++  K ++ +K +  K         +E    ++  + E+ RK  + + +   + 
Sbjct: 1209 EEERKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARR 1268

Query: 105  KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPP 164
            +   K ++  K+ + K  + ++  DE K+ +E K + +    +  +K+     +      
Sbjct: 1269 QAAIKAEEARKADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAK 1328

Query: 165  PPAPTPTQKSPVKTKEKE-KEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKES 223
              A    +K+    K  E  + E+    D+    + K +     K   K+K   +K+K  
Sbjct: 1329 KKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKKADAAKKKAE 1388

Query: 224  HKSSA 228
             K  A
Sbjct: 1389 EKKKA 1393



 Score = 31.6 bits (71), Expect = 1.4
 Identities = 45/241 (18%), Positives = 96/241 (39%), Gaps = 24/241 (9%)

Query: 49   KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE-------SSS 101
            ++ +  +  K  E  +K ++  K+  + K ++  K  +  K  +  K +E         +
Sbjct: 1143 EEARKAEDAKRVEIARKAEDARKAEEARKAEDAKKAEAARKAEEVRKAEELRKAEDARKA 1202

Query: 102  EKEKKKEKKDKKEKSHKHKDKDR----ERDKDEKKEQKESKSSSKIVSSSHNSK-EPASG 156
            E  +K E++ K E++ K +D  +    ++ ++ KK+ +E+K + +  ++    K E A  
Sbjct: 1203 EAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARM 1262

Query: 157  SQLISHPPPPAPTPTQKSP--VKTKEKEKEKESSTTHDKHSKHKHKKK---DKHGDKTNP 211
            +              +K+    K +EK+K  E+    +K    + KKK    K  D+   
Sbjct: 1263 AHFARRQAAIKAEEARKADELKKAEEKKKADEAKKAEEKKKADEAKKKAEEAKKADEAKK 1322

Query: 212  KEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQ 271
            K ++AK K   + K +   K   E           + +   +      +       K E+
Sbjct: 1323 KAEEAKKKADAAKKKAEEAKKAAEA-------AKAEAEAAADEAEAAEEKAEAAEKKKEE 1375

Query: 272  A 272
            A
Sbjct: 1376 A 1376



 Score = 31.6 bits (71), Expect = 1.5
 Identities = 25/117 (21%), Positives = 49/117 (41%), Gaps = 8/117 (6%)

Query: 39   SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
            S     S+ K+    K+   E+    +  + +K+  + ++  K+   +KEK+ KE   +E
Sbjct: 1824 SKEMEDSAIKEVADSKNMQLEEADAFEKHKFNKNNENGEDGNKEADFNKEKDLKEDDEEE 1883

Query: 99   SSSEKEKKKEKKDKKE--------KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSS 147
                 E +K  KD  E            +   D + DKDE  ++   ++  +I+  S
Sbjct: 1884 IEEADEIEKIDKDDIEREIPNNNMAGKNNDIIDDKLDKDEYIKRDAEETREEIIKIS 1940



 Score = 30.1 bits (67), Expect = 4.3
 Identities = 40/222 (18%), Positives = 90/222 (40%), Gaps = 11/222 (4%)

Query: 40   NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK-EKEKDKVSSKEKERKESKPKE 98
             P+        K+D   D+  E+     E+ K   + K E+ +    +K+K     K +E
Sbjct: 1073 KPSYKDFDFDAKEDNRADEATEEAFGKAEEAKKTETGKAEEARKAEEAKKKAEDARKAEE 1132

Query: 99   SSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
            +   ++ +K ++ +K +  K  +  R+ +   K E+      +K   ++  ++E     +
Sbjct: 1133 ARKAEDARKAEEARKAEDAKRVEIARKAEDARKAEEARKAEDAKKAEAARKAEEVRKAEE 1192

Query: 159  LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
            L       A    +    +  E+E++ E +   +   K +  KK +   K     ++AK 
Sbjct: 1193 L-----RKAEDARKAEAARKAEEERKAEEARKAEDAKKAEAVKKAEEAKK---DAEEAKK 1244

Query: 219  KEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFK 260
             E+E  +++   + + E    +   R    K  + R A++ K
Sbjct: 1245 AEEE--RNNEEIRKFEEARMAHFARRQAAIKAEEARKADELK 1284


>gnl|CDD|220271 pfam09507, CDC27, DNA polymerase subunit Cdc27.  This protein forms
           the C subunit of DNA polymerase delta. It carries the
           essential residues for binding to the Pol1 subunit of
           polymerase alpha, from residues 293-332, which are
           characterized by the motif D--G--VT, referred to as the
           DPIM motif. The first 160 residues of the protein form
           the minimal domain for binding to the B subunit, Cdc1,
           of polymerase delta, the final 10 C-terminal residues,
           362-372, being the DNA sliding clamp, PCNA, binding
           motif.
          Length = 427

 Score = 52.1 bits (125), Expect = 4e-07
 Identities = 30/196 (15%), Positives = 74/196 (37%), Gaps = 6/196 (3%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
              S  S         K +D+S     +T+  T   T+  ++   +    +        K
Sbjct: 167 PPKSIMSPEVKVKSAKKTQDTS---KETTTEKTEGKTSVKAASLKRNPPKKSNIMSSFFK 223

Query: 66  DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
            K K+K       +   K  S+E+  K     E  S +    ++ + +++     ++   
Sbjct: 224 KKTKEKKEKKEASESTVKEESEEESGKRDVILEDESAEPTGLDEDEDEDEPKPSGERSDS 283

Query: 126 RDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEK 185
            ++ E+KE+++ K   K++      +E     +    P     +   + P   K++E+++
Sbjct: 284 EEETEEKEKEKRKRLKKMMEDEDEDEEMEIVPES---PVEEEESEEPEPPPLPKKEEEKE 340

Query: 186 ESSTTHDKHSKHKHKK 201
           E + + D   +   ++
Sbjct: 341 EVTVSPDGGRRRGRRR 356



 Score = 45.6 bits (108), Expect = 5e-05
 Identities = 40/222 (18%), Positives = 70/222 (31%), Gaps = 21/222 (9%)

Query: 4   SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
             + +               K ++     S+    S  +     K  KK +D  KE   E
Sbjct: 135 VKRRTGVGLPPVAPAASPALKPTANGKRPSSKPPKSIMSPEVKVKSAKKTQDTSKETTTE 194

Query: 64  KKDKEKDKSAVSSKEKEKDKV-------SSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           K + +    A S K     K          K KE+KE K    S+ KE+ +E+  K++  
Sbjct: 195 KTEGKTSVKAASLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVKEESEEESGKRDVI 254

Query: 117 HKHK-----DKDRERDKDEKKEQKESKSSSKIVSSSHNSKE-----PASGS----QLISH 162
            + +       D + D+DE K   E   S +        K               ++   
Sbjct: 255 LEDESAEPTGLDEDEDEDEPKPSGERSDSEEETEEKEKEKRKRLKKMMEDEDEDEEMEIV 314

Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
           P  P      + P      +KE+E           + + + +
Sbjct: 315 PESPVEEEESEEPEPPPLPKKEEEKEEVTVSPDGGRRRGRRR 356



 Score = 40.6 bits (95), Expect = 0.002
 Identities = 36/213 (16%), Positives = 62/213 (29%), Gaps = 25/213 (11%)

Query: 24  KDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDK 83
           KDS+ +          N  N S +  +   +         K+        V+       K
Sbjct: 96  KDSNVLYDVDYDILKENLHNCSKNSLEYGKQAGPITNPNVKRRTGVGLPPVAPAASPALK 155

Query: 84  VSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
            ++  K      PK   S + K K  K  ++ S           K+   E+ E K+S K 
Sbjct: 156 PTANGKRPSSKPPKSIMSPEVKVKSAKKTQDTS-----------KETTTEKTEGKTSVKA 204

Query: 144 VSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKH---- 199
            S   N  + ++                +++   T ++E E+ES                
Sbjct: 205 ASLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVKEESEEESGKRDVILEDESAEPTG 264

Query: 200 ----------KKKDKHGDKTNPKEKDAKSKEKE 222
                     K   +  D     E+  K K K 
Sbjct: 265 LDEDEDEDEPKPSGERSDSEEETEEKEKEKRKR 297



 Score = 38.7 bits (90), Expect = 0.008
 Identities = 29/181 (16%), Positives = 66/181 (36%)

Query: 5   VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
            K +S S+    S  ++  +D      ++  +      +    K   +  D ++E E+++
Sbjct: 232 KKEASESTVKEESEEESGKRDVILEDESAEPTGLDEDEDEDEPKPSGERSDSEEETEEKE 291

Query: 65  KDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
           K+K K    +   E E +++    +   E +  E        K++++K+E +       R
Sbjct: 292 KEKRKRLKKMMEDEDEDEEMEIVPESPVEEEESEEPEPPPLPKKEEEKEEVTVSPDGGRR 351

Query: 125 ERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKE 184
              +   K++        +V+      E  S  +    P  P P  +  +     +K K 
Sbjct: 352 RGRRRVMKKKTFKDEEGYLVTKKVYEWESFSEDEAEPPPTKPKPKVSTPAVPAAAKKPKA 411

Query: 185 K 185
            
Sbjct: 412 P 412



 Score = 29.4 bits (66), Expect = 6.7
 Identities = 20/96 (20%), Positives = 37/96 (38%)

Query: 136 ESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHS 195
            + ++S  +  + N K P+S        P       +K+   +KE   EK    T  K +
Sbjct: 146 VAPAASPALKPTANGKRPSSKPPKSIMSPEVKVKSAKKTQDTSKETTTEKTEGKTSVKAA 205

Query: 196 KHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPK 231
             K     K    ++  +K  K K+++   S +  K
Sbjct: 206 SLKRNPPKKSNIMSSFFKKKTKEKKEKKEASESTVK 241


>gnl|CDD|115057 pfam06375, BLVR, Bovine leukaemia virus receptor (BLVR).  This
           family consists of several bovine specific leukaemia
           virus receptors which are thought to function as
           transmembrane proteins, although their exact function is
           unknown.
          Length = 561

 Score = 52.0 bits (124), Expect = 6e-07
 Identities = 47/181 (25%), Positives = 76/181 (41%), Gaps = 10/181 (5%)

Query: 41  PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
           P N+  S +D KD +          DK    S     +K ++  +SK  E+ +    E  
Sbjct: 139 PENALPSDEDDKDPNDPYRALDIDLDKPLADSEKLPVQKHRNAETSKSPEKGDVPAVEKK 198

Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
           S+K KKKEKK+K+          +ERDKD+KKE +  KS    +  S  S    + +   
Sbjct: 199 SKKPKKKEKKEKE----------KERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEA 248

Query: 161 SHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKE 220
           S     + T     P + K+ E E+   +   K  K + +K++K   K +   +   S  
Sbjct: 249 SLANTVSGTAPDSEPDEPKDAEAEETKKSPKHKKKKQRKEKEEKKKKKKHHHHRCHHSDG 308

Query: 221 K 221
            
Sbjct: 309 G 309



 Score = 40.8 bits (95), Expect = 0.002
 Identities = 33/126 (26%), Positives = 64/126 (50%), Gaps = 8/126 (6%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES------KPKESS 100
            KK+KK+K+++++K+K KK+ E  KS + + +      +S  +  + S           S
Sbjct: 203 KKKEKKEKEKERDKDK-KKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGTAPDS 261

Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
              E K  + ++ +KS KHK K + ++K+EKK++K+     +   S   +++P     + 
Sbjct: 262 EPDEPKDAEAEETKKSPKHKKKKQRKEKEEKKKKKK-HHHHRCHHSDGGAEQPVQNGAVE 320

Query: 161 SHPPPP 166
             P PP
Sbjct: 321 EEPLPP 326



 Score = 32.7 bits (74), Expect = 0.57
 Identities = 34/191 (17%), Positives = 70/191 (36%), Gaps = 8/191 (4%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS 146
           +E+  ++   K+   +K+++KEK+ ++       + D +    +  +    +     + S
Sbjct: 86  EERRHRQRLEKDKREKKKREKEKRGRRRHHSLGTESDEDIAPAQMVDIVTEEMPENALPS 145

Query: 147 SHNSKEPASGSQL--ISHPPPPAPT---PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK 201
             + K+P    +   I    P A +   P QK       K  EK      +K SK   KK
Sbjct: 146 DEDDKDPNDPYRALDIDLDKPLADSEKLPVQKHRNAETSKSPEKGDVPAVEKKSKKPKKK 205

Query: 202 KDKHGDKTNPKEKDAKSKEKESHK---SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQ 258
           + K  +K   K+K  + +  +S       +              L +  + T  +   ++
Sbjct: 206 EKKEKEKERDKDKKKEVEGFKSLLLALDDSPASAASVAEADEASLANTVSGTAPDSEPDE 265

Query: 259 FKDELFDRLKN 269
            KD   +  K 
Sbjct: 266 PKDAEAEETKK 276


>gnl|CDD|218177 pfam04615, Utp14, Utp14 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 728

 Score = 51.6 bits (124), Expect = 8e-07
 Identities = 38/182 (20%), Positives = 71/182 (39%), Gaps = 16/182 (8%)

Query: 43  NSSSSKKDKKDKD---RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
            + + KK++ D +     +E E E++  E++    S K   + K   +  E++    K  
Sbjct: 381 RAEARKKEENDAEIEELRRELEGEEESDEEENEEPSKKNVGRRKFGPENGEKEAESKKLK 440

Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQL 159
              K + KEKK+  E+     +++ + +K   K  K S+ + K        +E       
Sbjct: 441 KENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEE------- 493

Query: 160 ISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK 219
                 P    T       K+++ +K+SS+  DK +    K   K   K   KEK     
Sbjct: 494 -----NPWLKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVK-KKKKKEKSIDLD 547

Query: 220 EK 221
           + 
Sbjct: 548 DD 549



 Score = 43.5 bits (103), Expect = 3e-04
 Identities = 29/145 (20%), Positives = 50/145 (34%), Gaps = 1/145 (0%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD 66
            S    +  PS      +          + S      + +  K+KK+ D ++E E E++ 
Sbjct: 406 ESDEEENEEPSKKNVGRRKFGPENGEKEAESKKLKKENKNEFKEKKESDEEEELEDEEEA 465

Query: 67  KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
           K +  +    K  EK +   +E+E  E  P    +     K  K +  K       D+  
Sbjct: 466 KVEKVANKLLKRSEKAQKEEEEEELDEENP-WLKTTSSVGKSAKKQDSKKKSSSKLDKAA 524

Query: 127 DKDEKKEQKESKSSSKIVSSSHNSK 151
           +K  K   K  K   K  S   +  
Sbjct: 525 NKISKAAVKVKKKKKKEKSIDLDDD 549



 Score = 39.7 bits (93), Expect = 0.005
 Identities = 23/143 (16%), Positives = 57/143 (39%), Gaps = 7/143 (4%)

Query: 22  KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
           + K           S      N +  K+ K+  + ++ +++E+   EK  + +  + ++ 
Sbjct: 422 RRKFGPENGEKEAESKKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKA 481

Query: 82  DKVSSKEKERKE-------SKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
            K   +E+  +E       S   +S+ +++ KK+   K +K+     K   + K +KK++
Sbjct: 482 QKEEEEEELDEENPWLKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKE 541

Query: 135 KESKSSSKIVSSSHNSKEPASGS 157
           K       ++    + K      
Sbjct: 542 KSIDLDDDLIDEEDSIKLDVDDE 564



 Score = 39.3 bits (92), Expect = 0.007
 Identities = 37/218 (16%), Positives = 78/218 (35%), Gaps = 26/218 (11%)

Query: 33  STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAV--------SSKEKEKDKV 84
           S S       +      +     R K  + ++ + +++ S +        +   K+++  
Sbjct: 332 SDSEEEDEDDDEDDDDGENPWMLRKKLGKLKEGEDDEENSGLLSMKFMQRAEARKKEEND 391

Query: 85  SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           +  E+ R+E + +E S E+E ++  K    +  K   ++ E++ + KK +KE+K+  K  
Sbjct: 392 AEIEELRRELEGEEESDEEENEEPSKKNVGRR-KFGPENGEKEAESKKLKKENKNEFKEK 450

Query: 145 SSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESS-----------TTHDK 193
             S   +E      L             K   ++++ +KE+E             T+   
Sbjct: 451 KESDEEEE------LEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEENPWLKTTSSVG 504

Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPK 231
            S  K   K K   K +           +  K     K
Sbjct: 505 KSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEK 542



 Score = 34.6 bits (80), Expect = 0.15
 Identities = 21/126 (16%), Positives = 40/126 (31%), Gaps = 3/126 (2%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
           +   S         +    +  A      S             ++           K  K
Sbjct: 449 EKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEEEELDEENPWLKTTSSVGKSAK 508

Query: 66  DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
            ++  K + S  +K  +K+S    + K+ K KE S + +     ++    S K    D E
Sbjct: 509 KQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEKSIDLDDDLIDEE---DSIKLDVDDEE 565

Query: 126 RDKDEK 131
            + DE+
Sbjct: 566 DEDDEE 571



 Score = 33.1 bits (76), Expect = 0.42
 Identities = 35/188 (18%), Positives = 72/188 (38%), Gaps = 17/188 (9%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK-EKEKDKVSSKEKERKESKPKESSSE 102
           S   + +  D + + E + E  D  ++   +  K  K K+    +E     S      +E
Sbjct: 324 SEEDEDEDSDSEEEDEDDDEDDDDGENPWMLRKKLGKLKEGEDDEENSGLLSMKFMQRAE 383

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
             KK+E   + E+  +  + + E D++E +E  +     +     +  KE  S       
Sbjct: 384 ARKKEENDAEIEELRRELEGEEESDEEENEEPSKKNVGRRKFGPENGEKEAES------- 436

Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
                    +K   + K + KEK+ S   ++    +  K +K  +K   + + A+ +E+E
Sbjct: 437 ---------KKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQKEEEE 487

Query: 223 SHKSSAGP 230
                  P
Sbjct: 488 EELDEENP 495



 Score = 33.1 bits (76), Expect = 0.45
 Identities = 29/144 (20%), Positives = 53/144 (36%), Gaps = 31/144 (21%)

Query: 10  SSSSAHPSPHKNKDKDSSAIP-STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK--- 65
           S ++      K K+K             S     +    + D++     K+K+  K+   
Sbjct: 528 SKAAVKVKKKKKKEKSIDLDDDLIDEEDSIKLDVDDEEDEDDEELPFLFKQKDLIKEAFA 587

Query: 66  ------DKEKDKSAVSSKEKEKDK--------------VSSKEKERKESKPKESSSEKEK 105
                 + EK+K  V  +E  K+               +  ++K+RK  +   +  E  K
Sbjct: 588 GDDVVAEFEKEKKEVIEEEDPKEIDLTLPGWGSWAGDGIKKRKKKRKRKRRFLTKIEGVK 647

Query: 106 KKEKKDKK-------EKSHKHKDK 122
           K+++KDKK       EK +K   K
Sbjct: 648 KEKRKDKKLKNVIINEKRNKKAAK 671



 Score = 32.7 bits (75), Expect = 0.60
 Identities = 22/126 (17%), Positives = 44/126 (34%), Gaps = 8/126 (6%)

Query: 5   VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
                +      S    K+  +       +           +  +   +K   + ++ +K
Sbjct: 424 KFGPENGEKEAESKKLKKENKNEFKEKKESDEEEELEDEEEAKVEKVANKLLKRSEKAQK 483

Query: 65  KDKEKD--------KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           +++E++        K+  S  +  K + S K+   K  K     S+   K +KK KKEKS
Sbjct: 484 EEEEEELDEENPWLKTTSSVGKSAKKQDSKKKSSSKLDKAANKISKAAVKVKKKKKKEKS 543

Query: 117 HKHKDK 122
               D 
Sbjct: 544 IDLDDD 549


>gnl|CDD|218752 pfam05793, TFIIF_alpha, Transcription initiation factor IIF, alpha
           subunit (TFIIF-alpha).  Transcription initiation factor
           IIF, alpha subunit (TFIIF-alpha) or RNA polymerase
           II-associating protein 74 (RAP74) is the large subunit
           of transcription factor IIF (TFIIF), which is essential
           for accurate initiation and stimulates elongation by RNA
           polymerase II.
          Length = 528

 Score = 51.1 bits (122), Expect = 1e-06
 Identities = 43/182 (23%), Positives = 64/182 (35%), Gaps = 13/182 (7%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            K  K+       E E+ + EK         K KD     E +  ES       ++EK K
Sbjct: 183 MKAAKNGPAAFGDEDEETEGEKGGGGRGKDLKIKDLEGDDEDDGDESDKGGEDGDEEKSK 242

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK---------IVSSSHNSKEPASGSQ 158
           +KK K  K+ K  D D++  +    +  E  S            I  SS +  +P     
Sbjct: 243 KKKKKLAKNKKKLDDDKKGKRGGDDDADEYDSDDGDDEGREEDYISDSSASGNDPEERED 302

Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
            +S   P  P   Q        +E E+E +      SK   K K   G K    + D+ S
Sbjct: 303 KLSPEIPAKPEIEQDE----DSEESEEEKNEEEGGLSKKGKKLKKLKGKKNGLDKDDSDS 358

Query: 219 KE 220
            +
Sbjct: 359 GD 360



 Score = 48.4 bits (115), Expect = 7e-06
 Identities = 47/247 (19%), Positives = 81/247 (32%), Gaps = 25/247 (10%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDK-EKEKEK 64
                         K  +                N       KK K+  D D  E + + 
Sbjct: 217 DLEGDDEDDGDESDKGGEDGDEEKSKKKKKKLAKNKKKLDDDKKGKRGGDDDADEYDSDD 276

Query: 65  KDKE-------KDKSAVSSKEKEKDKVSSKEKERK--------------ESKPKESSSEK 103
            D E        D SA  +  +E++   S E   K              E   +E    K
Sbjct: 277 GDDEGREEDYISDSSASGNDPEEREDKLSPEIPAKPEIEQDEDSEESEEEKNEEEGGLSK 336

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
           + KK KK K +K+   KD     D  +  +     S S + +     KEP     + S+P
Sbjct: 337 KGKKLKKLKGKKNGLDKDDSDSGDDSDDSDIDGEDSVSLVTAK--KQKEPKKEEPVDSNP 394

Query: 164 PPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK-KDKHGDKTNPKEKDAKSKEKE 222
             P  +   +   ++K+K K K ++      +    KK K ++  K++  +   ++    
Sbjct: 395 SSPGNSGPARPSPESKDKGKRKAANEVSKSPASVPAKKLKTENAPKSSSGKSTPQTFSGS 454

Query: 223 SHKSSAG 229
              S+A 
Sbjct: 455 KSSSNAA 461


>gnl|CDD|219408 pfam07423, DUF1510, Protein of unknown function (DUF1510).  This
           family consists of several hypothetical bacterial
           proteins of around 200 residues in length. The function
           of this family is unknown.
          Length = 214

 Score = 47.4 bits (113), Expect = 5e-06
 Identities = 21/87 (24%), Positives = 42/87 (48%), Gaps = 4/87 (4%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
              D+    E+E +K D ++       KE+EK+  +S++KE K    KE    +E+ +E+
Sbjct: 39  SPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEEE 98

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKE 136
            ++       +++    +K E   +KE
Sbjct: 99  DEESSD----ENEKETEEKTESNVEKE 121



 Score = 44.7 bits (106), Expect = 4e-05
 Identities = 21/84 (25%), Positives = 40/84 (47%)

Query: 21  NKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE 80
           +   D +A        S    T      K+++ +  + E +++K D EK+      + +E
Sbjct: 38  SSPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEE 97

Query: 81  KDKVSSKEKERKESKPKESSSEKE 104
           +D+ SS E E++  +  ES+ EKE
Sbjct: 98  EDEESSDENEKETEEKTESNVEKE 121



 Score = 44.3 bits (105), Expect = 5e-05
 Identities = 22/120 (18%), Positives = 49/120 (40%), Gaps = 6/120 (5%)

Query: 33  STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
           S+ S  +      + K D ++    +E ++E+K+      A +S++KE    + KE E  
Sbjct: 38  SSPSDQAAADEQEAKKSDDQETAEIEEVKEEEKE------AANSEDKEDKGDAEKEDEES 91

Query: 93  ESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
           E + +E   E   + EK+ +++     + +           ++    +    S S +  E
Sbjct: 92  EEENEEEDEESSDENEKETEEKTESNVEKEITNPSWKPVGTEQTGPHAMTFDSGSQDWNE 151



 Score = 44.0 bits (104), Expect = 7e-05
 Identities = 23/84 (27%), Positives = 38/84 (45%), Gaps = 1/84 (1%)

Query: 31  STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE 90
           S S  ++        S  ++  + +  KE+EKE  + E DK      EKE ++   + +E
Sbjct: 39  SPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSE-DKEDKGDAEKEDEESEEENEE 97

Query: 91  RKESKPKESSSEKEKKKEKKDKKE 114
             E    E+  E E+K E   +KE
Sbjct: 98  EDEESSDENEKETEEKTESNVEKE 121



 Score = 36.6 bits (85), Expect = 0.020
 Identities = 15/70 (21%), Positives = 33/70 (47%)

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
            + +E+K  +     E ++ K+++KE ++    +D+   + E +E +E        SS  
Sbjct: 46  ADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEENEEEDEESSDE 105

Query: 149 NSKEPASGSQ 158
           N KE    ++
Sbjct: 106 NEKETEEKTE 115



 Score = 32.4 bits (74), Expect = 0.49
 Identities = 20/86 (23%), Positives = 34/86 (39%)

Query: 5   VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
             SS S  +A       K  D                 NS   +     +  D+E E+E 
Sbjct: 36  FPSSPSDQAAADEQEAKKSDDQETAEIEEVKEEEKEAANSEDKEDKGDAEKEDEESEEEN 95

Query: 65  KDKEKDKSAVSSKEKEKDKVSSKEKE 90
           ++++++ S  + KE E+   S+ EKE
Sbjct: 96  EEEDEESSDENEKETEEKTESNVEKE 121



 Score = 29.3 bits (66), Expect = 4.3
 Identities = 14/71 (19%), Positives = 26/71 (36%), Gaps = 7/71 (9%)

Query: 85  SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           SS   +    + +   S+ ++  E ++ KE+       ++E    E KE K         
Sbjct: 38  SSPSDQAAADEQEAKKSDDQETAEIEEVKEE-------EKEAANSEDKEDKGDAEKEDEE 90

Query: 145 SSSHNSKEPAS 155
           S   N +E   
Sbjct: 91  SEEENEEEDEE 101


>gnl|CDD|217051 pfam02463, SMC_N, RecF/RecN/SMC N terminal domain.  This domain is
           found at the N terminus of SMC proteins. The SMC
           (structural maintenance of chromosomes) superfamily
           proteins have ATP-binding domains at the N- and
           C-termini, and two extended coiled-coil domains
           separated by a hinge in the middle. The eukaryotic SMC
           proteins form two kind of heterodimers: the SMC1/SMC3
           and the SMC2/SMC4 types. These heterodimers constitute
           an essential part of higher order complexes, which are
           involved in chromatin and DNA dynamics. This family also
           includes the RecF and RecN proteins that are involved in
           DNA metabolism and recombination.
          Length = 1162

 Score = 49.2 bits (117), Expect = 5e-06
 Identities = 34/231 (14%), Positives = 84/231 (36%), Gaps = 31/231 (13%)

Query: 43  NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
                + +   ++ +KE+E   +  +++K     K+ +++++    KE +E K +    E
Sbjct: 248 RDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLE 307

Query: 103 --KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
             K   +EK  + EK  K  +K+ +++K+E +E ++     +I   +   +E        
Sbjct: 308 RRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEE-------- 359

Query: 161 SHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKE 220
                      Q   ++ K ++ E+E        S+          ++   K ++ K  +
Sbjct: 360 ----------EQLEKLQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAK 409

Query: 221 KESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQ 271
                S                L  ++ K   + + E  +     + K  +
Sbjct: 410 LLLELSE-----------QEEDLLKEEKKEELKIVEELEESLETKQGKLTE 449



 Score = 38.0 bits (88), Expect = 0.014
 Identities = 38/241 (15%), Positives = 87/241 (36%), Gaps = 24/241 (9%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           S++ +K K+R  +K  E+ +   +      + K   ++  KE+ +K  +  +   + E +
Sbjct: 166 SREKRKKKER-LKKLIEETENLAELIIDLEELK-LQELKLKEQAKKALEYYQLKEKLELE 223

Query: 107 KEK--KDKKEKSHKHKDKDR------ERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
           +E        K ++ +          E+++ E  +Q+  K    +      +KE     +
Sbjct: 224 EENLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQELEKEEEILAQVLKENKEEEKEKK 283

Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
           L            Q+  +K   KE+E+  S       +    ++     +   K+ + K 
Sbjct: 284 L------------QEEELKLLAKEEEELKSELLKLERRKVDDEEKLKESEKELKKLE-KE 330

Query: 219 KEKESHKSSAGPKCYPEVGGIYILLRSKK-NKTVQERLAEQFKDELFDRLKNEQADILQR 277
            +KE  +     K   E+         ++      +   EQ ++EL  + K E   +   
Sbjct: 331 LKKEKEEIEELEKELKELEIKREAEEEEEEQLEKLQEKLEQLEEELLAKKKLESERLSSA 390

Query: 278 K 278
            
Sbjct: 391 A 391



 Score = 37.6 bits (87), Expect = 0.019
 Identities = 29/192 (15%), Positives = 72/192 (37%), Gaps = 20/192 (10%)

Query: 34  TSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
                      S  +  K +K+  KEKE+ ++ +++ K     +E E+++    EK +++
Sbjct: 309 RKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQLEKLQEK 368

Query: 94  SKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEP 153
            +  E     +KK E +     +   +++   ++++EK+ +   + S +        K  
Sbjct: 369 LEQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAKLLLELSEQEEDLLKEEK-- 426

Query: 154 ASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKE 213
                             ++     +E E+  E+        K + +K+     K   + 
Sbjct: 427 ------------------KEELKIVEELEESLETKQGKLTEEKEELEKQALKLLKDKLEL 468

Query: 214 KDAKSKEKESHK 225
           K ++   KE+  
Sbjct: 469 KKSEDLLKETKL 480



 Score = 37.6 bits (87), Expect = 0.021
 Identities = 39/229 (17%), Positives = 81/229 (35%), Gaps = 17/229 (7%)

Query: 54  KDRDKEKEKEKKDKEK-DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
               K KE+ KK  E          E+E        K  +E         +++++E +  
Sbjct: 198 LQELKLKEQAKKALEYYQLKEKLELEEENLLYLDYLKLNEERIDLLQELLRDEQEEIESS 257

Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
           K++  K ++   +  K+ K+E+KE K   +        +E    S+L+           +
Sbjct: 258 KQELEKEEEILAQVLKENKEEEKEKKLQEEE-LKLLAKEEEELKSELL-----------K 305

Query: 173 KSPVKTKEKEKEKESSTTHDKHSKH----KHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
               K  ++EK KES     K  K     K + ++   +    + K    +E+E      
Sbjct: 306 LERRKVDDEEKLKESEKELKKLEKELKKEKEEIEELEKELKELEIKREAEEEEEEQLEKL 365

Query: 229 GPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQR 277
             K       +    + +  +       ++ + EL +  + E   +L+ 
Sbjct: 366 QEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKNEEEKEAKLLLEL 414



 Score = 36.5 bits (84), Expect = 0.053
 Identities = 37/241 (15%), Positives = 83/241 (34%), Gaps = 34/241 (14%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
             KK K +K+     + ++   + ++     ++K K+K   +EK R + + +E    +  
Sbjct: 712 ELKKLKLEKEELLADKVQEAQDKINEELKLLEQKIKEKEEEEEKSRLKKEEEEEEKSELS 771

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
            KEK+  +E+    K K  E  +++ K Q+E   + +                       
Sbjct: 772 LKEKELAEEEEKTEKLKVEEEKEEKLKAQEEELRALEE---------------------- 809

Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
                  K   +  E+E+         K  + +    +        KE+    K  E   
Sbjct: 810 -----ELKEEAELLEEEQLLIEQEEKIKEEELEELALEL-------KEEQKLEKLAEEEL 857

Query: 226 SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGD 285
                +   E     +LL+ ++ +  + +   + K+E     K E  +  Q+   +   +
Sbjct: 858 ERLEEEITKEELLQELLLKEEELEEQKLKDELESKEEKEKEEKKELEEESQKDNLLEEKE 917

Query: 286 I 286
            
Sbjct: 918 N 918



 Score = 30.3 bits (68), Expect = 3.8
 Identities = 22/127 (17%), Positives = 46/127 (36%), Gaps = 12/127 (9%)

Query: 48   KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
              +K+ ++ +KE+E+E+  +                    E E KE +  +   +KE+ +
Sbjct: 946  ADEKEKEEDNKEEEEERNKRLLLAKEELGNVNLMAI---AEFEEKEERYNKDELKKERLE 1002

Query: 108  EKKDKKEK---SHKHKDKDRERDKDEKKEQKESKSSSKIVSSS------HNSKEPASGSQ 158
            E+K +  +       +      +      +  +K    +           +S +P SG  
Sbjct: 1003 EEKKELLREIIEETCQRFKEFLELFVSINRGLNKVFFYLELGGSAELRLEDSDDPFSGGI 1062

Query: 159  LISHPPP 165
             IS  PP
Sbjct: 1063 EISARPP 1069



 Score = 29.9 bits (67), Expect = 4.6
 Identities = 16/100 (16%), Positives = 40/100 (40%), Gaps = 6/100 (6%)

Query: 43   NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK----- 97
              +         + + E+   ++  EK+K   + +E+E+        + +          
Sbjct: 923  RIAEEAIILLKYESEPEELLLEEADEKEKEEDNKEEEEERNKRLLLAKEELGNVNLMAIA 982

Query: 98   ESSSEKEKKKEKKDKKEK-SHKHKDKDRERDKDEKKEQKE 136
            E   ++E+  + + KKE+   + K+  RE  ++  +  KE
Sbjct: 983  EFEEKEERYNKDELKKERLEEEKKELLREIIEETCQRFKE 1022


>gnl|CDD|187557 cd05246, dTDP_GD_SDR_e, dTDP-D-glucose 4,6-dehydratase, extended
           (e) SDRs.  This subgroup contains dTDP-D-glucose
           4,6-dehydratase and related proteins, members of the
           extended-SDR family, with the characteristic Rossmann
           fold core region, active site tetrad and NAD(P)-binding
           motif. dTDP-D-glucose 4,6-dehydratase is closely related
           to other sugar epimerases of the SDR family.
           dTDP-D-dlucose 4,6,-dehydratase catalyzes the second of
           four steps in the dTDP-L-rhamnose pathway (the
           dehydration of dTDP-D-glucose to
           dTDP-4-keto-6-deoxy-D-glucose) in the synthesis of
           L-rhamnose, a cell wall component of some pathogenic
           bacteria. In many gram negative bacteria, L-rhamnose is
           an important constituent of lipopoylsaccharide
           O-antigen. The larger N-terminal portion of
           dTDP-D-Glucose 4,6-dehydratase forms a Rossmann fold
           NAD-binding domain, while the C-terminus binds the sugar
           substrate. Extended SDRs are distinct from classical
           SDRs. In addition to the Rossmann fold (alpha/beta
           folding pattern with a central beta-sheet) core region
           typical of all SDRs, extended SDRs have a less conserved
           C-terminal extension of approximately 100 amino acids.
           Extended SDRs are a diverse collection of proteins, and
           include isomerases, epimerases, oxidoreductases, and
           lyases; they typically have a TGXXGXXG cofactor binding
           motif. SDRs are a functionally diverse family of
           oxidoreductases that have a single domain with a
           structurally conserved Rossmann fold, an
           NAD(P)(H)-binding region, and a structurally diverse
           C-terminal region. Sequence identity between different
           SDR enzymes is typically in the 15-30% range; they
           catalyze a wide range of activities including the
           metabolism of steroids, cofactors, carbohydrates,
           lipids, aromatic compounds, and amino acids, and act in
           redox sensing. Classical SDRs have an TGXXX[AG]XG
           cofactor binding motif and a YXXXK active site motif,
           with the Tyr residue of the active site motif serving as
           a critical catalytic residue (Tyr-151, human
           15-hydroxyprostaglandin dehydrogenase numbering). In
           addition to the Tyr and Lys, there is often an upstream
           Ser and/or an Asn, contributing to the active site;
           while substrate binding is in the C-terminal region,
           which determines specificity. The standard reaction
           mechanism is a 4-pro-S hydride transfer and proton relay
           involving the conserved Tyr and Lys, a water molecule
           stabilized by Asn, and nicotinamide. Atypical SDRs
           generally lack the catalytic residues characteristic of
           the SDRs, and their glycine-rich NAD(P)-binding motif is
           often different from the forms normally seen in
           classical or extended SDRs. Complex (multidomain) SDRs
           such as ketoreductase domains of fatty acid synthase
           have a GGXGXXG NAD(P)-binding motif and an altered
           active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
          Length = 315

 Score = 48.3 bits (116), Expect = 6e-06
 Identities = 43/176 (24%), Positives = 62/176 (35%), Gaps = 52/176 (29%)

Query: 282 ISGDISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQDAF---TLNIQATRELLDL 338
           + GDI    L     D+ F +  I  +IH AA    D  I D       N+  T  LL+ 
Sbjct: 56  VKGDICDAEL----VDRLFEEEKIDAVIHFAAESHVDRSISDPEPFIRTNVLGTYTLLEA 111

Query: 339 ATRCSQLKAILHVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFG 398
           A +    +  +H+ST           +E Y  L    +           E   L+     
Sbjct: 112 ARKYGVKR-FVHIST-----------DEVYGDLLDDGEF---------TETSPLAP---- 146

Query: 399 GIYNNSYSFTKAIGESVVEKYL--YKLPLAMVRPSIVVSTWKEPIVGWSNNLYGPG 452
               + YS +KA  + +V  Y   Y LP+ + R               SNN YGP 
Sbjct: 147 ---TSPYSASKAAADLLVRAYHRTYGLPVVITRC--------------SNN-YGPY 184


>gnl|CDD|235962 PRK07201, PRK07201, short chain dehydrogenase; Provisional.
          Length = 657

 Score = 48.8 bits (117), Expect = 7e-06
 Identities = 39/142 (27%), Positives = 60/142 (42%), Gaps = 31/142 (21%)

Query: 234 PEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGDISQPSLGI 293
                +++L+R    +    RL         DR            V  + GD+++P LG+
Sbjct: 24  RREATVHVLVR----RQSLSRLEALAAYWGADR------------VVPLVGDLTEPGLGL 67

Query: 294 SSHDQQFIQHHIHVIIHAAA--SLRFDELIQDAFTLNIQATRELLDLATRCSQLKAIL-- 349
           S  D   +   I  ++H AA   L  DE  Q A   N+  TR +++LA R   L+A    
Sbjct: 68  SEADIAELG-DIDHVVHLAAIYDLTADEEAQRA--ANVDGTRNVVELAER---LQAATFH 121

Query: 350 HVST-----LYTHSYREDIQEE 366
           HVS+      Y   +RED  +E
Sbjct: 122 HVSSIAVAGDYEGVFREDDFDE 143


>gnl|CDD|233503 TIGR01642, U2AF_lg, U2 snRNP auxilliary factor, large subunit,
           splicing factor.  These splicing factors consist of an
           N-terminal arginine-rich low complexity domain followed
           by three tandem RNA recognition motifs (pfam00076). The
           well-characterized members of this family are auxilliary
           components of the U2 small nuclear ribonuclearprotein
           splicing factor (U2AF). These proteins are closely
           related to the CC1-like subfamily of splicing factors
           (TIGR01622). Members of this subfamily are found in
           plants, metazoa and fungi.
          Length = 509

 Score = 48.4 bits (115), Expect = 8e-06
 Identities = 19/122 (15%), Positives = 57/122 (46%), Gaps = 11/122 (9%)

Query: 52  KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER--------KESKPKESSSEK 103
           +D++ D+E+EK  + +++D+S+   + + +D+   +++ R        ++S+P++     
Sbjct: 1   RDEEPDREREK-SRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYD 59

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
            +   +  +     + +D+ R R +  +  ++  +   +  S S+  ++      L    
Sbjct: 60  SRSP-RSLRYSSVRRSRDRPRRRSRSVRSIEQH-RRRLRDRSPSNQWRKDDKKRSLWDIK 117

Query: 164 PP 165
           PP
Sbjct: 118 PP 119



 Score = 41.4 bits (97), Expect = 0.001
 Identities = 19/132 (14%), Positives = 48/132 (36%), Gaps = 12/132 (9%)

Query: 81  KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
           +D+   +E+E+   + ++ SSE+ +++ +   + +    + ++R   +D +   +    S
Sbjct: 1   RDEEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDS 60

Query: 141 SKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHK 200
                S   S    S                +   V++ E+ + +    +   +   K  
Sbjct: 61  RSP-RSLRYSSVRRSRD----------RPRRRSRSVRSIEQHRRRLRDRS-PSNQWRKDD 108

Query: 201 KKDKHGDKTNPK 212
           KK    D   P 
Sbjct: 109 KKRSLWDIKPPG 120



 Score = 31.0 bits (70), Expect = 2.0
 Identities = 22/167 (13%), Positives = 57/167 (34%), Gaps = 16/167 (9%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDR-DKEKEKEKKDKEKDKSAVSSKE 78
           +  D++           S+  P   S  +   +D+ R  +E+   +  + +D+    S+ 
Sbjct: 3   EEPDREREKSRGRDRDRSSERPRRRSRDRSRFRDRHRRSRERSYREDSRPRDRRRYDSRS 62

Query: 79  KEKDKVS----SKEKERKESKP-----------KESSSEKEKKKEKKDKKEKSHKHKDKD 123
               + S    S+++ R+ S+            ++ S   + +K+ K +     K    +
Sbjct: 63  PRSLRYSSVRRSRDRPRRRSRSVRSIEQHRRRLRDRSPSNQWRKDDKKRSLWDIKPPGYE 122

Query: 124 RERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTP 170
                  K  Q  S   +    +  + ++  +   +I+  P      
Sbjct: 123 LVTADQAKASQVFSVPGTAPRPAMTDPEKLLAEGSIITPLPVLPYQQ 169


>gnl|CDD|235401 PRK05306, infB, translation initiation factor IF-2; Validated.
          Length = 746

 Score = 47.9 bits (115), Expect = 1e-05
 Identities = 19/159 (11%), Positives = 59/159 (37%), Gaps = 1/159 (0%)

Query: 47  SKKDKKDKDRDKEKEKEKK-DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
           +K+         EK KE   + +   S V  +E  K++   + +E  +++ +E+++ + +
Sbjct: 10  AKELGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAEEAAAAEAE 69

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
           ++ K +    +   +  +     +      E +++    +++   K   +  +     P 
Sbjct: 70  EEAKAEAAAAAPAEEAAEAAAAAEAAARPAEDEAARPAEAAARRPKAKKAAKKKKGPKPK 129

Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
                 + +    + K  +        +  + K KK+  
Sbjct: 130 KKKPKRKAARGGKRGKGGKGRRRRRGRRRRRKKKKKQKP 168



 Score = 47.5 bits (114), Expect = 2e-05
 Identities = 26/178 (14%), Positives = 56/178 (31%), Gaps = 13/178 (7%)

Query: 53  DKDRDKEKEKEKKDKEKD-----KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            K R  E  KE     K+     K      +     V  +E  ++E+K +     K + +
Sbjct: 2   SKVRVYELAKELGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAE 61

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
           E    + +     +       +E  E   +  ++   +    ++                
Sbjct: 62  EAAAAEAEEEAKAEAAAAAPAEEAAEAAAAAEAAARPAEDEAARPAE--------AAARR 113

Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
           P   + +  K   K K+K+      +  K     K +   +   + +  K K+K + K
Sbjct: 114 PKAKKAAKKKKGPKPKKKKPKRKAARGGKRGKGGKGRRRRRGRRRRRKKKKKQKPTEK 171



 Score = 44.8 bits (107), Expect = 1e-04
 Identities = 23/166 (13%), Positives = 63/166 (37%), Gaps = 1/166 (0%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSS-KEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
             +      +  EK K+   + KS  S+ +E+E  K  +K +  +E+K +   +   + +
Sbjct: 10  AKELGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAEEAAAAEAE 69

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
           E+   +  +    ++  E     +   + ++  +   + +   +  A  +      P P 
Sbjct: 70  EEAKAEAAAAAPAEEAAEAAAAAEAAARPAEDEAARPAEAAARRPKAKKAAKKKKGPKPK 129

Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKE 213
               ++   +  ++ K  +         + + KKK +   +  P+E
Sbjct: 130 KKKPKRKAARGGKRGKGGKGRRRRRGRRRRRKKKKKQKPTEKIPRE 175



 Score = 38.7 bits (91), Expect = 0.008
 Identities = 30/236 (12%), Positives = 70/236 (29%), Gaps = 45/236 (19%)

Query: 49  KDKKDKDRDKEKEKEKKD-KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
              +  +  KE     K+  EK K      +     V  +E  ++E+K +     K + +
Sbjct: 2   SKVRVYELAKELGVSSKELLEKLKELGIEVKSHSSTVEEEEARKEEAKREAEEEAKAEAE 61

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
           E    + +     +       +E  E   +  +                         PA
Sbjct: 62  EAAAAEAEEEAKAEAAAAAPAEEAAEAAAAAEA----------------------AARPA 99

Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
                +       + K K+++    K    K KKK          ++    K +   +  
Sbjct: 100 EDEAARPAEAAARRPKAKKAAK---KKKGPKPKKKKPKRKAARGGKRGKGGKGRRRRRGR 156

Query: 228 AGPKCYPEVGGIYILLRSKKNKTVQERLAEQFK-------DELFDRLKNEQADILQ 276
              +            + KK +   E++  +          EL +++  + A++++
Sbjct: 157 RRRR------------KKKKKQKPTEKIPREVVIPETITVAELAEKMAVKAAEVIK 200



 Score = 38.7 bits (91), Expect = 0.009
 Identities = 25/162 (15%), Positives = 50/162 (30%), Gaps = 20/162 (12%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
             K       ++E  KE+  +E ++ A +  E+     + +E + + +    +    E  
Sbjct: 30  EVKSHSSTVEEEEARKEEAKREAEEEAKAEAEEAAAAEAEEEAKAEAAAAAPAEEAAEAA 89

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP 166
              +     +     +  E      K +K +K                        P P 
Sbjct: 90  AAAEAAARPAEDEAARPAEAAARRPKAKKAAKKKKG--------------------PKPK 129

Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDK 208
              P +K+    K  +  K       +  + K KKK K  +K
Sbjct: 130 KKKPKRKAARGGKRGKGGKGRRRRRGRRRRRKKKKKQKPTEK 171


>gnl|CDD|219838 pfam08432, DUF1742, Fungal protein of unknown function (DUF1742).
           This is a family of fungal proteins of unknown function.
          Length = 182

 Score = 44.7 bits (106), Expect = 3e-05
 Identities = 25/65 (38%), Positives = 38/65 (58%), Gaps = 2/65 (3%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
           KK K+  +E EK KK+ E+ +     K+K K K   K+K+ K+   K+  SEK+ +KE +
Sbjct: 62  KKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKK-KDKDKD-KKDDKKDDKSEKKDEKEAE 119

Query: 111 DKKEK 115
           DK E 
Sbjct: 120 DKLED 124



 Score = 43.1 bits (102), Expect = 1e-04
 Identities = 23/77 (29%), Positives = 37/77 (48%), Gaps = 3/77 (3%)

Query: 45  SSSKKDKKDKDR---DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
              KK+ ++K +    K+K K+KKDK+KDK      +K + K   + +++ E   K  S 
Sbjct: 72  EKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYSE 131

Query: 102 EKEKKKEKKDKKEKSHK 118
                 E K +K   HK
Sbjct: 132 TLSTLSELKPRKYALHK 148



 Score = 42.0 bits (99), Expect = 2e-04
 Identities = 23/85 (27%), Positives = 41/85 (48%), Gaps = 6/85 (7%)

Query: 70  DKSAVSSKEKEK---DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
           D     +K+K+K   +++   +KE +E +  +   +K KKK+ KDK +K  K  DK    
Sbjct: 54  DAEYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKS--- 110

Query: 127 DKDEKKEQKESKSSSKIVSSSHNSK 151
           +K ++KE ++         S   S 
Sbjct: 111 EKKDEKEAEDKLEDLTKSYSETLST 135



 Score = 41.2 bits (97), Expect = 5e-04
 Identities = 24/86 (27%), Positives = 44/86 (51%), Gaps = 9/86 (10%)

Query: 58  KEKEKEKKDKE-KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           +  E +KK KE  ++     KE E+ +   K K +K+   K+   +K+KK +KKD K   
Sbjct: 56  EYTEAKKKKKELAEEIEKVKKEYEEKQ---KWKWKKKKSKKKKDKDKDKKDDKKDDKS-- 110

Query: 117 HKHKDKDRERDKDEKKEQKESKSSSK 142
              + KD +  +D+ ++  +S S + 
Sbjct: 111 ---EKKDEKEAEDKLEDLTKSYSETL 133



 Score = 33.9 bits (78), Expect = 0.12
 Identities = 25/95 (26%), Positives = 42/95 (44%), Gaps = 8/95 (8%)

Query: 84  VSSKEKERKESKPKESSSEKEKKK---EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
           +   E    + K KE + E EK K   E+K K +   K   K +++DKD+K ++K+ KS 
Sbjct: 52  IYDAEYTEAKKKKKELAEEIEKVKKEYEEKQKWKWKKKKSKKKKDKDKDKKDDKKDDKSE 111

Query: 141 SKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSP 175
            K    + +  E  + S           T ++  P
Sbjct: 112 KKDEKEAEDKLEDLTKS-----YSETLSTLSELKP 141



 Score = 31.6 bits (72), Expect = 0.64
 Identities = 13/44 (29%), Positives = 20/44 (45%), Gaps = 1/44 (2%)

Query: 43  NSSSSKKDKKDKDRDK-EKEKEKKDKEKDKSAVSSKEKEKDKVS 85
                K  K DK  DK EK+ EK+ ++K +    S  +    +S
Sbjct: 94  KKDKDKDKKDDKKDDKSEKKDEKEAEDKLEDLTKSYSETLSTLS 137


>gnl|CDD|240274 PTZ00112, PTZ00112, origin recognition complex 1 protein;
           Provisional.
          Length = 1164

 Score = 46.9 bits (111), Expect = 3e-05
 Identities = 35/146 (23%), Positives = 57/146 (39%), Gaps = 24/146 (16%)

Query: 3   YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD----- 57
           +++ SSSSSS +           + +  S+ TS  +    + SS    K  K+       
Sbjct: 120 HNLDSSSSSSISSSL-------TNISFFSSPTSIYSCLSNSLSSKHSPKVIKENQSTHVN 172

Query: 58  -----KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
                  + KE  +K+  K    +     DK+    +     K   +   KEK KEK   
Sbjct: 173 ISSDNSPRNKEISNKQLKKQTNVTHTTCYDKMRRSPRNTSTIKNNTNDKNKEKNKEKD-- 230

Query: 113 KEKSHKHKDKDRERDKDEKKEQKESK 138
                K+  KDR+ DK  K+  ++SK
Sbjct: 231 -----KNIKKDRDGDKQTKRNSEKSK 251



 Score = 46.1 bits (109), Expect = 5e-05
 Identities = 48/232 (20%), Positives = 88/232 (37%), Gaps = 20/232 (8%)

Query: 3   YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
           YS  S+S SS    SP   K+  S+ +      SS ++P N   S K  K +        
Sbjct: 147 YSCLSNSLSSKH--SPKVIKENQSTHV----NISSDNSPRNKEISNKQLKKQTNVTHTTC 200

Query: 63  EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
             K +   ++  + K    DK   K KE+ ++  K+   +K+ K+   +K +  + H D 
Sbjct: 201 YDKMRRSPRNTSTIKNNTNDKNKEKNKEKDKNIKKDRDGDKQTKR-NSEKSKVQNSHFDV 259

Query: 123 ------DRERDKDEKKEQKESKSSSKIVSSSHN-SKEPASGSQLISHPPPPAPTPTQKSP 175
                  +E  KDEK      +SS  +   S    K+    S          P       
Sbjct: 260 RILRSYTKENKKDEKNVVSGIRSSVLLKRKSQCLRKDSYVYSNHQKKAKTGDPKNIIHRN 319

Query: 176 VKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
             +     +  SS+ H   ++  ++      + ++P +K   +K   + K++
Sbjct: 320 NGSSNSNNDDTSSSNHLGSNRISNR------NPSSPYKKQTTTKHTNNTKNN 365



 Score = 46.1 bits (109), Expect = 5e-05
 Identities = 56/269 (20%), Positives = 108/269 (40%), Gaps = 32/269 (11%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE--- 63
           SS +S       +K   K ++   +T       +P N+S+ K +  DK+++K KEK+   
Sbjct: 174 SSDNSPRNKEISNKQLKKQTNVTHTTCYDKMRRSPRNTSTIKNNTNDKNKEKNKEKDKNI 233

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK-----------KKEKKDK 112
           KKD++ DK    + EK K + S  +     S  KE+  +++            K++ +  
Sbjct: 234 KKDRDGDKQTKRNSEKSKVQNSHFDVRILRSYTKENKKDEKNVVSGIRSSVLLKRKSQCL 293

Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
           ++ S+ + +  ++    + K      + S   ++   S     GS  IS+  P +P    
Sbjct: 294 RKDSYVYSNHQKKAKTGDPKNIIHRNNGSSNSNNDDTSSSNHLGSNRISNRNPSSP---- 349

Query: 173 KSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKC 232
                      +K+++T H  ++K+    K K   K N   +   +  K   +SS  P  
Sbjct: 350 ----------YKKQTTTKHTNNTKNNKYNKTKTTQKFNHPLRHHATINK---RSSMLP-M 395

Query: 233 YPEVGGIYILLRSKKNKTVQERLAEQFKD 261
             + G           +   E +A+  KD
Sbjct: 396 SEQKGRGASEKSEYIKEFTMEEVAKLTKD 424



 Score = 30.7 bits (69), Expect = 2.5
 Identities = 34/177 (19%), Positives = 63/177 (35%), Gaps = 20/177 (11%)

Query: 55  DRDKEKEKEKKDKEKDKSAVSSKEKEK---DKVSSKEKERKESKPKESSSEKEKKKEKKD 111
           D ++  +   K+ +   + + + +KEK   D  SS       +     SS         +
Sbjct: 93  DLNERSKTPIKNNDNVTTPIKANKKEKHNLDSSSSSSISSSLTNISFFSSPTSIYSCLSN 152

Query: 112 KKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPT 171
                H              K  KE++S+   +SS ++ +     ++ +        T  
Sbjct: 153 SLSSKHS------------PKVIKENQSTHVNISSDNSPRNKEISNKQLKKQTNVTHTTC 200

Query: 172 QKSPVKTKEKEKEKESSTTHDKHSKHKHK----KKDKHGDKTNPKEKDAKSKEKESH 224
                ++       +++T      K+K K    KKD+ GDK   K    KSK + SH
Sbjct: 201 YDKMRRSPRNTSTIKNNTNDKNKEKNKEKDKNIKKDRDGDKQT-KRNSEKSKVQNSH 256


>gnl|CDD|240388 PTZ00372, PTZ00372, endonuclease 4-like protein; Provisional.
          Length = 413

 Score = 44.7 bits (106), Expect = 9e-05
 Identities = 30/132 (22%), Positives = 51/132 (38%), Gaps = 8/132 (6%)

Query: 12  SSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
           S    +  K+K +  S I S     S    +  S  +         K+K+++K ++ K K
Sbjct: 10  SFFSGTTQKSKLQPISYIYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKK 69

Query: 72  SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEK 131
           S    K+K++ K    E E K           +  K+ K  K+K  K K  +   +   K
Sbjct: 70  SEKKKKKKKEKKEPKSEGETKL--------GFKTPKKSKKTKKKPPKPKPNEDVDNAFNK 121

Query: 132 KEQKESKSSSKI 143
             +   KS+  I
Sbjct: 122 IAELAEKSNVYI 133



 Score = 41.6 bits (98), Expect = 0.001
 Identities = 31/121 (25%), Positives = 55/121 (45%), Gaps = 8/121 (6%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPST--STSSSTSNPTNSSSSKKDKK------DKDRD 57
             S ++  +   P      +   +     ST S   N   ++S+KKDKK       K + 
Sbjct: 11  FFSGTTQKSKLQPISYIYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKS 70

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
           ++K+K+KK+K++ KS   +K   K    SK+ ++K  KPK +        +  +  EKS+
Sbjct: 71  EKKKKKKKEKKEPKSEGETKLGFKTPKKSKKTKKKPPKPKPNEDVDNAFNKIAELAEKSN 130

Query: 118 K 118
            
Sbjct: 131 V 131



 Score = 40.9 bits (96), Expect = 0.001
 Identities = 27/122 (22%), Positives = 45/122 (36%), Gaps = 3/122 (2%)

Query: 26  SSAIPSTSTSSSTSNP-TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKV 84
            S     S     S   +N     K+      ++E +      +KDK    + E +K   
Sbjct: 12  FSGTTQKSKLQPISYIYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKSE 71

Query: 85  SSKE-KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE-RDKDEKKEQKESKSSSK 142
             K+ K+ K+    E  ++   K  KK KK K    K K  E  D    K  + ++ S+ 
Sbjct: 72  KKKKKKKEKKEPKSEGETKLGFKTPKKSKKTKKKPPKPKPNEDVDNAFNKIAELAEKSNV 131

Query: 143 IV 144
            +
Sbjct: 132 YI 133



 Score = 39.7 bits (93), Expect = 0.003
 Identities = 29/119 (24%), Positives = 44/119 (36%), Gaps = 6/119 (5%)

Query: 68  EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
           +K K    S       V SKE     S+ +   +    KK+KK+ K    K K + +++ 
Sbjct: 17  QKSKLQPISYIYSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKSEKKKKK 76

Query: 128 KDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
           K EKKE K    +     +   SK+          PP P P     +      +  EK 
Sbjct: 77  KKEKKEPKSEGETKLGFKTPKKSKKTK------KKPPKPKPNEDVDNAFNKIAELAEKS 129



 Score = 33.2 bits (76), Expect = 0.38
 Identities = 18/111 (16%), Positives = 36/111 (32%), Gaps = 7/111 (6%)

Query: 3   YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
           YS     S         +     +++          +     S  KK KK + ++ + E 
Sbjct: 28  YSNVLVLSKEILSTFSEEENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEG 87

Query: 63  EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           E K   K         K+  K   K  + K ++  +++  K  +  +K   
Sbjct: 88  ETKLGFK-------TPKKSKKTKKKPPKPKPNEDVDNAFNKIAELAEKSNV 131



 Score = 33.2 bits (76), Expect = 0.43
 Identities = 26/132 (19%), Positives = 49/132 (37%), Gaps = 8/132 (6%)

Query: 31  STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE 90
           S +T  S   P +   S      K+      +E+  K    S    K+++K+  S K+ E
Sbjct: 13  SGTTQKSKLQPISYIYSNVLVLSKEILSTFSEEEN-KVATTSTKKDKKEDKNNESKKKSE 71

Query: 91  RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNS 150
           +K+ K       K++KKE K + E     K   + +   +K  + +         +    
Sbjct: 72  KKKKK-------KKEKKEPKSEGETKLGFKTPKKSKKTKKKPPKPKPNEDVDNAFNKIAE 124

Query: 151 KEPASGSQLISH 162
               S   + +H
Sbjct: 125 LAEKSNVYIGAH 136



 Score = 32.4 bits (74), Expect = 0.71
 Identities = 16/65 (24%), Positives = 25/65 (38%)

Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
                  S  K K+++K  ES    +K  K K +KK+   +           K K++ K 
Sbjct: 46  ENKVATTSTKKDKKEDKNNESKKKSEKKKKKKKEKKEPKSEGETKLGFKTPKKSKKTKKK 105

Query: 227 SAGPK 231
              PK
Sbjct: 106 PPKPK 110


>gnl|CDD|114011 pfam05262, Borrelia_P83, Borrelia P83/100 protein.  This family
           consists of several Borrelia P83/P100 antigen proteins.
          Length = 489

 Score = 44.6 bits (105), Expect = 1e-04
 Identities = 27/95 (28%), Positives = 46/95 (48%), Gaps = 5/95 (5%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            +D  DK RD+ ++K+++ K   K A +S  KE  +V+  +K   E    E      KK 
Sbjct: 239 AQDNADKQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQIEI-----KKN 293

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           +++  K K HK  D  +E    EK+ + +   + K
Sbjct: 294 DEEALKAKDHKAFDLKQESKASEKEAEDKELEAQK 328



 Score = 41.5 bits (97), Expect = 0.001
 Identities = 31/104 (29%), Positives = 58/104 (55%), Gaps = 9/104 (8%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-KPKESSSEKEKK 106
           +  +  ++ DK++    K ++K   A  + +K++D+V  K++E K   KP ++SS KE K
Sbjct: 214 RAQQLKEELDKKQIDADKAQQKADFAQDNADKQRDEVRQKQQEAKNLPKPADTSSPKEDK 273

Query: 107 K---EKKDKKEKSH---KHKDKDRERDKDEKKE--QKESKSSSK 142
           +    +K + EK+    K  D++  + KD K    ++ESK+S K
Sbjct: 274 QVAENQKREIEKAQIEIKKNDEEALKAKDHKAFDLKQESKASEK 317



 Score = 41.1 bits (96), Expect = 0.001
 Identities = 24/111 (21%), Positives = 49/111 (44%), Gaps = 3/111 (2%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           +SS K+ K    ++++E EK   E  K+   + + +  K    ++E K S+ +    E E
Sbjct: 266 TSSPKEDKQVAENQKREIEKAQIEIKKNDEEALKAKDHKAFDLKQESKASEKEAEDKELE 325

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS---KIVSSSHNSKE 152
            +K+++   E   K K +   +     ++  +S +     K+V    N  E
Sbjct: 326 AQKKREPVAEDLQKTKPQVEAQPTSLNEDAIDSSNPVYGLKVVDPITNLSE 376



 Score = 39.2 bits (91), Expect = 0.006
 Identities = 30/182 (16%), Positives = 71/182 (39%), Gaps = 4/182 (2%)

Query: 53  DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
            K  +  +E  +K     +     KE+E  + + + ++ KE   K+     + +++    
Sbjct: 180 KKVVEALREDNEKGVNFRRDMTDLKERESQEDAKRAQQLKEELDKKQIDADKAQQKADFA 239

Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS--SHNSKEPASGSQLISHPPPPAPTP 170
           ++ + K +D+ R++ ++ K   K + +SS       + N K     +Q+           
Sbjct: 240 QDNADKQRDEVRQKQQEAKNLPKPADTSSPKEDKQVAENQKREIEKAQIEIKKNDEEALK 299

Query: 171 TQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGD--KTNPKEKDAKSKEKESHKSSA 228
            +       ++E +       DK  + + K++    D  KT P+ +   +   E    S+
Sbjct: 300 AKDHKAFDLKQESKASEKEAEDKELEAQKKREPVAEDLQKTKPQVEAQPTSLNEDAIDSS 359

Query: 229 GP 230
            P
Sbjct: 360 NP 361



 Score = 39.2 bits (91), Expect = 0.006
 Identities = 33/191 (17%), Positives = 68/191 (35%), Gaps = 39/191 (20%)

Query: 43  NSSSSKKDKKDKDRDKEKEKEKKDK--EKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
           N S    D     +  E  +E  +K     +     KE+E     S+E  ++  + KE  
Sbjct: 168 NVSDVDTDSISDKKVVEALREDNEKGVNFRRDMTDLKERE-----SQEDAKRAQQLKEEL 222

Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
            +K+   +K  +++      + D++RD+  +K+Q+                         
Sbjct: 223 DKKQIDADKA-QQKADFAQDNADKQRDEVRQKQQEAKNL--------------------- 260

Query: 161 SHPPPPAPTPTQKSPVKTK---EKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAK 217
                P P  T       +    +++E E +    K +  +  K   H  K    ++++K
Sbjct: 261 -----PKPADTSSPKEDKQVAENQKREIEKAQIEIKKNDEEALKAKDH--KAFDLKQESK 313

Query: 218 SKEKESHKSSA 228
           + EKE+     
Sbjct: 314 ASEKEAEDKEL 324


>gnl|CDD|219746 pfam08208, RNA_polI_A34, DNA-directed RNA polymerase I subunit
           RPA34.5.  This is a family of proteins conserved from
           yeasts to human. Subunit A34.5 of RNA polymerase I is a
           non-essential subunit which is thought to help Pol I
           overcome topological constraints imposed on ribosomal
           DNA during the process of transcription.
          Length = 193

 Score = 42.8 bits (101), Expect = 1e-04
 Identities = 23/61 (37%), Positives = 33/61 (54%), Gaps = 2/61 (3%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
           S KE         E E +E K K+   +KE KKEKK+KK+K  K  +    + K +KK++
Sbjct: 135 SEKETTAKVEKEAEVEEEEKKEKKK--KKEVKKEKKEKKDKKEKMVEPKGSKKKKKKKKK 192

Query: 135 K 135
           K
Sbjct: 193 K 193



 Score = 40.9 bits (96), Expect = 6e-04
 Identities = 20/65 (30%), Positives = 34/65 (52%)

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
            +   +  + EKE      KE+  E+E+KKEKK KKE   + K+K  +++K  + +  + 
Sbjct: 126 SELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGSKK 185

Query: 138 KSSSK 142
           K   K
Sbjct: 186 KKKKK 190



 Score = 40.9 bits (96), Expect = 7e-04
 Identities = 27/72 (37%), Positives = 40/72 (55%), Gaps = 7/72 (9%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
            S S+  +K+     EKE E +++EK       KEK+K K   KEK+ K+ K ++    K
Sbjct: 129 GSESETSEKETTAKVEKEAEVEEEEK-------KEKKKKKEVKKEKKEKKDKKEKMVEPK 181

Query: 104 EKKKEKKDKKEK 115
             KK+KK KK+K
Sbjct: 182 GSKKKKKKKKKK 193



 Score = 38.5 bits (90), Expect = 0.004
 Identities = 22/86 (25%), Positives = 43/86 (50%), Gaps = 10/86 (11%)

Query: 25  DSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKV 84
             +     S   S S  +   ++ K +K+ + ++E++KEKK K++ K     K+ +K+K+
Sbjct: 118 YGAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKM 177

Query: 85  SSKEKERKESKPKESSSEKEKKKEKK 110
              +  +K          K+KKK+KK
Sbjct: 178 VEPKGSKK----------KKKKKKKK 193



 Score = 34.7 bits (80), Expect = 0.070
 Identities = 22/89 (24%), Positives = 43/89 (48%), Gaps = 10/89 (11%)

Query: 41  PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
           PT   +      +   + E  +++   + +K A   +E++K+K   K+KE K        
Sbjct: 115 PTGYGAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEK--KKKKEVK-------- 164

Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
            EK++KK+KK+K  +    K K +++ K 
Sbjct: 165 KEKKEKKDKKEKMVEPKGSKKKKKKKKKK 193



 Score = 34.7 bits (80), Expect = 0.075
 Identities = 19/75 (25%), Positives = 33/75 (44%), Gaps = 1/75 (1%)

Query: 68  EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
             D        + +        + ++    E   +KEKKK+K+ KKEK  K KDK  +  
Sbjct: 120 APDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEK-KDKKEKMV 178

Query: 128 KDEKKEQKESKSSSK 142
           + +  ++K+ K   K
Sbjct: 179 EPKGSKKKKKKKKKK 193



 Score = 32.8 bits (75), Expect = 0.29
 Identities = 18/73 (24%), Positives = 31/73 (42%)

Query: 11  SSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKD 70
             +    P +   +  ++   T+              K+ KK K+  KEK+++K  KEK 
Sbjct: 118 YGAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKM 177

Query: 71  KSAVSSKEKEKDK 83
                SK+K+K K
Sbjct: 178 VEPKGSKKKKKKK 190



 Score = 31.2 bits (71), Expect = 1.0
 Identities = 19/69 (27%), Positives = 32/69 (46%), Gaps = 3/69 (4%)

Query: 87  KEKERKESKPKESSSEKEKKKEK--K-DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
                  S+ + S  E   K EK  + +++EK  K K K+ +++K EKK++KE     K 
Sbjct: 123 GPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKG 182

Query: 144 VSSSHNSKE 152
                  K+
Sbjct: 183 SKKKKKKKK 191



 Score = 30.1 bits (68), Expect = 2.6
 Identities = 16/68 (23%), Positives = 36/68 (52%), Gaps = 1/68 (1%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
            + +    ++ S+ +  ++    +   E E ++E+K +K+K  K   K+++  KD+K++ 
Sbjct: 119 GAPDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKK-KKEVKKEKKEKKDKKEKM 177

Query: 135 KESKSSSK 142
            E K S K
Sbjct: 178 VEPKGSKK 185



 Score = 29.7 bits (67), Expect = 3.1
 Identities = 14/63 (22%), Positives = 28/63 (44%)

Query: 165 PPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESH 224
           PP+   ++    + +   K ++ +   ++  K K KKK+   +K   K+K  K  E +  
Sbjct: 124 PPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVEPKGS 183

Query: 225 KSS 227
           K  
Sbjct: 184 KKK 186



 Score = 28.1 bits (63), Expect = 9.8
 Identities = 15/65 (23%), Positives = 25/65 (38%)

Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
           AP           E  +++ ++    +    + +KK+K   K   KEK  K  +KE    
Sbjct: 120 APDGPPSELGSESETSEKETTAKVEKEAEVEEEEKKEKKKKKEVKKEKKEKKDKKEKMVE 179

Query: 227 SAGPK 231
             G K
Sbjct: 180 PKGSK 184


>gnl|CDD|235640 PRK05901, PRK05901, RNA polymerase sigma factor; Provisional.
          Length = 509

 Score = 44.2 bits (105), Expect = 1e-04
 Identities = 26/171 (15%), Positives = 52/171 (30%), Gaps = 14/171 (8%)

Query: 2   AYSVKSSSSSSSAHPSPHKNKDKDSSAIP--STSTSSSTSNPTNSSSSKKDKKDKDRDKE 59
           +    +      A  S  K  ++    +   S     +     +    KK K        
Sbjct: 29  SKGFITKEEIKEALESKKKTPEQIDQVLIFLSGMVKDTDDATESDIPKKKTKTAAKAAAA 88

Query: 60  KEKEKK------DKEKDKSAVSSKEKEKDKVSSKEKERKESKPK------ESSSEKEKKK 107
           K   KK      D  K     ++ +K+ D    K+ +             +   + +   
Sbjct: 89  KAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKDIDVLNQADDDDDDDDDDDLDDDDID 148

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
           +  D ++      D D + + +EKKE KE +  S       +  +  +  Q
Sbjct: 149 DDDDDEDDDEDDDDDDVDDEDEEKKEAKELEKLSDDDDFVWDEDDSEALRQ 199



 Score = 40.7 bits (96), Expect = 0.002
 Identities = 19/124 (15%), Positives = 41/124 (33%), Gaps = 1/124 (0%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
               K   A    +          S S     K++ ++  + K+K  ++ D+  +     
Sbjct: 3   TASTKAELAAEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIFLSGM 62

Query: 80  EKDKVSSKE-KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
            KD   + E    K+     + +   K   KK  K++    K  +++   D+  +    K
Sbjct: 63  VKDTDDATESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVK 122

Query: 139 SSSK 142
               
Sbjct: 123 DIDV 126



 Score = 39.2 bits (92), Expect = 0.006
 Identities = 17/126 (13%), Positives = 37/126 (29%), Gaps = 5/126 (3%)

Query: 30  PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE-----KDKV 84
            ST    +          K   K K +    ++E K+  + K     +  +        V
Sbjct: 4   ASTKAELAAEEEAKKKLKKLAAKSKSKGFITKEEIKEALESKKKTPEQIDQVLIFLSGMV 63

Query: 85  SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
              +   +   PK+ +    K    K   +K  K +    ++ + +    K+   +    
Sbjct: 64  KDTDDATESDIPKKKTKTAAKAAAAKAPAKKKLKDELDSSKKAEKKNALDKDDDLNYVKD 123

Query: 145 SSSHNS 150
               N 
Sbjct: 124 IDVLNQ 129


>gnl|CDD|234017 TIGR02794, tolA_full, TolA protein.  TolA couples the inner
           membrane complex of itself with TolQ and TolR to the
           outer membrane complex of TolB and OprL (also called
           Pal). Most of the length of the protein consists of
           low-complexity sequence that may differ in both length
           and composition from one species to another,
           complicating efforts to discriminate TolA (the most
           divergent gene in the tol-pal system) from paralogs such
           as TonB. Selection of members of the seed alignment and
           criteria for setting scoring cutoffs are based largely
           conserved operon struction. //The Tol-Pal complex is
           required for maintaining outer membrane integrity. Also
           involved in transport (uptake) of colicins and
           filamentous DNA, and implicated in pathogenesis.
           Transport is energized by the proton motive force. TolA
           is an inner membrane protein that interacts with
           periplasmic TolB and with outer membrane porins ompC,
           phoE and lamB [Transport and binding proteins, Other,
           Cellular processes, Pathogenesis].
          Length = 346

 Score = 43.3 bits (102), Expect = 2e-04
 Identities = 18/91 (19%), Positives = 49/91 (53%), Gaps = 1/91 (1%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           + ++ +K R  E+ ++K+ +++  +  ++K+ E+    ++EK+++  + K      E K 
Sbjct: 76  QAEEAEKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAEEAKA-KQAAEAKA 134

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           + + + EK  K + K +  ++ + K   E+K
Sbjct: 135 KAEAEAEKKAKEEAKKQAEEEAKAKAAAEAK 165



 Score = 37.9 bits (88), Expect = 0.013
 Identities = 23/93 (24%), Positives = 39/93 (41%), Gaps = 4/93 (4%)

Query: 48  KKDKKDKDRDKEKEKEKKDK-EKDKSAVSSKEKEKDKVSSKEKERKESKPKES---SSEK 103
           ++    +   K+ E+  K   EK K A  +K K+  +  +K +   E K KE     +E+
Sbjct: 95  EQRAAAEKAAKQAEQAAKQAEEKQKQAEEAKAKQAAEAKAKAEAEAEKKAKEEAKKQAEE 154

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           E K +   + +K      K  E +   K E K 
Sbjct: 155 EAKAKAAAEAKKKAAEAKKKAEAEAKAKAEAKA 187



 Score = 36.0 bits (83), Expect = 0.043
 Identities = 21/88 (23%), Positives = 36/88 (40%), Gaps = 3/88 (3%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
           K   +     E E EKK KE+   A    E+E    ++ E ++K ++ K+ +  + K K 
Sbjct: 127 KQAAEAKAKAEAEAEKKAKEE---AKKQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAKA 183

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           +   K K+ + K K          E   
Sbjct: 184 EAKAKAKAEEAKAKAEAAKAKAAAEAAA 211



 Score = 35.2 bits (81), Expect = 0.087
 Identities = 22/135 (16%), Positives = 56/135 (41%), Gaps = 11/135 (8%)

Query: 8   SSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDK 67
            S   S  P P    +   + +      +  +N          KK+++R K+ E++ ++ 
Sbjct: 21  GSLYHSVKPEPGGGGEIIQAVLVDPGAVAQQANRIQQQKKPAAKKEQERQKKLEQQAEEA 80

Query: 68  EKDKSA-------VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
           EK ++A       +  +   +      E+  K+++ K+  +E+ K K+  + K       
Sbjct: 81  EKQRAAEQARQKELEQRAAAEKAAKQAEQAAKQAEEKQKQAEEAKAKQAAEAK----AKA 136

Query: 121 DKDRERDKDEKKEQK 135
           + + E+   E+ +++
Sbjct: 137 EAEAEKKAKEEAKKQ 151



 Score = 32.5 bits (74), Expect = 0.61
 Identities = 22/90 (24%), Positives = 40/90 (44%), Gaps = 1/90 (1%)

Query: 48  KKDKKDKDRD-KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
            K K + + + K KE+ KK  E++  A ++ E +K    +K+K   E+K K  +  K K 
Sbjct: 132 AKAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAKAEAKAKAKA 191

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           +E K K E +      +     + +     
Sbjct: 192 EEAKAKAEAAKAKAAAEAAAKAEAEAAAAA 221



 Score = 32.1 bits (73), Expect = 0.83
 Identities = 24/88 (27%), Positives = 47/88 (53%), Gaps = 1/88 (1%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            K  ++K +  E+ K K+  E  K+   ++ ++K K  +K++  +E+K K ++  K+K  
Sbjct: 111 AKQAEEKQKQAEEAKAKQAAEA-KAKAEAEAEKKAKEEAKKQAEEEAKAKAAAEAKKKAA 169

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQK 135
           E K K E   K K + + + K E+ + K
Sbjct: 170 EAKKKAEAEAKAKAEAKAKAKAEEAKAK 197



 Score = 30.6 bits (69), Expect = 2.7
 Identities = 19/88 (21%), Positives = 35/88 (39%), Gaps = 3/88 (3%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           KK K++  +  E+E + K   + K       + K K  ++ K + E+K K  + E + K 
Sbjct: 142 KKAKEEAKKQAEEEAKAKAAAEAKKK---AAEAKKKAEAEAKAKAEAKAKAKAEEAKAKA 198

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQK 135
           E    K  +      + E       E +
Sbjct: 199 EAAKAKAAAEAAAKAEAEAAAAAAAEAE 226



 Score = 29.4 bits (66), Expect = 5.6
 Identities = 17/87 (19%), Positives = 38/87 (43%), Gaps = 4/87 (4%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK-ERKESKPKESSSEKEKKKEK 109
           K+ ++  K K   +  K+  ++   ++ + K K  +K K + +E+K K   +E  K K  
Sbjct: 150 KQAEEEAKAKAAAEAKKKAAEAKKKAEAEAKAKAEAKAKAKAEEAKAK---AEAAKAKAA 206

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKE 136
            +   K+          + + K ++ E
Sbjct: 207 AEAAAKAEAEAAAAAAAEAERKADEAE 233



 Score = 29.4 bits (66), Expect = 6.0
 Identities = 12/61 (19%), Positives = 24/61 (39%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            K K + +   + E + K K ++  A +   K K    +  K   E+    ++  + K  
Sbjct: 171 AKKKAEAEAKAKAEAKAKAKAEEAKAKAEAAKAKAAAEAAAKAEAEAAAAAAAEAERKAD 230

Query: 108 E 108
           E
Sbjct: 231 E 231


>gnl|CDD|218440 pfam05110, AF-4, AF-4 proto-oncoprotein.  This family consists of
           AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental
           retardation syndrome) nuclear proteins. These proteins
           have been linked to human diseases such as acute
           lymphoblastic leukaemia and mental retardation. The
           family also contains a Drosophila AF4 protein homologue
           Lilliputian which contains an AT-hook domain.
           Lilliputian represents a novel pair-rule gene that acts
           in cytoskeleton regulation, segmentation and
           morphogenesis in Drosophila.
          Length = 1154

 Score = 43.8 bits (103), Expect = 3e-04
 Identities = 51/252 (20%), Positives = 82/252 (32%), Gaps = 34/252 (13%)

Query: 4   SVKSSSSSSSAHPS-------PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDR 56
           S  S S    + P           +K+  S A   T    S+   + +  SK       R
Sbjct: 625 SSSSDSPEDESLPPSSQSPGNTESSKE--SCASLRTPVCRSSVG-SQNDLSKDRLLSPMR 681

Query: 57  DKEKEKEKKDKEKDK-----------SAVSSKEKEKDKVSSKEKERKESKP-KESSSEKE 104
           + E     +D E+             S +     +K       ++   S P K++S    
Sbjct: 682 ETELLSPLRDSEERYSLWVKIDLDLLSRIPGHPYKKGVPPKPAEKDSLSAPKKQTSKTAS 741

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP- 163
           +K   K K+    KHK+ +     + KK++ E KSSS   SSS +    +S  +      
Sbjct: 742 EKSSSKGKR----KHKNDEEADKIESKKQRLEEKSSSCSPSSSSSHHHSSSNKESRKSSR 797

Query: 164 -------PPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA 216
                  P P+   +  SP       K           S        K   K++   K  
Sbjct: 798 NKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQEDTSSSSGPFSASSTKSSSKSSSTSKHR 857

Query: 217 KSKEKESHKSSA 228
           K++ K S  S  
Sbjct: 858 KTEGKGSSTSKE 869



 Score = 43.0 bits (101), Expect = 4e-04
 Identities = 48/188 (25%), Positives = 70/188 (37%), Gaps = 30/188 (15%)

Query: 12  SSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
                 P K  +KDS + P   TS + S  ++S   +K K D++ DK + K+++ +EK  
Sbjct: 714 PYKKGVPPKPAEKDSLSAPKKQTSKTASEKSSSKGKRKHKNDEEADKIESKKQRLEEKSS 773

Query: 72  S-----AVSSKEKEKDKVSSKEKERKESK----PKESSSEKEKKKEKKDKKEK------- 115
           S     + S      +K S K    KE +    P    S    K E   +K         
Sbjct: 774 SCSPSSSSSHHHSSSNKESRKSSRNKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQEDTS 833

Query: 116 ---------SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP 166
                    S K   K     K  K E K S +S +   SS ++   AS     S P PP
Sbjct: 834 SSSGPFSASSTKSSSKSSSTSKHRKTEGKGSSTSKEHKGSSGDTPNKAS-----SFPVPP 888

Query: 167 APTPTQKS 174
               + K 
Sbjct: 889 LSNGSSKP 896



 Score = 38.7 bits (90), Expect = 0.010
 Identities = 50/248 (20%), Positives = 82/248 (33%), Gaps = 27/248 (10%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKD-KKDKDRDKEKEKEKK 65
           SSS  S    +  K   +++     +S     ++ + SSSS    +     D E E    
Sbjct: 363 SSSEDSDEEQATEKPPSRNTPPSAPSSNPEPAASSSGSSSSSSGSESSSGSDSESESSSS 422

Query: 66  DKEKDK-SAVSSKEKEK------------DKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
           D E+++    +S E E             +KV+  +    ES       ++  +KE K K
Sbjct: 423 DSEENEPPRTASPEPEPPSTNKWQLDNWLNKVNPHKVSPAESVSSNPPIKQPMEKEGKVK 482

Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS-------SSHNSKEPASGSQLISHPP- 164
              S  H +      K   KE++  +++ K          S   S+ P     +    P 
Sbjct: 483 SSGSQYHPESKEPPPKSSSKEKRRPRTAQKGPESGRGKQKSPAQSEAPPQRRTVGKKQPK 542

Query: 165 -PPAPTPTQKSPVKTKEKE----KEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK 219
            P   +   +      E E        S  T     K   K   K   +  PK     + 
Sbjct: 543 KPEKASAGDERTGLRPESEPGTLPYGSSVQTPPDRPKAATKGSRKPSPRKEPKSSVPPAA 602

Query: 220 EKESHKSS 227
           EK  +KS 
Sbjct: 603 EKRKYKSP 610



 Score = 35.7 bits (82), Expect = 0.082
 Identities = 28/105 (26%), Positives = 43/105 (40%), Gaps = 1/105 (0%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD 66
           SS S SS+    H + +K+S             +P++  SS   K +    K   +++  
Sbjct: 773 SSCSPSSSSSHHHSSSNKESRKSSRNKEEEMLPSPSSPLSSSSPKPEHPSRKRPRRQEDT 832

Query: 67  KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
                   +S  K   K SS  K RK ++ K SS+ KE K    D
Sbjct: 833 SSSSGPFSASSTKSSSKSSSTSKHRK-TEGKGSSTSKEHKGSSGD 876



 Score = 31.4 bits (71), Expect = 1.7
 Identities = 44/182 (24%), Positives = 67/182 (36%), Gaps = 23/182 (12%)

Query: 54  KDRDKEKEKEKKDKEKDKSA-VSSKEKEK---DKVSSKEKERKESKPKESSSEKEKKKEK 109
                     KK   K  S   SSK K K   D+ + K + +K+   ++SSS        
Sbjct: 723 PAEKDSLSAPKKQTSKTASEKSSSKGKRKHKNDEEADKIESKKQRLEEKSSSCSPSSSS- 781

Query: 110 KDKKEKSHKHKDKDRE-RDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
                 SH H   ++E R     KE++   S S  +SSS  S +P   S+    P     
Sbjct: 782 ------SHHHSSSNKESRKSSRNKEEEMLPSPSSPLSSS--SPKPEHPSR--KRPRRQED 831

Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
           T +   P            S++    +  KH+K +  G  T+ KE    S +  +  SS 
Sbjct: 832 TSSSSGP-----FSASSTKSSSKSSSTS-KHRKTEGKGSSTS-KEHKGSSGDTPNKASSF 884

Query: 229 GP 230
             
Sbjct: 885 PV 886


>gnl|CDD|113514 pfam04747, DUF612, Protein of unknown function, DUF612.  This
           family includes several uncharacterized proteins from
           Caenorhabditis elegans.
          Length = 517

 Score = 42.0 bits (97), Expect = 8e-04
 Identities = 43/179 (24%), Positives = 82/179 (45%), Gaps = 1/179 (0%)

Query: 16  PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVS 75
           P+   ++ K++ A    +           S  KK +K   +D E E++   K+  +    
Sbjct: 44  PNSINDQRKEAFASLELTEQPQQVEKVKKSEKKKAQKQIAKDHEAEQKVNAKKAAEKEAR 103

Query: 76  SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD-KKEKSHKHKDKDRERDKDEKKEQ 134
             E E  K +++E+E K+ K ++   +KE++K++ D KK ++ K K+K  + +K EK E+
Sbjct: 104 RAEAEAKKRAAQEEEHKQWKAEQERIQKEQEKKEADLKKLQAEKKKEKAVKAEKAEKAEK 163

Query: 135 KESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDK 193
            +  S+   V      K+ A+       P P  PT T   P +  ++   K++     K
Sbjct: 164 TKKASTPAPVEEEIVVKKVANDRSAAPAPEPKTPTNTPAEPAEQVQEITGKKNKKNKKK 222


>gnl|CDD|217393 pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the
           protein product of the dentatorubral-pallidoluysian
           atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive
           neurodegenerative disorder. It is caused by the
           expansion of a CAG repeat in the DRPLA gene on
           chromosome 12p. This results in an extended
           polyglutamine region in atrophin-1, that is thought to
           confer toxicity to the protein, possibly through
           altering its interactions with other proteins. The
           expansion of a CAG repeat is also the underlying defect
           in six other neurodegenerative disorders, including
           Huntington's disease. One interaction of expanded
           polyglutamine repeats that is thought to be pathogenic
           is that with the short glutamine repeat in the
           transcriptional coactivator CREB binding protein, CBP.
           This interaction draws CBP away from its usual nuclear
           location to the expanded polyglutamine repeat protein
           aggregates that are characteristic of the polyglutamine
           neurodegenerative disorders. This interferes with
           CBP-mediated transcription and causes cytotoxicity.
          Length = 979

 Score = 42.4 bits (99), Expect = 8e-04
 Identities = 50/193 (25%), Positives = 88/193 (45%), Gaps = 26/193 (13%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKK-DKKDKDRDKEKEKEK 64
           K  ++S     SP  N+D+ SS   S S +S++SN + + S+KK +KK K+   E     
Sbjct: 22  KKQTASPDGRASP-TNEDQRSSGRNSPSAASTSSNDSKAESTKKPNKKIKE---EATSPL 77

Query: 65  KDKEKDKSAVSSKEKEKDKVSSKEKERKE----SKPKESSSEKEKKKEKKDKKEKSHKHK 120
           K  ++ +   +S  +E ++V++K+ + +E    + P E   E E + E  D +  + +  
Sbjct: 78  KSTKRQREKPASDTEEPERVTAKKSKTQELSRPNSPSEGEGEGEGEGESSDSRSVNEEGS 137

Query: 121 DKDRERDKDEKK--------EQKESKSSSKIVSSSHNSKEPAS-----GSQLISHPPPPA 167
              ++ D+D +         +  ES S S         + P S     G+ L    PPP 
Sbjct: 138 SDPKDIDQDNRSSSPSIPSPQDNESDSDSSAQQQLLQPQGPPSIQVPPGAALAPSAPPPT 197

Query: 168 PT----PTQKSPV 176
           P+    P Q SP+
Sbjct: 198 PSAQAVPPQGSPI 210



 Score = 36.2 bits (83), Expect = 0.055
 Identities = 23/80 (28%), Positives = 45/80 (56%), Gaps = 8/80 (10%)

Query: 92  KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK 151
           K +K +E + EK K++ ++  +E+  + K+K++ER   E++ ++E++ ++K  SSSH S+
Sbjct: 576 KLAKKREEAVEKAKREAEQKAREEREREKEKEKER---EREREREAERAAKASSSSHESR 632

Query: 152 E-----PASGSQLISHPPPP 166
                         S  PPP
Sbjct: 633 MSEPQLSGPAHMRPSFEPPP 652



 Score = 32.0 bits (72), Expect = 1.0
 Identities = 25/85 (29%), Positives = 47/85 (55%), Gaps = 5/85 (5%)

Query: 85  SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           SSK  +++E   +++  E E+K  ++ ++EK  K K+++RER+++ ++  K S SS +  
Sbjct: 574 SSKLAKKREEAVEKAKREAEQKAREEREREKE-KEKEREREREREAERAAKASSSSHE-- 630

Query: 145 SSSHNSKEPASGSQLISHPPPPAPT 169
             S  S+   SG   +     P PT
Sbjct: 631 --SRMSEPQLSGPAHMRPSFEPPPT 653



 Score = 30.4 bits (68), Expect = 3.4
 Identities = 24/63 (38%), Positives = 36/63 (57%), Gaps = 4/63 (6%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK-ESKPKESSSEK 103
           +SSK  KK   R++  EK K++ E+       +EKEK+K   +E+ER+ E   K SSS  
Sbjct: 573 ASSKLAKK---REEAVEKAKREAEQKAREEREREKEKEKEREREREREAERAAKASSSSH 629

Query: 104 EKK 106
           E +
Sbjct: 630 ESR 632


>gnl|CDD|215521 PLN02967, PLN02967, kinase.
          Length = 581

 Score = 42.0 bits (98), Expect = 8e-04
 Identities = 29/110 (26%), Positives = 51/110 (46%), Gaps = 3/110 (2%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSA-VSSKEKEKDKVSSKEKERKESKPKESSSE- 102
           S  K  +  K   K+   E  +  ++ S  V +++   DK S K   R   K   +SS+ 
Sbjct: 69  SKKKPTRSVKRATKKTVVEISEPLEEGSELVVNEDAALDKESKKTPRRTRRKAAAASSDV 128

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
           +E+K EKK +K +  K K  +   D+  + E  + + S  + S  + S+E
Sbjct: 129 EEEKTEKKVRKRRKVK-KMDEDVEDQGSESEVSDVEESEFVTSLENESEE 177



 Score = 35.0 bits (80), Expect = 0.12
 Identities = 33/148 (22%), Positives = 57/148 (38%), Gaps = 9/148 (6%)

Query: 2   AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDK-------KDK 54
           A S K   S+ +    P +N    S   P+ S   +T       S   ++       +D 
Sbjct: 46  AGSRKKIESALAVDEEPDEN-GAVSKKKPTRSVKRATKKTVVEISEPLEEGSELVVNEDA 104

Query: 55  DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
             DKE +K  +   +  +A SS  +E+       K RK  K  E   ++  + E  D +E
Sbjct: 105 ALDKESKKTPRRTRRKAAAASSDVEEEKTEKKVRKRRKVKKMDEDVEDQGSESEVSDVEE 164

Query: 115 KSHKHK-DKDRERDKDEKKEQKESKSSS 141
                  + + E + D +K+  E  S +
Sbjct: 165 SEFVTSLENESEEELDLEKDDGEDISHT 192



 Score = 29.2 bits (65), Expect = 6.5
 Identities = 23/124 (18%), Positives = 41/124 (33%), Gaps = 26/124 (20%)

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
           +++ ES         E     K K  +S K   K    +  E  E+              
Sbjct: 49  RKKIESALAVDEEPDENGAVSKKKPTRSVKRATKKTVVEISEPLEE-------------- 94

Query: 149 NSKEPASGSQLISHPPPPAPTPTQKSPVKTKEK-----EKEKESSTTHDKHSKHKHKKKD 203
                  GS+L+ +        ++K+P +T+ K        +E  T      + K KK D
Sbjct: 95  -------GSELVVNEDAALDKESKKTPRRTRRKAAAASSDVEEEKTEKKVRKRRKVKKMD 147

Query: 204 KHGD 207
           +  +
Sbjct: 148 EDVE 151


>gnl|CDD|220102 pfam09073, BUD22, BUD22.  BUD22 has been shown in yeast to be a
           nuclear protein involved in bud-site selection. It plays
           a role in positioning the proximal bud pole signal. More
           recently it has been shown to be involved in ribosome
           biogenesis.
          Length = 424

 Score = 41.7 bits (98), Expect = 9e-04
 Identities = 28/141 (19%), Positives = 67/141 (47%), Gaps = 18/141 (12%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
            + + +K K  K+  KS     +K++ K SS + + +ES+ ++ S  +E  ++  D +E+
Sbjct: 142 IETKAKKGKAKKKTKKS-----KKKEAKESSDKDDEEESESEDESKSEESAEDDSDDEEE 196

Query: 116 SHKHKDKDRERDK-------DEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
                +   + D        +E+ E+  S + ++  S S + +  +  S         + 
Sbjct: 197 EDSDSEDYSQYDGMLVDSSDEEEGEEAPSINYNEDTSESESDESDSEIS------ESRSV 250

Query: 169 TPTQKSPVKTKEKEKEKESST 189
           + +++S   +K+ +++K SST
Sbjct: 251 SDSEESSPPSKKPKEKKTSST 271



 Score = 29.8 bits (67), Expect = 4.9
 Identities = 27/173 (15%), Positives = 52/173 (30%), Gaps = 26/173 (15%)

Query: 39  SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
           S   N  +S+ +  + D  +  E       ++ S  S K KEK   S+            
Sbjct: 225 SINYNEDTSESESDESD-SEISESRSVSDSEESSPPSKKPKEKKTSSTFLPSLMGGYFSG 283

Query: 99  SSSEKEKKKEKKDKKEKSHKHKDKDR---------------ERDKDEKKEQKESKSSSKI 143
           S  E +  ++    +      K K+R                  K  KKE+++ +   + 
Sbjct: 284 SEDEDDDDEDIDPDQVVKKPVKRKNRRGQRARQAIWEKKYGSGAKHVKKEREKEQKEREG 343

Query: 144 VSSSHNSKEPA----------SGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
             S   +++            +     S        P +    K K+   +K 
Sbjct: 344 RQSEWEARQAKREGGDAKAGRAAEPTGSRTQQKGDRPKRGEKKKPKKPSVDKP 396



 Score = 29.0 bits (65), Expect = 7.5
 Identities = 23/116 (19%), Positives = 46/116 (39%), Gaps = 18/116 (15%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSA--------VSSKEKEKDK---------VSS 86
           +  SKK +  +  DK+ E+E + +++ KS            +E    +         V S
Sbjct: 155 TKKSKKKEAKESSDKDDEEESESEDESKSEESAEDDSDDEEEEDSDSEDYSQYDGMLVDS 214

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE-KKEQKESKSSS 141
            ++E  E  P  + +E   + E  +   +  + +      +     K+ KE K+SS
Sbjct: 215 SDEEEGEEAPSINYNEDTSESESDESDSEISESRSVSDSEESSPPSKKPKEKKTSS 270


>gnl|CDD|215641 PLN03237, PLN03237, DNA topoisomerase 2; Provisional.
          Length = 1465

 Score = 41.8 bits (98), Expect = 0.001
 Identities = 32/218 (14%), Positives = 74/218 (33%), Gaps = 6/218 (2%)

Query: 16   PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSS---KKDKKDKDRDKEKEKEKKDKEKDKS 72
            P+P K   K S +  +  T  S++  T + +     K +    +      ++K++E +  
Sbjct: 1205 PAPKKTTKKASESETTEETYGSSAMETENVAEVVKPKGRAGAKKKAPAAAKEKEEEDEIL 1264

Query: 73   AVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKK 132
             +  +    +  S+  +  K  +  ++   +     KK          D D + D    +
Sbjct: 1265 DLKDRLAAYNLDSAPAQSAKMEETVKAVPARRAAARKK-PLASVSVISDSDDDDDDFAVE 1323

Query: 133  EQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHD 192
                 +   K       + + A+     +    PA   + +      E  K  E+     
Sbjct: 1324 VSLAERLKKKGGRKPAAANKKAAKPPAAAKKRGPATVQSGQ--KLLTEMLKPAEAIGISP 1381

Query: 193  KHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGP 230
            +    K +    +    +   + A +KE ES ++ +G 
Sbjct: 1382 EKKVRKMRASPFNKKSGSVLGRAATNKETESSENVSGS 1419



 Score = 31.8 bits (72), Expect = 1.3
 Identities = 34/186 (18%), Positives = 62/186 (33%), Gaps = 13/186 (6%)

Query: 55   DRDK-EKEKEKKDKEKDKSA----VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
            DRDK   E E   K   KS     + + EKE DK+  ++ + +E++ K   +    +   
Sbjct: 1134 DRDKLNIEVEDLKKTTPKSLWLKDLDALEKELDKLDKEDAKAEEAREKLQRAAARGESGA 1193

Query: 110  KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
              K  +    K   ++  K   + +   ++       + N  E           P     
Sbjct: 1194 AKKVSRQAPKKPAPKKTTKKASESETTEETYGSSAMETENVAEVV--------KPKGRAG 1245

Query: 170  PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
              +K+P   KEKE+E E     D+ + +          K     K   ++   + K    
Sbjct: 1246 AKKKAPAAAKEKEEEDEILDLKDRLAAYNLDSAPAQSAKMEETVKAVPARRAAARKKPLA 1305

Query: 230  PKCYPE 235
                  
Sbjct: 1306 SVSVIS 1311



 Score = 31.4 bits (71), Expect = 1.6
 Identities = 33/191 (17%), Positives = 67/191 (35%), Gaps = 5/191 (2%)

Query: 2    AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKE 61
            AY++ S+ + S+          K   A  + +     ++ +  S S  D  D   +    
Sbjct: 1272 AYNLDSAPAQSAKMEET----VKAVPARRAAARKKPLASVSVISDSDDDDDDFAVEVSLA 1327

Query: 62   KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD 121
            +  K K   K A ++K+  K   ++K K    +         E  K  +       K   
Sbjct: 1328 ERLKKKGGRKPAAANKKAAKPPAAAK-KRGPATVQSGQKLLTEMLKPAEAIGISPEKKVR 1386

Query: 122  KDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEK 181
            K R    ++K      ++++   + S  +   +S S+         P P + +  +T   
Sbjct: 1387 KMRASPFNKKSGSVLGRAATNKETESSENVSGSSSSEKDEIDVSAKPRPQRANRKQTTYV 1446

Query: 182  EKEKESSTTHD 192
              + ES +  D
Sbjct: 1447 LSDSESESADD 1457


>gnl|CDD|220365 pfam09726, Macoilin, Transmembrane protein.  This entry is a highly
           conserved protein present in eukaryotes.
          Length = 680

 Score = 41.4 bits (97), Expect = 0.001
 Identities = 43/274 (15%), Positives = 86/274 (31%), Gaps = 13/274 (4%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD 66
           SS   +S            +    S S++S T     S   K      +           
Sbjct: 220 SSKGLTSTKELVPVQNSGGNH-SLSKSSNSQTPELEYSEKGKDHHHSHNHQHHSIGINNH 278

Query: 67  KEKDKSAVSSKEKEKDKVSSKEKERKES--KPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
             K   +     +  +  S+K +    S    KE++S            + S   K  +R
Sbjct: 279 HSKHADSKLQTIEVIENHSNKSRPSSSSTNGSKETTSNSSSAAAGSIGSKSSKSAKHSNR 338

Query: 125 ERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP---PPPAPTPTQKSPVKTKEK 181
            +     K    +  S    S S N  +    S+  S        A   +    V+    
Sbjct: 339 NKSNSSPKSHSSANGSVPSSSVSDNESKQKRASKSSSGARDSKKDASGMSANGTVENCIP 398

Query: 182 EKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGI-- 239
           E +    +T     + +   K    +    ++ +++ + + S  +S       ++G +  
Sbjct: 399 ENK---ISTPSAIERLEQDIKKLQAELQQARQNESELRNQISLLTSLERSLKSDLGQLKK 455

Query: 240 -YILLRSKKNKTVQERLAE-QFKDELFDRLKNEQ 271
              +L++K N  V  +  + Q    +  RLK+E 
Sbjct: 456 ENDMLQTKLNSMVSAKQKDKQSMQSMEKRLKSEA 489



 Score = 36.1 bits (83), Expect = 0.059
 Identities = 22/92 (23%), Positives = 40/92 (43%), Gaps = 4/92 (4%)

Query: 33  STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKE----KDKSAVSSKEKEKDKVSSKE 88
             +    +   +  S     ++    +  + KK+ +    K  S VS+K+K+K  + S E
Sbjct: 423 QQARQNESELRNQISLLTSLERSLKSDLGQLKKENDMLQTKLNSMVSAKQKDKQSMQSME 482

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
           K  K       ++EK+  +EKK KKE+     
Sbjct: 483 KRLKSEADSRVNAEKQLAEEKKRKKEEEETAA 514



 Score = 29.5 bits (66), Expect = 5.6
 Identities = 17/104 (16%), Positives = 36/104 (34%), Gaps = 12/104 (11%)

Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
           DK++ +   +  +S+K +    NS    S S           +   ++P     ++ +  
Sbjct: 213 DKEKSEASSKGLTSTKELVPVQNSGGNHSLS----------KSSNSQTPELEYSEKGKDH 262

Query: 187 SSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGP 230
             +    H  H     + H    + K +  +  E  S+KS    
Sbjct: 263 HHSH--NHQHHSIGINNHHSKHADSKLQTIEVIENHSNKSRPSS 304



 Score = 28.7 bits (64), Expect = 9.8
 Identities = 30/157 (19%), Positives = 52/157 (33%), Gaps = 11/157 (7%)

Query: 73  AVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKK 132
            +S  +KEK + SSK     +      +S            + S+    +    +K +  
Sbjct: 208 TLSVTDKEKSEASSKGLTSTKELVPVQNS-----GGNHSLSKSSNSQTPELEYSEKGKDH 262

Query: 133 EQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHD 192
               +     I  ++H+SK   S  Q I      +      S      KE    SS+   
Sbjct: 263 HHSHNHQHHSIGINNHHSKHADSKLQTIEVIENHSNKSRPSSSSTNGSKETTSNSSSAAA 322

Query: 193 KHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
                K  K  KH +      ++  +   +SH S+ G
Sbjct: 323 GSIGSKSSKSAKHSN------RNKSNSSPKSHSSANG 353


>gnl|CDD|233496 TIGR01622, SF-CC1, splicing factor, CC1-like family.  This model
           represents a subfamily of RNA splicing factors including
           the Pad-1 protein (N. crassa), CAPER (M. musculus) and
           CC1.3 (H.sapiens). These proteins are characterized by
           an N-terminal arginine-rich, low complexity domain
           followed by three (or in the case of 4 H. sapiens
           paralogs, two) RNA recognition domains (rrm: pfam00706).
           These splicing factors are closely related to the U2AF
           splicing factor family (TIGR01642). A homologous gene
           from Plasmodium falciparum was identified in the course
           of the analysis of that genome at TIGR and was included
           in the seed.
          Length = 457

 Score = 41.4 bits (97), Expect = 0.001
 Identities = 14/84 (16%), Positives = 38/84 (45%), Gaps = 3/84 (3%)

Query: 56  RDKEKEKEK-KDKEKDKSAVSSKEKEKDK-VSSKEKER-KESKPKESSSEKEKKKEKKDK 112
           RD+E+ + +   +  DK    S+ + + +  S + ++R      +  S  +   +  + +
Sbjct: 3   RDRERGRLRNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRYYRPR 62

Query: 113 KEKSHKHKDKDRERDKDEKKEQKE 136
            ++S++  D+   R+  E   + E
Sbjct: 63  GDRSYRRDDRRSGRNTKEPLTEAE 86



 Score = 38.7 bits (90), Expect = 0.009
 Identities = 9/93 (9%), Positives = 36/93 (38%), Gaps = 8/93 (8%)

Query: 45  SSSKKDKKDKDRDKEKEKE-KKDKEKDKSAVSSKEKEKDKVSSKE-KERKESKPKESSSE 102
              +   ++  R  +K +E  + + + +   S + +++D    +  + R  S  +     
Sbjct: 4   DRERGRLRNDTRRSDKGRERSRRRSRSRDR-SRRRRDRDYYRGRRGRSRSRSPNRYYRPR 62

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
            ++   + D+     +     +E   + +++ +
Sbjct: 63  GDRSYRRDDR-----RSGRNTKEPLTEAERDDR 90



 Score = 34.5 bits (79), Expect = 0.14
 Identities = 8/57 (14%), Positives = 26/57 (45%)

Query: 86  SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
            +E+ R  +  + S   +E+ + +   +++S + +D+D  R +  +   +      +
Sbjct: 4   DRERGRLRNDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRYYR 60



 Score = 32.9 bits (75), Expect = 0.46
 Identities = 18/112 (16%), Positives = 41/112 (36%), Gaps = 13/112 (11%)

Query: 19  HKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
            +N  + S               +   S  +D+  + RD++  + ++ + + +S      
Sbjct: 10  LRNDTRRSDK---------GRERSRRRSRSRDRSRRRRDRDYYRGRRGRSRSRSPNRYYR 60

Query: 79  KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
              D+       R + +   ++ E   + E+ D+     +   K RERD  E
Sbjct: 61  PRGDRSY----RRDDRRSGRNTKEPLTEAERDDRTVFVLQLALKARERDLYE 108



 Score = 30.2 bits (68), Expect = 3.5
 Identities = 12/110 (10%), Positives = 36/110 (32%), Gaps = 26/110 (23%)

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
            +D++    +    +  R  D+ +E+   +S S+  S     ++   G            
Sbjct: 2   YRDRERGRLR----NDTRRSDKGRERSRRRSRSRDRSRRRRDRDYYRGR----------- 46

Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK-HGDKTNPKEKDAK 217
                     + + + +  +  +       +++ D+  G  T     +A+
Sbjct: 47  ----------RGRSRSRSPNRYYRPRGDRSYRRDDRRSGRNTKEPLTEAE 86


>gnl|CDD|234229 TIGR03490, Mycoplas_LppA, mycoides cluster lipoprotein, LppA/P72
           family.  Members of this protein family occur in
           Mycoplasma mycoides, Mycoplasma hyopneumoniae, and
           related Mycoplasmas in small paralogous families that
           may also include truncated forms and/or pseudogenes.
           Members are predicted lipoproteins with a conserved
           signal peptidase II processing and lipid attachment
           site. Note that the name for certain characterized
           members, p72, reflects an anomalous apparent molecular
           weight, given a theoretical MW of about 61 kDa.
          Length = 541

 Score = 41.4 bits (97), Expect = 0.001
 Identities = 31/106 (29%), Positives = 50/106 (47%), Gaps = 6/106 (5%)

Query: 29  IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
           I S S  S  S  T SS+SK+ +K  +    +   K  K+ D    + +  E +   S  
Sbjct: 13  ISSISFLSVVSCSTTSSNSKQPEKKPEIKPNENTPKIPKKPD----NKEPSENNNNKSNN 68

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
           + + E  P  SS+  EKK +    KE+  K KD+ ++ DK  + +Q
Sbjct: 69  ENKDEENP--SSTNPEKKPDPSKNKEEIEKPKDEPKKPDKKPQADQ 112



 Score = 36.4 bits (84), Expect = 0.041
 Identities = 26/110 (23%), Positives = 44/110 (40%), Gaps = 4/110 (3%)

Query: 26  SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS 85
           S    ST++S+S            +   K   K   KE  +   +KS   +K++E    +
Sbjct: 20  SVVSCSTTSSNSKQPEKKPEIKPNENTPKIPKKPDNKEPSENNNNKSNNENKDEENPSST 79

Query: 86  SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
           + EK+   SK KE   + + + +K DKK       D+      D+    K
Sbjct: 80  NPEKKPDPSKNKEEIEKPKDEPKKPDKKP----QADQPNNVHADQPNNNK 125


>gnl|CDD|178945 PRK00247, PRK00247, putative inner membrane protein translocase
           component YidC; Validated.
          Length = 429

 Score = 41.4 bits (97), Expect = 0.001
 Identities = 16/90 (17%), Positives = 36/90 (40%), Gaps = 5/90 (5%)

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK-----IVSSSHNSKEPASGSQ 158
           EK + K  KKE + K +  +RE +++ ++E+  + + ++     + +      + +    
Sbjct: 339 EKNEAKARKKEIAQKRRAAEREINREARQERAAAMARARARRAAVKAKKKGLIDASPNED 398

Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESS 188
             S       +P Q     T E  +E    
Sbjct: 399 TPSENEESKGSPPQVEATTTAEPNREPSQE 428



 Score = 31.0 bits (70), Expect = 1.7
 Identities = 21/100 (21%), Positives = 40/100 (40%), Gaps = 12/100 (12%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           K R  EK + K  K+        +  +K + + +E  R+    ++  +    +   +   
Sbjct: 334 KTRTAEKNEAKARKK--------EIAQKRRAAEREINREA---RQERAAAMARARARRAA 382

Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEP 153
            K+ K    D   ++D   E +ESK S   V  +  + EP
Sbjct: 383 VKAKKKGLIDASPNEDTPSENEESKGSPPQV-EATTTAEP 421


>gnl|CDD|216461 pfam01370, Epimerase, NAD dependent epimerase/dehydratase family.
           This family of proteins utilise NAD as a cofactor. The
           proteins in this family use nucleotide-sugar substrates
           for a variety of chemical reactions.
          Length = 233

 Score = 40.3 bits (95), Expect = 0.001
 Identities = 31/158 (19%), Positives = 46/158 (29%), Gaps = 49/158 (31%)

Query: 300 FIQHHIHVIIHAAASLRFDELIQDAFTL---NIQATRELLDLATRCSQLKAILHVSTLYT 356
             +     +IH AA        +D       N+  T  LL+ A R    K  +  S+   
Sbjct: 59  LAEVQPDAVIHLAAQSGVGASFEDPADFIRANVLGTLRLLEAARRAGV-KRFVFASS--- 114

Query: 357 HSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVV 416
                    E Y  +                +  I      G    + Y+  K   E +V
Sbjct: 115 --------SEVYGDV---------------ADPPITEDTPLG--PLSPYAAAKLAAERLV 149

Query: 417 EKY--LYKLPLAMVRPSIVVSTWKEPIVGWSNNLYGPG 452
           E Y   Y L   ++R                 N+YGPG
Sbjct: 150 EAYARAYGLRAVILRLF---------------NVYGPG 172


>gnl|CDD|240370 PTZ00342, PTZ00342, acyl-CoA synthetase; Provisional.
          Length = 746

 Score = 41.2 bits (97), Expect = 0.002
 Identities = 27/103 (26%), Positives = 42/103 (40%), Gaps = 9/103 (8%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
            ++EK       +    ++KE++K    S E E     P E   +KEK ++ KD KEK+ 
Sbjct: 219 NKEEKNNGSNVNNNGNKNNKEEQKGNDLSNELEDISLGPLEY--DKEKLEKIKDLKEKAK 276

Query: 118 KHKDKDRERDKDEKKEQKESKSS-------SKIVSSSHNSKEP 153
           K        D   K +    K         + IV +S  S +P
Sbjct: 277 KLGISIILFDDMTKNKTTNYKIQNEDPDFITSIVYTSGTSGKP 319


>gnl|CDD|183610 PRK12585, PRK12585, putative monovalent cation/H+ antiporter
           subunit G; Reviewed.
          Length = 197

 Score = 39.3 bits (91), Expect = 0.002
 Identities = 27/97 (27%), Positives = 52/97 (53%), Gaps = 11/97 (11%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           RD+ +  +K D +K KS +  +E+        EK R+E +  E   E E+++EK D++E 
Sbjct: 104 RDQLRSVKKDDIKKKKSLIIRQEQ-------IEKARQEREELEERMEWERREEKIDERE- 155

Query: 116 SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
               ++++RER++   +EQ +  S  +I+    +  E
Sbjct: 156 --DQEEQEREREEQTIEEQSDD-SEHEIIEQDESETE 189


>gnl|CDD|172341 PRK13808, PRK13808, adenylate kinase; Provisional.
          Length = 333

 Score = 40.3 bits (94), Expect = 0.002
 Identities = 25/139 (17%), Positives = 53/139 (38%), Gaps = 6/139 (4%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
            +++  ++  P+      K S+   S +   S       ++       K      +  KK
Sbjct: 196 AANAKKAAKTPAAKSGAKKASAKAKSAAKKVSKK----KAAKTAVSAKKAAKTAAKAAKK 251

Query: 66  DKEKDKSAVSSKEKEKDKVSSKEKE--RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
            K+  K A+    K   K + K  +   K +K    +++ + K +KK  K+ +   K K 
Sbjct: 252 AKKTAKKALKKAAKAVKKAAKKAAKAAAKAAKGAAKATKGKAKAKKKAGKKAAAGSKAKA 311

Query: 124 RERDKDEKKEQKESKSSSK 142
             +      + K++K  +K
Sbjct: 312 TAKAPKRGAKGKKAKKVTK 330



 Score = 39.9 bits (93), Expect = 0.003
 Identities = 32/144 (22%), Positives = 59/144 (40%), Gaps = 6/144 (4%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
           ++++KK  K        +K     +     VS K+  K  VS+K+  +  +K  + + + 
Sbjct: 196 AANAKKAAKTPAAKSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKKAKKT 255

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
            KK  KK  K      K   +   K  K   K +K  +K  +     K+ A+GS+  +  
Sbjct: 256 AKKALKKAAKAVKKAAKKAAKAAAKAAKGAAKATKGKAK--AKKKAGKKAAAGSKAKA-- 311

Query: 164 PPPAPTPTQKSPVKTKEKEKEKES 187
              A  P + +  K  +K  +K +
Sbjct: 312 --TAKAPKRGAKGKKAKKVTKKRA 333



 Score = 36.4 bits (84), Expect = 0.032
 Identities = 28/131 (21%), Positives = 55/131 (41%), Gaps = 1/131 (0%)

Query: 27  SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSA-VSSKEKEKDKVS 85
           +A+ + +   +   P   S +KK         +K  +KK  +   SA  ++K   K    
Sbjct: 192 AAVGAANAKKAAKTPAAKSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKK 251

Query: 86  SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
           +K+  +K  K    + +K  KK  K   + +       + + K +KK  K++ + SK  +
Sbjct: 252 AKKTAKKALKKAAKAVKKAAKKAAKAAAKAAKGAAKATKGKAKAKKKAGKKAAAGSKAKA 311

Query: 146 SSHNSKEPASG 156
           ++   K  A G
Sbjct: 312 TAKAPKRGAKG 322



 Score = 31.0 bits (70), Expect = 1.9
 Identities = 27/124 (21%), Positives = 45/124 (36%), Gaps = 4/124 (3%)

Query: 5   VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
              +  +S+   S  K   K  +A  + S   +      ++   K    K   K  +  K
Sbjct: 209 KSGAKKASAKAKSAAKKVSKKKAAKTAVSAKKAAKTAAKAAKKAKKTAKKALKKAAKAVK 268

Query: 65  KDKEK-DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK---KEKSHKHK 120
           K  +K  K+A  + +        K K +K++  K ++  K K   K  K   K K  K  
Sbjct: 269 KAAKKAAKAAAKAAKGAAKATKGKAKAKKKAGKKAAAGSKAKATAKAPKRGAKGKKAKKV 328

Query: 121 DKDR 124
            K R
Sbjct: 329 TKKR 332


>gnl|CDD|219953 pfam08648, DUF1777, Protein of unknown function (DUF1777).  This is
           a family of eukaryotic proteins of unknown function.
           Some of the proteins in this family are putative nucleic
           acid binding proteins.
          Length = 158

 Score = 38.7 bits (90), Expect = 0.002
 Identities = 24/109 (22%), Positives = 48/109 (44%), Gaps = 5/109 (4%)

Query: 35  SSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES 94
             S S     S  +   + +DR + + +  + +E+D+   S          S+   R  S
Sbjct: 2   GRSRSRSPRRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRS 61

Query: 95  KPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
                S    +++++K +++K  + + K RER K  K+E  E KS  ++
Sbjct: 62  ----RSRSPSRRRDRKRERDKDAR-EPKKRERQKLIKEEDLEGKSDEEV 105



 Score = 31.4 bits (71), Expect = 0.72
 Identities = 14/94 (14%), Positives = 40/94 (42%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
           ++  +      S S          S S ++D++ + R +   + ++ +   +    S+  
Sbjct: 7   RSPRRSRRRGRSRSRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRRHRSRSRSP 66

Query: 80  EKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
            + +   +E+++   +PK+   +K  K+E  + K
Sbjct: 67  SRRRDRKRERDKDAREPKKRERQKLIKEEDLEGK 100



 Score = 30.6 bits (69), Expect = 1.2
 Identities = 15/100 (15%), Positives = 40/100 (40%), Gaps = 5/100 (5%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           + R +   + ++         S   +E+ +  S+ +ER   +   S S    ++ +  ++
Sbjct: 3   RSRSRSPRRSRRRGRSR----SRDRRERRRERSRSRERDRRRRSRSRSPHRSRRSRSPRR 58

Query: 114 EKSH-KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
            +S  +   + R+R ++  K+ +E K   +         E
Sbjct: 59  HRSRSRSPSRRRDRKRERDKDAREPKKRERQKLIKEEDLE 98


>gnl|CDD|215214 PLN02381, PLN02381, valyl-tRNA synthetase.
          Length = 1066

 Score = 40.3 bits (94), Expect = 0.003
 Identities = 25/106 (23%), Positives = 51/106 (48%), Gaps = 6/106 (5%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
           S +  + +K    ++E E++KK +EK      +KEKE  K+ + +KE K     + +S+ 
Sbjct: 2   SRTESEAEKKILTEEELERKKKKEEK------AKEKELKKLKAAQKEAKAKLQAQQASDG 55

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
               +K +KK +    +D++ E   D      + K  S  ++  ++
Sbjct: 56  TNVPKKSEKKSRKRDVEDENPEDFIDPDTPFGQKKRLSSQMAKQYS 101



 Score = 39.9 bits (93), Expect = 0.004
 Identities = 20/89 (22%), Positives = 36/89 (40%), Gaps = 2/89 (2%)

Query: 85  SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           +  E E+K    +E   E++KKKE+K K+++  K K   +E     + +Q    ++    
Sbjct: 4   TESEAEKKILTEEEL--ERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKK 61

Query: 145 SSSHNSKEPASGSQLISHPPPPAPTPTQK 173
           S   + K             P  P   +K
Sbjct: 62  SEKKSRKRDVEDENPEDFIDPDTPFGQKK 90



 Score = 31.8 bits (72), Expect = 1.5
 Identities = 17/90 (18%), Positives = 38/90 (42%)

Query: 35  SSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES 94
           S + S       ++++ + K + +EK KEK+ K+   +   +K K + + +S      + 
Sbjct: 2   SRTESEAEKKILTEEELERKKKKEEKAKEKELKKLKAAQKEAKAKLQAQQASDGTNVPKK 61

Query: 95  KPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
             K+S     + +  +D  +       K R
Sbjct: 62  SEKKSRKRDVEDENPEDFIDPDTPFGQKKR 91


>gnl|CDD|177089 CHL00189, infB, translation initiation factor 2; Provisional.
          Length = 742

 Score = 40.2 bits (94), Expect = 0.003
 Identities = 28/133 (21%), Positives = 44/133 (33%), Gaps = 8/133 (6%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKD--RDKEKEKEKKDKE---KDKSAV 74
           ++  KDS      +          +    K    KD  + K K+K+K  K+    D    
Sbjct: 36  ESDIKDSLLNLDINKKLHEKLDKKNKKFNKTDDLKDSKKTKLKQKKKIKKKLHIDDDYDN 95

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
               K   K  +        +  +  +EK KKK   +K       K K     KDE  + 
Sbjct: 96  FFDSKNNSKQFAGPLAISLMRKPKPKTEKLKKKITVNKST---NKKKKKVLSSKDELIKY 152

Query: 135 KESKSSSKIVSSS 147
             +K  S  + S 
Sbjct: 153 DNNKPKSISIHSP 165



 Score = 35.6 bits (82), Expect = 0.080
 Identities = 31/168 (18%), Positives = 52/168 (30%), Gaps = 34/168 (20%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE--KKKEKKDKKE 114
              K    K + +     S    + +K   ++ ++K  K  ++   K+  K K K+ KK 
Sbjct: 24  KNLKHSSYKIRLESDIKDSLLNLDINKKLHEKLDKKNKKFNKTDDLKDSKKTKLKQKKKI 83

Query: 115 KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKS 174
           K   H D D +   D K   K+      I                              S
Sbjct: 84  KKKLHIDDDYDNFFDSKNNSKQFAGPLAI------------------------------S 113

Query: 175 PVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
            ++  + + EK          K  +KKK K     +   K   +K K 
Sbjct: 114 LMRKPKPKTEKLKKKITVN--KSTNKKKKKVLSSKDELIKYDNNKPKS 159


>gnl|CDD|218538 pfam05285, SDA1, SDA1.  This family consists of several SDA1
           protein homologues. SDA1 is a Saccharomyces cerevisiae
           protein which is involved in the control of the actin
           cytoskeleton. The protein is essential for cell
           viability and is localised in the nucleus.
          Length = 317

 Score = 39.3 bits (92), Expect = 0.004
 Identities = 43/183 (23%), Positives = 64/183 (34%), Gaps = 39/183 (21%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE----- 104
           DK+ +  D E E+EK +  K     S +E  ++      +E +    KE +SE       
Sbjct: 124 DKEIESSDSEDEEEKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEKASELATTRIL 183

Query: 105 ------KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
                 K +E + +K        K + RDKD                   +S E      
Sbjct: 184 TPADFAKIQELRLEKGVDKALGGKLKRRDKDA---------------PERHSDELVDADD 228

Query: 159 LISHPPPPAPTPTQKSPVKTKEK--EKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA 216
           +        P   +K    TKE+     KE     +K    K KK  +    TN KEK A
Sbjct: 229 IE------GPAKKKKQ---TKEERIATAKEGREDREKFGSRKGKKDKEGKSTTN-KEK-A 277

Query: 217 KSK 219
           + K
Sbjct: 278 RKK 280



 Score = 29.2 bits (66), Expect = 6.9
 Identities = 19/100 (19%), Positives = 48/100 (48%), Gaps = 2/100 (2%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKS--AVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           K K+++ + KE E+  +  + D        +E E      +  + +  K  ESS  ++++
Sbjct: 77  KWKEEERKKKEAEQGLESDDDDDEEEEWEVEEDEDSDDEGEWIDVESDKEIESSDSEDEE 136

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS 146
           ++ +  K+      ++  E D++E  E++E+++  +  S 
Sbjct: 137 EKDEAAKKAKEDSDEELSEEDEEEAAEEEEAEAEKEKASE 176


>gnl|CDD|219655 pfam07946, DUF1682, Protein of unknown function (DUF1682).  The
           members of this family are all hypothetical eukaryotic
           proteins of unknown function. One member is described as
           being an adipocyte-specific protein, but no evidence of
           this was found.
          Length = 322

 Score = 39.2 bits (92), Expect = 0.004
 Identities = 24/71 (33%), Positives = 43/71 (60%), Gaps = 6/71 (8%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           S     K DK R++E+EK  K  E+++   + ++KE+     K+KE +E+K  + S E++
Sbjct: 254 SPEVLRKVDKTREEEEEKILKAAEEERQEEAQEKKEE-----KKKEEREAKLAKLSPEEQ 308

Query: 105 KKKE-KKDKKE 114
           +K E K+ KK+
Sbjct: 309 RKLEEKERKKQ 319



 Score = 32.2 bits (74), Expect = 0.64
 Identities = 17/64 (26%), Positives = 36/64 (56%), Gaps = 13/64 (20%)

Query: 48  KKDKKDKDRDK-EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           +K  K  + ++ E+ +EKK+++K +     +E +  K+S +E+ + E        EKE+K
Sbjct: 270 EKILKAAEEERQEEAQEKKEEKKKEE----REAKLAKLSPEEQRKLE--------EKERK 317

Query: 107 KEKK 110
           K+ +
Sbjct: 318 KQAR 321



 Score = 31.5 bits (72), Expect = 1.4
 Identities = 20/64 (31%), Positives = 35/64 (54%), Gaps = 3/64 (4%)

Query: 75  SSKEKEK-DKVSSKEKE--RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEK 131
           S +   K DK   +E+E   K ++ +     +EKK+EKK ++ ++   K    E+ K E+
Sbjct: 254 SPEVLRKVDKTREEEEEKILKAAEEERQEEAQEKKEEKKKEEREAKLAKLSPEEQRKLEE 313

Query: 132 KEQK 135
           KE+K
Sbjct: 314 KERK 317



 Score = 28.8 bits (65), Expect = 8.1
 Identities = 14/62 (22%), Positives = 36/62 (58%), Gaps = 1/62 (1%)

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
           +K +++   K  +  E + +E + EK K+++KK+++E        + +R  +EK+ +K++
Sbjct: 262 DKTREEEEEKILKAAEEERQEEAQEK-KEEKKKEEREAKLAKLSPEEQRKLEEKERKKQA 320

Query: 138 KS 139
           + 
Sbjct: 321 RK 322


>gnl|CDD|226096 COG3566, COG3566, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 379

 Score = 39.5 bits (92), Expect = 0.004
 Identities = 16/102 (15%), Positives = 41/102 (40%), Gaps = 4/102 (3%)

Query: 48  KKDKKDKDRDKEK-EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
               +    D      E+    + +   +S +   +KV + EK+   ++ K  S +   K
Sbjct: 184 PITTRRIGVDGISLSLEETKASEVEHLSASLKTATEKVDALEKDLHAAQAKLDSGQALTK 243

Query: 107 KE---KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
           +E   KK +  K+    +     D+D +      +++++++ 
Sbjct: 244 EELDAKKAELSKALAALEAANAADEDPQDRDAAVEAAARLMG 285


>gnl|CDD|222636 pfam14265, DUF4355, Domain of unknown function (DUF4355).  This
           family of proteins is found in bacteria and viruses.
           Proteins in this family are typically between 180 and
           214 amino acids in length.
          Length = 125

 Score = 37.2 bits (87), Expect = 0.004
 Identities = 24/78 (30%), Positives = 38/78 (48%), Gaps = 1/78 (1%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           + E+EK   DKE DK+    K K + K   K+ E ++   K S+ EK + + +K +KE  
Sbjct: 1   EPEEEKTFTDKEVDKAIAKEKAKWEKKQEEKKSEAEKLA-KMSAEEKAEYELEKLEKELE 59

Query: 117 HKHKDKDRERDKDEKKEQ 134
               +  R   K E K+ 
Sbjct: 60  ELEAELARRELKAEAKKM 77



 Score = 31.8 bits (73), Expect = 0.34
 Identities = 16/59 (27%), Positives = 33/59 (55%)

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           +KE DK  +KEK + E K +E  SE EK  +   +++  ++ +  ++E ++ E +  + 
Sbjct: 10  DKEVDKAIAKEKAKWEKKQEEKKSEAEKLAKMSAEEKAEYELEKLEKELEELEAELARR 68



 Score = 29.5 bits (67), Expect = 2.3
 Identities = 17/78 (21%), Positives = 35/78 (44%), Gaps = 12/78 (15%)

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
           E E+EK   +K+     +KEK K       ++++E K  E+    +   E+K + E    
Sbjct: 1   EPEEEKTFTDKEVDKAIAKEKAK------WEKKQEEKKSEAEKLAKMSAEEKAEYEL--- 51

Query: 119 HKDKDRERDKDEKKEQKE 136
              +  E++ +E + +  
Sbjct: 52  ---EKLEKELEELEAELA 66


>gnl|CDD|218684 pfam05672, MAP7, MAP7 (E-MAP-115) family.  The organisation of
           microtubules varies with the cell type and is presumably
           controlled by tissue-specific microtubule-associated
           proteins (MAPs). The 115-kDa epithelial MAP
           (E-MAP-115/MAP7) has been identified as a
           microtubule-stabilising protein predominantly expressed
           in cell lines of epithelial origin. The binding of this
           microtubule associated protein is nucleotide
           independent.
          Length = 171

 Score = 37.8 bits (87), Expect = 0.005
 Identities = 18/118 (15%), Positives = 64/118 (54%), Gaps = 1/118 (0%)

Query: 26  SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS 85
             A  S +    T+  T++  + +   +K R   +++E++++E+ +     + + ++   
Sbjct: 3   GKAENSAALGKPTAGTTDAEEATRLLAEKRRQAREQREQEEQERREQEEQDRLEREELKR 62

Query: 86  SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
              +ER   + +    E+E+ +EK++K ++  + ++K +E+++ E+ ++++ ++ ++ 
Sbjct: 63  RAAEERLRREEEARRQEEERAREKEEKAKRKAEEEEK-QEQEEQERIQKQKEEAEARA 119



 Score = 29.7 bits (66), Expect = 2.7
 Identities = 17/89 (19%), Positives = 47/89 (52%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            ++K++K + K +E+EK+++E+ +     KE+ + +   + +  +  + K     ++++ 
Sbjct: 83  AREKEEKAKRKAEEEEKQEQEEQERIQKQKEEAEARAREEAERMRLEREKHFQQIEQERL 142

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           E+K + E+  K   K     + +K++ K 
Sbjct: 143 ERKKRLEEIMKRTRKSEVSPQVKKEDPKV 171


>gnl|CDD|226894 COG4499, COG4499, Predicted membrane protein [Function unknown].
          Length = 434

 Score = 39.0 bits (91), Expect = 0.005
 Identities = 18/85 (21%), Positives = 39/85 (45%), Gaps = 1/85 (1%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           K  ++ K       +K +  +    K+    + K  E K+       +E + K+EK  ++
Sbjct: 351 KLYEEVKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVKDETDASEEAEAKAKEEKLKQE 410

Query: 114 EKSHKHKDKDRERDKDEKKEQKESK 138
           E   K K++  E DK+++++ +  K
Sbjct: 411 ENEKKQKEQADE-DKEKRQKDERKK 434



 Score = 39.0 bits (91), Expect = 0.006
 Identities = 18/79 (22%), Positives = 40/79 (50%), Gaps = 5/79 (6%)

Query: 38  TSNPTNSSSSKKDKKDKDRDKE-----KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
             + T+ S  K+ +  K+ +K+     K+  +   E D S  +  + +++K+  +E E+K
Sbjct: 356 VKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVKDETDASEEAEAKAKEEKLKQEENEKK 415

Query: 93  ESKPKESSSEKEKKKEKKD 111
           + +  +   EK +K E+K 
Sbjct: 416 QKEQADEDKEKRQKDERKK 434



 Score = 38.2 bits (89), Expect = 0.009
 Identities = 23/84 (27%), Positives = 38/84 (45%), Gaps = 3/84 (3%)

Query: 40  NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
               S++     K ++  KE  K+ +D  K    V     E D     E + KE K K+ 
Sbjct: 354 EEVKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVK---DETDASEEAEAKAKEEKLKQE 410

Query: 100 SSEKEKKKEKKDKKEKSHKHKDKD 123
            +EK++K++  + KEK  K + K 
Sbjct: 411 ENEKKQKEQADEDKEKRQKDERKK 434



 Score = 30.5 bits (69), Expect = 2.8
 Identities = 14/75 (18%), Positives = 31/75 (41%), Gaps = 2/75 (2%)

Query: 69  KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
           K    V S          +  +    K ++ + +  + K++ D   +  + K K+ +  +
Sbjct: 351 KLYEEVKSNTDLSGDKRQELLKEYNKKLQDYTKKLGEVKDETDA-SEEAEAKAKEEKLKQ 409

Query: 129 DE-KKEQKESKSSSK 142
           +E +K+QKE     K
Sbjct: 410 EENEKKQKEQADEDK 424


>gnl|CDD|236877 PRK11192, PRK11192, ATP-dependent RNA helicase SrmB; Provisional.
          Length = 434

 Score = 39.2 bits (92), Expect = 0.006
 Identities = 15/54 (27%), Positives = 28/54 (51%), Gaps = 5/54 (9%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
            EK+  +   K  +   EKK+++K+K +   +H+D      K+  K +K S +S
Sbjct: 384 SEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDT-----KNIGKRRKPSGTS 432



 Score = 36.8 bits (86), Expect = 0.029
 Identities = 19/61 (31%), Positives = 28/61 (45%), Gaps = 6/61 (9%)

Query: 42  TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
           T + S KK  K   +   K  EKK+KEK+K  V  + ++   +       K  KP  +S 
Sbjct: 380 TKAPSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIG------KRRKPSGTSE 433

Query: 102 E 102
           E
Sbjct: 434 E 434



 Score = 35.7 bits (83), Expect = 0.058
 Identities = 15/55 (27%), Positives = 23/55 (41%), Gaps = 3/55 (5%)

Query: 62  KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           K K   EK     S K   K     ++KE+++ KPK     ++ K   K +K   
Sbjct: 379 KTKAPSEKKTGKPSKKVLAKRA---EKKEKEKEKPKVKKRHRDTKNIGKRRKPSG 430



 Score = 33.4 bits (77), Expect = 0.32
 Identities = 18/58 (31%), Positives = 24/58 (41%), Gaps = 2/58 (3%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR--ERDKDEKKEQKESKSSSK 142
            E   K   P E  + K  KK    + EK  K K+K +  +R +D K   K  K S  
Sbjct: 374 DELRPKTKAPSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPSGT 431



 Score = 32.6 bits (75), Expect = 0.52
 Identities = 15/59 (25%), Positives = 28/59 (47%), Gaps = 1/59 (1%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
           R K K   +K   K    V +K  EK K   KEK + + + +++ +  +++K     +E
Sbjct: 377 RPKTKAPSEKKTGKPSKKVLAKRAEK-KEKEKEKPKVKKRHRDTKNIGKRRKPSGTSEE 434



 Score = 31.8 bits (73), Expect = 1.1
 Identities = 11/53 (20%), Positives = 17/53 (32%)

Query: 165 PPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAK 217
            P+   T K   K   K  EK+         K +H+     G +  P     +
Sbjct: 382 APSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPSGTSEE 434



 Score = 30.7 bits (70), Expect = 2.4
 Identities = 14/60 (23%), Positives = 22/60 (36%), Gaps = 9/60 (15%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
              EK+     K   A  +++KEK+K   K K+R             K   K+ K   + 
Sbjct: 382 APSEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDT---------KNIGKRRKPSGTS 432



 Score = 29.9 bits (68), Expect = 4.6
 Identities = 15/59 (25%), Positives = 26/59 (44%), Gaps = 3/59 (5%)

Query: 170 PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
           P  K+P    EK+  K S     K ++ K K+K+K   K   ++     K ++   +S 
Sbjct: 378 PKTKAP---SEKKTGKPSKKVLAKRAEKKEKEKEKPKVKKRHRDTKNIGKRRKPSGTSE 433



 Score = 29.5 bits (67), Expect = 5.1
 Identities = 15/61 (24%), Positives = 21/61 (34%), Gaps = 8/61 (13%)

Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
            P    T      V  K  EK+++         K K K K +H D  N  ++   S   E
Sbjct: 382 APSEKKTGKPSKKVLAKRAEKKEK--------EKEKPKVKKRHRDTKNIGKRRKPSGTSE 433

Query: 223 S 223
            
Sbjct: 434 E 434


>gnl|CDD|220684 pfam10310, DUF2413, Protein of unknown function (DUF2413).  This is
           a family of proteins conserved in fungi. The function is
           not known.
          Length = 436

 Score = 39.0 bits (91), Expect = 0.006
 Identities = 25/124 (20%), Positives = 48/124 (38%), Gaps = 16/124 (12%)

Query: 34  TSSSTSNPTNSSSSKKDKKDKDRD-------KEKEKEKKDKEKDKSAVSSKEKEKDKVSS 86
              + +       + KD  + D D        E+ ++ K  +K         KE  +  +
Sbjct: 10  DEKAPTKKPKKGDASKDSTEDDEDILEFLDELEQSEKAKPPKKP--------KEASRPGT 61

Query: 87  KEKERKESKPKESS-SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
               +K SKP ESS +  E+K  K  K  +S +      +    E +E++E + +   ++
Sbjct: 62  PRNPKKSSKPTESSAASSEEKPAKPRKSAESTRSSHPKSKAPSTESEEEEEPEETPDPIA 121

Query: 146 SSHN 149
           S   
Sbjct: 122 SIGG 125



 Score = 33.6 bits (77), Expect = 0.33
 Identities = 21/118 (17%), Positives = 36/118 (30%), Gaps = 12/118 (10%)

Query: 61  EKEKKDKEKDKSAVSSKEKEKDK-----VSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           +++   K+  K   S    E D+     +   E+  K   PK+          +  KK  
Sbjct: 10  DEKAPTKKPKKGDASKDSTEDDEDILEFLDELEQSEKAKPPKKPKEASRPGTPRNPKK-- 67

Query: 116 SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS-HPPPPAPTPTQ 172
                 K  E      +E+      S   + S + K  A  ++      P   P P  
Sbjct: 68  ----SSKPTESSAASSEEKPAKPRKSAESTRSSHPKSKAPSTESEEEEEPEETPDPIA 121



 Score = 30.9 bits (70), Expect = 2.0
 Identities = 38/186 (20%), Positives = 64/186 (34%), Gaps = 26/186 (13%)

Query: 105 KKKEKKDKKEKSHKHKDKD--RERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
            KK KK    K     D+D     D+ E+ E+ +     K  S     + P   S+    
Sbjct: 15  TKKPKKGDASKDSTEDDEDILEFLDELEQSEKAKPPKKPKEASRPGTPRNPKKSSK---- 70

Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
              P  +    S  K  +  K  ES+ +    SK            T  +E++   +  +
Sbjct: 71  ---PTESSAASSEEKPAKPRKSAESTRSSHPKSK---------APSTESEEEEEPEETPD 118

Query: 223 SHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHII 282
              S  G   +   G I     S  +  V+   AEQ  +E    ++ E+A +   +V   
Sbjct: 119 PIASIGG--WWSLWGSITSTATSTASAAVK--QAEQAVNE----IQQEEAQLWAEQVRGN 170

Query: 283 SGDISQ 288
            G +  
Sbjct: 171 VGALRD 176


>gnl|CDD|222665 pfam14303, NAM-associated, No apical meristem-associated C-terminal
           domain.  This domain is found in a number of different
           types of plant proteins including NAM-like proteins.
          Length = 147

 Score = 37.4 bits (87), Expect = 0.006
 Identities = 30/99 (30%), Positives = 52/99 (52%), Gaps = 3/99 (3%)

Query: 18  PHKNKDKDSSAIPSTSTSSSTSNP-TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSS 76
              +K K   +  S  ++S+  N   +  S+ + K+ + R K KEK ++DK K K   + 
Sbjct: 26  KKASKKKKKRSNSSPGSTSNEENEDEDDESTAESKRPEGRKKAKEKLRRDKLKAKKEEAE 85

Query: 77  KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           KEKEK++     K   E++ + +  EK+K + K  K+EK
Sbjct: 86  KEKEKEERF--MKALAEAEKERAELEKKKAEAKLMKEEK 122



 Score = 36.6 bits (85), Expect = 0.010
 Identities = 22/91 (24%), Positives = 40/91 (43%), Gaps = 3/91 (3%)

Query: 52  KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE---KEKKKE 108
           K++ + K K  E K   K K   S+           E E  ES  +    E   K K+K 
Sbjct: 13  KNEPKWKSKRSELKKASKKKKKRSNSSPGSTSNEENEDEDDESTAESKRPEGRKKAKEKL 72

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKS 139
           ++DK +   +  +K++E+++   K   E++ 
Sbjct: 73  RRDKLKAKKEEAEKEKEKEERFMKALAEAEK 103


>gnl|CDD|236944 PRK11642, PRK11642, exoribonuclease R; Provisional.
          Length = 813

 Score = 39.0 bits (91), Expect = 0.007
 Identities = 21/92 (22%), Positives = 39/92 (42%), Gaps = 8/92 (8%)

Query: 36  SSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKS----AVSSKEKEKDKVSSKEKER 91
           SS   P N   + ++K  K    +K  +++   K  +    +    EK+    ++K+  R
Sbjct: 726 SSERAPRNVGKTAREKAKKGDAGKKGGKRRQVGKKVNFEPDSAFRGEKKAKPKAAKKDAR 785

Query: 92  KESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
           K  KP    S K +K     K +++ K K  +
Sbjct: 786 KAKKP----SAKTQKIAAATKAKRAAKKKVAE 813



 Score = 32.0 bits (73), Expect = 1.0
 Identities = 13/73 (17%), Positives = 33/73 (45%)

Query: 70  DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
           D S +SS+   ++   +  ++ K+    +   ++ +  +K + +  S    +K  +    
Sbjct: 721 DFSLISSERAPRNVGKTAREKAKKGDAGKKGGKRRQVGKKVNFEPDSAFRGEKKAKPKAA 780

Query: 130 EKKEQKESKSSSK 142
           +K  +K  K S+K
Sbjct: 781 KKDARKAKKPSAK 793


>gnl|CDD|235175 PRK03918, PRK03918, chromosome segregation protein; Provisional.
          Length = 880

 Score = 39.3 bits (92), Expect = 0.007
 Identities = 25/85 (29%), Positives = 48/85 (56%), Gaps = 2/85 (2%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEK-DKVSSKEKERKE-SKPKESSSEKEKKKEKKD 111
           ++  KEKEKE ++  ++ + +SS+  E  +++   EKE KE  + KE   E EK+ E  +
Sbjct: 192 EELIKEKEKELEEVLREINEISSELPELREELEKLEKEVKELEELKEEIEELEKELESLE 251

Query: 112 KKEKSHKHKDKDRERDKDEKKEQKE 136
             ++  + K ++ E   +E K++ E
Sbjct: 252 GSKRKLEEKIRELEERIEELKKEIE 276



 Score = 37.7 bits (88), Expect = 0.018
 Identities = 17/88 (19%), Positives = 46/88 (52%), Gaps = 2/88 (2%)

Query: 52  KDKDRDKEKEKEKKDKEKDKSAVSSKE--KEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
           KD +++ E+E+++  K +++   + +E  + + ++    KE +E + K S  E E+ +E+
Sbjct: 608 KDAEKELEREEKELKKLEEELDKAFEELAETEKRLEELRKELEELEKKYSEEEYEELREE 667

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKES 137
             +  +       + E  +  ++E K++
Sbjct: 668 YLELSRELAGLRAELEELEKRREEIKKT 695



 Score = 35.4 bits (82), Expect = 0.11
 Identities = 17/95 (17%), Positives = 49/95 (51%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           K+ ++ ++R +E +K+ K+ EK    +  + +  ++  +K++E +  K + +    EK +
Sbjct: 331 KELEEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEELERLKKRLTGLTPEKLE 390

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           ++ ++ EK+ +  +++  +      E K+     K
Sbjct: 391 KELEELEKAKEEIEEEISKITARIGELKKEIKELK 425



 Score = 35.0 bits (81), Expect = 0.12
 Identities = 22/96 (22%), Positives = 48/96 (50%), Gaps = 1/96 (1%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK-ERKESKPKESSSEKEKK 106
           K+ KK ++   +  +E  + EK    +  + +E +K  S+E+ E    +  E S E    
Sbjct: 619 KELKKLEEELDKAFEELAETEKRLEELRKELEELEKKYSEEEYEELREEYLELSRELAGL 678

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           + + ++ EK  +   K  E+ K+E +E++++K   +
Sbjct: 679 RAELEELEKRREEIKKTLEKLKEELEEREKAKKELE 714



 Score = 32.3 bits (74), Expect = 0.87
 Identities = 20/92 (21%), Positives = 43/92 (46%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
            +K+++ ++  K+ ++ +K  E+ +      E+ K K    E+ +K          +++ 
Sbjct: 334 EEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEELERLKKRLTGLTPEKLEKEL 393

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           +E +  KE+  +   K   R  + KKE KE K
Sbjct: 394 EELEKAKEEIEEEISKITARIGELKKEIKELK 425



 Score = 31.2 bits (71), Expect = 2.1
 Identities = 18/96 (18%), Positives = 40/96 (41%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
            + +K+ +  +  K K ++   + +  +   +KE +++  K KE KE K K     K  +
Sbjct: 241 EELEKELESLEGSKRKLEEKIRELEERIEELKKEIEELEEKVKELKELKEKAEEYIKLSE 300

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
             ++   E     K   R  ++    E++  +   K
Sbjct: 301 FYEEYLDELREIEKRLSRLEEEINGIEERIKELEEK 336



 Score = 31.2 bits (71), Expect = 2.1
 Identities = 16/55 (29%), Positives = 30/55 (54%)

Query: 61  EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           EKE K+ E+ K  +   EKE + +   +++ +E   +     +E KKE ++ +EK
Sbjct: 227 EKEVKELEELKEEIEELEKELESLEGSKRKLEEKIRELEERIEELKKEIEELEEK 281



 Score = 30.4 bits (69), Expect = 3.5
 Identities = 21/95 (22%), Positives = 44/95 (46%), Gaps = 7/95 (7%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           K+ +K+ +    +  E   +  +        +E +K+  + KE +E K +    EKE + 
Sbjct: 196 KEKEKELEEVLREINEISSELPE------LREELEKLEKEVKELEELKEEIEELEKELES 249

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
            +  K++   K ++   ER ++ KKE +E +   K
Sbjct: 250 LEGSKRKLEEKIREL-EERIEELKKEIEELEEKVK 283



 Score = 29.6 bits (67), Expect = 5.5
 Identities = 19/102 (18%), Positives = 49/102 (48%), Gaps = 5/102 (4%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSA-----VSSKEKEKDKVSSKEKERKESKPKESS 100
             K+++ ++ + K KE EK+ +E ++         +K++E +++  +       K ++  
Sbjct: 334 EEKEERLEELKKKLKELEKRLEELEERHELYEEAKAKKEELERLKKRLTGLTPEKLEKEL 393

Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
            E EK KE+ +++      +  + +++  E K+  E    +K
Sbjct: 394 EELEKAKEEIEEEISKITARIGELKKEIKELKKAIEELKKAK 435


>gnl|CDD|217756 pfam03839, Sec62, Translocation protein Sec62. 
          Length = 217

 Score = 37.9 bits (88), Expect = 0.007
 Identities = 21/65 (32%), Positives = 34/65 (52%), Gaps = 1/65 (1%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           R  E EK K +K+K    + +K   +DK   K K   +++  E   +K   +EKK++K+K
Sbjct: 14  RALESEKYKANKDKGNPEIYNKINSQDKAIEKFKLLIKAQMAE-RVKKLHSQEKKEEKKK 72

Query: 116 SHKHK 120
             K K
Sbjct: 73  PKKKK 77



 Score = 36.7 bits (85), Expect = 0.019
 Identities = 19/77 (24%), Positives = 32/77 (41%), Gaps = 7/77 (9%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           S   K  KDK      E   K   +DK    + EK K  + ++  ER +    +   E++
Sbjct: 18  SEKYKANKDK---GNPEIYNKINSQDK----AIEKFKLLIKAQMAERVKKLHSQEKKEEK 70

Query: 105 KKKEKKDKKEKSHKHKD 121
           KK +KK    + +  + 
Sbjct: 71  KKPKKKKVPLQVNPAQL 87



 Score = 35.5 bits (82), Expect = 0.047
 Identities = 25/85 (29%), Positives = 40/85 (47%), Gaps = 12/85 (14%)

Query: 63  EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
           +++D  + K  V + E EK K ++K+K   E   K +S +K  +K K          K +
Sbjct: 2   KRQDFFRAKRVVRALESEKYK-ANKDKGNPEIYNKINSQDKAIEKFKLLI-------KAQ 53

Query: 123 DRERDK----DEKKEQKESKSSSKI 143
             ER K     EKKE+K+     K+
Sbjct: 54  MAERVKKLHSQEKKEEKKKPKKKKV 78


>gnl|CDD|184900 PRK14907, rplD, 50S ribosomal protein L4; Provisional.
          Length = 295

 Score = 38.4 bits (89), Expect = 0.008
 Identities = 24/109 (22%), Positives = 44/109 (40%), Gaps = 3/109 (2%)

Query: 1   MAYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEK 60
           MA + K++   ++    P   K   S     T  ++ T++   +  + K KK K      
Sbjct: 1   MAETKKTTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTT 60

Query: 61  EKEKKDKEKD---KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           +K     EK    K    +K+  K +  S E     +K  +++S+  KK
Sbjct: 61  KKVTVKFEKTESVKKESVAKKTVKKEAVSAEVFEASNKLFKNTSKLPKK 109



 Score = 35.3 bits (81), Expect = 0.076
 Identities = 24/105 (22%), Positives = 45/105 (42%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
           KK   +   +EK+   K+   S  ++K K+  K +S +  +K +K K++ S K   K+  
Sbjct: 5   KKTTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTTKKVT 64

Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
            K EK+   K +   +   +K+        +      + SK P  
Sbjct: 65  VKFEKTESVKKESVAKKTVKKEAVSAEVFEASNKLFKNTSKLPKK 109



 Score = 34.9 bits (80), Expect = 0.082
 Identities = 23/112 (20%), Positives = 40/112 (35%), Gaps = 2/112 (1%)

Query: 32  TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER 91
           T+   +T     ++      K+  + K+  K    K   K+A   K K     + K   +
Sbjct: 7   TTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTTKKVTVK 66

Query: 92  KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
            E           KK  K  K+  S +  +   +  K+  K  K+  +S KI
Sbjct: 67  FEKTESVKKESVAKKTVK--KEAVSAEVFEASNKLFKNTSKLPKKLFASEKI 116



 Score = 31.8 bits (72), Expect = 0.84
 Identities = 19/102 (18%), Positives = 40/102 (39%), Gaps = 2/102 (1%)

Query: 42  TNSSSSKKDKKDKDRDKEKEKEKKDKEKDK-SAVSSKEKEKDKVSSKEKERKESKP-KES 99
           T  ++ KK  ++K    +K    K+  K K +A ++  K   K +  +K +      K+ 
Sbjct: 4   TKKTTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAKKAAKVKKTKSVKTTTKKV 63

Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
           + + EK +  K +       K +    +  E   +    +S 
Sbjct: 64  TVKFEKTESVKKESVAKKTVKKEAVSAEVFEASNKLFKNTSK 105



 Score = 29.9 bits (67), Expect = 3.7
 Identities = 21/107 (19%), Positives = 34/107 (31%), Gaps = 6/107 (5%)

Query: 83  KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           K ++K+K  +E KP    +   K+  K  K  K+   K       K  K   K++KS   
Sbjct: 5   KKTTKKKTTEEKKPAAKKATTSKETAKTKKTAKTTSTKAAK----KAAKV--KKTKSVKT 58

Query: 143 IVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESST 189
                    E     +  S            + V     +  K +S 
Sbjct: 59  TTKKVTVKFEKTESVKKESVAKKTVKKEAVSAEVFEASNKLFKNTSK 105


>gnl|CDD|215565 PLN03083, PLN03083, E3 UFM1-protein ligase 1 homolog; Provisional.
          Length = 803

 Score = 38.6 bits (90), Expect = 0.009
 Identities = 22/96 (22%), Positives = 41/96 (42%)

Query: 63  EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
           ++ +KE D  ++ +        S K     ES P  S+S+K  KK+K           + 
Sbjct: 378 DQIEKEMDAFSIQASSAGLIGSSEKSLGSNESSPAASNSDKGSKKKKGKSTSTKGGTAES 437

Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
             + ++D  K+ K+++   +  SS   S   A G +
Sbjct: 438 IPDDEEDAPKKGKKNQKKGRDKSSKVPSDSKAGGKK 473



 Score = 37.9 bits (88), Expect = 0.016
 Identities = 26/110 (23%), Positives = 45/110 (40%), Gaps = 2/110 (1%)

Query: 2   AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKE 61
           A+S+++SS+              +SS   S S   S      S+S+K    +   D E++
Sbjct: 386 AFSIQASSAGLIGSSEKSLG-SNESSPAASNSDKGSKKKKGKSTSTKGGTAESIPDDEED 444

Query: 62  KEKKDKEKDKSAVSSKEKEK-DKVSSKEKERKESKPKESSSEKEKKKEKK 110
             KK K+  K       K   D  +  +KE  +S+   ++   E+   KK
Sbjct: 445 APKKGKKNQKKGRDKSSKVPSDSKAGGKKESVKSQEDNNNIPPEEWVMKK 494



 Score = 34.0 bits (78), Expect = 0.28
 Identities = 25/112 (22%), Positives = 38/112 (33%), Gaps = 15/112 (13%)

Query: 33  STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
           S  S  SN ++ ++S  DK  K            K+K KS  +     +     +E   K
Sbjct: 400 SEKSLGSNESSPAASNSDKGSK------------KKKGKSTSTKGGTAESIPDDEEDAPK 447

Query: 93  ESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           + K  +     +  K   D K    K   K +E   D      E     KI+
Sbjct: 448 KGKKNQKKGRDKSSKVPSDSKAGGKKESVKSQE---DNNNIPPEEWVMKKIL 496


>gnl|CDD|237629 PRK14160, PRK14160, heat shock protein GrpE; Provisional.
          Length = 211

 Score = 37.4 bits (87), Expect = 0.010
 Identities = 14/67 (20%), Positives = 33/67 (49%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           K+ K  K  + E++  K+++ K++     ++ E +++  +E      +  E   E+ K +
Sbjct: 3   KECKDAKHENMEEDCCKENENKEEDKGKEEDLEFEEIEKEEIIEDSEESNEVKIEELKDE 62

Query: 108 EKKDKKE 114
             K K+E
Sbjct: 63  NNKLKEE 69



 Score = 29.0 bits (65), Expect = 6.3
 Identities = 11/80 (13%), Positives = 36/80 (45%), Gaps = 5/80 (6%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
            +++ K+ K +  ++      E +++     +K ++E    E   ++E  ++ ++  E  
Sbjct: 1   MEKECKDAKHENMEEDCCKENENKEE-----DKGKEEDLEFEEIEKEEIIEDSEESNEVK 55

Query: 117 HKHKDKDRERDKDEKKEQKE 136
            +    +  + K+E K+ + 
Sbjct: 56  IEELKDENNKLKEENKKLEN 75


>gnl|CDD|150884 pfam10278, Med19, Mediator of RNA pol II transcription subunit 19. 
           Med19 represents a family of conserved proteins which
           are members of the multi-protein co-activator Mediator
           complex. Mediator is required for activation of RNA
           polymerase II transcription by DNA binding
           transactivators.
          Length = 178

 Score = 37.1 bits (86), Expect = 0.010
 Identities = 20/61 (32%), Positives = 35/61 (57%)

Query: 79  KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           K+K K   K+   ++  P+E+ S+ E  K  + K +K     DK+R++ K EKK++K+  
Sbjct: 110 KKKHKHKHKKHRTQDPLPEETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRH 169

Query: 139 S 139
           S
Sbjct: 170 S 170



 Score = 34.0 bits (78), Expect = 0.11
 Identities = 28/100 (28%), Positives = 35/100 (35%), Gaps = 17/100 (17%)

Query: 148 HNSKEPASGSQLISHPPPPAPTPTQ-----KSPVKTKEKEKEKE------------SSTT 190
                P +  QL      P P P Q       P K K K K K+            S + 
Sbjct: 76  GKELLPLTSVQLAGFRLHPGPLPEQYRLMHIQPPKKKHKHKHKKHRTQDPLPEETPSDSE 135

Query: 191 HDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGP 230
             K  + KHKKK    DK   K+K  K K+K+ H      
Sbjct: 136 GLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRHSPEHPG 175



 Score = 32.9 bits (75), Expect = 0.29
 Identities = 23/66 (34%), Positives = 35/66 (53%), Gaps = 4/66 (6%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
           K K K KK + +D      +E   D    K  E+K  K K+   +KE+KK+KK+KK+K  
Sbjct: 112 KHKHKHKKHRTQDPL---PEETPSDSEGLKGHEKKHKK-KKHEDDKERKKKKKEKKKKKK 167

Query: 118 KHKDKD 123
           +H  + 
Sbjct: 168 RHSPEH 173



 Score = 32.1 bits (73), Expect = 0.41
 Identities = 22/70 (31%), Positives = 32/70 (45%), Gaps = 2/70 (2%)

Query: 79  KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
            +  K   K K  K+ + ++   E E   + +  K    KHK K  E DK+ KK++KE K
Sbjct: 106 IQPPKKKHKHKH-KKHRTQDPLPE-ETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKK 163

Query: 139 SSSKIVSSSH 148
              K  S  H
Sbjct: 164 KKKKRHSPEH 173



 Score = 31.3 bits (71), Expect = 0.88
 Identities = 19/59 (32%), Positives = 30/59 (50%), Gaps = 4/59 (6%)

Query: 19  HKNKDKDS---SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK-EKKDKEKDKSA 73
           HK+K K       +P  + S S     +    KK K + D++++K+K EKK K+K  S 
Sbjct: 113 HKHKHKKHRTQDPLPEETPSDSEGLKGHEKKHKKKKHEDDKERKKKKKEKKKKKKRHSP 171


>gnl|CDD|218188 pfam04641, Rtf2, Replication termination factor 2.  It is vital for
           effective cell-replication that replication is not
           stalled at any point by, for instance, damaged bases.
           Rtf2 stabilizes the replication fork stalled at the
           site-specific replication barrier RTS1 by preventing
           replication restart until completion of DNA synthesis by
           a converging replication fork initiated at a flanking
           origin. The RTS1 element terminates replication forks
           that are moving in the cen2-distal direction while
           allowing forks moving in the cen2-proximal direction to
           pass through the region. Rtf2 contains a C2HC2 motif
           related to the C3HC4 RING-finger motif, and would appear
           to fold up, creating a RING finger-like structure but
           forming only one functional Zn2+ ion-binding site.
          Length = 254

 Score = 37.7 bits (88), Expect = 0.010
 Identities = 22/89 (24%), Positives = 39/89 (43%), Gaps = 7/89 (7%)

Query: 40  NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
           NPT         + ++  + K+K+KK K+K K   ++    +  VSS          + S
Sbjct: 163 NPTEEEVELLKARLEEE-RAKKKKKKKKKKTKKNNATGSSAEATVSSA------VPTELS 215

Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
           S   +  + KK KK++S    ++  E  K
Sbjct: 216 SGAGQVGEAKKLKKKRSIAPDNEKSEVYK 244



 Score = 33.9 bits (78), Expect = 0.17
 Identities = 25/106 (23%), Positives = 39/106 (36%), Gaps = 7/106 (6%)

Query: 79  KEKDKVS-SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
            E+D V  +  +E  E        E+ KKK+KK KK      K K         +    S
Sbjct: 155 SEEDVVPLNPTEEEVELLKARLEEERAKKKKKKKKK------KTKKNNATGSSAEATVSS 208

Query: 138 KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEK 183
              +++ S +    E     +  S  P    +   KS   + +KEK
Sbjct: 209 AVPTELSSGAGQVGEAKKLKKKRSIAPDNEKSEVYKSLFTSHKKEK 254



 Score = 33.1 bits (76), Expect = 0.33
 Identities = 13/80 (16%), Positives = 27/80 (33%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
            ++E E  K   E++++    K+K+K    +            S+   E         E 
Sbjct: 165 TEEEVELLKARLEEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEA 224

Query: 116 SHKHKDKDRERDKDEKKEQK 135
               K +    D ++ +  K
Sbjct: 225 KKLKKKRSIAPDNEKSEVYK 244



 Score = 33.1 bits (76), Expect = 0.35
 Identities = 19/75 (25%), Positives = 30/75 (40%), Gaps = 7/75 (9%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           +KK KK K +  +K         + +  S+   E    + +  E K+ K K S +   +K
Sbjct: 181 AKKKKKKKKKKTKKNNATGSS-AEATVSSAVPTELSSGAGQVGEAKKLKKKRSIAPDNEK 239

Query: 107 KE------KKDKKEK 115
            E         KKEK
Sbjct: 240 SEVYKSLFTSHKKEK 254



 Score = 32.7 bits (75), Expect = 0.46
 Identities = 19/93 (20%), Positives = 31/93 (33%), Gaps = 4/93 (4%)

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
           E+E E      ++     K+K+K K + K      S     SS    +      +    K
Sbjct: 166 EEEVELLKARLEEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEAK 225

Query: 119 HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK 151
              K R    D +K    S+    + +S    K
Sbjct: 226 KLKKKRSIAPDNEK----SEVYKSLFTSHKKEK 254



 Score = 29.2 bits (66), Expect = 4.8
 Identities = 16/78 (20%), Positives = 33/78 (42%), Gaps = 1/78 (1%)

Query: 122 KDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP-PPPAPTPTQKSPVKTKE 180
           ++    K +KK++K++K ++   SS+  +   A  ++L S           +K      +
Sbjct: 177 EEERAKKKKKKKKKKTKKNNATGSSAEATVSSAVPTELSSGAGQVGEAKKLKKKRSIAPD 236

Query: 181 KEKEKESSTTHDKHSKHK 198
            EK +   +    H K K
Sbjct: 237 NEKSEVYKSLFTSHKKEK 254


>gnl|CDD|191249 pfam05279, Asp-B-Hydro_N, Aspartyl beta-hydroxylase N-terminal
           region.  This family includes the N-terminal regions of
           the junctin, junctate and aspartyl beta-hydroxylase
           proteins. Junctate is an integral ER/SR membrane calcium
           binding protein, which comes from an alternatively
           spliced form of the same gene that generates aspartyl
           beta-hydroxylase and junctin. Aspartyl beta-hydroxylase
           catalyzes the post-translational hydroxylation of
           aspartic acid or asparagine residues contained within
           epidermal growth factor (EGF) domains of proteins.
          Length = 240

 Score = 37.6 bits (87), Expect = 0.011
 Identities = 22/98 (22%), Positives = 40/98 (40%), Gaps = 4/98 (4%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP----KESSSEKE 104
           K+ +  +      ++  D+++   A    E+ +D    +E   ++ K     K S  E E
Sbjct: 128 KEPQLDEDKFLLAEDSDDRQETLEAGKVHEETEDSYHVEETASEQYKQDMKEKASEQENE 187

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
             KE  +K E++    D   E D DE+    E   + K
Sbjct: 188 DSKEPVEKAERTKAETDDVTEEDYDEEDNPVEDSKAIK 225



 Score = 35.7 bits (82), Expect = 0.051
 Identities = 31/176 (17%), Positives = 71/176 (40%), Gaps = 7/176 (3%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           K+    +     +E E   +E+ + AV  K K+K +   KE+ +   +    S ++E   
Sbjct: 68  KEKSTSEPTVPPEEAEPHAEEEGQLAVR-KTKQKVEEEVKEQLQSLLEKIVVSKQEEDGP 126

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS--GSQLISHPPP 165
            K+ + ++           D D+++E  E+    +    S++ +E AS    Q +     
Sbjct: 127 GKEPQLDEDKFLL----AEDSDDRQETLEAGKVHEETEDSYHVEETASEQYKQDMKEKAS 182

Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
                  K PV+  E+ K +    T + + +  +  +D    K    ++  + +++
Sbjct: 183 EQENEDSKEPVEKAERTKAETDDVTEEDYDEEDNPVEDSKAIKEELAKEPVEEQQE 238



 Score = 32.6 bits (74), Expect = 0.46
 Identities = 23/159 (14%), Positives = 55/159 (34%), Gaps = 9/159 (5%)

Query: 67  KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
           KEK  S  +   +E +  + +E +    K K+   E+ K++ +   ++     +++D   
Sbjct: 68  KEKSTSEPTVPPEEAEPHAEEEGQLAVRKTKQKVEEEVKEQLQSLLEKIVVSKQEEDGPG 127

Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
            + +  E K   +          + E     +          T +++     KEK  E+E
Sbjct: 128 KEPQLDEDKFLLAED--SDDRQETLEAGKVHEETEDSYHVEETASEQYKQDMKEKASEQE 185

Query: 187 SSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
           +          K   +     K    +   +  ++E + 
Sbjct: 186 N-------EDSKEPVEKAERTKAETDDVTEEDYDEEDNP 217



 Score = 32.2 bits (73), Expect = 0.56
 Identities = 17/73 (23%), Positives = 40/73 (54%), Gaps = 4/73 (5%)

Query: 42  TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
           T S   K+D K+K  ++E E  K+  EK +   +    E D V+ ++ + +++  ++S +
Sbjct: 168 TASEQYKQDMKEKASEQENEDSKEPVEKAERTKA----ETDDVTEEDYDEEDNPVEDSKA 223

Query: 102 EKEKKKEKKDKKE 114
            KE+  ++  +++
Sbjct: 224 IKEELAKEPVEEQ 236


>gnl|CDD|227446 COG5116, RPN2, 26S proteasome regulatory complex component
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 926

 Score = 38.4 bits (89), Expect = 0.011
 Identities = 19/96 (19%), Positives = 39/96 (40%), Gaps = 4/96 (4%)

Query: 21  NKDKDSSAIP--STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
           + +++    P      S  +    N++      K   R K+K KEK   +K+   + S  
Sbjct: 759 DLEEEEFEYPRMYEEASGKSVRKVNTAVLSTTIKAAARAKQKPKEKGPNDKEI-KIESPS 817

Query: 79  KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
            E +       +++E K  ++ +    KK+K  K +
Sbjct: 818 VETEG-ERCTIKQREEKGIDAPAILNVKKKKPYKVD 852



 Score = 33.8 bits (77), Expect = 0.27
 Identities = 20/82 (24%), Positives = 37/82 (45%), Gaps = 11/82 (13%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSS--KEKERKESKPKES-SSEKEKKKEKKDKKE 114
           +E+E E     ++ S  S ++     +S+  K   R + KPKE   ++KE K E      
Sbjct: 761 EEEEFEYPRMYEEASGKSVRKVNTAVLSTTIKAAARAKQKPKEKGPNDKEIKIE------ 814

Query: 115 KSHKHKDKDRERDKDEKKEQKE 136
                 + + ER   +++E+K 
Sbjct: 815 --SPSVETEGERCTIKQREEKG 834



 Score = 31.1 bits (70), Expect = 2.0
 Identities = 15/67 (22%), Positives = 32/67 (47%), Gaps = 8/67 (11%)

Query: 71  KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
           K+A  +K+K K+K  + ++ + ES   E+  E+   K++++        K  D     + 
Sbjct: 792 KAAARAKQKPKEKGPNDKEIKIESPSVETEGERCTIKQREE--------KGIDAPAILNV 843

Query: 131 KKEQKES 137
           KK++   
Sbjct: 844 KKKKPYK 850


>gnl|CDD|185628 PTZ00449, PTZ00449, 104 kDa microneme/rhoptry antigen; Provisional.
          Length = 943

 Score = 38.5 bits (89), Expect = 0.011
 Identities = 48/209 (22%), Positives = 76/209 (36%), Gaps = 23/209 (11%)

Query: 17  SPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSS 76
           +P + +D D    P     +S   P        DK+ ++ + E  KE  + ++      +
Sbjct: 497 APIEEEDSDKHDEPPEGPEASGLPPKAPG----DKEGEEGEHEDSKESDEPKEGGKPGET 552

Query: 77  KEKEKDKVSSKEKERKESK----------PKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
           KE E  K     KE K SK          PK+    K+ ++ KK K+ +S +   +    
Sbjct: 553 KEGEVGKKPGPAKEHKPSKIPTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQRPTR---P 609

Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS--HPPPPAPTPTQKSPVKTK----E 180
              +  E  +   S K   S  + K P    +  S   P  P    + K P   K     
Sbjct: 610 KSPKLPELLDIPKSPKRPESPKSPKRPPPPQRPSSPERPEGPKIIKSPKPPKSPKPPFDP 669

Query: 181 KEKEKESSTTHDKHSKHKHKKKDKHGDKT 209
           K KEK      D  +K K  K     D++
Sbjct: 670 KFKEKFYDDYLDAAAKSKETKTTVVLDES 698



 Score = 31.6 bits (71), Expect = 1.6
 Identities = 43/202 (21%), Positives = 70/202 (34%), Gaps = 28/202 (13%)

Query: 48  KKDKKDKDRDKEKEKEKKDKE-KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           KK KK     +E++ +K D+  +   A     K       +E E ++SK  ES   KE  
Sbjct: 490 KKSKKKLAPIEEEDSDKHDEPPEGPEASGLPPKAPGDKEGEEGEHEDSK--ESDEPKEGG 547

Query: 107 KEKKDKKEKSHKHKDKDRE----------------RDKDEKKEQKESKSSSKIVSSSH-- 148
           K  + K+ +  K     +E                +D    K+ +E K   +  S+    
Sbjct: 548 KPGETKEGEVGKKPGPAKEHKPSKIPTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQRPT 607

Query: 149 ---NSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKK--- 202
              + K P       S   P +P   ++ P   +    E+       K  K     K   
Sbjct: 608 RPKSPKLPELLDIPKSPKRPESPKSPKRPPPPQRPSSPERPEGPKIIKSPKPPKSPKPPF 667

Query: 203 -DKHGDKTNPKEKDAKSKEKES 223
             K  +K      DA +K KE+
Sbjct: 668 DPKFKEKFYDDYLDAAAKSKET 689



 Score = 29.3 bits (65), Expect = 6.7
 Identities = 41/243 (16%), Positives = 65/243 (26%), Gaps = 13/243 (5%)

Query: 3   YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
            S K        HP   +   K      S    +   +P         K  K    E  K
Sbjct: 575 LSKKPEFPKDPKHPKDPEEPKKPKRPR-SAQRPTRPKSPKLPELLDIPKSPKR--PESPK 631

Query: 63  EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
             K     +   S +  E  K+    K  K  KP      KEK  +         K    
Sbjct: 632 SPKRPPPPQRPSSPERPEGPKIIKSPKPPKSPKPPFDPKFKEKFYDDYLDAAAKSKETKT 691

Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKS-------- 174
               D+  +   KE+   +     +     P    +    P  P   P  +         
Sbjct: 692 TVVLDESFESILKETLPETPGTPFTTPRPLPPKLPRDEEFPFEPIGDPDAEQPDDIEFFT 751

Query: 175 -PVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCY 233
            P + +    E  + T        + K++D H +   P E   +      H     P  +
Sbjct: 752 PPEEERTFFHETPADTPLPDILAEEFKEEDIHAETGEPDEAMKRPDSPSEH-EDKPPGDH 810

Query: 234 PEV 236
           P +
Sbjct: 811 PSL 813


>gnl|CDD|237035 PRK12280, rplW, 50S ribosomal protein L23; Reviewed.
          Length = 158

 Score = 36.7 bits (85), Expect = 0.012
 Identities = 20/64 (31%), Positives = 36/64 (56%)

Query: 53  DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
            +   KE  KE ++KE  K+    KEK++ KV+ K  ++K +K  +++++K  KK    K
Sbjct: 95  SEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTTTKK 154

Query: 113 KEKS 116
           +E  
Sbjct: 155 EEGK 158



 Score = 33.2 bits (76), Expect = 0.17
 Identities = 23/70 (32%), Positives = 36/70 (51%), Gaps = 3/70 (4%)

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
            +E EK+ KE  K     +EKE  K   ++KE+KE K  E  ++K+  K  K+  +K+ K
Sbjct: 92  PEESEKEQKEVSKET---EEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATK 148

Query: 119 HKDKDRERDK 128
                +E  K
Sbjct: 149 KTTTKKEEGK 158



 Score = 32.0 bits (73), Expect = 0.40
 Identities = 18/65 (27%), Positives = 34/65 (52%), Gaps = 1/65 (1%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
           S KE+++    ++EKE  ++K ++   +KEKK  +K  K+KS K      ++   +   +
Sbjct: 95  SEKEQKEVSKETEEKEAIKAKKEK-KEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTTTK 153

Query: 135 KESKS 139
           KE   
Sbjct: 154 KEEGK 158



 Score = 32.0 bits (73), Expect = 0.46
 Identities = 17/67 (25%), Positives = 34/67 (50%), Gaps = 1/67 (1%)

Query: 41  PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSA-VSSKEKEKDKVSSKEKERKESKPKES 99
           P  S   +K+   +  +KE  K KK+K++ K   V+ K  +K    + +   K++  K +
Sbjct: 92  PEESEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTT 151

Query: 100 SSEKEKK 106
           + ++E K
Sbjct: 152 TKKEEGK 158



 Score = 30.1 bits (68), Expect = 2.0
 Identities = 16/66 (24%), Positives = 37/66 (56%), Gaps = 2/66 (3%)

Query: 55  DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK--EKERKESKPKESSSEKEKKKEKKDK 112
           +  ++++KE   + ++K A+ +K+++K+K   K  EK  K+   K + +  +K  +K   
Sbjct: 93  EESEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTTT 152

Query: 113 KEKSHK 118
           K++  K
Sbjct: 153 KKEEGK 158



 Score = 29.0 bits (65), Expect = 3.9
 Identities = 20/66 (30%), Positives = 30/66 (45%)

Query: 90  ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
           E  E + KE S E E+K+  K KKEK  K + K  E+   +K  +    ++ K    +  
Sbjct: 93  EESEKEQKEVSKETEEKEAIKAKKEKKEKKEKKVAEKLAKKKSTKTTKNTTKKATKKTTT 152

Query: 150 SKEPAS 155
            KE   
Sbjct: 153 KKEEGK 158


>gnl|CDD|240246 PTZ00053, PTZ00053, methionine aminopeptidase 2; Provisional.
          Length = 470

 Score = 38.2 bits (89), Expect = 0.013
 Identities = 25/113 (22%), Positives = 46/113 (40%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
              +  + + K+ K+++K   + K+ +K K    + +   ++    + E E K+  K KK
Sbjct: 1   AMNENGENEVKQQKQQNKQKGTKKKNKKSKKDVDDDDAFLAELISENQEAENKQNNKKKK 60

Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP 166
           +K  K K K+     D   +     SS+    +SH  K      Q      PP
Sbjct: 61  KKKKKKKKKNLGEAYDLAYDLPVVWSSAAFQDNSHIRKLGNWPEQEWKQTQPP 113



 Score = 37.8 bits (88), Expect = 0.016
 Identities = 24/102 (23%), Positives = 47/102 (46%), Gaps = 6/102 (5%)

Query: 42  TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
             + + + + K +   K++ K+K  K+K+K    SK+   D  +   +   E++  E+  
Sbjct: 1   AMNENGENEVKQQ---KQQNKQKGTKKKNKK---SKKDVDDDDAFLAELISENQEAENKQ 54

Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
             +KKK+KK KK+K +  +  D   D          + +S I
Sbjct: 55  NNKKKKKKKKKKKKKNLGEAYDLAYDLPVVWSSAAFQDNSHI 96



 Score = 33.9 bits (78), Expect = 0.23
 Identities = 16/89 (17%), Positives = 34/89 (38%)

Query: 39  SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
            N  N    +K +  +   K+K K+ K    D  A  ++   +++ +  ++  K+ K K+
Sbjct: 4   ENGENEVKQQKQQNKQKGTKKKNKKSKKDVDDDDAFLAELISENQEAENKQNNKKKKKKK 63

Query: 99  SSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
              +K+   E  D              +D
Sbjct: 64  KKKKKKNLGEAYDLAYDLPVVWSSAAFQD 92


>gnl|CDD|224117 COG1196, Smc, Chromosome segregation ATPases [Cell division and
           chromosome partitioning].
          Length = 1163

 Score = 38.2 bits (89), Expect = 0.013
 Identities = 27/239 (11%), Positives = 90/239 (37%), Gaps = 24/239 (10%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           +++ ++ +++ E+ K + ++ +++     +E  + K   +E E + S  +E   E E + 
Sbjct: 259 QEELEEAEKEIEELKSELEELREELEELQEELLELKEEIEELEGEISLLRERLEELENEL 318

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
           E+ +++ +  K K +  + + +E++   E              +     S L        
Sbjct: 319 EELEERLEELKEKIEALKEELEERETLLEELEQLLAELEEAKEELEEKLSAL-------- 370

Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK---------KDKHGDKTNPKEKDAKS 218
                   ++   +   +E +    + ++ +++           ++  ++ + + +D K 
Sbjct: 371 -----LEELEELFEALREELAELEAELAEIRNELEELKREIESLEERLERLSERLEDLKE 425

Query: 219 KEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQR 277
           + KE        +   E+  +   L   + +  + R   +  +     L+ E   + + 
Sbjct: 426 ELKELEAELEELQ--TELEELNEELEELEEQLEELRDRLKELERELAELQEELQRLEKE 482



 Score = 31.6 bits (72), Expect = 1.5
 Identities = 33/242 (13%), Positives = 84/242 (34%), Gaps = 53/242 (21%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKD------KVSSKEKERKESKPKESSS 101
           ++  +  +R +E + E ++ E        KE  K+      ++S  E+E +E        
Sbjct: 206 ERQAEKAERYQELKAELRELELALLLAKLKELRKELEELEEELSRLEEELEE-------L 258

Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS 161
           ++E ++ +K+ +E   + ++   E ++ +++  +  +   ++       +E         
Sbjct: 259 QEELEEAEKEIEELKSELEELREELEELQEELLELKEEIEELEGEISLLRE--------- 309

Query: 162 HPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
                          + +E E E E      +  K K +   +  ++     ++ +    
Sbjct: 310 ---------------RLEELENELEELEERLEELKEKIEALKEELEERETLLEELEQLLA 354

Query: 222 ESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKD--ELFDRLKNEQADILQRKV 279
           E  ++                   +K   + E L E F+   E    L+ E A+I     
Sbjct: 355 ELEEAKE--------------ELEEKLSALLEELEELFEALREELAELEAELAEIRNELE 400

Query: 280 HI 281
            +
Sbjct: 401 EL 402


>gnl|CDD|214395 CHL00204, ycf1, Ycf1; Provisional.
          Length = 1832

 Score = 38.2 bits (89), Expect = 0.016
 Identities = 24/89 (26%), Positives = 42/89 (47%), Gaps = 6/89 (6%)

Query: 57   DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
            +  +   KK  +K K  + S EK+  ++ ++ +E KE   +    E E  KEKK   E  
Sbjct: 1493 NGNENVNKKINQKKKGFIPSNEKKSIEIENRNQEEKEPAGQG---ELESDKEKKGNLESV 1549

Query: 117  HKHKDKDRERDKDE---KKEQKESKSSSK 142
              +++K+ E D  E   KK + + +  S 
Sbjct: 1550 LSNQEKNIEEDYAESDIKKRKNKKQYKSN 1578



 Score = 35.8 bits (83), Expect = 0.076
 Identities = 22/80 (27%), Positives = 42/80 (52%), Gaps = 6/80 (7%)

Query: 44   SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
             S+ KK  + ++R++E EKE   + + +S    KEK+ +  S    + K  +   + S+ 
Sbjct: 1511 PSNEKKSIEIENRNQE-EKEPAGQGELES---DKEKKGNLESVLSNQEKNIEEDYAESDI 1566

Query: 104  EKKKEKKDKKEKSHKHKDKD 123
            +K+K K  K+ KS+   + D
Sbjct: 1567 KKRKNK--KQYKSNTEAELD 1584



 Score = 30.1 bits (68), Expect = 4.6
 Identities = 17/38 (44%), Positives = 25/38 (65%), Gaps = 2/38 (5%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
           K+ E K S   E  ++K+KKKEK  KKE+ +K ++K R
Sbjct: 731 KDAEFKISDSVEEKTKKKKKKEK--KKEEEYKREEKAR 766


>gnl|CDD|222447 pfam13904, DUF4207, Domain of unknown function (DUF4207).  This
           family is found in eukaryotes; it has several conserved
           tryptophan residues. The function is not known.
          Length = 261

 Score = 37.0 bits (86), Expect = 0.018
 Identities = 34/179 (18%), Positives = 76/179 (42%), Gaps = 12/179 (6%)

Query: 26  SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS 85
            S   S S  S + + T SS S      +   + K + +  +E  ++ +S+K+ ++ K  
Sbjct: 46  DSESSSNSVPSLSLSSTASSLSDSSTYSRSLKEVKLERQA-QEAYENWLSAKQAQRQKKL 104

Query: 86  SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
            K  E K+ +      E+EK++E+ + +++  K K ++  R    K +Q   + + K   
Sbjct: 105 QKLLEEKQKQ------EREKEREEAELRQRLAKEKYEEWCRQ---KAQQAAKQRTPKHKK 155

Query: 146 SSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
            +  S   +      + P         K  ++  E +K K+     ++  + + KK+ +
Sbjct: 156 EAAESASSSLSGS--AKPERNVSQEEAKKRLQEWELKKLKQQQQKREEERRKQRKKQQE 212



 Score = 30.9 bits (70), Expect = 1.5
 Identities = 18/85 (21%), Positives = 42/85 (49%), Gaps = 6/85 (7%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES----SSEKEKKKEKKD 111
           R K ++  K+   K K   +          S + ER  S+ +        E +K K+++ 
Sbjct: 139 RQKAQQAAKQRTPKHKKEAAESASSSLS-GSAKPERNVSQEEAKKRLQEWELKKLKQQQQ 197

Query: 112 KKEKSHKHKDKDRERDKDEKKEQKE 136
           K+E+  + K + ++++++E+K++ E
Sbjct: 198 KREEE-RRKQRKKQQEEEERKQKAE 221


>gnl|CDD|218026 pfam04321, RmlD_sub_bind, RmlD substrate binding domain.
           L-rhamnose is a saccharide required for the virulence of
           some bacteria. Its precursor, dTDP-L-rhamnose, is
           synthesised by four different enzymes the final one of
           which is RmlD. The RmlD substrate binding domain is
           responsible for binding a sugar nucleotide.
          Length = 284

 Score = 37.2 bits (87), Expect = 0.019
 Identities = 21/67 (31%), Positives = 29/67 (43%), Gaps = 13/67 (19%)

Query: 306 HVIIHAAASLRFD--ELIQD-AFTLNIQATRELLDLATRCSQLKAIL-HVSTLY------ 355
            V+++AAA    D  E   + A+ +N         LA  C+   A L H+ST Y      
Sbjct: 51  DVVVNAAAYTAVDKAESEPELAYAVNALGPGN---LAEACAARGAPLIHISTDYVFDGAK 107

Query: 356 THSYRED 362
              YRED
Sbjct: 108 GGPYRED 114


>gnl|CDD|219924 pfam08597, eIF3_subunit, Translation initiation factor eIF3
           subunit.  This is a family of proteins which are
           subunits of the eukaryotic translation initiation factor
           3 (eIF3). In yeast it is called Hcr1. The Saccharomyces
           cerevisiae protein eIF3j (HCR1) has been shown to be
           required for processing of 20S pre-rRNA and binds to 18S
           rRNA and eIF3 subunits Rpg1p and Prt1p.
          Length = 242

 Score = 36.9 bits (86), Expect = 0.020
 Identities = 25/97 (25%), Positives = 51/97 (52%), Gaps = 3/97 (3%)

Query: 41  PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
              + +  KDK D + + +  K+  D+E+D+     +EK K    +K K+  ++K +E  
Sbjct: 13  APPAKAVVKDKWDDEDEDDDVKDSWDEEEDEEK--EEEKAKVAAKAKAKKALKAKIEEKE 70

Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
             K +K+EK  ++ +    +D+  E+ +  +K Q+ES
Sbjct: 71  KAKREKEEKGLRELEEDTPEDELAEKLR-LRKLQEES 106



 Score = 30.4 bits (69), Expect = 2.0
 Identities = 13/58 (22%), Positives = 32/58 (55%)

Query: 85  SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
             +++E++E K K ++  K KK  K   +EK    ++K+ +  ++ +++  E + + K
Sbjct: 39  EEEDEEKEEEKAKVAAKAKAKKALKAKIEEKEKAKREKEEKGLRELEEDTPEDELAEK 96


>gnl|CDD|236545 PRK09510, tolA, cell envelope integrity inner membrane protein
           TolA; Provisional.
          Length = 387

 Score = 37.1 bits (86), Expect = 0.022
 Identities = 19/98 (19%), Positives = 42/98 (42%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
           K+   E K K + ++A  +  + K K  ++   +  ++ K+ +  + KKK   + K+K+ 
Sbjct: 161 KKAAAEAKKKAEAEAAKKAAAEAKKKAEAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAA 220

Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
                   +   E K   E  +++K    +  +K  A 
Sbjct: 221 AEAKAAAAKAAAEAKAAAEKAAAAKAAEKAAAAKAAAE 258



 Score = 36.7 bits (85), Expect = 0.032
 Identities = 21/89 (23%), Positives = 35/89 (39%), Gaps = 1/89 (1%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK-K 106
           KK   +  +  E E  KK   + K    ++   K    +K+K   E+K K ++  K+K  
Sbjct: 161 KKAAAEAKKKAEAEAAKKAAAEAKKKAEAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAA 220

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
            E K    K+        E+    K  +K
Sbjct: 221 AEAKAAAAKAAAEAKAAAEKAAAAKAAEK 249



 Score = 35.6 bits (82), Expect = 0.079
 Identities = 17/90 (18%), Positives = 37/90 (41%), Gaps = 1/90 (1%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK-ESSSEKEKK 106
           KK + +  +    E +KK + +  +  +++ K+K +  +K+K   E+K K  + ++    
Sbjct: 169 KKAEAEAAKKAAAEAKKKAEAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAAAEAKAAAA 228

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           K   + K  + K             K   E
Sbjct: 229 KAAAEAKAAAEKAAAAKAAEKAAAAKAAAE 258



 Score = 34.8 bits (80), Expect = 0.11
 Identities = 12/91 (13%), Positives = 44/91 (48%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           +K  K  +  ++K+++++ +E  +   + +E+ K     +   +++ K  E ++++   K
Sbjct: 71  QKSAKRAEEQRKKKEQQQAEELQQKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAALK 130

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           +K+ ++  +        + + + K+    +K
Sbjct: 131 QKQAEEAAAKAAAAAKAKAEAEAKRAAAAAK 161



 Score = 34.8 bits (80), Expect = 0.14
 Identities = 28/181 (15%), Positives = 75/181 (41%), Gaps = 1/181 (0%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER-KESKPKESSSEKEKKK 107
           + ++   +  E++++KK++++ +     +  E++++   EKER    + K+ + E  K+ 
Sbjct: 68  QQQQKSAKRAEEQRKKKEQQQAEELQQKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQA 127

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
             K K+ +    K     + K E + ++ + ++ K  + +    E  +  +  +     A
Sbjct: 128 ALKQKQAEEAAAKAAAAAKAKAEAEAKRAAAAAKKAAAEAKKKAEAEAAKKAAAEAKKKA 187

Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
                       +K+ E E+       +K K   + K        E  A +++  + K++
Sbjct: 188 EAEAAAKAAAEAKKKAEAEAKKKAAAEAKKKAAAEAKAAAAKAAAEAKAAAEKAAAAKAA 247

Query: 228 A 228
            
Sbjct: 248 E 248


>gnl|CDD|187564 cd05254, dTDP_HR_like_SDR_e, dTDP-6-deoxy-L-lyxo-4-hexulose
           reductase and related proteins, extended (e) SDRs.
           dTDP-6-deoxy-L-lyxo-4-hexulose reductase, an extended
           SDR, synthesizes dTDP-L-rhamnose from
           alpha-D-glucose-1-phosphate,  providing the precursor of
           L-rhamnose, an essential cell wall component of many
           pathogenic bacteria. This subgroup has the
           characteristic active site tetrad and NADP-binding
           motif. This subgroup also contains human MAT2B, the
           regulatory subunit of methionine adenosyltransferase
           (MAT); MAT catalyzes S-adenosylmethionine synthesis. The
           human gene encoding MAT2B encodes two major splicing
           variants which are induced in human cell liver cancer
           and regulate HuR, an mRNA-binding protein which
           stabilizes the mRNA of several cyclins, to affect cell
           proliferation. Both MAT2B variants include this extended
           SDR domain. Extended SDRs are distinct from classical
           SDRs. In addition to the Rossmann fold (alpha/beta
           folding pattern with a central beta-sheet) core region
           typical of all SDRs, extended SDRs have a less conserved
           C-terminal extension of approximately 100 amino acids.
           Extended SDRs are a diverse collection of proteins, and
           include isomerases, epimerases, oxidoreductases, and
           lyases; they typically have a TGXXGXXG cofactor binding
           motif. SDRs are a functionally diverse family of
           oxidoreductases that have a single domain with a
           structurally conserved Rossmann fold, an
           NAD(P)(H)-binding region, and a structurally diverse
           C-terminal region. Sequence identity between different
           SDR enzymes is typically in the 15-30% range; they
           catalyze a wide range of activities including the
           metabolism of steroids, cofactors, carbohydrates,
           lipids, aromatic compounds, and amino acids, and act in
           redox sensing. Classical SDRs have an TGXXX[AG]XG
           cofactor binding motif and a YXXXK active site motif,
           with the Tyr residue of the active site motif serving as
           a critical catalytic residue (Tyr-151, human
           15-hydroxyprostaglandin dehydrogenase numbering). In
           addition to the Tyr and Lys, there is often an upstream
           Ser and/or an Asn, contributing to the active site;
           while substrate binding is in the C-terminal region,
           which determines specificity. The standard reaction
           mechanism is a 4-pro-S hydride transfer and proton relay
           involving the conserved Tyr and Lys, a water molecule
           stabilized by Asn, and nicotinamide. Atypical SDRs
           generally lack the catalytic residues characteristic of
           the SDRs, and their glycine-rich NAD(P)-binding motif is
           often different from the forms normally seen in
           classical or extended SDRs. Complex (multidomain) SDRs
           such as ketoreductase domains of fatty acid synthase
           have a GGXGXXG NAD(P)-binding motif and an altered
           active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
          Length = 280

 Score = 36.8 bits (86), Expect = 0.022
 Identities = 31/148 (20%), Positives = 49/148 (33%), Gaps = 48/148 (32%)

Query: 300 FIQHHIHVIIHAAASLRFDELIQD---AFTLNIQATRELLDLATRCSQLKAIL-HVSTLY 355
              +   VII+ AA  R D+   D   A+ +N+ A   L        ++ A L H+ST Y
Sbjct: 51  IRDYKPDVIINCAAYTRVDKCESDPELAYRVNVLAPENLARA---AKEVGARLIHISTDY 107

Query: 356 -----THSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKA 410
                   Y+E             ED  + +                     N Y  +K 
Sbjct: 108 VFDGKKGPYKE-------------EDAPNPL---------------------NVYGKSKL 133

Query: 411 IGESVVEKYLYKLPLAMVRPSIVVSTWK 438
           +GE  V     +    ++R S +    K
Sbjct: 134 LGEVAVLNANPR--YLILRTSWLYGELK 159


>gnl|CDD|191187 pfam05087, Rota_VP2, Rotavirus VP2 protein.  Rotavirus particles
           consist of three concentric proteinaceous capsid layers.
           The innermost capsid (core) is made of VP2. The genomic
           RNA and the two minor proteins VP1 and VP3 are
           encapsidated within this layer. The N-terminus of
           rotavirus VP2 is necessary for the encapsidation of VP1
           and VP3.
          Length = 887

 Score = 37.6 bits (87), Expect = 0.023
 Identities = 21/90 (23%), Positives = 45/90 (50%), Gaps = 2/90 (2%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           + +  + DR +EK+ EK+D +K++  +  K  +K +    +      K + S    +   
Sbjct: 8   EANINNNDRMQEKDDEKQD-QKNRMELKEKVLDKKEEVVTDNVDSPVKEQSSQENLKIAD 66

Query: 108 E-KKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           E KK  KE+S +  +  + +++ +K+ Q E
Sbjct: 67  EVKKSTKEESKQLLEVLKTKEEHQKEIQYE 96



 Score = 31.4 bits (71), Expect = 1.5
 Identities = 19/91 (20%), Positives = 32/91 (35%), Gaps = 14/91 (15%)

Query: 60  KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH 119
             +E      D      +EK+ +K    +K R E K        EK  +KK++    +  
Sbjct: 5   NRREANINNND----RMQEKDDEKQD--QKNRMELK--------EKVLDKKEEVVTDNVD 50

Query: 120 KDKDRERDKDEKKEQKESKSSSKIVSSSHNS 150
                +  ++  K   E K S+K  S     
Sbjct: 51  SPVKEQSSQENLKIADEVKKSTKEESKQLLE 81



 Score = 30.3 bits (68), Expect = 3.6
 Identities = 17/80 (21%), Positives = 40/80 (50%)

Query: 39  SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
            N  +    K D+K   +++ + KEK   +K++    + +    + SS+E  +   + K+
Sbjct: 11  INNNDRMQEKDDEKQDQKNRMELKEKVLDKKEEVVTDNVDSPVKEQSSQENLKIADEVKK 70

Query: 99  SSSEKEKKKEKKDKKEKSHK 118
           S+ E+ K+  +  K ++ H+
Sbjct: 71  STKEESKQLLEVLKTKEEHQ 90



 Score = 29.9 bits (67), Expect = 4.7
 Identities = 12/59 (20%), Positives = 26/59 (44%), Gaps = 1/59 (1%)

Query: 84  VSSKEKERKESKPKESSSEKE-KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
           ++ + +        +   EK+ +K+++K++ E   K  DK  E   D      + +SS 
Sbjct: 1   MAYRNRREANINNNDRMQEKDDEKQDQKNRMELKEKVLDKKEEVVTDNVDSPVKEQSSQ 59


>gnl|CDD|217503 pfam03344, Daxx, Daxx Family.  The Daxx protein (also known as the
           Fas-binding protein) is thought to play a role in
           apoptosis, but precise role played by Daxx remains to be
           determined. Daxx forms a complex with Axin.
          Length = 715

 Score = 37.2 bits (86), Expect = 0.023
 Identities = 21/88 (23%), Positives = 50/88 (56%)

Query: 53  DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
            K  D E+E+ +K +E+++   SS+  +  K SS   E      +ES  E+  ++E++++
Sbjct: 393 MKQDDTEEEERRKRQERERQGTSSRSSDPSKASSTSGESPSMASQESEEEESVEEEEEEE 452

Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSS 140
           +E+  + ++ + E  +DE++E++    +
Sbjct: 453 EEEEEEEQESEEEEGEDEEEEEEVEADN 480



 Score = 31.8 bits (72), Expect = 1.2
 Identities = 31/157 (19%), Positives = 69/157 (43%), Gaps = 16/157 (10%)

Query: 50  DKKDKDRDKEKEKEKKDKEK---DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           D ++++R K +E+E++       D S  SS   E   ++S+E E +ES  +E   E+E++
Sbjct: 397 DTEEEERRKRQERERQGTSSRSSDPSKASSTSGESPSMASQESEEEESVEEEEEEEEEEE 456

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKE---SKSSSKIVSSSHNSKEPASGSQLISHP 163
           +E+++ +E+  + ++++ E + D   E++    S+          +++   S    IS  
Sbjct: 457 EEEQESEEEEGEDEEEEEEVEADNGSEEEMEGSSEGDGDGEEPEEDAERRNSEMAGISRM 516

Query: 164 PP----------PAPTPTQKSPVKTKEKEKEKESSTT 190
                       P     +    ++ + E   E S  
Sbjct: 517 SEGQQPRGSSVQPESPQEEPLQPESMDAESVGEESDE 553


>gnl|CDD|130141 TIGR01069, mutS2, MutS2 family protein.  Function of MutS2 is
           unknown. It should not be considered a DNA mismatch
           repair protein. It is likely a DNA mismatch binding
           protein of unknown cellular function [DNA metabolism,
           Other].
          Length = 771

 Score = 37.1 bits (86), Expect = 0.026
 Identities = 41/145 (28%), Positives = 66/145 (45%), Gaps = 13/145 (8%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
           ++ K+++R+K+ E EK+ +E  K+    KE E      KEK+  ++K  +S  +  K KE
Sbjct: 553 EELKERERNKKLELEKEAQEALKALK--KEVESIIRELKEKKIHKAKEIKSIEDLVKLKE 610

Query: 109 KKDK------KEKSHKHKDKDRERDKDEKKEQKESKSSSKI-VSSSH-NSKEPASGSQLI 160
            K K        ++ K  DK R R   +K +  +    +K  V+      K   S  + I
Sbjct: 611 TKQKIPQKPTNFQADKIGDKVRIRYFGQKGKIVQILGGNKWNVTVGGMRMKVHGSELEKI 670

Query: 161 SHPPPPAPTPTQKSPVKTKEKEKEK 185
           +  PPP      K P  TK + KE 
Sbjct: 671 NKAPPPKKF---KVPKTTKPEPKEA 692



 Score = 36.3 bits (84), Expect = 0.051
 Identities = 30/96 (31%), Positives = 48/96 (50%), Gaps = 5/96 (5%)

Query: 43  NSSSSKKDKKDKDRDKE---KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
             S+ +K+ + K+   E   KE+EK  KE ++     KE+E++K    EKE +E   K  
Sbjct: 519 KLSALEKELEQKNEHLEKLLKEQEKLKKELEQEMEELKERERNKKLELEKEAQE-ALKAL 577

Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
             E E    ++ K++K HK K+     D  + KE K
Sbjct: 578 KKEVE-SIIRELKEKKIHKAKEIKSIEDLVKLKETK 612



 Score = 30.6 bits (69), Expect = 2.9
 Identities = 25/127 (19%), Positives = 53/127 (41%), Gaps = 9/127 (7%)

Query: 68  EKDKSAVSSKEKEKDKV----SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
           E+ K+     ++E + +    S+ EKE ++         KE++K KK+ +++  + K+++
Sbjct: 500 EQAKTFYGEFKEEINVLIEKLSALEKELEQKNEHLEKLLKEQEKLKKELEQEMEELKERE 559

Query: 124 RERDKDEKKEQKES-----KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKT 178
           R +  + +KE +E+     K    I+      K   +                QK P K 
Sbjct: 560 RNKKLELEKEAQEALKALKKEVESIIRELKEKKIHKAKEIKSIEDLVKLKETKQKIPQKP 619

Query: 179 KEKEKEK 185
              + +K
Sbjct: 620 TNFQADK 626


>gnl|CDD|176924 cd09071, FAR_C, C-terminal domain of fatty acyl CoA reductases.
           C-terminal domain of fatty acyl CoA reductases, a family
           of SDR-like proteins. SDRs or short-chain
           dehydrogenases/reductases are Rossmann-fold
           NAD(P)H-binding proteins. Many proteins in this FAR_C
           family may function as fatty acyl-CoA reductases (FARs),
           acting on medium and long chain fatty acids, and have
           been reported to be involved in diverse processes such
           as the biosynthesis of insect pheromones, plant
           cuticular wax production, and mammalian wax
           biosynthesis. In Arabidopsis thaliana, proteins with
           this particular architecture have also been identified
           as the MALE STERILITY 2 (MS2) gene product, which is
           implicated in male gametogenesis. Mutations in MS2
           inhibit the synthesis of exine (sporopollenin),
           rendering plants unable to reduce pollen wall fatty
           acids to corresponding alcohols. The function of this
           C-terminal domain is unclear.
          Length = 92

 Score = 34.1 bits (79), Expect = 0.026
 Identities = 10/23 (43%), Positives = 17/23 (73%)

Query: 574 FLHMIPGMIMDTVLRCLNKPPRI 596
           FLH++P  ++D +LR L + PR+
Sbjct: 1   FLHLLPAYLLDLLLRLLGRKPRL 23


>gnl|CDD|240433 PTZ00482, PTZ00482, membrane-attack complex/perforin (MACPF)
           Superfamily; Provisional.
          Length = 844

 Score = 37.2 bits (86), Expect = 0.028
 Identities = 28/146 (19%), Positives = 55/146 (37%), Gaps = 14/146 (9%)

Query: 23  DKDSSAIPSTSTSSSTSNPTNSSSSKKDK-KDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
           D +  A  +TS  SST + +      +D+  D   + ++  +    +   S      K+ 
Sbjct: 100 DDEDDAGNATSGESSTDDDSLLELPDRDEDADTQANNDQTNDFDQDDSSNSQTDQGLKQS 159

Query: 82  DKVSSKEKERKESKPKESS--------SEKEKKKEKKDKKEKSHKHK-----DKDRERDK 128
             +SS EK  +E K +  +        ++ E+   K   K KS         D   +   
Sbjct: 160 VNLSSAEKLIEEKKGQTENTFKFYNFGNDGEEAAAKDGGKSKSSDPGPLNDSDGQGDDGD 219

Query: 129 DEKKEQKESKSSSKIVSSSHNSKEPA 154
            E  E+ ++ S+++   +   S  P 
Sbjct: 220 PESAEEDKAASNTRAAYTKATSVFPG 245



 Score = 32.5 bits (74), Expect = 0.78
 Identities = 26/164 (15%), Positives = 60/164 (36%), Gaps = 6/164 (3%)

Query: 39  SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
           ++  N   S  D  D + D   E ++ D     S  SS + +         E  +++   
Sbjct: 77  ASFLNQRKSLDDDDDDEFDFLYEDDEDDAGNATSGESSTDDDSLLELPDRDEDADTQANN 136

Query: 99  SSSEKEKKKEKK-DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGS 157
             +    + +    + ++  K        +K  ++++ +++++ K   +  N  E A+  
Sbjct: 137 DQTNDFDQDDSSNSQTDQGLKQSVNLSSAEKLIEEKKGQTENTFKF-YNFGNDGEEAAAK 195

Query: 158 QL-ISHPPPPAPTPT---QKSPVKTKEKEKEKESSTTHDKHSKH 197
               S    P P      Q      +  E++K +S T   ++K 
Sbjct: 196 DGGKSKSSDPGPLNDSDGQGDDGDPESAEEDKAASNTRAAYTKA 239


>gnl|CDD|218734 pfam05758, Ycf1, Ycf1.  The chloroplast genomes of most higher
           plants contain two giant open reading frames designated
           ycf1 and ycf2. Although the function of Ycf1 is unknown,
           it is known to be an essential gene.
          Length = 832

 Score = 37.3 bits (87), Expect = 0.028
 Identities = 21/71 (29%), Positives = 33/71 (46%), Gaps = 2/71 (2%)

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
           KK KE   S    +E+E D       E K +K ++  S +E      ++KE   K +D D
Sbjct: 224 KKLKET--SETEEREEETDVEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDPDKTEDLD 281

Query: 124 RERDKDEKKEQ 134
           +     EKK++
Sbjct: 282 KLEILKEKKDE 292



 Score = 31.5 bits (72), Expect = 1.5
 Identities = 20/79 (25%), Positives = 36/79 (45%), Gaps = 9/79 (11%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
           KK K   +  E E++++E D       E E    +   K+ +E   +E  S   ++KE  
Sbjct: 224 KKLK---ETSETEEREEETD------VEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDP 274

Query: 111 DKKEKSHKHKDKDRERDKD 129
           DK E   K +    ++D++
Sbjct: 275 DKTEDLDKLEILKEKKDEE 293



 Score = 31.1 bits (71), Expect = 2.0
 Identities = 22/87 (25%), Positives = 35/87 (40%), Gaps = 4/87 (4%)

Query: 29  IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
           IPS      T     +S +++ +++ D + E   E K   K +   S++E        KE
Sbjct: 217 IPS---PFFTKKLKETSETEEREEETDVEIETTSETK-GTKQEQEGSTEEDPSLFSEEKE 272

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEK 115
              K     +    KEKK E+    EK
Sbjct: 273 DPDKTEDLDKLEILKEKKDEELFWFEK 299



 Score = 30.4 bits (69), Expect = 3.4
 Identities = 17/72 (23%), Positives = 33/72 (45%), Gaps = 4/72 (5%)

Query: 81  KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
           K K +S+ +ER+E    E+  E E   E K  K++     ++D     +EK++  +++  
Sbjct: 225 KLKETSETEEREE----ETDVEIETTSETKGTKQEQEGSTEEDPSLFSEEKEDPDKTEDL 280

Query: 141 SKIVSSSHNSKE 152
            K+        E
Sbjct: 281 DKLEILKEKKDE 292


>gnl|CDD|219563 pfam07767, Nop53, Nop53 (60S ribosomal biogenesis).  This nucleolar
           family of proteins are involved in 60S ribosomal
           biogenesis. They are specifically involved in the
           processing beyond the 27S stage of 25S rRNA maturation.
           This family contains sequences that bear similarity to
           the glioma tumour suppressor candidate region gene 2
           protein (p60). This protein has been found to interact
           with herpes simplex type 1 regulatory proteins.
          Length = 387

 Score = 36.6 bits (85), Expect = 0.029
 Identities = 23/92 (25%), Positives = 43/92 (46%), Gaps = 1/92 (1%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS-SKEKERKESKPKESSSEKEKK 106
             D  +++ D E   E  + E +      + K K K   +KEK RKE + +    ++ KK
Sbjct: 245 SDDDGEEESDDESAWEGFESEYEPINKPVRPKRKTKAQRNKEKRRKELEREAKEEKQLKK 304

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           K  +  + K    +   +E+ +  KKEQ++ +
Sbjct: 305 KLAQLARLKEIAKEVAQKEKARARKKEQRKER 336



 Score = 31.2 bits (71), Expect = 1.6
 Identities = 21/94 (22%), Positives = 42/94 (44%), Gaps = 6/94 (6%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKE------KEKDKVSSKEKERKESKPKESSSEKE 104
           K +K R + +  E+K  EK     S  +       E+     +E+   ES  +   SE E
Sbjct: 208 KAEKKRQELERVEEKKLEKMAPEASRLDEMSEGLLEESDDDGEEESDDESAWEGFESEYE 267

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
              +    K K+   ++K++ R + E++ ++E +
Sbjct: 268 PINKPVRPKRKTKAQRNKEKRRKELEREAKEEKQ 301



 Score = 30.1 bits (68), Expect = 3.3
 Identities = 24/100 (24%), Positives = 42/100 (42%), Gaps = 14/100 (14%)

Query: 39  SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
           S      S  +      R K K K +++KEK       + KE ++   + KE K+ K K 
Sbjct: 257 SAWEGFESEYEPINKPVRPKRKTKAQRNKEK-------RRKELER---EAKEEKQLKKKL 306

Query: 99  SSSEK----EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
           +   +     K+  +K+K     K + K+R   K  K+ +
Sbjct: 307 AQLARLKEIAKEVAQKEKARARKKEQRKERGEKKKLKRRK 346


>gnl|CDD|220839 pfam10661, EssA, WXG100 protein secretion system (Wss), protein
           EssA.  The WXG100 protein secretion system (Wss) is
           responsible for the secretion of WXG100 proteins
           (pfam06013) such as ESAT-6 and CFP-10 in Mycobacterium
           tuberculosis or EsxA and EsxB in Staphylococcus aureus.
           In S. aureus, the Wss seems to be encoded by a locus of
           eight CDS, called ess (eSAT-6 secretion system). This
           locus encodes, amongst several other proteins, EssA, a
           protein predicted to possess one transmembrane domain.
           Due to its predicted membrane location and its absolute
           requirement for WXG100 protein secretion, it has been
           speculated that EssA could form a secretion apparatus in
           conjunction with the polytopic membrane protein EsaA,
           YukC (pfam10140) and YukAB, which is a membrane-bound
           ATPase containing Ftsk/SpoIIIE domains (pfam01580)
           called EssC in S. aureus and Snm1/Snm2 in Mycobacterium
           tuberculosis. Proteins homologous to EssA, YukC, EsaA
           and YukD seem absent from mycobacteria.
          Length = 145

 Score = 35.2 bits (81), Expect = 0.033
 Identities = 25/119 (21%), Positives = 50/119 (42%), Gaps = 7/119 (5%)

Query: 33  STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVS-----SKEKEKDKVSSK 87
           S + S          K D+  K   ++ EK+ ++ E DK  +      ++E+   K +++
Sbjct: 3   SAADSYLEDDGKMQFKVDRLQKTDQEKNEKKLRETELDKLGIELFTTETEEEINKKKNAE 62

Query: 88  EKERKESKPKESSSEKEKKKEKKDKKEK--SHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           +KE ++ +    S +KE     K+ K+   S +++    E       E   S S S  +
Sbjct: 63  QKEMEDIENSLFSEDKEGNVAVKETKDSLFSSEYEVTSNEAASSGNAETSTSSSISNTI 121


>gnl|CDD|227507 COG5180, PBP1, Protein interacting with poly(A)-binding protein
           [RNA processing and modification].
          Length = 654

 Score = 36.6 bits (84), Expect = 0.036
 Identities = 47/194 (24%), Positives = 72/194 (37%), Gaps = 18/194 (9%)

Query: 34  TSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
            SS+ SN  N        K      +   ++K   + +     + ++ + VS+ E    +
Sbjct: 240 ESSNASNKENRQEKPAAAKQPHHMDDDGTKRKMVIEIEGLSLLENRKPEAVSAPEAVSPQ 299

Query: 94  SKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEP 153
           SK +  SS +EK+K+ K+KK  S+  K        D  K   E   S        +S E 
Sbjct: 300 SKSEGPSSGQEKEKQIKEKKSFSYGWKHTKF----DSSKNLLEVIKSKFKSLFDISSGEL 355

Query: 154 ASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK-------KDKHG 206
             GS+    PP  A   +  + V   +KE  +  S    K    KH         +  HG
Sbjct: 356 KWGSK----PPWEAKAVSIATKVSKPKKESVRSGSKAAKKSPSTKHTTRSSTSLRRRNHG 411

Query: 207 DK---TNPKEKDAK 217
                 NP   DAK
Sbjct: 412 SFFGAKNPHTNDAK 425



 Score = 29.7 bits (66), Expect = 5.9
 Identities = 39/196 (19%), Positives = 61/196 (31%), Gaps = 16/196 (8%)

Query: 5   VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
            K +++    H      K K    I   S   +      S+      + K       +EK
Sbjct: 252 EKPAAAKQPHHMDDDGTKRKMVIEIEGLSLLENRKPEAVSAPEAVSPQSKSEGPSSGQEK 311

Query: 65  KDKEKDKSAVSSKEKEKDKVSSKE----KERKESKPKESSSEKEKKKEKKDKKEK----- 115
           + + K+K + S   K     SSK      + K     + SS + K   K   + K     
Sbjct: 312 EKQIKEKKSFSYGWKHTKFDSSKNLLEVIKSKFKSLFDISSGELKWGSKPPWEAKAVSIA 371

Query: 116 SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQL---ISHPPPPAPTPTQ 172
           +   K K        K  +K   +     SS+   +    GS       H          
Sbjct: 372 TKVSKPKKESVRSGSKAAKKSPSTKHTTRSSTSLRRR-NHGSFFGAKNPHTNDAKRVLFG 430

Query: 173 KS---PVKTKEKEKEK 185
           KS    +K+KE   EK
Sbjct: 431 KSFNMFIKSKEAHDEK 446


>gnl|CDD|215774 pfam00183, HSP90, Hsp90 protein. 
          Length = 529

 Score = 36.7 bits (85), Expect = 0.036
 Identities = 19/63 (30%), Positives = 34/63 (53%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
           KE EKE  D+E+++     KE+E+     +E+  +E + +E   + +K KE   + E  +
Sbjct: 29  KEVEKEVPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKVKETTTEWELLN 88

Query: 118 KHK 120
           K K
Sbjct: 89  KTK 91



 Score = 33.2 bits (76), Expect = 0.38
 Identities = 15/52 (28%), Positives = 30/52 (57%), Gaps = 3/52 (5%)

Query: 84  VSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
              KE E++    +E   ++EKK+E++   +K    ++ D E +K+EKK++ 
Sbjct: 26  WVEKEVEKEVPDEEEEEEKEEKKEEEEKTTDKE---EEVDEEEEKEEKKKKT 74



 Score = 32.0 bits (73), Expect = 0.93
 Identities = 18/56 (32%), Positives = 38/56 (67%), Gaps = 3/56 (5%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           K+  D++ ++EKE++K+++EK        ++E++K   +EK++K  K KE+++E E
Sbjct: 33  KEVPDEEEEEEKEEKKEEEEKTTDKEEEVDEEEEK---EEKKKKTKKVKETTTEWE 85



 Score = 31.3 bits (71), Expect = 1.6
 Identities = 16/71 (22%), Positives = 39/71 (54%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
             K++KK+++     ++E+ D+E++K     K K+  + +++ +   ++KP  + + K+ 
Sbjct: 42  EEKEEKKEEEEKTTDKEEEVDEEEEKEEKKKKTKKVKETTTEWELLNKTKPIWTRNPKDV 101

Query: 106 KKEKKDKKEKS 116
            KE+     KS
Sbjct: 102 TKEEYAAFYKS 112



 Score = 30.5 bits (69), Expect = 2.9
 Identities = 18/53 (33%), Positives = 34/53 (64%), Gaps = 4/53 (7%)

Query: 92  KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           KE + +    E+E++KE+K ++E+    K  D+E + DE++E++E K  +K V
Sbjct: 29  KEVEKEVPDEEEEEEKEEKKEEEE----KTTDKEEEVDEEEEKEEKKKKTKKV 77


>gnl|CDD|219882 pfam08524, rRNA_processing, rRNA processing.  This is a family of
           proteins that are involved in rRNA processing. In a
           localisation study they were found to localise to the
           nucleus and nucleolus. The family also includes other
           metazoa members from plants to mammals where the protein
           has been named BR22 and is associated with TTF-1,
           thyroid transcription factor 1. In the lungs, the family
           binds TTF-1 to form a complex which influences the
           expression of the key lung surfactant protein-B (SP-B)
           and -C (SP-C), the small hydrophobic surfactant proteins
           that maintain surface tension in alveoli.
          Length = 150

 Score = 34.9 bits (80), Expect = 0.040
 Identities = 17/76 (22%), Positives = 41/76 (53%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           +       +++  +++ KS+   ++ EK K   ++KE  + + +E   ++  K++K+ +K
Sbjct: 43  EKEGYAVPEKESAEKQVKSSKEDRKFEKKKKLDEKKEIAKQRKREQREKELAKRQKELEK 102

Query: 114 EKSHKHKDKDRERDKD 129
            +  K K K+RER + 
Sbjct: 103 IELSKKKQKERERRRK 118



 Score = 32.9 bits (75), Expect = 0.22
 Identities = 27/97 (27%), Positives = 50/97 (51%), Gaps = 1/97 (1%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
               D+  K   K+  +E K KE  ++       +K+ +   EKE   + P++ S+EK+ 
Sbjct: 1   MGSVDQNQKKNGKKFTREYKVKEIQRNLTKKARLKKEYLKLLEKE-GYAVPEKESAEKQV 59

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           K  K+D+K +  K  D+ +E  K  K+EQ+E + + +
Sbjct: 60  KSSKEDRKFEKKKKLDEKKEIAKQRKREQREKELAKR 96



 Score = 32.2 bits (73), Expect = 0.31
 Identities = 24/83 (28%), Positives = 44/83 (53%)

Query: 77  KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           +  EK   SSKE  + E K K    ++  K+ K++++EK    + K+ E+ +  KK+QKE
Sbjct: 53  ESAEKQVKSSKEDRKFEKKKKLDEKKEIAKQRKREQREKELAKRQKELEKIELSKKKQKE 112

Query: 137 SKSSSKIVSSSHNSKEPASGSQL 159
            +   K ++    S +P  G ++
Sbjct: 113 RERRRKKLTKKTKSGQPLMGPRI 135


>gnl|CDD|220818 pfam10595, UPF0564, Uncharacterized protein family UPF0564.  This
           family of proteins has no known function. However, one
           of the members is annotated as an EF-hand family
           protein.
          Length = 349

 Score = 36.3 bits (84), Expect = 0.042
 Identities = 42/209 (20%), Positives = 72/209 (34%), Gaps = 16/209 (7%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKE-KEKEKKDKEKDKSAVSSKE 78
               K  S+          + P   S      KDK +++E K K +      +   SS+ 
Sbjct: 104 ILPRKLRSSTSEREPKKFKAKPVPKSIYIPLLKDKMQEEELKRKIRVQMRAQELLQSSRL 163

Query: 79  KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES- 137
             +      ++   + K +     K KKK  K K+ KS    +K  E+ + +  E+K+S 
Sbjct: 164 PPRMAKHEAQERLTKKKKRGQKKSKYKKKTFKPKRAKSIPDFEKLHEKFQKQLAEKKKSK 223

Query: 138 --------------KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEK 183
                         KSSS+      N        +      PP+    +      + + K
Sbjct: 224 RPTVPEPFNFQESHKSSSRTYLDQENISAGEENLKPTRRKLPPSTKKWESLVKFLRTERK 283

Query: 184 EKESSTTHDKHSKHKHKKKDKHGDKTNPK 212
           EKE+    +K    + KKK K       +
Sbjct: 284 EKEAKEQQEKKELEQRKKKKKEMAPKVKQ 312



 Score = 34.0 bits (78), Expect = 0.20
 Identities = 26/105 (24%), Positives = 45/105 (42%), Gaps = 13/105 (12%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           S     +   D++     E+  K   +    S +K +  V     ERKE + KE   +KE
Sbjct: 236 SHKSSSRTYLDQENISAGEENLKPTRRKLPPSTKKWESLVKFLRTERKEKEAKEQQEKKE 295

Query: 105 KKKEKKDKKEKSH-------------KHKDKDRERDKDEKKEQKE 136
            ++ KK KKE +              K +++ +E+    +KE+KE
Sbjct: 296 LEQRKKKKKEMAPKVKQRFEANDPAQKLQEERKEQLAKLRKEEKE 340



 Score = 32.1 bits (73), Expect = 0.78
 Identities = 19/82 (23%), Positives = 43/82 (52%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
           S+KK +      + + KEK+ KE+ +     + K+K K  + + +++      +   +E+
Sbjct: 267 STKKWESLVKFLRTERKEKEAKEQQEKKELEQRKKKKKEMAPKVKQRFEANDPAQKLQEE 326

Query: 106 KKEKKDKKEKSHKHKDKDRERD 127
           +KE+  K  K  K ++K+ E++
Sbjct: 327 RKEQLAKLRKEEKEREKEYEQE 348



 Score = 30.1 bits (68), Expect = 3.2
 Identities = 24/116 (20%), Positives = 52/116 (44%), Gaps = 3/116 (2%)

Query: 18  PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
            HK+  +      + S       PT        KK +   K    E+K+KE  +     +
Sbjct: 236 SHKSSSRTYLDQENISAGEENLKPTRRKLPPSTKKWESLVKFLRTERKEKEAKEQQEKKE 295

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
            +++ K   K+KE      +   +    +K ++++KE+  K + +++ER+K+ ++E
Sbjct: 296 LEQRKK---KKKEMAPKVKQRFEANDPAQKLQEERKEQLAKLRKEEKEREKEYEQE 348



 Score = 29.7 bits (67), Expect = 3.9
 Identities = 27/174 (15%), Positives = 71/174 (40%), Gaps = 22/174 (12%)

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK--- 115
           E+ +E++++ ++KS       +K     + +E+K++           ++E K  K K   
Sbjct: 68  EQNEERREEVREKSKAILLSSQKPFSFYEREEQKKAILPRKLRSSTSEREPKKFKAKPVP 127

Query: 116 --SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQK 173
              +    KD+ ++++ K++ +    + +++ SS      A                   
Sbjct: 128 KSIYIPLLKDKMQEEELKRKIRVQMRAQELLQSSRLPPRMAKHEA--------------- 172

Query: 174 SPVKTKEKEKEKESSTTHDKHS-KHKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
              +  +K+K  +  + + K + K K  K     +K + K +   +++K+S + 
Sbjct: 173 -QERLTKKKKRGQKKSKYKKKTFKPKRAKSIPDFEKLHEKFQKQLAEKKKSKRP 225


>gnl|CDD|191716 pfam07263, DMP1, Dentin matrix protein 1 (DMP1).  This family
           consists of several mammalian dentin matrix protein 1
           (DMP1) sequences. The dentin matrix acidic
           phosphoprotein 1 (DMP1) gene has been mapped to human
           chromosome 4q21. DMP1 is a bone and teeth specific
           protein initially identified from mineralised dentin.
           DMP1 is primarily localised in the nuclear compartment
           of undifferentiated osteoblasts. In the nucleus, DMP1
           acts as a transcriptional component for activation of
           osteoblast-specific genes like osteocalcin. During the
           early phase of osteoblast maturation, Ca(2+) surges into
           the nucleus from the cytoplasm, triggering the
           phosphorylation of DMP1 by a nuclear isoform of casein
           kinase II. This phosphorylated DMP1 is then exported out
           into the extracellular matrix, where it regulates
           nucleation of hydroxyapatite. DMP1 is a unique molecule
           that initiates osteoblast differentiation by
           transcription in the nucleus and orchestrates
           mineralised matrix formation extracellularly, at later
           stages of osteoblast maturation. The DMP1 gene has been
           found to be ectopically expressed in lung cancer
           although the reason for this is unknown.
          Length = 514

 Score = 36.6 bits (84), Expect = 0.042
 Identities = 32/149 (21%), Positives = 66/149 (44%), Gaps = 4/149 (2%)

Query: 10  SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEK 69
           SS S+  +   +++  S +     + S   NP N++S  +D++D +  +E   +     +
Sbjct: 332 SSESSQEADLPSQENSSESQEEVVSESRGDNPDNTTSHSEDQEDSESSEEDSLDTPSSSE 391

Query: 70  DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE----KKKEKKDKKEKSHKHKDKDRE 125
            +S     + E ++  S  +E  ES   E+SS +E         + + ++S   +D   E
Sbjct: 392 SQSTEEQADSESNESLSSSEESPESTEDENSSSQEGLQSHSASTESRSQESQSEQDSRSE 451

Query: 126 RDKDEKKEQKESKSSSKIVSSSHNSKEPA 154
            D  + ++   SK  S    S+ +S+E  
Sbjct: 452 EDDSDSQDSSRSKEDSNSTESASSSEEDG 480



 Score = 29.6 bits (66), Expect = 4.7
 Identities = 32/212 (15%), Positives = 76/212 (35%), Gaps = 25/212 (11%)

Query: 31  STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE 90
           +    S ++     S S++  + + ++  +E + ++  ++    SS+  ++  + S+E  
Sbjct: 288 TMEVKSDSTENAGLSQSREHSRSESQEDSEENQSQEDSQEVQDPSSESSQEADLPSQENS 347

Query: 91  RKESKPKESSSEKE-----------------KKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
                   S S++E                 + +E  +  E+         E    E++ 
Sbjct: 348 --------SESQEEVVSESRGDNPDNTTSHSEDQEDSESSEEDSLDTPSSSESQSTEEQA 399

Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDK 193
             ES  S      S  S E  + S         A T ++    ++++  + +E  +    
Sbjct: 400 DSESNESLSSSEESPESTEDENSSSQEGLQSHSASTESRSQESQSEQDSRSEEDDSDSQD 459

Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
            S+ K          ++ ++   K+ E ES K
Sbjct: 460 SSRSKEDSNSTESASSSEEDGQPKNTEIESRK 491



 Score = 29.6 bits (66), Expect = 5.5
 Identities = 43/234 (18%), Positives = 89/234 (38%), Gaps = 9/234 (3%)

Query: 5   VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
           V   S   S+H    +  D+   +    ST S   N   SS+  K K+ K  D+E+   +
Sbjct: 194 VGGGSEGESSHGDGSEFDDEGMQSDDPESTRSERGNSRMSSAGLKSKESKGEDEEQASTQ 253

Query: 65  KDKEKDKSAVSSKE-------KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
              E       S++        E+D     +         +S+      + ++  + +S 
Sbjct: 254 DSGESQSVEYPSRKFFRKSRISEEDGRGELDDSNTMEVKSDSTENAGLSQSREHSRSESQ 313

Query: 118 KHKDKDRERDKDEKKEQKESKSSSKI-VSSSHNSKEPASGSQLISHPPPPAPTPTQKSPV 176
           +  ++++ ++  ++ +   S+SS +  + S  NS E        S    P  T +     
Sbjct: 314 EDSEENQSQEDSQEVQDPSSESSQEADLPSQENSSESQEEVVSESRGDNPDNTTSHSEDQ 373

Query: 177 KTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKT-NPKEKDAKSKEKESHKSSAG 229
           +  E  +E    T     S+   ++ D   +++ +  E+  +S E E+  S  G
Sbjct: 374 EDSESSEEDSLDTPSSSESQSTEEQADSESNESLSSSEESPESTEDENSSSQEG 427



 Score = 28.9 bits (64), Expect = 9.3
 Identities = 23/78 (29%), Positives = 43/78 (55%), Gaps = 1/78 (1%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPS-TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
           +S+ S SS+  SP   +D++SS+     S S+ST + +  S S++D + ++ D + +   
Sbjct: 402 ESNESLSSSEESPESTEDENSSSQEGLQSHSASTESRSQESQSEQDSRSEEDDSDSQDSS 461

Query: 65  KDKEKDKSAVSSKEKEKD 82
           + KE   S  S+   E+D
Sbjct: 462 RSKEDSNSTESASSSEED 479


>gnl|CDD|227600 COG5275, COG5275, BRCT domain type II [General function prediction
           only].
          Length = 276

 Score = 35.9 bits (82), Expect = 0.044
 Identities = 18/80 (22%), Positives = 27/80 (33%), Gaps = 1/80 (1%)

Query: 33  STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
           ST+        S+ S+  K     +KE    K      K+ + +    K  V      RK
Sbjct: 14  STTPDEYFEQQSTRSRS-KPRIISNKETTTSKDVVHPVKTELDTTSDSKPVVHQTRATRK 72

Query: 93  ESKPKESSSEKEKKKEKKDK 112
            ++PK   S   K K     
Sbjct: 73  PAQPKAEKSTTSKSKSHTTT 92


>gnl|CDD|223496 COG0419, SbcC, ATPase involved in DNA repair [DNA replication,
           recombination, and repair].
          Length = 908

 Score = 36.3 bits (84), Expect = 0.050
 Identities = 38/301 (12%), Positives = 103/301 (34%), Gaps = 41/301 (13%)

Query: 50  DKKDKDRDKEKEKEKKDKEK----DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
           +K +K  +  KE  K+ K K    +       E  +D + + E+E KE K  E   E+++
Sbjct: 167 EKYEKLSELLKEVIKEAKAKIEELEGQLSELLEDIEDLLEALEEELKELKKLEEIQEEQE 226

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
           ++E + + E   +   +  E  +  ++ +        +   +   +E             
Sbjct: 227 EEELEQEIEALEERLAELEEEKERLEELKARLLEIESLELEALKIRE------------- 273

Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHK 225
                 +   ++   +E E++     +   + +  +++  G +        +  E+   K
Sbjct: 274 -----EELRELERLLEELEEKIERLEELEREIEELEEELEGLR-----ALLEELEELLEK 323

Query: 226 SSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQRKVHIISGD 285
             +              L  +  K  ++    + + E     KNE A +L+ ++  +   
Sbjct: 324 LKS--------------LEERLEKLEEKLEKLESELEELAEEKNELAKLLEERLKELEER 369

Query: 286 ISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQL 345
           + +    +    ++  Q    +             +++      +  +EL +L     +L
Sbjct: 370 LEELEKELEKALERLKQLEEAIQELKEELAELSAALEEIQEELEELEKELEELERELEEL 429

Query: 346 K 346
           +
Sbjct: 430 E 430



 Score = 33.2 bits (76), Expect = 0.47
 Identities = 15/88 (17%), Positives = 42/88 (47%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
           + + +  +E+   +K++ + +  +   EKE  ++  +  E  E +       +EK ++ +
Sbjct: 480 ELELEELEEELSREKEEAELREEIEELEKELRELEEELIELLELEEALKEELEEKLEKLE 539

Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESK 138
           +  E+  + K+K + +   E+  Q E +
Sbjct: 540 NLLEELEELKEKLQLQQLKEELRQLEDR 567



 Score = 33.2 bits (76), Expect = 0.53
 Identities = 14/92 (15%), Positives = 43/92 (46%), Gaps = 1/92 (1%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           ++  +    + E++ E+ + E  +     + +E+ +   +E E+ E + ++   E E+  
Sbjct: 649 EELLQAALEELEEKVEELEAEIRRELQRIENEEQLEEKLEELEQLEEELEQLREELEELL 708

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKS 139
           +K   + +    + + R+ + +E K++ E   
Sbjct: 709 KKL-GEIEQLIEELESRKAELEELKKELEKLE 739



 Score = 29.3 bits (66), Expect = 7.8
 Identities = 16/88 (18%), Positives = 42/88 (47%), Gaps = 3/88 (3%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDK--SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           ++ +++  +EKE+ +  +E ++    +   E+E  ++   E+  KE   ++    +   +
Sbjct: 484 EELEEELSREKEEAELREEIEELEKELRELEEELIELLELEEALKEELEEKLEKLENLLE 543

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQK 135
           E ++ KEK    +    E  + E + Q+
Sbjct: 544 ELEELKEKLQL-QQLKEELRQLEDRLQE 570


>gnl|CDD|240403 PTZ00400, PTZ00400, DnaK-type molecular chaperone; Provisional.
          Length = 663

 Score = 36.3 bits (84), Expect = 0.051
 Identities = 26/100 (26%), Positives = 51/100 (51%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
           KE E+ K+  EK K  V +K + +  + S EK+  + K K S ++K++ K+K  K   + 
Sbjct: 551 KEAEEYKEQDEKKKELVDAKNEAETLIYSVEKQLSDLKDKISDADKDELKQKITKLRSTL 610

Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGS 157
             +D D  +DK ++ ++   K S +     ++  + +  S
Sbjct: 611 SSEDVDSIKDKTKQLQEASWKISQQAYKQGNSDNQQSEQS 650


>gnl|CDD|200340 TIGR03927, T7SS_EssA_Firm, type VII secretion protein EssA.
           Members of this family are associated with type VII
           secretion of WXG100 family targets in the Firmicutes,
           but not in the Actinobacteria. This highly divergent
           protein family consists largely of a central region of
           highly polar low-complexity sequence containing
           occasional LF motifs in weak repeats about 17 residues
           in length, flanked by hydrophobic N- and C-terminal
           regions [Protein fate, Protein and peptide secretion and
           trafficking].
          Length = 150

 Score = 34.7 bits (80), Expect = 0.054
 Identities = 18/114 (15%), Positives = 52/114 (45%), Gaps = 2/114 (1%)

Query: 28  AIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK 87
            +PS  +++          ++ +KKD + + +  +E+ + +K+      ++K   +   +
Sbjct: 10  FLPSADSAAENDGKLQVKPNRYEKKDIEINTDYLQEETELDKELFTPEEQKKITFQKHKE 69

Query: 88  EKERKESKPKESSSEKEKKKEKKDKKEK--SHKHKDKDRERDKDEKKEQKESKS 139
           + E++E K +  S    +    K  K++  S +++      +   ++E K++ S
Sbjct: 70  KPEQEELKNQLFSENATENNTVKATKKQLFSSEYEQTSSSSESTSEEETKKTSS 123


>gnl|CDD|222581 pfam14181, YqfQ, YqfQ-like protein.  The YqfQ-like protein family
           includes the B. subtilis YqfQ protein, also known as
           VrrA, which is functionally uncharacterized. This family
           of proteins is found in bacteria. Proteins in this
           family are typically between 146 and 237 amino acids in
           length. There are two conserved sequence motifs: QYGP
           and PKLY.
          Length = 155

 Score = 34.7 bits (80), Expect = 0.055
 Identities = 21/69 (30%), Positives = 33/69 (47%), Gaps = 3/69 (4%)

Query: 29  IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
           I    +SS          S  D+ +++   E + E K+K+K +      EKEK K  ++ 
Sbjct: 87  IFRELSSSDDEEEETEEEST-DETEQEDPPETKTESKEKKKREVPKPKTEKEKPK--TEP 143

Query: 89  KERKESKPK 97
           K+ K SKPK
Sbjct: 144 KKPKPSKPK 152



 Score = 32.8 bits (75), Expect = 0.20
 Identities = 15/63 (23%), Positives = 30/63 (47%), Gaps = 4/63 (6%)

Query: 76  SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
           S   ++++ + +E   +  +        E K E K+KK++       ++E+ K E K+ K
Sbjct: 92  SSSDDEEEETEEESTDETEQEDPP----ETKTESKEKKKREVPKPKTEKEKPKTEPKKPK 147

Query: 136 ESK 138
            SK
Sbjct: 148 PSK 150



 Score = 32.8 bits (75), Expect = 0.22
 Identities = 12/59 (20%), Positives = 25/59 (42%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
             ++E+E +++  D++      + K +   K+K        E    K + K+ K  K K
Sbjct: 94  SDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTEPKKPKPSKPK 152



 Score = 31.6 bits (72), Expect = 0.48
 Identities = 12/52 (23%), Positives = 24/52 (46%)

Query: 91  RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           R+ S   +   E E++   + ++E   + K + +E+ K E  + K  K   K
Sbjct: 89  RELSSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPK 140



 Score = 30.9 bits (70), Expect = 0.99
 Identities = 14/57 (24%), Positives = 27/57 (47%), Gaps = 1/57 (1%)

Query: 86  SKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           S   + +E   +ES+ E E++   + K E   + K ++  + K EK++ K      K
Sbjct: 92  SSSDDEEEETEEESTDETEQEDPPETKTESK-EKKKREVPKPKTEKEKPKTEPKKPK 147



 Score = 30.9 bits (70), Expect = 1.1
 Identities = 17/65 (26%), Positives = 31/65 (47%), Gaps = 4/65 (6%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           SSS  ++++ + +   E E++D  + K+    K+K +      EKE    KPK    + +
Sbjct: 92  SSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKE----KPKTEPKKPK 147

Query: 105 KKKEK 109
             K K
Sbjct: 148 PSKPK 152



 Score = 28.2 bits (63), Expect = 7.5
 Identities = 16/63 (25%), Positives = 27/63 (42%), Gaps = 3/63 (4%)

Query: 174 SPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKT---NPKEKDAKSKEKESHKSSAGP 230
           S    +E+E E+ES+   ++    + K + K   K     PK +  K K +      + P
Sbjct: 92  SSSDDEEEETEEESTDETEQEDPPETKTESKEKKKREVPKPKTEKEKPKTEPKKPKPSKP 151

Query: 231 KCY 233
           K Y
Sbjct: 152 KLY 154


>gnl|CDD|220371 pfam09736, Bud13, Pre-mRNA-splicing factor of RES complex.  This
           entry is characterized by proteins with alternating
           conserved and low-complexity regions. Bud13 together
           with Snu17p and a newly identified factor,
           Pml1p/Ylr016c, form a novel trimeric complex. called The
           RES complex, pre-mRNA retention and splicing complex.
           Subunits of this complex are not essential for viability
           of yeasts but they are required for efficient splicing
           in vitro and in vivo. Furthermore, inactivation of this
           complex causes pre-mRNA leakage from the nucleus. Bud13
           contains a unique, phylogenetically conserved C-terminal
           region of unknown function.
          Length = 141

 Score = 34.2 bits (79), Expect = 0.063
 Identities = 29/91 (31%), Positives = 47/91 (51%), Gaps = 20/91 (21%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
           +DK  +  D E+++E+K++EK                 +EKERKE K KE      +K+E
Sbjct: 5   RDKSGRIIDIEEKREEKEREK-----------------EEKERKEEKEKEWGKGLVQKEE 47

Query: 109 KKDKKEKSHKHKDKDRER---DKDEKKEQKE 136
           ++ + E+  K K+K   R   D+D  +E KE
Sbjct: 48  REKRLEELEKAKNKPLARYADDEDYDEELKE 78


>gnl|CDD|188306 TIGR03319, RNase_Y, ribonuclease Y.  Members of this family are
           RNase Y, an endoribonuclease. The member from Bacillus
           subtilis, YmdA, has been shown to be involved in
           turnover of yitJ riboswitch [Transcription, Degradation
           of RNA].
          Length = 514

 Score = 35.7 bits (83), Expect = 0.063
 Identities = 28/106 (26%), Positives = 53/106 (50%), Gaps = 15/106 (14%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK- 103
            S+++  K    + +KE E   KE     + +KE+     +  E+E KE + +    E+ 
Sbjct: 28  GSAEELAKRIIEEAKKEAETLKKEA---LLEAKEEVHKLRAELERELKERRNELQRLERR 84

Query: 104 --------EKKKEKKDKKEKSHKHKDK---DRERDKDEKKEQKESK 138
                   ++K E  DKKE++ + K+K   ++E++ DEK+E+ E  
Sbjct: 85  LLQREETLDRKMESLDKKEENLEKKEKELSNKEKNLDEKEEELEEL 130


>gnl|CDD|227519 COG5192, BMS1, GTP-binding protein required for 40S ribosome
            biogenesis [Translation, ribosomal structure and
            biogenesis].
          Length = 1077

 Score = 35.9 bits (82), Expect = 0.077
 Identities = 22/72 (30%), Positives = 40/72 (55%), Gaps = 7/72 (9%)

Query: 63   EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE---KEKKKEKKDKKEKSHKH 119
            E ++K + K  +  KE+ KD    +EKER ES  +    E   KEK++E++ +K     +
Sbjct: 1010 ECREKHEIKDRIV-KERIKD---QEEKERMESLQRAKEEEIGKKEKEREQRIRKTIHDNY 1065

Query: 120  KDKDRERDKDEK 131
            K+  ++R K ++
Sbjct: 1066 KEMAKKRLKKKR 1077



 Score = 32.8 bits (74), Expect = 0.68
 Identities = 15/68 (22%), Positives = 32/68 (47%)

Query: 57   DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
            +  ++ E KD+   +     +EKE+ +   + KE +  K ++   ++ +K    + KE +
Sbjct: 1010 ECREKHEIKDRIVKERIKDQEEKERMESLQRAKEEEIGKKEKEREQRIRKTIHDNYKEMA 1069

Query: 117  HKHKDKDR 124
             K   K R
Sbjct: 1070 KKRLKKKR 1077


>gnl|CDD|148630 pfam07133, Merozoite_SPAM, Merozoite surface protein (SPAM).  This
           family consists of several Plasmodium falciparum SPAM
           (secreted polymorphic antigen associated with
           merozoites) proteins. Variation among SPAM alleles is
           the result of deletions and amino acid substitutions in
           non-repetitive sequences within and flanking the alanine
           heptad-repeat domain. Heptad repeats in which the a and
           d position contain hydrophobic residues generate
           amphipathic alpha-helices which give rise to helical
           bundles or coiled-coil structures in proteins. SPAM is
           an example of a P. falciparum antigen in which a
           repetitive sequence has features characteristic of a
           well-defined structural element.
          Length = 164

 Score = 34.4 bits (79), Expect = 0.077
 Identities = 18/82 (21%), Positives = 32/82 (39%), Gaps = 3/82 (3%)

Query: 57  DKE---KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           DKE   KE E    EK +     +E++++++   E    E +  E   E+E+ +E     
Sbjct: 32  DKEDIIKENEDVKDEKQEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEEDEEDNVDL 91

Query: 114 EKSHKHKDKDRERDKDEKKEQK 135
           +   K    D      +   Q 
Sbjct: 92  KDIEKKNINDIFNSTQDDNAQN 113



 Score = 31.0 bits (70), Expect = 1.1
 Identities = 22/97 (22%), Positives = 51/97 (52%), Gaps = 3/97 (3%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
            + +  +D+++E +++++E+D+  +   E  +D+    E E +E +  E  +   K  EK
Sbjct: 38  KENEDVKDEKQEDDEEEEEEDEEEIEEPEDIEDEEEIVEDEEEEEE-DEEDNVDLKDIEK 96

Query: 110 KDKKEKSHKHKDKDRER--DKDEKKEQKESKSSSKIV 144
           K+  +  +  +D + +    K+ KK +K  K++  IV
Sbjct: 97  KNINDIFNSTQDDNAQNLISKNYKKNEKSKKTAEDIV 133


>gnl|CDD|185618 PTZ00438, PTZ00438, gamete antigen 27/25-like protein; Provisional.
          Length = 374

 Score = 35.4 bits (81), Expect = 0.077
 Identities = 19/78 (24%), Positives = 37/78 (47%)

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
           +K   + D E +   +  K +E+     +E+E ++ +  E   E E  +E+ D  E S K
Sbjct: 80  DKSDNENDVELEGLNIIVKNEEERGTQKEEEEDEDVEEIEEVEEVEVVEEEYDDDEDSEK 139

Query: 119 HKDKDRERDKDEKKEQKE 136
             +K+ + + DE +   E
Sbjct: 140 DDEKESDAEGDENELAGE 157


>gnl|CDD|223880 COG0810, TonB, Periplasmic protein TonB, links inner and outer
           membranes [Cell envelope biogenesis, outer membrane].
          Length = 244

 Score = 34.8 bits (80), Expect = 0.082
 Identities = 24/147 (16%), Positives = 47/147 (31%), Gaps = 5/147 (3%)

Query: 24  KDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDK 83
           +D   I     +          +  ++ + +   +  E++ K   + ++       +  +
Sbjct: 29  EDFVGIELVPLAVFLLAAKVLEAPTEEPQPEP--EPPEEQPKPPTEPETPPEPTPPKPKE 86

Query: 84  VSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK-DKDRERDKDEKKEQKESKSSSK 142
               EK+ K+ KPK     K K K K   K K    K         ++      + S+S 
Sbjct: 87  KPKPEKKPKKPKPKPKPKPKPKPKVKPQPKPKKPPSKTAAKAPAAPNQPARPPSAASASG 146

Query: 143 IVSSSHNSKEPASGSQLISHPPPPAPT 169
             +    S    SG +      P  P 
Sbjct: 147 AATG--PSASYLSGLRRAIRRAPRYPA 171



 Score = 31.7 bits (72), Expect = 0.80
 Identities = 21/67 (31%), Positives = 22/67 (32%)

Query: 160 ISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK 219
              P PP    T   P   K KEK K          K K K K K   K  PK K   SK
Sbjct: 64  EEQPKPPTEPETPPEPTPPKPKEKPKPEKKPKKPKPKPKPKPKPKPKVKPQPKPKKPPSK 123

Query: 220 EKESHKS 226
                 +
Sbjct: 124 TAAKAPA 130



 Score = 28.6 bits (64), Expect = 7.4
 Identities = 22/66 (33%), Positives = 29/66 (43%), Gaps = 8/66 (12%)

Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
           PP    TP + +P K KEK K ++      K  K K K K K      PK K     +K 
Sbjct: 69  PPTEPETPPEPTPPKPKEKPKPEKKP----KKPKPKPKPKPK----PKPKVKPQPKPKKP 120

Query: 223 SHKSSA 228
             K++A
Sbjct: 121 PSKTAA 126


>gnl|CDD|165563 PHA03308, PHA03308, transcriptional regulator ICP4; Provisional.
          Length = 1463

 Score = 35.6 bits (81), Expect = 0.082
 Identities = 18/50 (36%), Positives = 33/50 (66%)

Query: 18   PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDK 67
            P  +  + SS + S+S+SSS+S  ++SSSS     ++D D++ EKE +++
Sbjct: 1240 PCPDLSESSSTMHSSSSSSSSSCSSSSSSSDSSSSEEDGDEKNEKEDRER 1289



 Score = 34.0 bits (77), Expect = 0.26
 Identities = 20/67 (29%), Positives = 35/67 (52%), Gaps = 4/67 (5%)

Query: 3    YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
            Y  +    + +  P P    D   S+    S+SSS+S+  +SSSS  D    + D +++ 
Sbjct: 1227 YMAQPHVGAGAMPPCP----DLSESSSTMHSSSSSSSSSCSSSSSSSDSSSSEEDGDEKN 1282

Query: 63   EKKDKEK 69
            EK+D+E+
Sbjct: 1283 EKEDRER 1289



 Score = 31.7 bits (71), Expect = 1.5
 Identities = 20/70 (28%), Positives = 38/70 (54%), Gaps = 8/70 (11%)

Query: 8    SSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDK 67
            S SSS+ H S        SS+  S+ +SSS+S+ ++SS    D+K++  D+E+    K +
Sbjct: 1245 SESSSTMHSS--------SSSSSSSCSSSSSSSDSSSSEEDGDEKNEKEDRERAGGGKRR 1296

Query: 68   EKDKSAVSSK 77
             + +  +  +
Sbjct: 1297 GRQRLPIRDR 1306


>gnl|CDD|205448 pfam13268, DUF4059, Protein of unknown function (DUF4059).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria. Proteins in
           this family are approximately 70 amino acids in length.
           There is a conserved DKT sequence motif.
          Length = 72

 Score = 32.3 bits (74), Expect = 0.084
 Identities = 14/34 (41%), Positives = 19/34 (55%), Gaps = 7/34 (20%)

Query: 236 VGGIYILLR--SKKNKTVQERLAEQFKDELFDRL 267
           +  I+IL R   KK+KT +ER A      L+D L
Sbjct: 23  ISLIWILWRAIRKKDKTAKERQA-----FLYDVL 51


>gnl|CDD|221733 pfam12720, DUF3807, Protein of unknown function (DUF3807).  This is
           a family of conserved fungal proteins of unknown
           function.
          Length = 169

 Score = 34.3 bits (79), Expect = 0.087
 Identities = 15/77 (19%), Positives = 30/77 (38%), Gaps = 1/77 (1%)

Query: 61  EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
           E+E K +E +       +   D  +   +   + K  E     +K+K  +DK+ KS K  
Sbjct: 68  ERELK-EEAEAEEEGEVDASPDAGAVAGESSADRKEAEQQGAAQKRKSCRDKERKSAKDP 126

Query: 121 DKDRERDKDEKKEQKES 137
               +   D+ +   + 
Sbjct: 127 RGGTQDVVDKSQASLDY 143



 Score = 32.8 bits (75), Expect = 0.28
 Identities = 20/99 (20%), Positives = 44/99 (44%), Gaps = 8/99 (8%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           ++  +E E E++ +        +   E      KE E++ +  K  S   +++K  KD +
Sbjct: 69  RELKEEAEAEEEGEVDASPDAGAVAGESSA-DRKEAEQQGAAQKRKSCRDKERKSAKDPR 127

Query: 114 EKSHKHKDKDRERDK--DEKKEQKESKSSS-----KIVS 145
             +    DK +      +E+ +Q+E++S       +I+S
Sbjct: 128 GGTQDVVDKSQASLDYGEEETQQQEAQSGPNNFGRRIIS 166



 Score = 28.9 bits (65), Expect = 5.1
 Identities = 14/66 (21%), Positives = 24/66 (36%), Gaps = 7/66 (10%)

Query: 88  EKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSS 147
           E+E KE    E   E +   +      +S      DR+  + +   QK      K     
Sbjct: 68  ERELKEEAEAEEEGEVDASPDAGAVAGESSA----DRKEAEQQGAAQKRKSCRDK---ER 120

Query: 148 HNSKEP 153
            ++K+P
Sbjct: 121 KSAKDP 126


>gnl|CDD|221952 pfam13166, AAA_13, AAA domain.  This family of domains contain a
           P-loop motif that is characteristic of the AAA
           superfamily. Many of the proteins in this family are
           conjugative transfer proteins. This family includes the
           PrrC protein that is thought to be the active component
           of the anticodon nuclease.
          Length = 713

 Score = 35.4 bits (82), Expect = 0.088
 Identities = 25/94 (26%), Positives = 45/94 (47%), Gaps = 1/94 (1%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
           K  ++K    E E EKK++E +K+     +K   K+ +K+ +   S+  +  + K+  KE
Sbjct: 105 KKLEEKIEQLEAEIEKKEEELEKAKNKFLDKAWKKL-AKKYDSNLSEALKGLNYKKNFKE 163

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           K  K+ KS           ++ K + K   SS+K
Sbjct: 164 KLLKELKSVILNASSLLSLEELKAKIKTLFSSNK 197



 Score = 30.7 bits (70), Expect = 2.8
 Identities = 20/130 (15%), Positives = 45/130 (34%), Gaps = 12/130 (9%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
            +++ + + + +E +KE K  E+    + ++ ++K++    EK + +        +  KK
Sbjct: 87  GEENIEIEAQIEELKKELKKLEEKIEQLEAEIEKKEE--ELEKAKNKFL-----DKAWKK 139

Query: 107 KEKKDKKE-----KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS 161
             KK         K   +K   +E+   E K    + SS   +       +    S    
Sbjct: 140 LAKKYDSNLSEALKGLNYKKNFKEKLLKELKSVILNASSLLSLEELKAKIKTLFSSNKPE 199

Query: 162 HPPPPAPTPT 171
                     
Sbjct: 200 LALLTLSVID 209


>gnl|CDD|218312 pfam04889, Cwf_Cwc_15, Cwf15/Cwc15 cell cycle control protein.
           This family represents Cwf15/Cwc15 (from
           Schizosaccharomyces pombe and Saccharomyces cerevisiae
           respectively) and their homologues. The function of
           these proteins is unknown, but they form part of the
           spliceosome and are thus thought to be involved in mRNA
           splicing.
          Length = 241

 Score = 34.7 bits (80), Expect = 0.090
 Identities = 30/120 (25%), Positives = 59/120 (49%), Gaps = 10/120 (8%)

Query: 18  PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
            HK+K ++  AI   +  S+  + +N      D++D+ + K  E++ ++ + D S  SS 
Sbjct: 69  AHKSKKENKLAI-EDADKSTNLDASNEGDEDDDEEDEIKRKRIEEDARNSDADDSDSSSD 127

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEK-KKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
               D  S  +    E+       E EK KKE+ ++KE+      ++ E+  +E+K ++E
Sbjct: 128 SDSSDDDSDDDDSEDETA--ALLRELEKIKKERAEEKER------EEEEKAAEEEKAREE 179



 Score = 28.5 bits (64), Expect = 9.8
 Identities = 19/102 (18%), Positives = 39/102 (38%), Gaps = 4/102 (3%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD 66
           +  S++    +     D +   I              +S +       D D   +    D
Sbjct: 83  ADKSTNLDASNEGDEDDDEEDEIKRKRIEE----DARNSDADDSDSSSDSDSSDDDSDDD 138

Query: 67  KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
             +D++A   +E EK K    E++ +E + K +  EK +++E
Sbjct: 139 DSEDETAALLRELEKIKKERAEEKEREEEEKAAEEEKAREEE 180


>gnl|CDD|221641 pfam12569, NARP1, NMDA receptor-regulated protein 1.  This domain
           family is found in eukaryotes, and is approximately 40
           amino acids in length. The family is found in
           association with pfam07719, pfam00515. There is a single
           completely conserved residue L that may be functionally
           important. NARP1 is the mammalian homologue of a yeast
           N-terminal acetyltransferase that regulates entry into
           the G(0) phase of the cell cycle.
          Length = 516

 Score = 35.3 bits (82), Expect = 0.095
 Identities = 21/81 (25%), Positives = 38/81 (46%), Gaps = 1/81 (1%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           DK    E +++E +   +S  E++K +   ++ E+K  K +   +  +KK E   KK K 
Sbjct: 389 DKPLLAEGEEEEGENGNLSPAERKKLRKKQRKAEKKAEKEEAEKAAAKKKAEAAAKKAKG 448

Query: 117 HKHKDKDRERDKD-EKKEQKE 136
              + K  + D   EK  + E
Sbjct: 449 PDGETKKVDPDPLGEKLARTE 469



 Score = 34.9 bits (81), Expect = 0.13
 Identities = 19/89 (21%), Positives = 40/89 (44%), Gaps = 2/89 (2%)

Query: 30  PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK 89
           P  +         N + S  ++K K R K+++ EKK ++++    ++K+K +      + 
Sbjct: 391 PLLAEGEE-EEGENGNLSPAERK-KLRKKQRKAEKKAEKEEAEKAAAKKKAEAAAKKAKG 448

Query: 90  ERKESKPKESSSEKEKKKEKKDKKEKSHK 118
              E+K  +     EK    +D  E++ K
Sbjct: 449 PDGETKKVDPDPLGEKLARTEDPLEEAMK 477



 Score = 32.6 bits (75), Expect = 0.62
 Identities = 14/71 (19%), Positives = 31/71 (43%), Gaps = 5/71 (7%)

Query: 68  EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
           +   +    +E E   +S  E+++   K +     K +KK +K++ EK+   K  +    
Sbjct: 390 KPLLAEGEEEEGENGNLSPAERKKLRKKQR-----KAEKKAEKEEAEKAAAKKKAEAAAK 444

Query: 128 KDEKKEQKESK 138
           K +  + +  K
Sbjct: 445 KAKGPDGETKK 455



 Score = 29.5 bits (67), Expect = 5.3
 Identities = 20/99 (20%), Positives = 36/99 (36%), Gaps = 17/99 (17%)

Query: 82  DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
           DK    E E +E +    S  + KK  KK +K +            K  +KE+ E  ++ 
Sbjct: 389 DKPLLAEGEEEEGENGNLSPAERKKLRKKQRKAE------------KKAEKEEAEKAAAK 436

Query: 142 KIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKE 180
           K   ++    +   G           P P  +   +T++
Sbjct: 437 KKAEAAAKKAKGPDG-----ETKKVDPDPLGEKLARTED 470


>gnl|CDD|227463 COG5134, COG5134, Uncharacterized conserved protein [Function
           unknown].
          Length = 272

 Score = 34.7 bits (79), Expect = 0.098
 Identities = 25/139 (17%), Positives = 53/139 (38%), Gaps = 10/139 (7%)

Query: 22  KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKD--------RDKEKEKEKKDKEKDKSA 73
            DK          S      +N    + D K+ +        R  E +   +D  K ++ 
Sbjct: 66  GDKSYYTTKIYRFSIKCHLCSNPIDVRTDPKNTEYVVESGGRRKIEPQDINEDPAKAENV 125

Query: 74  VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
              K  E D + + EK+  + K ++ +S      ++ +K+  S       R R +  +++
Sbjct: 126 --EKVPESDAIEALEKQLTQQKSEKHNSSAINFIDELNKRLWSDPFVSSQRLRKQFRERK 183

Query: 134 QKESKSSSKIVSSSHNSKE 152
           + E K  +K +S  + +  
Sbjct: 184 KIEKKQEAKDLSLKNRAAL 202


>gnl|CDD|237867 PRK14953, PRK14953, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 486

 Score = 35.2 bits (81), Expect = 0.098
 Identities = 26/95 (27%), Positives = 47/95 (49%), Gaps = 2/95 (2%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
              SK++ K++   KE+EKE+ D +K    +   E +      K  E KE + K +   +
Sbjct: 365 KEGSKQETKEQPEKKEEEKEELDIDKIILQIIKNEGKIISAILKNAEIKEEEGKITIKVE 424

Query: 104 EKKKEKKDKKEKSHKHK--DKDRERDKDEKKEQKE 136
           + +++  D + KS K      + E  K EK+++KE
Sbjct: 425 KSEEDTLDLEIKSIKKYFPFIEFEEVKKEKEKEKE 459


>gnl|CDD|235219 PRK04098, PRK04098, sec-independent translocase; Provisional.
          Length = 158

 Score = 33.9 bits (78), Expect = 0.098
 Identities = 15/69 (21%), Positives = 28/69 (40%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
               ++  K  +   +D +K     +      ++VS++ K  +ES   ES  E +   + 
Sbjct: 90  KITAENEIKSIQDLLQDYKKSLEEDTIPNHLNEEVSNETKLTQESSSDESPKEVKLATKN 149

Query: 110 KDKKEKSHK 118
           K KK    K
Sbjct: 150 KTKKHDKEK 158



 Score = 31.6 bits (72), Expect = 0.54
 Identities = 19/87 (21%), Positives = 34/87 (39%), Gaps = 3/87 (3%)

Query: 39  SNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK---ESK 95
            +   S   K   ++ D  K   + +    +D      K  E+D + +   E        
Sbjct: 71  ESAVESLKKKLKFEELDDLKITAENEIKSIQDLLQDYKKSLEEDTIPNHLNEEVSNETKL 130

Query: 96  PKESSSEKEKKKEKKDKKEKSHKHKDK 122
            +ESSS++  K+ K   K K+ KH  +
Sbjct: 131 TQESSSDESPKEVKLATKNKTKKHDKE 157


>gnl|CDD|221275 pfam11861, DUF3381, Domain of unknown function (DUF3381).  This
           domain is functionally uncharacterized. This domain is
           found in eukaryotes. This presumed domain is typically
           between 156 to 174 amino acids in length. This domain is
           found associated with pfam07780, pfam01728.
          Length = 154

 Score = 33.8 bits (78), Expect = 0.099
 Identities = 12/64 (18%), Positives = 33/64 (51%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           R      +K+ +E+++  V  +E ++++   +  E++ +K K     + ++K+K+  KE+
Sbjct: 91  RKLLGLDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRENERKQKEILKEQ 150

Query: 116 SHKH 119
               
Sbjct: 151 MKML 154



 Score = 31.5 bits (72), Expect = 0.60
 Identities = 14/47 (29%), Positives = 27/47 (57%)

Query: 92  KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
            + K KE   E+E + E+ D++E+  +  +K+  + K EK+ + E K
Sbjct: 96  LDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRENERK 142



 Score = 29.6 bits (67), Expect = 2.6
 Identities = 16/71 (22%), Positives = 35/71 (49%), Gaps = 9/71 (12%)

Query: 67  KEKDKSAVSSKEKEKDKVSSKEKERKESKPKES-SSEKEKKKEKKDKKEKSHKHKDKDRE 125
           ++K +  +   +KEK++   +E E +E   +E      EK+  K  ++++        RE
Sbjct: 87  RKKVRKLLGLDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKR--------RE 138

Query: 126 RDKDEKKEQKE 136
            ++ +K+  KE
Sbjct: 139 NERKQKEILKE 149



 Score = 28.8 bits (65), Expect = 4.4
 Identities = 14/62 (22%), Positives = 36/62 (58%)

Query: 77  KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           ++K +  +   +KE++E + +E   E+  ++E+ D+  +    K K  +R ++E+K+++ 
Sbjct: 87  RKKVRKLLGLDKKEKEEEEEEEVEVEELDEEEQIDELLEKELAKLKREKRRENERKQKEI 146

Query: 137 SK 138
            K
Sbjct: 147 LK 148



 Score = 28.0 bits (63), Expect = 7.6
 Identities = 14/67 (20%), Positives = 27/67 (40%), Gaps = 6/67 (8%)

Query: 60  KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH 119
           K      KEK++      E E+     +  E  E +        + K+EK+ + E+  K 
Sbjct: 92  KLLGLDKKEKEEEEEEEVEVEELDEEEQIDELLEKE------LAKLKREKRRENERKQKE 145

Query: 120 KDKDRER 126
             K++ +
Sbjct: 146 ILKEQMK 152


>gnl|CDD|233830 TIGR02350, prok_dnaK, chaperone protein DnaK.  Members of this
           family are the chaperone DnaK, of the DnaK-DnaJ-GrpE
           chaperone system. All members of the seed alignment were
           taken from completely sequenced bacterial or archaeal
           genomes and (except for Mycoplasma sequence) found
           clustered with other genes of this systems. This model
           excludes DnaK homologs that are not DnaK itself, such as
           the heat shock cognate protein HscA (TIGR01991).
           However, it is not designed to distinguish among DnaK
           paralogs in eukaryotes. Note that a number of dnaK genes
           have shadow ORFs in the same reverse (relative to dnaK)
           reading frame, a few of which have been assigned
           glutamate dehydrogenase activity. The significance of
           this observation is unclear; lengths of such shadow ORFs
           are highly variable as if the presumptive protein
           product is not conserved [Protein fate, Protein folding
           and stabilization].
          Length = 595

 Score = 35.4 bits (82), Expect = 0.10
 Identities = 26/121 (21%), Positives = 54/121 (44%), Gaps = 8/121 (6%)

Query: 22  KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
           KDK +        S + +  +  S  + ++  K+ +   E++KK KE     + ++    
Sbjct: 480 KDKGTG----KEQSITITASSGLSEEEIERMVKEAEANAEEDKKRKE----EIEARNNAD 531

Query: 82  DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
                 EK  KE+  K  + EKEK ++   + +++ K +D +  + K E+ +Q   K + 
Sbjct: 532 SLAYQAEKTLKEAGDKLPAEEKEKIEKAVAELKEALKGEDVEEIKAKTEELQQALQKLAE 591

Query: 142 K 142
            
Sbjct: 592 A 592


>gnl|CDD|225606 COG3064, TolA, Membrane protein involved in colicin uptake [Cell
           envelope biogenesis, outer membrane].
          Length = 387

 Score = 34.9 bits (80), Expect = 0.10
 Identities = 22/117 (18%), Positives = 48/117 (41%), Gaps = 8/117 (6%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKE----KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
           + K K   +  K+ E+  K  E    K ++A + K+ E +  ++ EK + E++ K  + +
Sbjct: 160 AAKLKAAAEAKKKAEEAAKAAEEAKAKAEAAAAKKKAEAEAKAAAEKAKAEAEAKAKAEK 219

Query: 103 KEK----KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
           K +    +K   +KK+ + K K           + +  + +   I     + K    
Sbjct: 220 KAEAAAEEKAAAEKKKAAAKAKADKAAAAAKAAERKAAAAALDDIFGGLSSGKNAPK 276



 Score = 33.8 bits (77), Expect = 0.24
 Identities = 23/100 (23%), Positives = 46/100 (46%), Gaps = 5/100 (5%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK---- 103
           +K ++++ R    E++KK +     A +   K K    +K+K  + +K  E +  K    
Sbjct: 131 QKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEAKAKAEAA 190

Query: 104 -EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
             KKK + + K  + K K +   + K EKK +  ++  + 
Sbjct: 191 AAKKKAEAEAKAAAEKAKAEAEAKAKAEKKAEAAAEEKAA 230



 Score = 33.4 bits (76), Expect = 0.34
 Identities = 30/181 (16%), Positives = 85/181 (46%), Gaps = 1/181 (0%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER-KESKPKESSSEKEKKK 107
           + ++   +  E++++KK+++  +     +  E++++   EKER K  + ++ + E EK+ 
Sbjct: 68  QSQQSSAKKGEQQRKKKEEQVAEELKPKQAAEQERLKQLEKERLKAQEQQKQAEEAEKQA 127

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
           + + K+++    K    ++ K E  + K +  ++K+ +++   K+    ++        A
Sbjct: 128 QLEQKQQEEQARKAAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEAKAKA 187

Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
                K   + + K   +++    +  +K + K +    +K   ++K A +K K    ++
Sbjct: 188 EAAAAKKKAEAEAKAAAEKAKAEAEAKAKAEKKAEAAAEEKAAAEKKKAAAKAKADKAAA 247

Query: 228 A 228
           A
Sbjct: 248 A 248



 Score = 32.6 bits (74), Expect = 0.66
 Identities = 30/181 (16%), Positives = 79/181 (43%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           +K K+++  ++ K K+  ++E+ K     + K +++    E+  K+++ ++   E++ +K
Sbjct: 81  RKKKEEQVAEELKPKQAAEQERLKQLEKERLKAQEQQKQAEEAEKQAQLEQKQQEEQARK 140

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
              ++K+K+   K K        K   +  K + +   ++  +K  A  +         A
Sbjct: 141 AAAEQKKKAEAAKAKAAAEAAKLKAAAEAKKKAEEAAKAAEEAKAKAEAAAAKKKAEAEA 200

Query: 168 PTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
               +K+  + + K K ++ +    +      KKK     K +     AK+ E+++  ++
Sbjct: 201 KAAAEKAKAEAEAKAKAEKKAEAAAEEKAAAEKKKAAAKAKADKAAAAAKAAERKAAAAA 260

Query: 228 A 228
            
Sbjct: 261 L 261


>gnl|CDD|227447 COG5117, NOC3, Protein involved in the nuclear export of
           pre-ribosomes [Translation, ribosomal structure and
           biogenesis / Intracellular trafficking and secretion].
          Length = 657

 Score = 35.4 bits (81), Expect = 0.11
 Identities = 23/106 (21%), Positives = 44/106 (41%), Gaps = 11/106 (10%)

Query: 3   YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSK-KDKKDKDRDKEKE 61
           Y +K SSS           +++D    P  S SS  +   N    K KD    D +  +E
Sbjct: 44  YDLKKSSSDE---------EEQDYELRPRVS-SSWNNESYNRLPIKTKDNVVADVNNGEE 93

Query: 62  KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
              + + +    + S  K++ + S +E++     P +   + EK++
Sbjct: 94  FLSESESEASLEIDSDIKDEKQKSLEEQKIAPEIPVKQQIDSEKER 139


>gnl|CDD|213844 TIGR03657, IsdB, heme uptake protein IsdB.  Isd proteins are
           iron-regulated surface proteins found in Bacillus,
           Staphylococcus and Listeria species and are responsible
           for heme scavenging from hemoproteins. The IsdB protein
           is only observed in Staphylococcus and consists of an
           N-terminal hydrophobic signal sequence, a pair of tandem
           NEAT (NEAr Transporter, pfam05031) domains which confers
           the ability to bind heme and a C-terminal sortase
           processing signal which targets the protein to the cell
           wall. IsdB is believed to make a direct contact with
           methemoglobin facilitating transfer of heme to IsdB. The
           heme is then transferred to other cell wall-bound NEAT
           domain proteins such as IsdA and IsdC.
          Length = 644

 Score = 34.9 bits (79), Expect = 0.12
 Identities = 35/169 (20%), Positives = 73/169 (43%), Gaps = 4/169 (2%)

Query: 63  EKKDKEKDKSAVSSKEKEKDKVSSKEKE-RKESKPKESSSEKEKKKE---KKDKKEKSHK 118
           +K+   K  +  ++K++++D  + KE      SKP     EKE +K+   K D K+    
Sbjct: 449 DKEAFTKANADKTNKKEQQDNSAKKETTPATPSKPTTPPVEKESQKQDSQKDDNKQSPSV 508

Query: 119 HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKT 178
            K+ D   +  + K      +  ++ SSS    +  S +Q ++ P   +   T+     +
Sbjct: 509 EKENDASSESGKDKTPATKPAKGEVESSSTTPTKVVSTTQNVAKPTTASSETTKDVVQTS 568

Query: 179 KEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSS 227
               + K+S+     + K+ +    +  +  N +E  AKS  +   +S+
Sbjct: 569 AGSSEAKDSAPLQKANIKNTNDGHTQSQNNKNTQENKAKSLPQTGEESN 617



 Score = 31.4 bits (70), Expect = 1.4
 Identities = 28/141 (19%), Positives = 57/141 (40%), Gaps = 1/141 (0%)

Query: 19  HKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
           +K + +D+SA   T+ ++ +   T     +  K+D  +D  K+    +KE D S+ S K+
Sbjct: 462 NKKEQQDNSAKKETTPATPSKPTTPPVEKESQKQDSQKDDNKQSPSVEKENDASSESGKD 521

Query: 79  KE-KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
           K    K +  E E   + P +  S  +   +      ++ K   +      + K      
Sbjct: 522 KTPATKPAKGEVESSSTTPTKVVSTTQNVAKPTTASSETTKDVVQTSAGSSEAKDSAPLQ 581

Query: 138 KSSSKIVSSSHNSKEPASGSQ 158
           K++ K  +  H   +    +Q
Sbjct: 582 KANIKNTNDGHTQSQNNKNTQ 602


>gnl|CDD|206039 pfam13868, Trichoplein, Tumour suppressor, Mitostatin.  Trichoplein
           or mitostatin, was first defined as a meiosis-specific
           nuclear structural protein. It has since been linked
           with mitochondrial movement. It is associated with the
           mitochondrial outer membrane, and over-expression leads
           to reduction in mitochondrial motility whereas lack of
           it enhances mitochondrial movement. The activity appears
           to be mediated through binding the mitochondria to the
           actin intermediate filaments (IFs).
          Length = 349

 Score = 34.5 bits (80), Expect = 0.13
 Identities = 20/95 (21%), Positives = 55/95 (57%), Gaps = 8/95 (8%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
            +K +++++R+ E+ + K++KE++ + + ++++E +    + +E  E +      E E+K
Sbjct: 160 REKAEREEEREAERRERKEEKEREVARLRAQQEEAED---EREELDELRADLYQEEYERK 216

Query: 107 KEKKDKKEKSHKHK-----DKDRERDKDEKKEQKE 136
           + +K+K+E   + +      + RE   +EK+E+ +
Sbjct: 217 ERQKEKEEAEKRRRQKQELQRAREEQIEEKEERLQ 251



 Score = 33.7 bits (78), Expect = 0.23
 Identities = 19/99 (19%), Positives = 60/99 (60%), Gaps = 11/99 (11%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
            +++ + +++EKE+E++++ K       K + +++  ++ +ERKE K +E +  + +++E
Sbjct: 134 NEERIERKEEEKEREREEELKILEYQREKAEREEEREAERRERKEEKEREVARLRAQQEE 193

Query: 109 KKDKKEKS-----------HKHKDKDRERDKDEKKEQKE 136
            +D++E+            ++ K++ +E+++ EK+ +++
Sbjct: 194 AEDEREELDELRADLYQEEYERKERQKEKEEAEKRRRQK 232



 Score = 32.6 bits (75), Expect = 0.62
 Identities = 21/100 (21%), Positives = 58/100 (58%), Gaps = 13/100 (13%)

Query: 52  KDKDRDKEKEKEKKDKEKDK-SAVSSKEKEKDKVSSKEKERKESKP----KESSSEKEKK 106
           +++++ +++E E++ +E+++   +  + +E+D+  ++EK  K+ K      E + E+ ++
Sbjct: 81  EEREKRRQEEYEERLQEREQMDEIIERIQEEDEAEAQEKREKQKKLREEIDEFNEERIER 140

Query: 107 KEKKDKKEKS--------HKHKDKDRERDKDEKKEQKESK 138
           KE++ ++E+          + K +  E  + E++E+KE K
Sbjct: 141 KEEEKEREREEELKILEYQREKAEREEEREAERRERKEEK 180


>gnl|CDD|218328 pfam04921, XAP5, XAP5, circadian clock regulator.  This protein is
           found in a wide range of eukaryotes. It is a nuclear
           protein and is suggested to be DNA binding. In plants,
           this family is essential for correct circadian clock
           functioning by acting as a light-quality regulator
           coordinating the activities of blue and red light
           signalling pathways during plant growth - inhibiting
           growth in red light but promoting growth in blue light.
          Length = 233

 Score = 34.2 bits (79), Expect = 0.13
 Identities = 23/76 (30%), Positives = 38/76 (50%), Gaps = 5/76 (6%)

Query: 64  KKDKEKDKSAVS-----SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
           KK K+K KS +S      +E E +    K+  ++ S+P E++    KKK  K+    +  
Sbjct: 1   KKKKKKKKSKLSFGDDDEEEDEDEGEDEKKVPKESSEPDEANVNPNKKKIGKNPSVDTSF 60

Query: 119 HKDKDRERDKDEKKEQ 134
             DK RE  + E +E+
Sbjct: 61  LPDKAREEKEAELREE 76


>gnl|CDD|227880 COG5593, COG5593, Nucleic-acid-binding protein possibly involved in
           ribosomal biogenesis [Translation, ribosomal structure
           and biogenesis].
          Length = 821

 Score = 35.0 bits (80), Expect = 0.14
 Identities = 24/112 (21%), Positives = 41/112 (36%), Gaps = 5/112 (4%)

Query: 18  PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
           P    D D S +       S S  T+    K D  D +  K +  ++ D+E+    +   
Sbjct: 699 PDVEDDSDDSELDFAEDDFSDS--TSDDEPKLDAIDDEDAKSEGSQESDQEEGLDEIFYS 756

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
              +   S    E  E      SSE+EK++E+  +       K + +   K 
Sbjct: 757 FDGEQDNSDSFAESSEEDE---SSEEEKEEEENKEVSAKRAKKKQRKNMLKS 805



 Score = 31.6 bits (71), Expect = 1.6
 Identities = 34/178 (19%), Positives = 66/178 (37%), Gaps = 27/178 (15%)

Query: 42  TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
           T  ++  K KK      + + E  + E   + V S+   +D     E +  E    +S+S
Sbjct: 663 TKKTADGKGKKSNKASFDSDDEMDENEIWSALVKSRPDVEDDSDDSELDFAEDDFSDSTS 722

Query: 102 EKEKKKEKKDKKE-KSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
           + E K +  D ++ KS   ++ D+E   DE     + +  +    +  + ++ +S     
Sbjct: 723 DDEPKLDAIDDEDAKSEGSQESDQEEGLDEIFYSFDGEQDNSDSFAESSEEDESS----- 777

Query: 161 SHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKS 218
                                E+EKE     +  +K   KK+ K+  K+ P    A  
Sbjct: 778 ---------------------EEEKEEEENKEVSAKRAKKKQRKNMLKSLPVFASADD 814


>gnl|CDD|240402 PTZ00399, PTZ00399, cysteinyl-tRNA-synthetase; Provisional.
          Length = 651

 Score = 34.6 bits (80), Expect = 0.16
 Identities = 18/51 (35%), Positives = 27/51 (52%), Gaps = 3/51 (5%)

Query: 66  DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           DK    S     +KE+ +   +EKE KE+  ++    K KK+E+K KKE  
Sbjct: 537 DKPDGPSVWKLDDKEELQ---REKEEKEALKEQKRLRKLKKQEEKKKKELE 584



 Score = 29.6 bits (67), Expect = 5.3
 Identities = 22/98 (22%), Positives = 42/98 (42%), Gaps = 9/98 (9%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK---------E 104
           +D  K + K     +K K  +   +K +D+       R E KP   S  K         E
Sbjct: 497 RDAAKAEMKLISLDKKKKQLLQLCDKLRDEWLPNLGIRIEDKPDGPSVWKLDDKEELQRE 556

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           K++++  K++K  +   K  E+ K E ++ +++K    
Sbjct: 557 KEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAKIPPA 594



 Score = 29.6 bits (67), Expect = 5.8
 Identities = 21/79 (26%), Positives = 37/79 (46%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
           +EKE+++  KE+ +     K++EK K   ++ E+ +  P E    +E K    D+     
Sbjct: 555 REKEEKEALKEQKRLRKLKKQEEKKKKELEKLEKAKIPPAEFFKRQEDKYSAFDETGLPT 614

Query: 118 KHKDKDRERDKDEKKEQKE 136
              D +    K+ KK  KE
Sbjct: 615 HDADGEEISKKERKKLSKE 633


>gnl|CDD|227492 COG5163, NOP7, Protein required for biogenesis of the 60S ribosomal
           subunit [Translation, ribosomal structure and
           biogenesis].
          Length = 591

 Score = 34.7 bits (79), Expect = 0.16
 Identities = 22/76 (28%), Positives = 43/76 (56%), Gaps = 4/76 (5%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
           S + K   K K++ ++ ++E+++K+     +S+K+K   K+  K K     K +++ + K
Sbjct: 519 SEADKDVNKSKNKKRKVDEEEEEKKLKMIMMSNKQK---KLYKKMKYSNAKKEEQAENLK 575

Query: 104 EKKKE-KKDKKEKSHK 118
           +KKK+  K KK  S K
Sbjct: 576 KKKKQIAKQKKLDSKK 591



 Score = 30.4 bits (68), Expect = 3.5
 Identities = 24/92 (26%), Positives = 44/92 (47%), Gaps = 9/92 (9%)

Query: 53  DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS-------SEKEK 105
           D D + + +KE + + +      + E +KD   SK K+RK  + +E         S K+K
Sbjct: 495 DDDEELQAQKELELEAQGIKYSETSEADKDVNKSKNKKRKVDEEEEEKKLKMIMMSNKQK 554

Query: 106 KKEKKDKKEKSHKHKDKD--RERDKDEKKEQK 135
           K  KK K   + K +  +  +++ K   K++K
Sbjct: 555 KLYKKMKYSNAKKEEQAENLKKKKKQIAKQKK 586



 Score = 28.9 bits (64), Expect = 8.4
 Identities = 17/82 (20%), Positives = 40/82 (48%), Gaps = 4/82 (4%)

Query: 61  EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
             E+  +  ++  V+  E  + +  + E++  + +  ++  E E + +     E S   K
Sbjct: 464 TMEETQRHSEEDLVNRFEDVRYEHVAGEEDDDDDEELQAQKELELEAQGIKYSETSEADK 523

Query: 121 D----KDRERDKDEKKEQKESK 138
           D    K+++R  DE++E+K+ K
Sbjct: 524 DVNKSKNKKRKVDEEEEEKKLK 545


>gnl|CDD|218115 pfam04502, DUF572, Family of unknown function (DUF572).  Family of
           eukaryotic proteins with undetermined function.
          Length = 321

 Score = 34.3 bits (79), Expect = 0.16
 Identities = 30/195 (15%), Positives = 66/195 (33%), Gaps = 25/195 (12%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK---E 104
           ++++   D  K+ E    D +++   +   E+ K+      + R+      S  E     
Sbjct: 125 REEELAGDAMKKLENRTADSKREMEVLERLEELKEL-----QSRRADVDVNSMLEALFRR 179

Query: 105 KKKEKKDKKEKSHK---------HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
           +KKE+++++E+              ++DR R  DE  E  E  + +     S +S     
Sbjct: 180 EKKEEEEEEEEDEALIKSLSFGPETEEDRRRADDEDSEDDEEDNDNTPSPKSGSSSPAKP 239

Query: 156 GSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKD 215
            S L       +  P+     K      +   + +     K   K         +  +  
Sbjct: 240 TSILKKSAAKRSEAPSSSKAKKNSRGIPKPRDALSSLVVRK---KAAP-----ESTSQSP 291

Query: 216 AKSKEKESHKSSAGP 230
           + ++       +AG 
Sbjct: 292 SSAEPTSESPQTAGN 306



 Score = 33.6 bits (77), Expect = 0.24
 Identities = 21/140 (15%), Positives = 52/140 (37%), Gaps = 7/140 (5%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
               +++ +E+E+E++D+   KS     E E+D+  + +++ ++ +    ++   K    
Sbjct: 175 ALFRREKKEEEEEEEEDEALIKSLSFGPETEEDRRRADDEDSEDDEEDNDNTPSPKSGSS 234

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
              K            +    K+ +  S S +K  S        A  S ++     P  T
Sbjct: 235 SPAKP-------TSILKKSAAKRSEAPSSSKAKKNSRGIPKPRDALSSLVVRKKAAPEST 287

Query: 170 PTQKSPVKTKEKEKEKESST 189
               S  +   +  +   ++
Sbjct: 288 SQSPSSAEPTSESPQTAGNS 307


>gnl|CDD|221175 pfam11705, RNA_pol_3_Rpc31, DNA-directed RNA polymerase III subunit
           Rpc31.  RNA polymerase III contains seventeen subunits
           in yeasts and in human cells. Twelve of these are akin
           to RNA polymerase I or II and the other five are RNA pol
           III-specific, and form the functionally distinct groups
           (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31,
           Rpc34 and Rpc82 form a cluster of enzyme-specific
           subunits that contribute to transcription initiation in
           S.cerevisiae and H.sapiens. There is evidence that these
           subunits are anchored at or near the N-terminal Zn-fold
           of Rpc1, itself prolonged by a highly conserved but RNA
           polymerase III-specific domain.
          Length = 221

 Score = 34.0 bits (78), Expect = 0.17
 Identities = 16/74 (21%), Positives = 38/74 (51%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
            K  +K    K K K  + ++E+E         E+K  + +    ++E +K++++++E+ 
Sbjct: 124 KKAGKKLALSKFKRKVGLFTEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEE 183

Query: 117 HKHKDKDRERDKDE 130
            + +D D + D D+
Sbjct: 184 EEDEDFDDDDDDDD 197



 Score = 29.3 bits (66), Expect = 4.8
 Identities = 28/113 (24%), Positives = 48/113 (42%), Gaps = 16/113 (14%)

Query: 41  PTNSSSSKKDKKDKDR---DKEKEKEKKDKEKD-------------KSAVSSKEKEKDKV 84
            T S S  +D KD      DK ++K K                    S +   +K   K+
Sbjct: 71  YTGSESLSQDPKDGIERYSDKYQKKRKIGPSIKEHPFDLELFPKELYSVMGINKKAGKKL 130

Query: 85  SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
           +  + +RK     E   + ++K    +KK K  + +D D E +KDE++E++E 
Sbjct: 131 ALSKFKRKVGLFTEEEEDIDEKLSMLEKKLKELEAEDVDEEDEKDEEEEEEEE 183



 Score = 28.6 bits (64), Expect = 7.7
 Identities = 17/70 (24%), Positives = 34/70 (48%), Gaps = 3/70 (4%)

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
             +K+  K    SK K K  + ++E+E  + K       ++K KE + +       KD++
Sbjct: 121 GINKKAGKKLALSKFKRKVGLFTEEEEDIDEKLSM---LEKKLKELEAEDVDEEDEKDEE 177

Query: 124 RERDKDEKKE 133
            E +++E+ E
Sbjct: 178 EEEEEEEEDE 187


>gnl|CDD|223039 PHA03307, PHA03307, transcriptional regulator ICP4; Provisional.
          Length = 1352

 Score = 34.8 bits (80), Expect = 0.17
 Identities = 27/176 (15%), Positives = 50/176 (28%), Gaps = 14/176 (7%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
             +    S  P+    +    S+  S S SS    P  S++        D    +     
Sbjct: 189 PPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGASSSDSSSSESSGCG 248

Query: 66  DKEKDK------SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH 119
              +++      + ++   +  +         +      SSS +E+          S   
Sbjct: 249 WGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPA 308

Query: 120 KDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSP 175
                              SSS   SSS +S+  A         P P+ +P+   P
Sbjct: 309 ---PSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSP-----GPSPSRSPSPSRP 356



 Score = 33.2 bits (76), Expect = 0.48
 Identities = 17/45 (37%), Positives = 20/45 (44%)

Query: 4   SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSK 48
              SSSS     PSP  +      A  S   SSS+S+   SSSS 
Sbjct: 284 PASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSS 328


>gnl|CDD|215146 PLN02260, PLN02260, probable rhamnose biosynthetic enzyme.
          Length = 668

 Score = 34.7 bits (80), Expect = 0.17
 Identities = 43/156 (27%), Positives = 58/156 (37%), Gaps = 45/156 (28%)

Query: 301 IQHHIHVIIHAAASLRFDELIQDAFTL---NIQATRELLDLATRCSQLKAILHVSTLYTH 357
           I   I  I+H AA    D    ++F     NI  T  LL+      Q++  +HVST    
Sbjct: 77  ITEGIDTIMHFAAQTHVDNSFGNSFEFTKNNIYGTHVLLEACKVTGQIRRFIHVST---- 132

Query: 358 SYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFGGIYNNSYSFTKAIGESVVE 417
                  +E Y    + ED        N E  ++L +        N YS TKA  E +V 
Sbjct: 133 -------DEVYGE--TDEDAD----VGNHEASQLLPT--------NPYSATKAGAEMLVM 171

Query: 418 KY--LYKLPLAMVRPSIVVSTWKEPIVGWSNNLYGP 451
            Y   Y LP+   R                NN+YGP
Sbjct: 172 AYGRSYGLPVITTR---------------GNNVYGP 192


>gnl|CDD|220383 pfam09756, DDRGK, DDRGK domain.  This is a family of proteins of
           approximately 300 residues, found in plants and
           vertebrates. They contain a highly conserved DDRGK
           motif.
          Length = 189

 Score = 33.5 bits (77), Expect = 0.17
 Identities = 23/83 (27%), Positives = 50/83 (60%), Gaps = 2/83 (2%)

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
           KK   K ++ +  K+  + +  ++E+ER+E K  E   E E+K E+++ +E+  K K+++
Sbjct: 1   KKIGAKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERK-EEEELEEEREKKKEEE 59

Query: 124 RERDKDEKKEQKESKSSSKIVSS 146
             +++ E++ +KE +   K+ SS
Sbjct: 60  ERKER-EEQARKEQEEYEKLKSS 81



 Score = 32.7 bits (75), Expect = 0.35
 Identities = 18/104 (17%), Positives = 52/104 (50%), Gaps = 7/104 (6%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEK---EKDKVSSKEKERKESKPKESSSEKEKKK 107
            K + + +EK+  ++ +E ++     ++K   +++    +E+E +E + K+   E+ K++
Sbjct: 5   AKKRAKLEEKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKER 64

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK 151
           E++ +KE     ++ ++ +     +E+   K S+   S+     
Sbjct: 65  EEQARKE----QEEYEKLKSSFVVEEEGTDKLSADEESNELLED 104



 Score = 31.2 bits (71), Expect = 0.98
 Identities = 19/72 (26%), Positives = 42/72 (58%), Gaps = 5/72 (6%)

Query: 48  KKDKKDKDRDKE--KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
           +K  + + R+ E  + +E+K  E+ +     +E+E ++   K+KE +E K +E   E+ +
Sbjct: 13  EKQARRQQREAEEEEREERKKLEEKREGERKEEEELEEEREKKKEEEERKERE---EQAR 69

Query: 106 KKEKKDKKEKSH 117
           K++++ +K KS 
Sbjct: 70  KEQEEYEKLKSS 81


>gnl|CDD|219405 pfam07418, PCEMA1, Acidic phosphoprotein precursor PCEMA1.  This
           family consists of several acidic phosphoprotein
           precursor PCEMA1 sequences which appear to be found
           exclusively in Plasmodium chabaudi. PCEMA1 is an antigen
           that is associated with the membrane of the infected
           erythrocyte throughout the entire intraerythrocytic
           cycle. The exact function of this family is unclear.
          Length = 286

 Score = 34.1 bits (78), Expect = 0.18
 Identities = 20/87 (22%), Positives = 41/87 (47%), Gaps = 4/87 (4%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSS-KEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
           D +      + E E+ D E +   +S  K+ E D    KEK R+E +  +       + E
Sbjct: 203 DDEVTSYFNDGENEENDDELEAEVISYLKDGENDNE-VKEKIRREYREWKGDKANTNETE 261

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQK 135
            +D+ E   +++++  E  ++E K ++
Sbjct: 262 IEDESED--EYEEEAGEEQENEDKGEE 286


>gnl|CDD|227448 COG5118, BDP1, Transcription initiation factor TFIIIB, Bdp1 subunit
           [Transcription].
          Length = 507

 Score = 34.3 bits (78), Expect = 0.18
 Identities = 43/223 (19%), Positives = 77/223 (34%), Gaps = 22/223 (9%)

Query: 30  PSTSTSSSTSNPT----NSSSSKKDKK-------DKDRDKEKEKEKKDKEKDKSAVSSKE 78
           PS   SSS SN T    ++ S+K  KK       + D + +  K  +   K  SA    +
Sbjct: 111 PSFLDSSSNSNGTARRLSTISNKLPKKIRLGSITENDMNLKTFKRHRVLGKPSSAKKPAK 170

Query: 79  KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
                  +   +R  S    +S E ++ +     K K    K +D E  K    E+  + 
Sbjct: 171 ISPPTAMTDSLDRNFSSETSTSREADENENYVISKVKDIPKKVRDGESAKYFIDEENFTM 230

Query: 139 SSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHK 198
           +     +      E    S++             K+ ++ +   K  E S TH+     K
Sbjct: 231 AELCKPNFPIQISENFEKSKMAK-----------KAKLEKRRHVKFLEGSNTHEMDQLLK 279

Query: 199 HKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYI 241
           H   + +  +     K   S  ++    +A      + G I +
Sbjct: 280 HFLDNSNFRQDRRSRKKKASASRDISDQNAEEILMIKNGHIVV 322


>gnl|CDD|240578 cd12951, RRP7_Rrp7A, RRP7 domain ribosomal RNA-processing protein 7
           homolog A (Rrp7A) and similar proteins.  The family
           corresponds to the RRP7 domain of Rrp7A, also termed
           gastric cancer antigen Zg14, and similar proteins which
           are yeast ribosomal RNA-processing protein 7 (Rrp7p)
           homologs mainly found in Metazoans. The cellular
           function of Rrp7A remains unclear currently. Rrp7A
           harbors an N-terminal RNA recognition motif (RRM), also
           termed RBD (RNA binding domain) or RNP
           (ribonucleoprotein domain), and a C-terminal RRP7
           domain.
          Length = 129

 Score = 32.6 bits (75), Expect = 0.18
 Identities = 26/66 (39%), Positives = 37/66 (56%), Gaps = 12/66 (18%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKD----KVSSKEKERKESKPKESSSEKEKKKEKKDK 112
           DKE+E+EK++KEK        E E D       +K+  R ++  KES + K  +KEKK K
Sbjct: 31  DKEEEEEKEEKEK--------EAEPDEDGWVTVTKKGRRPKTARKESVAAKAAEKEKKKK 82

Query: 113 KEKSHK 118
           K+K  K
Sbjct: 83  KKKELK 88


>gnl|CDD|227458 COG5129, MAK16, Nuclear protein with HMG-like acidic region
           [General function prediction only].
          Length = 303

 Score = 33.9 bits (77), Expect = 0.18
 Identities = 29/114 (25%), Positives = 54/114 (47%), Gaps = 2/114 (1%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS-SEKE 104
           + ++ ++D+     +E+E+ D E +     S EKEK K    EK     +  E+S SE+E
Sbjct: 191 TEREKRQDEKERYVEEEEESDTELEAVTDDS-EKEKTKKKDLEKWLGSDQSMETSESEEE 249

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
           +  E +  +++   +K K R+R  D+ K+ ++     +      N K PA    
Sbjct: 250 ESSESESDEDEDEDNKGKIRKRKTDDAKKSRKPHIHIEYEQERENEKIPAVQHS 303



 Score = 30.8 bits (69), Expect = 1.8
 Identities = 24/109 (22%), Positives = 49/109 (44%), Gaps = 5/109 (4%)

Query: 43  NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
                K+   +++ + + E E    + +K     K+ EK   S +  E  ES+ +ESS  
Sbjct: 195 KRQDEKERYVEEEEESDTELEAVTDDSEKEKTKKKDLEKWLGSDQSMETSESEEEESSES 254

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKD-----EKKEQKESKSSSKIVSS 146
           +  + E +D K K  K K  D ++ +      E ++++E++    +  S
Sbjct: 255 ESDEDEDEDNKGKIRKRKTDDAKKSRKPHIHIEYEQERENEKIPAVQHS 303


>gnl|CDD|218883 pfam06075, DUF936, Plant protein of unknown function (DUF936).
           This family consists of several hypothetical proteins
           from Arabidopsis thaliana and Oryza sativa. The function
           of this family is unknown.
          Length = 564

 Score = 34.4 bits (79), Expect = 0.19
 Identities = 27/122 (22%), Positives = 40/122 (32%), Gaps = 8/122 (6%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKD--------KKDKDRDK 58
            SS S    PSP       SS+    S+  S     ++S  KK            +  D 
Sbjct: 183 RSSRSELGAPSPSGGTSCPSSSGGRRSSIGSRRLRGSASLRKKVAVLSAPRKPGSRSSDC 242

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
           +     +         SS +++  K  SK   R   K    SS+ E    KK + +    
Sbjct: 243 KSSPRARSSSAKSPFKSSIQRKATKALSKLSLRASPKDTSKSSKSEVAPPKKSEAKVPSS 302

Query: 119 HK 120
            K
Sbjct: 303 SK 304


>gnl|CDD|224012 COG1087, GalE, UDP-glucose 4-epimerase [Cell envelope biogenesis,
           outer membrane].
          Length = 329

 Score = 34.1 bits (79), Expect = 0.19
 Identities = 22/95 (23%), Positives = 39/95 (41%), Gaps = 11/95 (11%)

Query: 264 FDRLKNEQADILQRK-VHIISGDIS-QPSLGISSHDQQFIQHHIHVIIHAAASLRFDELI 321
            D L N     L +       GD+  +  L        F ++ I  ++H AAS+   E +
Sbjct: 30  LDNLSNGHKIALLKLQFKFYEGDLLDRALL-----TAVFEENKIDAVVHFAASISVGESV 84

Query: 322 QDA---FTLNIQATRELLDLATRCSQLKAILHVST 353
           Q+    +  N+  T  L++ A   + +K  +  ST
Sbjct: 85  QNPLKYYDNNVVGTLNLIE-AMLQTGVKKFIFSST 118


>gnl|CDD|114268 pfam05537, DUF759, Borrelia burgdorferi protein of unknown function
           (DUF759).  This family consists of several
           uncharacterized proteins from the Lyme disease
           spirochete Borrelia burgdorferi.
          Length = 439

 Score = 34.3 bits (78), Expect = 0.19
 Identities = 39/113 (34%), Positives = 59/113 (52%), Gaps = 16/113 (14%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK-EKKKEK 109
           KK  ++D  K  EK  K K  S  S+K+  K+ +S K+KE  +    ES  E+ EK +  
Sbjct: 19  KKAIEQDISK-MEKYLKPKKSSLGSTKDIVKNNLSDKKKELSKQSKFESLRERVEKYRLT 77

Query: 110 KDKK--------EKSHKHKDK-----DRERDKDEKKE-QKESKSSSKIVSSSH 148
           + KK        EK+ K   K     DR++ + E KE  KESK+ SK++++S 
Sbjct: 78  QTKKLIKQGMGFEKARKEAFKRSLMSDRDKRRLEYKELAKESKAKSKMLAASQ 130


>gnl|CDD|113413 pfam04642, DUF601, Protein of unknown function, DUF601.  This
           family represents a conserved region found in several
           uncharacterized plant proteins.
          Length = 311

 Score = 33.9 bits (77), Expect = 0.19
 Identities = 30/128 (23%), Positives = 56/128 (43%), Gaps = 10/128 (7%)

Query: 49  KDKKDKDRDKEKE--KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS-EKEK 105
           +D K +  +K K+   E+    + K +  + +      S      K +  K+ S    +K
Sbjct: 6   EDSKLRAAEKAKQPQAEEDSGSRQKPSTLAGKNPDAPTSESRTPSKATSSKDPSKRYADK 65

Query: 106 KKEKKDKKEKSHKHKDKDRERDKD-----EKKEQKESKSSSKIVSSSHNSKEPASGSQLI 160
           K+++ +K  +S     + R  +KD     +K++ K+  S   +V SS  S+     S+  
Sbjct: 66  KRKQSEKDARSPPRSSRPRTEEKDAGPSQQKEKGKKGDSQDLVVLSSRESE--RRTSERR 123

Query: 161 SHPPPPAP 168
           S  P PAP
Sbjct: 124 STGPLPAP 131



 Score = 32.7 bits (74), Expect = 0.48
 Identities = 27/124 (21%), Positives = 52/124 (41%), Gaps = 8/124 (6%)

Query: 9   SSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKE 68
           S   +A  +     ++DS +    ST +  +    +S S+   K     K+  K   DK+
Sbjct: 8   SKLRAAEKAKQPQAEEDSGSRQKPSTLAGKNPDAPTSESRTPSKAT-SSKDPSKRYADKK 66

Query: 69  KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
           +       K+ EKD  S     R  ++ K++   ++K+K KK   +       ++ ER  
Sbjct: 67  R-------KQSEKDARSPPRSSRPRTEEKDAGPSQQKEKGKKGDSQDLVVLSSRESERRT 119

Query: 129 DEKK 132
            E++
Sbjct: 120 SERR 123


>gnl|CDD|234525 TIGR04259, oxa_formateAnti, oxalate/formate antiporter.  This model
           represents a subgroup of the more broadly defined model
           TIGR00890, which in turn belongs to the Major
           Facilitator transporter family. Seed members for this
           family include the known oxalate/formate antiporter of
           Oxalobacter formigenes, as well as transporter subunits
           co-clustered with the two genes of a system that
           decarboxylates oxalate into formate. In many of these
           cassettes, two subunits are found rather than one,
           suggesting the antiporter is sometimes homodimeric,
           sometimes heterodimeric.
          Length = 405

 Score = 34.0 bits (78), Expect = 0.23
 Identities = 13/35 (37%), Positives = 15/35 (42%), Gaps = 6/35 (17%)

Query: 431 SIVVSTWKEPIVGWSNNLYGP------GGAAAGAA 459
            +V  TW  PI GW  + YGP      GG   G  
Sbjct: 47  FVVTETWLVPIEGWFVDKYGPRIVVMFGGIMCGLG 81


>gnl|CDD|217348 pfam03064, U79_P34, HSV U79 / HCMV P34.  This family represents
           herpes virus protein U79 and cytomegalovirus early
           phosphoprotein P34 (UL112).
          Length = 238

 Score = 33.3 bits (76), Expect = 0.24
 Identities = 20/106 (18%), Positives = 42/106 (39%), Gaps = 1/106 (0%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK-KDKEKDKSAVSSKE 78
              + +   +     +        + S KK   +  +   K+KEK + ++  K     ++
Sbjct: 125 VAHEAEIRNLGDVKNAEKFEKECRALSRKKSDDEHRKRSGKQKEKRRVEDSQKHKEDRRK 184

Query: 79  KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
           K+++K  + E +R       S  +     + +  KEK  KH D +R
Sbjct: 185 KQEEKRRNDEDKRPGGGGGSSGGQSGLSTKDEPPKEKRQKHHDPER 230



 Score = 33.3 bits (76), Expect = 0.28
 Identities = 20/101 (19%), Positives = 38/101 (37%), Gaps = 11/101 (10%)

Query: 60  KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH 119
           K  EK +KE    +    + E  K S K+KE++  +  +   E  +KK+++ ++    K 
Sbjct: 138 KNAEKFEKECRALSRKKSDDEHRKRSGKQKEKRRVEDSQKHKEDRRKKQEEKRRNDEDKR 197

Query: 120 KDKDRERDK-----------DEKKEQKESKSSSKIVSSSHN 149
                                ++K QK      ++   SH 
Sbjct: 198 PGGGGGSSGGQSGLSTKDEPPKEKRQKHHDPERRLEPQSHE 238



 Score = 32.5 bits (74), Expect = 0.50
 Identities = 21/92 (22%), Positives = 37/92 (40%), Gaps = 3/92 (3%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK--ERKESKPKESSSE 102
            +  + K D +  K   K+K +K + + +   KE  + K   K +  E K       SS 
Sbjct: 148 RALSRKKSDDEHRKRSGKQK-EKRRVEDSQKHKEDRRKKQEEKRRNDEDKRPGGGGGSSG 206

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
            +     KD+  K  + K  D ER  + +  +
Sbjct: 207 GQSGLSTKDEPPKEKRQKHHDPERRLEPQSHE 238



 Score = 28.7 bits (64), Expect = 7.5
 Identities = 19/83 (22%), Positives = 32/83 (38%), Gaps = 3/83 (3%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
            + E E   +   +   K  K   + S   +KK   + +++S K K+K R  D  + KE 
Sbjct: 125 VAHEAEIRNLGDVKNAEKFEKECRALS---RKKSDDEHRKRSGKQKEKRRVEDSQKHKED 181

Query: 135 KESKSSSKIVSSSHNSKEPASGS 157
           +  K   K  +          GS
Sbjct: 182 RRKKQEEKRRNDEDKRPGGGGGS 204


>gnl|CDD|227693 COG5406, COG5406, Nucleosome binding factor SPN, SPT16 subunit
           [Transcription / DNA replication, recombination, and
           repair / Chromatin structure and dynamics].
          Length = 1001

 Score = 34.2 bits (78), Expect = 0.24
 Identities = 35/160 (21%), Positives = 60/160 (37%), Gaps = 11/160 (6%)

Query: 37  STSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP 96
           S SNP   + S K + D      ++ E  +    +        +K   S + K R E++ 
Sbjct: 408 SLSNPIVFTDSPKAQGDISFLFGEDDETPEYLTLQDKAPDF-LDKTISSHRSKFRDETRE 466

Query: 97  KESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASG 156
            E ++ K++ + +K+  +K  +   +      D   +  E KS  +I S S +S+ P   
Sbjct: 467 HELNARKKRVEHQKELLDKIIEEGLERFRNASDAGPDSIEEKSEKRIESYSRDSQLPRQI 526

Query: 157 ----------SQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
                      Q I  P    P P   S +K   K  E  
Sbjct: 527 GELRIIVDFARQSIILPIGGRPVPFHISSIKNASKNDEGN 566


>gnl|CDD|234750 PRK00409, PRK00409, recombination and DNA strand exchange inhibitor
           protein; Reviewed.
          Length = 782

 Score = 34.0 bits (79), Expect = 0.24
 Identities = 21/89 (23%), Positives = 49/89 (55%), Gaps = 13/89 (14%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKV-------------SSKEKERKESK 95
           ++KK+K +++E +  ++ +++ + A+   +KE D++             S K  E  E++
Sbjct: 554 EEKKEKLQEEEDKLLEEAEKEAQQAIKEAKKEADEIIKELRQLQKGGYASVKAHELIEAR 613

Query: 96  PKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
            + + + ++K+K+KK +KEK  + K  D 
Sbjct: 614 KRLNKANEKKEKKKKKQKEKQEELKVGDE 642



 Score = 29.0 bits (66), Expect = 9.5
 Identities = 33/147 (22%), Positives = 65/147 (44%), Gaps = 30/147 (20%)

Query: 80  EKDKV-----SSKEKERK-ESKPKESSSEKEK----KKEKKDKKEKSHKHKDKDRERDKD 129
           +K+K+     S +E ER+ E K +E+ +  ++    K+E ++KKEK  + +DK  E  + 
Sbjct: 514 DKEKLNELIASLEELERELEQKAEEAEALLKEAEKLKEELEEKKEKLQEEEDKLLEEAEK 573

Query: 130 E-KKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESS 188
           E ++  KE+K  +  +       +    + + +H                +  E  K  +
Sbjct: 574 EAQQAIKEAKKEADEIIKELRQLQKGGYASVKAH----------------ELIEARKRLN 617

Query: 189 TTHDKHSKHKHKKKDKHGDKTNPKEKD 215
             ++K  K K K+K+K   +   K  D
Sbjct: 618 KANEKKEKKKKKQKEK---QEELKVGD 641


>gnl|CDD|237177 PRK12704, PRK12704, phosphodiesterase; Provisional.
          Length = 520

 Score = 34.0 bits (79), Expect = 0.24
 Identities = 15/81 (18%), Positives = 40/81 (49%), Gaps = 7/81 (8%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE--RKESKPKESSSEKEKKKEKKDKKEK 115
           KE   E K++         + + + ++  +  E  + E +  +     ++K E  +K+E+
Sbjct: 56  KEALLEAKEEIHKL-----RNEFEKELRERRNELQKLEKRLLQKEENLDRKLELLEKREE 110

Query: 116 SHKHKDKDRERDKDEKKEQKE 136
             + K+K+ E+ + E ++++E
Sbjct: 111 ELEKKEKELEQKQQELEKKEE 131



 Score = 30.5 bits (70), Expect = 3.2
 Identities = 21/97 (21%), Positives = 46/97 (47%), Gaps = 12/97 (12%)

Query: 47  SKKDKKDKDRDKE---KEKEKKDKEKDKSAVSSKEKEKDKVSSKEK--ERKESKPKESSS 101
           +KK+ +   ++     KE+  K + + +  +  +  E   +   EK   +KE        
Sbjct: 47  AKKEAEAIKKEALLEAKEEIHKLRNEFEKELRERRNE---LQKLEKRLLQKEENLDRKLE 103

Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
             EK++E+ +KKEK  + K    +++ ++K+E+ E  
Sbjct: 104 LLEKREEELEKKEKELEQK----QQELEKKEEELEEL 136



 Score = 29.4 bits (67), Expect = 7.0
 Identities = 22/94 (23%), Positives = 52/94 (55%), Gaps = 4/94 (4%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE-KE- 104
            +K +  + R++E EK++K+ E+ +  +  KE+E +++  ++ +  E     ++ E KE 
Sbjct: 99  DRKLELLEKREEELEKKEKELEQKQQELEKKEEELEELIEEQLQELERISGLTAEEAKEI 158

Query: 105 --KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
             +K E++ + E +   K+ + E  ++  K+ KE
Sbjct: 159 LLEKVEEEARHEAAVLIKEIEEEAKEEADKKAKE 192


>gnl|CDD|191179 pfam05053, Menin, Menin.  MEN1, the gene responsible for multiple
           endocrine neoplasia type 1, is a tumour suppressor gene
           that encodes a protein called Menin which may be an
           atypical GTPase stimulated by nm23.
          Length = 618

 Score = 33.8 bits (77), Expect = 0.25
 Identities = 26/118 (22%), Positives = 46/118 (38%), Gaps = 13/118 (11%)

Query: 79  KEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           ++K  +   EKE KESK       +E ++    ++ KS   +    E    E      + 
Sbjct: 452 RQKVVIKLPEKEAKESKEAAGEEAREGRRRGPRRESKS--QEPSGGESPNPELPANNNNS 509

Query: 139 SSSKIVSSSHNSKEPAS------GSQLISHPPPPAPTPT-----QKSPVKTKEKEKEK 185
           +S+   ++  + KE A+       +   S    P P  +     ++ PV T   EK K
Sbjct: 510 NSNNNNNNGADRKEAAATTGNATTTSNGSGTSVPLPVSSEPPQHKEGPVITFYSEKMK 567


>gnl|CDD|227594 COG5269, ZUO1, Ribosome-associated chaperone zuotin [Translation,
           ribosomal structure and biogenesis / Posttranslational
           modification, protein turnover, chaperones].
          Length = 379

 Score = 33.9 bits (77), Expect = 0.25
 Identities = 28/112 (25%), Positives = 52/112 (46%), Gaps = 2/112 (1%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
           K K++D++ +      +   +P   S  +++K+ K   K  E+E   + K  +A+  K +
Sbjct: 209 KLKNQDNARLKRLVQIAKKRDPRIKSFKEQEKEMKKIRKW-EREAGARLKALAALKGKAE 267

Query: 80  EKDKVSSK-EKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
            K+K   + E     +  K+ + E  KK  K +KK   +  KD D   D D+
Sbjct: 268 AKNKAEIEAEALASATAVKKKAKEVMKKALKMEKKAIKNAAKDADYFGDADK 319


>gnl|CDD|227615 COG5296, COG5296, Transcription factor involved in TATA site
           selection and in elongation by RNA polymerase II
           [Transcription].
          Length = 521

 Score = 33.9 bits (77), Expect = 0.25
 Identities = 15/71 (21%), Positives = 32/71 (45%), Gaps = 2/71 (2%)

Query: 68  EKDKSAVSSKEKEKDKVSS--KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
            + +   S+ E++K K+    K +ER+E    E   E ++ K+ K+ +E     +++   
Sbjct: 132 TRYEPLTSAAEEKKKKLLELKKTREREERLYSERHIELQRFKDYKELEESEQGLQEEYTP 191

Query: 126 RDKDEKKEQKE 136
              +E  E   
Sbjct: 192 SYAEEAVEDIS 202


>gnl|CDD|111859 pfam03015, Sterile, Male sterility protein.  This family represents
           the C-terminal region of the male sterility protein in a
           number of arabidopsis and drosophila. A sequence-related
           jojoba acyl CoA reductase is also included.
          Length = 94

 Score = 31.5 bits (72), Expect = 0.26
 Identities = 8/22 (36%), Positives = 12/22 (54%)

Query: 574 FLHMIPGMIMDTVLRCLNKPPR 595
           F H +P   +D +LR   + PR
Sbjct: 1   FYHTLPAYFLDLLLRLYGQKPR 22


>gnl|CDD|218737 pfam05764, YL1, YL1 nuclear protein.  The proteins in this family
           are designated YL1. These proteins have been shown to be
           DNA-binding and may be a transcription factor.
          Length = 238

 Score = 33.1 bits (76), Expect = 0.26
 Identities = 36/152 (23%), Positives = 54/152 (35%), Gaps = 27/152 (17%)

Query: 53  DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
           D D  ++ E E  D+E         E EK+    +  ++K+    ++  E  KKK+KKD 
Sbjct: 57  DFDDSEDDEPESDDEE---------EGEKELQREERLKKKKRVKTKAYKEPTKKKKKKDP 107

Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
                      R + K E+     +   S    SS +S                  T   
Sbjct: 108 TAAKSPKAAAPRPKKKSERISWAPTLLDSPRRKSSRSS------------------TVQN 149

Query: 173 KSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
           K     + KE+E        K  K K KKK+K
Sbjct: 150 KEATHERLKEREIRRKKIQAKARKRKEKKKEK 181



 Score = 29.7 bits (67), Expect = 4.1
 Identities = 34/123 (27%), Positives = 59/123 (47%), Gaps = 14/123 (11%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSS--SKKDKKDKDRDKEKEKE 63
            ++ S  +A P P K K +  S  P+        +P   SS  S    K+   ++ KE+E
Sbjct: 108 TAAKSPKAAAPRP-KKKSERISWAPTL-----LDSPRRKSSRSSTVQNKEATHERLKERE 161

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK------EKKKEKKDKKEKSH 117
            + K+    A   KEK+K+K  ++E+   E+K  E  + K      E+++EKK  K ++ 
Sbjct: 162 IRRKKIQAKARKRKEKKKEKELTQEERLAEAKETERINLKSLERYEEQEEEKKKAKIQAL 221

Query: 118 KHK 120
           K +
Sbjct: 222 KKR 224



 Score = 28.9 bits (65), Expect = 6.4
 Identities = 21/77 (27%), Positives = 37/77 (48%), Gaps = 7/77 (9%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKD------KSAVSSKEKEKDKVSSKEKERKESKPK 97
            +++K  K    R K+K +              KS+ SS  + K+    + KER+  + K
Sbjct: 107 PTAAKSPKAAAPRPKKKSERISWAPTLLDSPRRKSSRSSTVQNKEATHERLKEREI-RRK 165

Query: 98  ESSSEKEKKKEKKDKKE 114
           +  ++  K+KEKK +KE
Sbjct: 166 KIQAKARKRKEKKKEKE 182


>gnl|CDD|202096 pfam02029, Caldesmon, Caldesmon. 
          Length = 431

 Score = 33.9 bits (77), Expect = 0.28
 Identities = 27/184 (14%), Positives = 79/184 (42%), Gaps = 3/184 (1%)

Query: 42  TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSS 101
           T+ S S+  ++ ++    + +  +++EK++S    +E E+ +  +K +++ + +  E   
Sbjct: 89  TDQSLSEPSRRMQEDSGAENETVEEEEKEESREEREEVEETEGVTKSEQKNDWRDAEECQ 148

Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS 161
           ++EK+ E +++++      +++       K +  E+  S     +     E     + + 
Sbjct: 149 KEEKEPEPEEEEKPKRGSLEENNGEFMTHKLKHTENTFSRG--GAEGAQVEAGKEFEKLK 206

Query: 162 HPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
                A    ++   K +E+ K  E      +  +   +K  +  +K   KE+  + + +
Sbjct: 207 QKQQEAALELEELKKKREERRKVLEEE-EQRRKQEEADRKSREEEEKRRLKEEIERRRAE 265

Query: 222 ESHK 225
            + K
Sbjct: 266 AAEK 269



 Score = 31.5 bits (71), Expect = 1.4
 Identities = 16/107 (14%), Positives = 50/107 (46%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
            + K K  ++       E    E  K     K+K+++     E+ +K+ + +    E+E+
Sbjct: 175 MTHKLKHTENTFSRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEE 234

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
           ++ K+++ ++  + +++ R   ++ ++ + E+    + V     S++
Sbjct: 235 QRRKQEEADRKSREEEEKRRLKEEIERRRAEAAEKRQKVPEDGLSED 281



 Score = 30.0 bits (67), Expect = 4.2
 Identities = 18/97 (18%), Positives = 42/97 (43%)

Query: 17  SPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSS 76
           S    +     A                   +  KK ++R K  E+E++ ++++++   S
Sbjct: 187 SRGGAEGAQVEAGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEADRKS 246

Query: 77  KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           +E+E+ +   +E ER+ ++  E   +  +    +DKK
Sbjct: 247 REEEEKRRLKEEIERRRAEAAEKRQKVPEDGLSEDKK 283



 Score = 30.0 bits (67), Expect = 4.5
 Identities = 21/137 (15%), Positives = 57/137 (41%), Gaps = 1/137 (0%)

Query: 2   AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKE 61
           + S  S      +       ++++             +     S  K D +D +  +++E
Sbjct: 92  SLSEPSRRMQEDSGAENETVEEEEKEESREEREEVEETEGVTKSEQKNDWRDAEECQKEE 151

Query: 62  KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD 121
           KE + +E++K    S E+   +  + + +  E+      +E  + +  K+ ++   K ++
Sbjct: 152 KEPEPEEEEKPKRGSLEENNGEFMTHKLKHTENTFSRGGAEGAQVEAGKEFEKLKQKQQE 211

Query: 122 KDRERDKDEKKEQKESK 138
              E + + KK+++E +
Sbjct: 212 AALELE-ELKKKREERR 227



 Score = 28.8 bits (64), Expect = 7.8
 Identities = 30/169 (17%), Positives = 68/169 (40%), Gaps = 6/169 (3%)

Query: 63  EKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
           E++ + K  S   S  +   ++        E+  +E   E  +++E+ ++ E   K + K
Sbjct: 79  ERQKEFKPTSTDQSLSEPSRRMQEDSGAENETVEEEEKEESREEREEVEETEGVTKSEQK 138

Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKE 182
           +  RD +E +++++     +       S E  +G + ++H          +   +  + E
Sbjct: 139 NDWRDAEECQKEEKEPEPEEEEKPKRGSLEENNG-EFMTHKLKHTENTFSRGGAEGAQVE 197

Query: 183 KEKESSTTHDKHSK-----HKHKKKDKHGDKTNPKEKDAKSKEKESHKS 226
             KE      K  +      + KKK +   K   +E+  + +E+   KS
Sbjct: 198 AGKEFEKLKQKQQEAALELEELKKKREERRKVLEEEEQRRKQEEADRKS 246


>gnl|CDD|215406 PLN02761, PLN02761, lipase class 3 family protein.
          Length = 527

 Score = 33.9 bits (77), Expect = 0.28
 Identities = 21/69 (30%), Positives = 32/69 (46%), Gaps = 12/69 (17%)

Query: 16 PSPHK------------NKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
           SP+K             K K  S I S+S +S +S+ T    S K     D  +E+E E
Sbjct: 18 SSPNKIFKTQPQTLILTTKFKTCSIICSSSCTSISSSTTQQKQSNKQTHVSDNKREEEPE 77

Query: 64 KKDKEKDKS 72
          ++ +EK+ S
Sbjct: 78 EELEEKEVS 86


>gnl|CDD|217884 pfam04086, SRP-alpha_N, Signal recognition particle, alpha subunit,
           N-terminal.  SRP is a complex of six distinct
           polypeptides and a 7S RNA that is essential for
           transferring nascent polypeptide chains that are
           destined for export from the cell to the translocation
           apparatus of the endoplasmic reticulum (ER) membrane.
           SRP binds hydrophobic signal sequences as they emerge
           from the ribosome, and arrests translation.
          Length = 272

 Score = 33.2 bits (76), Expect = 0.29
 Identities = 22/95 (23%), Positives = 35/95 (36%), Gaps = 3/95 (3%)

Query: 4   SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
           S K+  S     P     K K       ++TS  +S  +  +SS+       + KE    
Sbjct: 120 SKKTVDSMIERKPKEPGLKRKQRKKAQESATSPESSPSSTPNSSRPSTPHLLKAKEGPSR 179

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
           +  K    S+ +S   EK   S K K   +   K+
Sbjct: 180 RAKKAAKLSSTASSGDEK---SPKSKAAPKKAGKK 211



 Score = 32.0 bits (73), Expect = 0.79
 Identities = 26/151 (17%), Positives = 47/151 (31%), Gaps = 3/151 (1%)

Query: 10  SSSSAHPSPHKNKDKDSSAIPS-TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKE 68
            S     SP   +  + S     T  S     P      +K +K         +      
Sbjct: 100 ESKKQAKSPKAMRTFEESKKSKKTVDSMIERKPKEPGLKRKQRKKAQESATSPESSPSST 159

Query: 69  KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
            + S  S+    K K     + +K +K   SS+     ++    K    K   K R+ D 
Sbjct: 160 PNSSRPSTPHLLKAKEGPSRRAKKAAK--LSSTASSGDEKSPKSKAAPKKAGKKMRKWDL 217

Query: 129 DEKKEQKESKSSSKIVSSSHNSKEPASGSQL 159
           D  ++       S   ++  N+  P    ++
Sbjct: 218 DGDEDDDAVLDYSAPDANDENADAPEDVEEV 248



 Score = 29.7 bits (67), Expect = 4.5
 Identities = 33/145 (22%), Positives = 58/145 (40%), Gaps = 10/145 (6%)

Query: 65  KDKEKDKSAVSSKEKEKDK-----VSSKEKERKES--KPKESSSEKEKKKEKKDKKEKSH 117
           K++ + + A ++ ++E D+     +   EKE K+    PK   + +E KK KK       
Sbjct: 70  KNQLRQEKARTTYDEEFDEYFDQQLRELEKESKKQAKSPKAMRTFEESKKSKKTVDSMIE 129

Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVK 177
           +   +   + K  KK Q+ + S     SS+ NS  P++   L        P+   K   K
Sbjct: 130 RKPKEPGLKRKQRKKAQESATSPESSPSSTPNSSRPSTPHLL---KAKEGPSRRAKKAAK 186

Query: 178 TKEKEKEKESSTTHDKHSKHKHKKK 202
                   +  +   K +  K  KK
Sbjct: 187 LSSTASSGDEKSPKSKAAPKKAGKK 211


>gnl|CDD|215591 PLN03124, PLN03124, poly [ADP-ribose] polymerase; Provisional.
          Length = 643

 Score = 33.7 bits (77), Expect = 0.29
 Identities = 23/110 (20%), Positives = 49/110 (44%), Gaps = 5/110 (4%)

Query: 27  SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK----DKEKDKSAVSSKEKEKD 82
            AI   + ++S S   +S+  KK ++ +D   ++    K    D+ K  +    +E   +
Sbjct: 34  DAIAEDAKTASKSGTKSSAGRKKRRERQDDGDDEPVSPKRIAIDEVKGMTVRELREAASE 93

Query: 83  KVSSKEKERKESKPK-ESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEK 131
           +  +    +K+   +  ++ E + K    +   +  K K  D ER+K+EK
Sbjct: 94  RGLATTGRKKDLLERLCAALESDVKVGSANGTGEDEKEKGGDEEREKEEK 143


>gnl|CDD|226889 COG4487, COG4487, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 438

 Score = 33.6 bits (77), Expect = 0.29
 Identities = 13/98 (13%), Positives = 41/98 (41%), Gaps = 2/98 (2%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK--ESKPKESSSEKEK 105
           KK+    +     +K+++    ++     +   +D+++  E        K KE    +++
Sbjct: 63  KKELSQLEEQLINQKKEQKNLFNEQIKQFELALQDEIAKLEALELLNLEKDKELELLEKE 122

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
             E   + +K  ++  +  E+ ++  K ++  K  ++ 
Sbjct: 123 LDELSKELQKQLQNTAEIIEKKRENNKNEERLKFENEK 160



 Score = 33.6 bits (77), Expect = 0.31
 Identities = 19/105 (18%), Positives = 48/105 (45%), Gaps = 3/105 (2%)

Query: 50  DKKDKDRDK--EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           +++D+ R     +E EK+  EK     S+K+KE  ++  +   +K+ +    + + ++ +
Sbjct: 33  EQEDQSRILNTLEEFEKEANEKRAQYRSAKKKELSQLEEQLINQKKEQKNLFNEQIKQFE 92

Query: 108 EKKDKKEKSHKHKDK-DRERDKDEKKEQKESKSSSKIVSSSHNSK 151
                +    +  +  + E+DK+ +  +KE    SK +     + 
Sbjct: 93  LALQDEIAKLEALELLNLEKDKELELLEKELDELSKELQKQLQNT 137



 Score = 29.8 bits (67), Expect = 4.4
 Identities = 21/96 (21%), Positives = 43/96 (44%), Gaps = 3/96 (3%)

Query: 43  NSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
                  +K  +    EKE ++  KE  K   ++ E  + K  + + E +     E   E
Sbjct: 104 ALELLNLEKDKELELLEKELDELSKELQKQLQNTAEIIEKKRENNKNEERLKFENEKKLE 163

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           +  + E++  +E+ H   + + + +  E +EQ+ESK
Sbjct: 164 ESLELEREKFEEQLH---EANLDLEFKENEEQRESK 196



 Score = 29.0 bits (65), Expect = 7.6
 Identities = 19/101 (18%), Positives = 49/101 (48%), Gaps = 7/101 (6%)

Query: 48  KKDKKDKDRDK------EKEKEKKDKEKDKSAVSSKEKEKDKVSSK-EKERKESKPKESS 100
           + D+  K+  K      E  ++K++  K++  +  + ++K + S + E+E+ E +  E++
Sbjct: 122 ELDELSKELQKQLQNTAEIIEKKRENNKNEERLKFENEKKLEESLELEREKFEEQLHEAN 181

Query: 101 SEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
            + E K+ ++ ++ K    K   R  +   ++ Q E+    
Sbjct: 182 LDLEFKENEEQRESKWAILKKLKRRAELGSQQVQGEALELP 222


>gnl|CDD|224415 COG1498, SIK1, Protein implicated in ribosomal biogenesis, Nop56p
           homolog [Translation, ribosomal structure and
           biogenesis].
          Length = 395

 Score = 33.5 bits (77), Expect = 0.30
 Identities = 12/63 (19%), Positives = 28/63 (44%), Gaps = 2/63 (3%)

Query: 69  KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
            +   +S +E+ + ++  ++ + K  KP   +  +  KKE+  +  +  K K    ER  
Sbjct: 335 GEPDGISLREELEKRI--EKLKEKPPKPPTKAKPERDKKERPGRYRRKKKEKKAKSERRG 392

Query: 129 DEK 131
            + 
Sbjct: 393 LQN 395



 Score = 32.4 bits (74), Expect = 0.62
 Identities = 16/61 (26%), Positives = 34/61 (55%), Gaps = 2/61 (3%)

Query: 79  KEKDKVSSKEK-ERKESKPKESSSEKE-KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
            E D +S +E+ E++  K KE   +   K K ++DKKE+  +++ K +E+    ++   +
Sbjct: 335 GEPDGISLREELEKRIEKLKEKPPKPPTKAKPERDKKERPGRYRRKKKEKKAKSERRGLQ 394

Query: 137 S 137
           +
Sbjct: 395 N 395


>gnl|CDD|219569 pfam07777, MFMR, G-box binding protein MFMR.  This region is found
           to the N-terminus of the pfam00170 transcription factor
           domain. It is between 150 and 200 amino acids in length.
           The N-terminal half is rather rich in proline residues
           and has been termed the PRD (proline rich domain),
           whereas the C-terminal half is more polar and has been
           called the MFMR (multifunctional mosaic region). It has
           been suggested that this family is composed of three
           sub-families called A, B and C, classified according to
           motif composition. It has been suggested that some of
           these motifs may be involved in mediating
           protein-protein interactions. The MFMR region contains a
           nuclear localisation signal in bZIP opaque and GBF-2.
           The MFMR also contains a transregulatory activity in
           TAF-1. The MFMR in CPRF-2 contains cytoplasmic retention
           signals.
          Length = 189

 Score = 32.9 bits (75), Expect = 0.31
 Identities = 29/93 (31%), Positives = 38/93 (40%), Gaps = 10/93 (10%)

Query: 14  AHPS-PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKS 72
           AHPS P  +      A+PS     ST  P +  +  K   +KD+   K    K K  D S
Sbjct: 89  AHPSMPPGSHPFSPYAMPSAEVPGST--PLSMETDAKSSDNKDKGSIK----KSKGSDGS 142

Query: 73  ---AVSSKEKEKDKVSSKEKERKESKPKESSSE 102
              A+S K  E  K S        S+  ES S+
Sbjct: 143 LGLAMSGKNGESGKASGSSANGGSSQSSESGSD 175


>gnl|CDD|145900 pfam02994, Transposase_22, L1 transposable element. 
          Length = 370

 Score = 33.5 bits (76), Expect = 0.31
 Identities = 25/91 (27%), Positives = 42/91 (46%), Gaps = 1/91 (1%)

Query: 20  KNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEK 79
           +N+D  SS+ P   TSSS + P    ++  D+K      E+  +K +    +  + +K K
Sbjct: 10  RNQDTQSSSEPPKPTSSSPATPNTWENNDLDEKSYLIMMEEGFKKDNYSSLREDIETKGK 69

Query: 80  EKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
           E      KE E   +K  E+  E  +K  K+
Sbjct: 70  EVQNF-LKELEECITKQVEAHIENTEKCLKE 99


>gnl|CDD|220376 pfam09745, DUF2040, Coiled-coil domain-containing protein 55
           (DUF2040).  This entry is a conserved domain of
           approximately 130 residues of proteins conserved from
           fungi to humans. The proteins do contain a coiled-coil
           domain, but the function is unknown.
          Length = 128

 Score = 32.0 bits (73), Expect = 0.32
 Identities = 19/100 (19%), Positives = 53/100 (53%), Gaps = 8/100 (8%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           + K++K + +  E+++ +K K       +++ +++++  +  +ERK  K +E   ++   
Sbjct: 29  AAKEEKKQAKLSERKENRKPKYIGSLLEAAERRKREREIA--EERKLQKEREKEGDEFAD 86

Query: 107 KEK------KDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
           KEK      K + E++ K +++++ER++ E++        
Sbjct: 87  KEKFVTSAYKKQLEENRKLEEEEKEREELEEENDVTKGKD 126


>gnl|CDD|219868 pfam08496, Peptidase_S49_N, Peptidase family S49 N-terminal.  This
           domain is found to the N-terminus of bacterial signal
           peptidases of the S49 family (pfam01343).
          Length = 154

 Score = 32.1 bits (74), Expect = 0.34
 Identities = 20/78 (25%), Positives = 37/78 (47%), Gaps = 15/78 (19%)

Query: 73  AVSSKEKEK------DKVSSKEKERKESKPKESSSEKEKKK-EKKDKKEKSHKHKDKDRE 125
           A++ K+K K        ++ + K+ KES       +KE K  EK +KK         ++ 
Sbjct: 30  ALAQKKKGKKGELEITDLNEEYKDLKESLEAALLDKKELKAWEKAEKK--------AEKA 81

Query: 126 RDKDEKKEQKESKSSSKI 143
           + K EKK+ K+ +   ++
Sbjct: 82  KAKAEKKKAKKEEPKPRL 99


>gnl|CDD|235943 PRK07133, PRK07133, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 725

 Score = 33.6 bits (77), Expect = 0.35
 Identities = 16/96 (16%), Positives = 38/96 (39%), Gaps = 6/96 (6%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
              + + + E E + K   K +         K+K   K +   + + K  +   E+  E 
Sbjct: 354 ALSELEEEDENEIKFK---KIEENSIDNLDIKEK---KIENENDIEGKSDTKNLEEGFET 407

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
           KD K K+    +K      +   + +  + +++I++
Sbjct: 408 KDNKNKNSSFINKTENILTNSPLKDELLEKTTEIIN 443


>gnl|CDD|130249 TIGR01181, dTDP_gluc_dehyt, dTDP-glucose 4,6-dehydratase.  This
           protein is related to UDP-glucose 4-epimerase (GalE) and
           likewise has an NAD cofactor [Cell envelope,
           Biosynthesis and degradation of surface polysaccharides
           and lipopolysaccharides].
          Length = 317

 Score = 33.1 bits (76), Expect = 0.36
 Identities = 42/175 (24%), Positives = 58/175 (33%), Gaps = 51/175 (29%)

Query: 282 ISGDISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQD--AFT-LNIQATRELLDL 338
           + GDI    L      + F +H    ++H AA    D  I    AF   N+  T  LL+ 
Sbjct: 55  VKGDIGDREL----VSRLFTEHQPDAVVHFAAESHVDRSISGPAAFIETNVVGTYTLLEA 110

Query: 339 ATRCSQLKAILHVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFG 398
             +        H+ST           +E       Y DL      T    L         
Sbjct: 111 VRKYWHEFRFHHIST-----------DEV------YGDLEKGDAFTETTPLAP------- 146

Query: 399 GIYNNSYSFTKAIGESVVEKYL--YKLPLAMVRPSIVVSTWKEPIVGWSNNLYGP 451
              ++ YS +KA  + +V  Y   Y LP  + R               SNN YGP
Sbjct: 147 ---SSPYSASKAASDHLVRAYHRTYGLPALITRC--------------SNN-YGP 183


>gnl|CDD|114629 pfam05917, DUF874, Helicobacter pylori protein of unknown function
           (DUF874).  This family consists of several hypothetical
           proteins specific to Helicobacter pylori. The function
           of this family is unknown.
          Length = 417

 Score = 33.3 bits (75), Expect = 0.37
 Identities = 30/168 (17%), Positives = 78/168 (46%), Gaps = 3/168 (1%)

Query: 21  NKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE 80
           ++DK      +   + +  +  N S  + +++++  ++EK+K +K+  +  ++    E+E
Sbjct: 123 DQDKKIELAQAKKEAENARDRANKSGIELEQEEQKTEQEKQKTEKEGIELANSQIKAEQE 182

Query: 81  KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD--EKKEQKESK 138
           K K   ++++ ++ K K S+   +   E + +K+K+   K    +  KD  ++ EQ   +
Sbjct: 183 KQKTEQEKQKTEQEKQKTSNIANKNAIELEQEKQKTENEKQDLIKEQKDFIKEAEQNCQE 242

Query: 139 SSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKE 186
           + ++        K      ++ +    P P  T ++P++ K     K+
Sbjct: 243 NHNQFFIKKLGIK-AGIAIEIEAECKTPKPAKTNQTPIQPKHLPNSKQ 289



 Score = 32.2 bits (72), Expect = 0.84
 Identities = 21/105 (20%), Positives = 48/105 (45%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
            +  +DKK +    +KE E      +KS +  +++E+     K+K  KE     +S  K 
Sbjct: 120 FADDQDKKIELAQAKKEAENARDRANKSGIELEQEEQKTEQEKQKTEKEGIELANSQIKA 179

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
           +++++K ++EK    ++K +  +   K   +  +   K  +   +
Sbjct: 180 EQEKQKTEQEKQKTEQEKQKTSNIANKNAIELEQEKQKTENEKQD 224


>gnl|CDD|118064 pfam09528, Ehrlichia_rpt, Ehrlichia tandem repeat (Ehrlichia_rpt). 
           This entry represents 77 residues of an 80 amino acid
           (240 nucleotide) tandem repeat, found in a variable
           number of copies in an immunodominant outer membrane
           protein of Ehrlichia chaffeensis, a tick-borne obligate
           intracellular pathogen.
          Length = 707

 Score = 33.5 bits (75), Expect = 0.37
 Identities = 32/275 (11%), Positives = 94/275 (34%), Gaps = 8/275 (2%)

Query: 5   VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSS-KKDKKDKDRDKEKEKE 63
           ++     +       ++  +D            +S P  +     + +K+++  +   ++
Sbjct: 274 IEEHQGETEKEEGIPESHAEDLQPAVDDIVEHPSSEPFVAEEEVSETEKEENNPEVLAED 333

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
            +D    +S VS +  +  +    E E  + + ++     E   E  +            
Sbjct: 334 LQDAADGESGVSDQPAQVVEERESEIEEHQGETEKEEGIPESHAEDDEIASDPSIEHFSA 393

Query: 124 RERDKDEKKEQKESKSSSKIVS-SSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKE 182
               +  + E++ES    K          + A     +   P       +   ++ ++ E
Sbjct: 394 EVGKEVSETEKEESNPEVKAEDLQPAVDGDVAHHESEVGDKPAETSKEEESPEIEAEDGE 453

Query: 183 KEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYIL 242
             K+            H+++D+   + + +E  A+ K ++   +  G   +        +
Sbjct: 454 PAKDGGIEE------SHQEEDEIVSEPSKEEFTAEVKAEDLQPAVDGSVEHSSSEVGEEV 507

Query: 243 LRSKKNKTVQERLAEQFKDELFDRLKNEQADILQR 277
             ++K ++  E  AE     + D L++   ++ ++
Sbjct: 508 SETEKEESNPEIKAEDLPPAVDDSLEHSIPEVGEK 542



 Score = 33.1 bits (74), Expect = 0.50
 Identities = 39/192 (20%), Positives = 73/192 (38%), Gaps = 12/192 (6%)

Query: 7   SSSSSSSAHPSPHKNKDK---DSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
             +      P  H   D+   D S    ++      + T    S  + K +D     + +
Sbjct: 364 GETEKEEGIPESHAEDDEIASDPSIEHFSAEVGKEVSETEKEESNPEVKAEDLQPAVDGD 423

Query: 64  KKDKEK---DKSAVSSKEKEKDKVSSKEKE-RKESKPKESSSEKEKKKEKKDKKEKSHKH 119
               E    DK A +SKE+E  ++ +++ E  K+   +ES  E+++   +  K+E + + 
Sbjct: 424 VAHHESEVGDKPAETSKEEESPEIEAEDGEPAKDGGIEESHQEEDEIVSEPSKEEFTAEV 483

Query: 120 KDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTK 179
           K +D +   D   E   S+   ++   S   KE ++        PP      + S  +  
Sbjct: 484 KAEDLQPAVDGSVEHSSSEVGEEV---SETEKEESNPEIKAEDLPPAVDDSLEHSIPEVG 540

Query: 180 EKEKE--KESST 189
           EK  E   E   
Sbjct: 541 EKVDEMFAEEFN 552



 Score = 32.7 bits (73), Expect = 0.71
 Identities = 31/163 (19%), Positives = 69/163 (42%), Gaps = 10/163 (6%)

Query: 70  DKSAVSSKEKEKDKVSSKEKE-RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDK 128
           DK A +SKE+E  ++ +++ E  K+   +ES  E+++   +  K+E + + K +D +   
Sbjct: 132 DKPAKTSKEEENPEIEAEDGEPAKDDGIEESHQEEDEIVSESSKEEFTAEVKAEDLQPAV 191

Query: 129 DEKKEQKESKSSSKIVSSSHNSK---------EPASGSQLISHPPPPAPTPTQKSPVKTK 179
           D   E   S+   ++  +              +PA    +  H       P + S  +  
Sbjct: 192 DGSIEHSSSEVGEEVSKTEKEESNPEVKAEDLQPAVDDDVAHHESEVGDKPAETSKEEET 251

Query: 180 EKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
            + K ++     D   +H   + ++H  +T  +E   +S  ++
Sbjct: 252 PEVKAEDLQPAVDGSVEHSSSEIEEHQGETEKEEGIPESHAED 294


>gnl|CDD|214661 smart00435, TOPEUc, DNA Topoisomerase I (eukaryota).  DNA
           Topoisomerase I (eukaryota), DNA topoisomerase V,
           Vaccina virus topoisomerase, Variola virus
           topoisomerase, Shope fibroma virus topoisomeras.
          Length = 391

 Score = 33.1 bits (76), Expect = 0.39
 Identities = 23/91 (25%), Positives = 40/91 (43%), Gaps = 13/91 (14%)

Query: 52  KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
           K +++ K  + + K  +K           K K+ SK +   E       +E ++KK++K 
Sbjct: 281 KLQEKIKALKYQLKRLKKMILLFEMISDLKRKLKSKFERDNEKL----DAEVKEKKKEKK 336

Query: 112 KKEKSHKHKDKDRER---------DKDEKKE 133
           K+EK  K  ++  ER         DK+E K 
Sbjct: 337 KEEKKKKQIERLEERIEKLEVQATDKEENKT 367



 Score = 30.0 bits (68), Expect = 4.0
 Identities = 16/55 (29%), Positives = 30/55 (54%)

Query: 41  PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESK 95
              S   + ++K     KEK+KEKK +EK K  +   E+  +K+  +  +++E+K
Sbjct: 312 KLKSKFERDNEKLDAEVKEKKKEKKKEEKKKKQIERLEERIEKLEVQATDKEENK 366



 Score = 29.6 bits (67), Expect = 5.1
 Identities = 21/67 (31%), Positives = 33/67 (49%), Gaps = 8/67 (11%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK--ERKESKPKESSSEKE 104
           S   +K K +  E++ EK D E  +     K+KEK K   K+K  ER E + ++   +  
Sbjct: 307 SDLKRKLKSKF-ERDNEKLDAEVKE-----KKKEKKKEEKKKKQIERLEERIEKLEVQAT 360

Query: 105 KKKEKKD 111
            K+E K 
Sbjct: 361 DKEENKT 367


>gnl|CDD|217840 pfam04006, Mpp10, Mpp10 protein.  This family includes proteins
           related to Mpp10 (M phase phosphoprotein 10). The U3
           small nucleolar ribonucleoprotein (snoRNP) is required
           for three cleavage events that generate the mature 18S
           rRNA from the pre-rRNA. In Saccharomyces cerevisiae,
           depletion of Mpp10, a U3 snoRNP-specific protein, halts
           18S rRNA production and impairs cleavage at the three U3
           snoRNP-dependent sites.
          Length = 613

 Score = 33.4 bits (76), Expect = 0.40
 Identities = 11/102 (10%), Positives = 43/102 (42%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           S +D++D +    + ++  D ++      + +  + +    +KE  + +      E++++
Sbjct: 240 SGEDEEDDEEGNIEYEDFFDPKEKDKKKDAGDDAELEDDEPDKEAVKKEADSKPEEEDEE 299

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
            ++++  +   +  +   ++ K ++   +     S    SS 
Sbjct: 300 DDEQEDDQDEEEPPEAAMDKVKLDEPVLEGVDLESPKELSSF 341



 Score = 32.7 bits (74), Expect = 0.56
 Identities = 27/134 (20%), Positives = 52/134 (38%), Gaps = 4/134 (2%)

Query: 53  DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
           D+D  ++  ++  +  KD     S E E+D     E+   E +      EK+KKK+  D 
Sbjct: 217 DEDDFEDYFQDDSEDGKDDEDFGSGEDEEDD----EEGNIEYEDFFDPKEKDKKKDAGDD 272

Query: 113 KEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
            E      DK+  + + + K ++E +   +        + P +    +    P       
Sbjct: 273 AELEDDEPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEEPPEAAMDKVKLDEPVLEGVDL 332

Query: 173 KSPVKTKEKEKEKE 186
           +SP +    EK + 
Sbjct: 333 ESPKELSSFEKRQA 346



 Score = 31.5 bits (71), Expect = 1.4
 Identities = 24/125 (19%), Positives = 52/125 (41%), Gaps = 9/125 (7%)

Query: 47  SKKDKKDKDRDK-------EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER--KESKPK 97
            + D +D  +D        E     +D+E D+      E   D     +K+    +++ +
Sbjct: 217 DEDDFEDYFQDDSEDGKDDEDFGSGEDEEDDEEGNIEYEDFFDPKEKDKKKDAGDDAELE 276

Query: 98  ESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGS 157
           +   +KE  K++ D K +    +D ++E D+DE++  + +    K+        +  S  
Sbjct: 277 DDEPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEEPPEAAMDKVKLDEPVLEGVDLESPK 336

Query: 158 QLISH 162
           +L S 
Sbjct: 337 ELSSF 341



 Score = 30.7 bits (69), Expect = 2.8
 Identities = 24/95 (25%), Positives = 43/95 (45%), Gaps = 6/95 (6%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-----KPKESSSEK 103
             +KDK +D   + E +D E DK AV  +   K +   +E + +E      +P E++ +K
Sbjct: 260 PKEKDKKKDAGDDAELEDDEPDKEAVKKEADSKPEEEDEEDDEQEDDQDEEEPPEAAMDK 319

Query: 104 EKKKEKKDKKEKSHKHKDKDR-ERDKDEKKEQKES 137
            K  E   +       K+    E+ + + K+Q E 
Sbjct: 320 VKLDEPVLEGVDLESPKELSSFEKRQAKLKQQIEQ 354



 Score = 30.0 bits (67), Expect = 4.2
 Identities = 16/63 (25%), Positives = 35/63 (55%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           + D K ++ D+E ++++ D+++++   ++ +K K      E    ES  + SS EK + K
Sbjct: 288 EADSKPEEEDEEDDEQEDDQDEEEPPEAAMDKVKLDEPVLEGVDLESPKELSSFEKRQAK 347

Query: 108 EKK 110
            K+
Sbjct: 348 LKQ 350


>gnl|CDD|115579 pfam06933, SSP160, Special lobe-specific silk protein SSP160.  This
           family consists of several special lobe-specific silk
           protein SSP160 sequences which appear to be specific to
           Chironomus (Midge) species.
          Length = 758

 Score = 33.2 bits (75), Expect = 0.41
 Identities = 15/41 (36%), Positives = 30/41 (73%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSS 47
           S +SSSSA+ + + N    +++  S++++++TSN T+SS+S
Sbjct: 106 SGNSSSSANSTSNSNSTTSNNSTTSSNSTTTTSNSTSSSNS 146


>gnl|CDD|219668 pfam07964, Red1, Rec10 / Red1.  Rec10 / Red1 is involved in meiotic
           recombination and chromosome segregation during
           homologous chromosome formation. This protein localises
           to the synaptonemal complex in S. cerevisiae and the
           analogous structures (linear elements) in S. pombe. This
           family is currently only found in fungi.
          Length = 706

 Score = 33.3 bits (76), Expect = 0.41
 Identities = 36/208 (17%), Positives = 66/208 (31%), Gaps = 20/208 (9%)

Query: 22  KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEK 81
           KD    A        ++    N     K  K K    EK+++    +   ++  SKE  K
Sbjct: 405 KDPTIIAGKKLMNKLTSEKINNPVKVVKVSKYKGNKSEKKRDINVLDTIFASPVSKELRK 464

Query: 82  DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
               SK+ + K  KP  + S+K+             K K   + +  ++   Q   + SS
Sbjct: 465 KVGKSKQTKLKNFKPVPNKSKKQLANNNSQNI----KSKKVVKAKTNNKANLQDVGECSS 520

Query: 142 KIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEK----EKEKESSTTHDKHSKH 197
              +   N K+ ++ S               KS   + E        K+   T       
Sbjct: 521 PPNNKEKNDKQTSTSSS------------VLKSDRSSIEVRNPNANVKKLEDTTYNAKFP 568

Query: 198 KHKKKDKHGDKTNPKEKDAKSKEKESHK 225
              K + +        +DA +   ++  
Sbjct: 569 TVSKNNAYTLVDISTSEDAVNSADDTRS 596



 Score = 29.5 bits (66), Expect = 6.3
 Identities = 36/255 (14%), Positives = 80/255 (31%), Gaps = 32/255 (12%)

Query: 10  SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPT-----NSSSSKKDKKDKDRDKEKEKEK 64
            SS       +  +  +  I + +  S   +       N+     ++  K +    + ++
Sbjct: 250 ESSVQDEECSREANVPTQDIEANTKDSLHMSAQDNHYDNTQLQTPERSTKRKSPIWDLKE 309

Query: 65  KDKEKDKSAVSSKEKEKDKVSSKEKER-----KESKP----------KESSSEKEKKKEK 109
             KE    + ++ +   +K S  E        +E  P          + +S   E  KE 
Sbjct: 310 DQKESKIKSGTNLKLSSEKESIPETSYVNVLEEEQSPLVRLQKRKLARSTSKTLESLKEV 369

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
            + +  S K+K    E + +E  +   + +    +            ++L S        
Sbjct: 370 FEDQASSVKNKQAQSEENLNESPKTPIAVTGDPHLKDPTIIAGKKLMNKLTSEKINNPVK 429

Query: 170 PTQKSPVKTKEKEKEKE---------SSTTHDKHSKHKHKKKDKHGDKTNPKEKD---AK 217
             + S  K  + EK+++         S  + +   K    K+ K  +      K      
Sbjct: 430 VVKVSKYKGNKSEKKRDINVLDTIFASPVSKELRKKVGKSKQTKLKNFKPVPNKSKKQLA 489

Query: 218 SKEKESHKSSAGPKC 232
           +   ++ KS    K 
Sbjct: 490 NNNSQNIKSKKVVKA 504


>gnl|CDD|225887 COG3351, FlaD, Putative archaeal flagellar protein D/E [Cell
           motility and secretion].
          Length = 214

 Score = 32.5 bits (74), Expect = 0.41
 Identities = 22/110 (20%), Positives = 46/110 (41%), Gaps = 15/110 (13%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
           K   +EK +K +  +A + +E ++       + +K        S+ E  +E  ++ E++ 
Sbjct: 2   KPYLEEKIEKAEKVTAFALEELKEKIEELPIQAKK--------SDDELVEELPERYEQTK 53

Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPA 167
           ++   ++    +E+  +KE   S K+       KEPA  S         A
Sbjct: 54  ENSLIEKVDSIEEEISEKEKVMSEKL-------KEPAQMSSTSEEEEKKA 96


>gnl|CDD|220431 pfam09831, DUF2058, Uncharacterized protein conserved in bacteria
           (DUF2058).  This domain, found in various prokaryotic
           proteins, has no known function.
          Length = 177

 Score = 32.2 bits (74), Expect = 0.41
 Identities = 19/71 (26%), Positives = 34/71 (47%), Gaps = 11/71 (15%)

Query: 66  DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
           DK+K K A   KEK K +     + RK +   +   ++  ++ K +K E     +D++  
Sbjct: 13  DKKKAKKA--KKEKRKQRK----QARKGADDGDDELKQAAEEAKAEKAE-----RDRELN 61

Query: 126 RDKDEKKEQKE 136
           R +  + EQK 
Sbjct: 62  RQRQAEAEQKA 72



 Score = 28.3 bits (64), Expect = 7.7
 Identities = 14/63 (22%), Positives = 35/63 (55%), Gaps = 9/63 (14%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           KK KK+K + +++ ++  D   D+   +++E + +K    E++R      E + +++ + 
Sbjct: 18  KKAKKEKRKQRKQARKGADDGDDELKQAAEEAKAEK---AERDR------ELNRQRQAEA 68

Query: 108 EKK 110
           E+K
Sbjct: 69  EQK 71


>gnl|CDD|238356 cd00660, Topoisomer_IB_N, Topoisomer_IB_N: N-terminal DNA binding
           fragment found in eukaryotic DNA topoisomerase (topo) IB
           proteins similar to the monomeric yeast and human topo I
           and heterodimeric topo I from Leishmania donvanni. Topo
           I enzymes are divided into:  topo type IA (bacterial)
           and type IB (eukaryotic). Topo I relaxes superhelical
           tension in duplex DNA by creating a single-strand nick,
           the broken strand can then rotate around the unbroken
           strand to remove DNA supercoils and, the nick is
           religated, liberating topo I. These enzymes regulate the
           topological changes that accompany DNA replication,
           transcription and other nuclear processes.  Human topo I
           is the target of a diverse set of anticancer drugs
           including camptothecins (CPTs). CPTs bind to the topo
           I-DNA complex and inhibit re-ligation of the
           single-strand nick, resulting in the accumulation of
           topo I-DNA adducts.  In addition to differences in
           structure and some biochemical properties,
           Trypanosomatid parasite topo I differ from human topo I
           in their sensitivity to CPTs and other classical topo I
           inhibitors. Trypanosomatid topos I play putative roles
           in organizing the kinetoplast DNA network unique to
           these parasites.  This family may represent more than
           one structural domain.
          Length = 215

 Score = 32.6 bits (75), Expect = 0.42
 Identities = 13/32 (40%), Positives = 22/32 (68%), Gaps = 3/32 (9%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
           +EKE+K++  KE   EK+  KE+K+K E+ + 
Sbjct: 96  EEKEKKKAMSKE---EKKAIKEEKEKLEEPYG 124


>gnl|CDD|235549 PRK05658, PRK05658, RNA polymerase sigma factor RpoD; Validated.
          Length = 619

 Score = 33.2 bits (77), Expect = 0.43
 Identities = 21/103 (20%), Positives = 37/103 (35%), Gaps = 2/103 (1%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
               D + E++      E ++      E+E++  +       ES+  E   EK K    K
Sbjct: 171 DGFVDPNAEEDPAHVGSELEELDDDEDEEEEEDENDDSLAADESELPEKVLEKFKALA-K 229

Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH-NSKE 152
             K+     + K   R    KK  K  +   + + S    SK+
Sbjct: 230 QYKKLRKAQEKKVEGRLAQHKKYAKLREKLKEELKSLRLTSKQ 272


>gnl|CDD|227925 COG5638, COG5638, Uncharacterized conserved protein [Function
           unknown].
          Length = 622

 Score = 33.2 bits (75), Expect = 0.45
 Identities = 38/209 (18%), Positives = 67/209 (32%), Gaps = 30/209 (14%)

Query: 42  TNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE----------- 90
           T S  S +D       K  +K   +KE D    S      D   + E E           
Sbjct: 364 TASKLSDEDDDSVMESK-MQKLFSEKEIDFGLNSELVDMSDDGENGEMEDTFTSHLPASN 422

Query: 91  RKESKPKESSS-------EKEKKKEKKDKKEKSHKH------KDKDRERDKDEKKEQKES 137
             ES  K  ++        +E+++ +K+++ K  K       KDK    +K  KK +   
Sbjct: 423 ESESDDKLETTIEKLDRKLRERQENRKERQLKKTKDDSDVDLKDKKESINKKNKKGKHAI 482

Query: 138 KSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTT-----HD 192
           + ++         K      + + H    +    +K     K K+K             D
Sbjct: 483 ERTAASKEELELIKADDEDDEQLDHFDMKSILKAEKFKKNRKLKKKASNLEEGFVFDPKD 542

Query: 193 KHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
                  +  +   D T+P+ K     +K
Sbjct: 543 PRFVAIFEDHNFAIDPTHPEFKKTGGMKK 571



 Score = 32.4 bits (73), Expect = 0.81
 Identities = 30/142 (21%), Positives = 53/142 (37%), Gaps = 7/142 (4%)

Query: 18  PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
                D   +     + +S       S S  K +   ++   K +E+++  K++      
Sbjct: 398 LVDMSDDGENGEMEDTFTSHLPASNESESDDKLETTIEKLDRKLRERQENRKERQL---- 453

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
           +K KD      K++KES  K++   K   +     KE+    K  D   + DE+ +  + 
Sbjct: 454 KKTKDDSDVDLKDKKESINKKNKKGKHAIERTAASKEELELIKADD---EDDEQLDHFDM 510

Query: 138 KSSSKIVSSSHNSKEPASGSQL 159
           KS  K      N K     S L
Sbjct: 511 KSILKAEKFKKNRKLKKKASNL 532


>gnl|CDD|218581 pfam05416, Peptidase_C37, Southampton virus-type processing
           peptidase.  Corresponds to Merops family C37.
           Norwalk-like viruses (NLVs), including the Southampton
           virus, cause acute non-bacterial gastroenteritis in
           humans. The NLV genome encodes three open reading frames
           (ORFs). ORF1 encodes a polyprotein, which is processed
           by the viral protease into six proteins.
          Length = 535

 Score = 32.9 bits (75), Expect = 0.46
 Identities = 26/139 (18%), Positives = 52/139 (37%), Gaps = 17/139 (12%)

Query: 10  SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDK-----------DRDK 58
               A P   K K+K        + S    +       KK ++++           DR++
Sbjct: 224 EPQDATPEGKKGKNKKGRGKKHNAFSRRGLSDEEYDEYKKIREERGGKYSIQEYLEDRER 283

Query: 59  -EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES-----SSEKEKKKEKKDK 112
            E+E  ++   +       + K + ++    K RK+ K + +     +    +K++  D 
Sbjct: 284 YEEELAERQATEADFCEEEEAKIRQRIFGLRKTRKQRKEERAKLGLVTGSDIRKRKPIDW 343

Query: 113 KEKSHKHKDKDRERDKDEK 131
             K     D DR+ D +EK
Sbjct: 344 NPKGPLWADDDRQVDYNEK 362


>gnl|CDD|219124 pfam06658, DUF1168, Protein of unknown function (DUF1168).  This
           family consists of several hypothetical eukaryotic
           proteins of unknown function.
          Length = 142

 Score = 31.6 bits (72), Expect = 0.46
 Identities = 21/95 (22%), Positives = 50/95 (52%), Gaps = 8/95 (8%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           D++ +KE +D+E  +     K K+++K ++K++ +++ K       K+KKK+KK  K+ +
Sbjct: 56  DEKWKKETEDEEFQQKREEKKRKDEEK-TAKKRAKRQKK-------KQKKKKKKKAKKGN 107

Query: 117 HKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSK 151
            K + +  +  ++   E++E +   +        K
Sbjct: 108 KKEEKEGSKSSEESSDEEEEGEEDKQEEPVEIMEK 142



 Score = 29.2 bits (66), Expect = 3.2
 Identities = 22/75 (29%), Positives = 41/75 (54%), Gaps = 4/75 (5%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           +++KK KD +K  +K  K ++K K     K+K+K      +KE KE       S  E+++
Sbjct: 72  REEKKRKDEEKTAKKRAK-RQKKKQK---KKKKKKAKKGNKKEEKEGSKSSEESSDEEEE 127

Query: 108 EKKDKKEKSHKHKDK 122
            ++DK+E+  +  +K
Sbjct: 128 GEEDKQEEPVEIMEK 142


>gnl|CDD|221121 pfam11489, DUF3210, Protein of unknown function (DUF3210).  This is
           a family of proteins conserved in yeasts. The function
           is not known. The Schizosaccharomyces pombe member is
           SPBC18E5.07 and the Saccharomyces cerevisiae member is
           AIM21.
          Length = 671

 Score = 33.0 bits (75), Expect = 0.49
 Identities = 30/150 (20%), Positives = 58/150 (38%), Gaps = 5/150 (3%)

Query: 30  PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE-KKDKEKDKSAVSSKEKEKDKVSSKE 88
           P   +  +   PT  SS + +++      E   E      +D + VS+      + +S+ 
Sbjct: 384 PEDESEIAVKPPTEESSRRPEEEKHRFPSEDVWEDSPSSLQDTATVSTPSNPPPR-ASET 442

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
            E++ S+     S    + E K +K+K+     K R   +D  ++  ES+       +  
Sbjct: 443 PEQETSRSSSEVSLDPHQSELKSEKKKARPEVSKQRFPSRDVWEDAPESQELVTTEETPE 502

Query: 149 NSKEPASGSQ---LISHPPPPAPTPTQKSP 175
             K  + G     + S P    PT  ++ P
Sbjct: 503 EVKSSSPGVTKPAIPSRPKKGKPTSEKRKP 532



 Score = 29.9 bits (67), Expect = 5.0
 Identities = 34/201 (16%), Positives = 71/201 (35%), Gaps = 15/201 (7%)

Query: 4   SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
           S++ +++ S+    P +  +        +S+  S     +   S+K K   +  K++   
Sbjct: 422 SLQDTATVSTPSNPPPRASETPEQETSRSSSEVSLDPHQSELKSEKKKARPEVSKQRFPS 481

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
           +   E    +      E+     K      +KP   S  K+ K   + +K      K K 
Sbjct: 482 RDVWEDAPESQELVTTEETPEEVKSSSPGVTKPAIPSRPKKGKPTSEKRKPPPVPKKPKP 541

Query: 124 R---------ERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLIS------HPPPPAP 168
           +         ++   E+      K   ++ +    SK  A  +   S         P AP
Sbjct: 542 QIPARPAKLQKQQAGEEANSSAFKPKPRVPARPGGSKIAALKAGFASDLNGRLALGPQAP 601

Query: 169 TPTQKSPVKTKEKEKEKESST 189
               +SP +  +++KE++  T
Sbjct: 602 KKVLESPKEPSKEKKEEDEDT 622



 Score = 29.1 bits (65), Expect = 9.1
 Identities = 44/228 (19%), Positives = 75/228 (32%), Gaps = 20/228 (8%)

Query: 19  HKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
            K++     + P +  +S  ++   S          +  +E E    + E  + AV    
Sbjct: 338 EKSEKSRHESDPKSRENSKPASIYGSVPDLIRHTPLEDVEEYEPLFPEDES-EIAVKPPT 396

Query: 79  KEKDKVSSKEKERK------ESKPKES---------SSEKEKKKEKKDKKEKSHKHKDKD 123
           +E  +   +EK R       E  P            S+   +  E  +++      +   
Sbjct: 397 EESSRRPEEEKHRFPSEDVWEDSPSSLQDTATVSTPSNPPPRASETPEQETSRSSSEVSL 456

Query: 124 RERDKDEKKEQKESKSS-SKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEK- 181
                + K E+K+++   SK    S +  E A  SQ +             SP  TK   
Sbjct: 457 DPHQSELKSEKKKARPEVSKQRFPSRDVWEDAPESQELVTTEETPEEVKSSSPGVTKPAI 516

Query: 182 -EKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSA 228
             + K+   T +K       KK K      P  K  K +  E   SSA
Sbjct: 517 PSRPKKGKPTSEKRKPPPVPKKPKPQIPARP-AKLQKQQAGEEANSSA 563


>gnl|CDD|222011 pfam13257, DUF4048, Domain of unknown function (DUF4048).  This
           presumed domain is functionally uncharacterized. This
           domain family is found in eukaryotes, and is typically
           between 228 and 257 amino acids in length.
          Length = 242

 Score = 32.4 bits (74), Expect = 0.49
 Identities = 17/51 (33%), Positives = 21/51 (41%), Gaps = 3/51 (5%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD 57
            S  S S   S  + + +  S   S S+ SSTS    S    KD K  D D
Sbjct: 126 RSRRSGSRSTSRSRLRLQGGSLSSSRSSRSSTSKGATSG---KDSKSADID 173


>gnl|CDD|217502 pfam03343, SART-1, SART-1 family.  SART-1 is a protein involved in
           cell cycle arrest and pre-mRNA splicing. It has been
           shown to be a component of U4/U6 x U5 tri-snRNP complex
           in human, Schizosaccharomyces pombe and Saccharomyces
           cerevisiae. SART-1 is a known tumour antigen in a range
           of cancers recognised by T cells.
          Length = 603

 Score = 32.8 bits (75), Expect = 0.50
 Identities = 34/155 (21%), Positives = 56/155 (36%), Gaps = 7/155 (4%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
           K KK  +          D E +K     K+K K    S + +  E+   +     E  K 
Sbjct: 216 KKKKSDNLFTLDSGGSTDDEAEKKRQEVKKKLKINNVSLDDDSTETPASDYYDVSEMVKF 275

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
           KK KK+K  K   K R +D DE + + E++      S S    E     +       P  
Sbjct: 276 KKPKKKKKKK---KKRRKDLDEDELEPEAEGLGSSDSGSRKDVE----EENARLEDSPKK 328

Query: 169 TPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKD 203
              ++      E + + ++S    +    K +KK 
Sbjct: 329 RKEEQEDDDFVEDDDDLQASLAKQRRLAQKKRKKL 363



 Score = 32.4 bits (74), Expect = 0.78
 Identities = 46/209 (22%), Positives = 81/209 (38%), Gaps = 39/209 (18%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAV-SSKEKEKDKVSSKEKERKESKP-----KESS 100
           SKK +K K+ +++K     +KEK+++A  +S++    KV  K +E +E +      K++ 
Sbjct: 98  SKKRQKKKEAERKKALLLDEKEKERAAEYTSEDLAGLKVGHKVEEFEEGEDVILTLKDTG 157

Query: 101 ----------------SEKEKKKEK-KDKKEKSHKHKDKDRER---------DKDEKKEQ 134
                            EKEK K+  + KK+K     D D +          D++ + ++
Sbjct: 158 VLEDEDEGDELENVELVEKEKDKKNLELKKKKPDYDPDDDDKFNKRSILSKYDEEIEGKK 217

Query: 135 KESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKH 194
           K+S +   + S      E     Q             + + V   +   E  +S  +D  
Sbjct: 218 KKSDNLFTLDSGGSTDDEAEKKRQ-------EVKKKLKINNVSLDDDSTETPASDYYDVS 270

Query: 195 SKHKHKKKDKHGDKTNPKEKDAKSKEKES 223
              K KK  K   K   + KD    E E 
Sbjct: 271 EMVKFKKPKKKKKKKKKRRKDLDEDELEP 299



 Score = 31.7 bits (72), Expect = 1.4
 Identities = 21/82 (25%), Positives = 42/82 (51%)

Query: 31  STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE 90
             ST +  S+  + S   K KK K + K+K+K +KD ++D+    ++        S++  
Sbjct: 256 DDSTETPASDYYDVSEMVKFKKPKKKKKKKKKRRKDLDEDELEPEAEGLGSSDSGSRKDV 315

Query: 91  RKESKPKESSSEKEKKKEKKDK 112
            +E+   E S +K K++++ D 
Sbjct: 316 EEENARLEDSPKKRKEEQEDDD 337



 Score = 30.5 bits (69), Expect = 2.7
 Identities = 24/108 (22%), Positives = 50/108 (46%), Gaps = 2/108 (1%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           K +++ + K K+++ ++K A + +++E++      K   E    +  ++   KK KK +K
Sbjct: 44  KRQEEAEAKRKREELREKIAKAREKRERNSKLGGIKTLGEDDDDDDDTKAWLKKSKKRQK 103

Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSS-SKIVSSSHNSKEPASGSQLI 160
           +K    + K    D+ EK+   E  S     +   H  +E   G  +I
Sbjct: 104 KKE-AERKKALLLDEKEKERAAEYTSEDLAGLKVGHKVEEFEEGEDVI 150


>gnl|CDD|218332 pfam04929, Herpes_DNAp_acc, Herpes DNA replication accessory
           factor.  Replicative DNA polymerases are capable of
           polymerising tens of thousands of nucleotides without
           dissociating from their DNA templates. The high
           processivity of these polymerases is dependent upon
           accessory proteins that bind to the catalytic subunit of
           the polymerase or to the substrate. The Epstein-Barr
           virus (EBV) BMRF1 protein is an essential component of
           the viral DNA polymerase and is absolutely required for
           lytic virus replication. BMRF1 is also a transactivator.
           This family is predicted to have a UL42 like structure.
          Length = 381

 Score = 32.7 bits (75), Expect = 0.51
 Identities = 20/95 (21%), Positives = 29/95 (30%), Gaps = 12/95 (12%)

Query: 23  DKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKD 82
           + +      T + S      +S SS       D D   E          S       E+ 
Sbjct: 294 EANGVEPEPTGSVSDRPRHLSSDSSPSPPDTSDSDPSTETPP---PASLSHSPPAAFERP 350

Query: 83  KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
              S  K ++E   K        +K+KK KK K  
Sbjct: 351 LALS-PKRKREGDKK--------QKKKKSKKLKLT 376



 Score = 31.9 bits (73), Expect = 0.89
 Identities = 20/63 (31%), Positives = 28/63 (44%), Gaps = 6/63 (9%)

Query: 9   SSSSSAHPSPHKNKDKDSS------AIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
             SS + PSP    D D S      A  S S  ++   P   S  +K + DK + K+K K
Sbjct: 312 HLSSDSSPSPPDTSDSDPSTETPPPASLSHSPPAAFERPLALSPKRKREGDKKQKKKKSK 371

Query: 63  EKK 65
           + K
Sbjct: 372 KLK 374


>gnl|CDD|146486 pfam03879, Cgr1, Cgr1 family.  Members of this family are
           coiled-coil proteins that are involved in pre-rRNA
           processing.
          Length = 105

 Score = 30.8 bits (70), Expect = 0.52
 Identities = 14/47 (29%), Positives = 26/47 (55%), Gaps = 1/47 (2%)

Query: 90  ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           E++  K  E  + K ++KE KD+KE   + +     +++   KE+KE
Sbjct: 31  EKRMEKRLEQQAIKAREKELKDEKEAERQRR-IQAIKERRAAKEEKE 76


>gnl|CDD|237753 PRK14552, PRK14552, C/D box methylation guide ribonucleoprotein
           complex aNOP56 subunit; Provisional.
          Length = 414

 Score = 32.6 bits (75), Expect = 0.53
 Identities = 17/53 (32%), Positives = 33/53 (62%)

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
           ++ K++++ + +E KE  PK    ++E+KK +K KK+K  K K K R++   +
Sbjct: 362 DELKEELNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRKKKGRK 414



 Score = 31.1 bits (71), Expect = 1.9
 Identities = 15/53 (28%), Positives = 30/53 (56%), Gaps = 1/53 (1%)

Query: 82  DKVSSKEKER-KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
           D++  +  +R +E K K     K+K++EKK +K K  K + K  ++ K + ++
Sbjct: 362 DELKEELNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRKKKGRK 414



 Score = 30.3 bits (69), Expect = 3.4
 Identities = 11/50 (22%), Positives = 27/50 (54%)

Query: 77  KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
           KE+   ++   +++  +   K+   +K +K++KK K++K  K + K   +
Sbjct: 365 KEELNKRIEEIKEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRKKKGRK 414



 Score = 29.6 bits (67), Expect = 5.5
 Identities = 12/34 (35%), Positives = 20/34 (58%)

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           K+K  K  K+K  + K + R++ K  KK+ K+ K
Sbjct: 376 KEKYPKPPKKKREEKKPQKRKKKKKRKKKGKKRK 409


>gnl|CDD|233787 TIGR02223, ftsN, cell division protein FtsN.  FtsN is a poorly
           conserved protein active in cell division in a number of
           Proteobacteria. The N-terminal 30 residue region tends
           to by Lys/Arg-rich, and is followed by a
           membrane-spanning region. This is followed by an acidic
           low-complexity region of variable length and a
           well-conserved C-terminal domain of two tandem regions
           matched by pfam05036 (Sporulation related repeat), found
           in several cell division and sporulation proteins. The
           role of FtsN as a suppressor for other cell division
           mutations is poorly understood; it may involve cell wall
           hydrolysis [Cellular processes, Cell division].
          Length = 298

 Score = 32.4 bits (73), Expect = 0.54
 Identities = 29/181 (16%), Positives = 64/181 (35%), Gaps = 24/181 (13%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKK 65
             + +  +A   P K +++ S      +     ++P   S+    ++      E+ +  +
Sbjct: 64  NQTENGETAADLPPKPEERWSYIEELEAREVLINDPEEPSNGGGVEESAQLTAEQRQLLE 123

Query: 66  DKEKDKSAVSSKEKE---KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDK 122
             + D  A          +  V+ + +++   K  + +   E +K   +  EK      +
Sbjct: 124 QMQADMRAAEKVLATAPSEQTVAVEARKQTAEKKPQKARTAEAQKTPVET-EKIASKVKE 182

Query: 123 DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKE 182
            +++ K   K+  E++S+SK                    P   AP   +    K K KE
Sbjct: 183 AKQKQKALPKQTAETQSNSK--------------------PIETAPKADKADKTKPKPKE 222

Query: 183 K 183
           K
Sbjct: 223 K 223



 Score = 30.0 bits (67), Expect = 3.7
 Identities = 27/127 (21%), Positives = 53/127 (41%), Gaps = 8/127 (6%)

Query: 32  TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAV-SSKEKEKDKVSSKEKE 90
           T+ S  T        + + K  K R  E +K   + EK  S V  +K+K+K       + 
Sbjct: 138 TAPSEQTVAVEARKQTAEKKPQKARTAEAQKTPVETEKIASKVKEAKQKQKALPKQTAET 197

Query: 91  RKESKPKESSSEKEKKKEKKDKKEKSHKHKD-------KDRERDKDEKKEQKESKSSSKI 143
           +  SKP E++ + +K  + K K ++  +           ++E+ +  + +      SSKI
Sbjct: 198 QSNSKPIETAPKADKADKTKPKPKEKAERAAALQCGAYANKEQAESVRAKLAFLGISSKI 257

Query: 144 VSSSHNS 150
            ++    
Sbjct: 258 TTTDGGK 264



 Score = 29.7 bits (66), Expect = 4.5
 Identities = 25/171 (14%), Positives = 52/171 (30%), Gaps = 15/171 (8%)

Query: 60  KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESK-----------PKESSSEKEKKKE 108
           K+  + +  + K+   + E   D     E+     +           P+E S+    ++ 
Sbjct: 52  KQANEPETLQPKNQTENGETAADLPPKPEERWSYIEELEAREVLINDPEEPSNGGGVEES 111

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH--PPPP 166
            +   E+    +    +    EK         +  V +   + E        +     P 
Sbjct: 112 AQLTAEQRQLLEQMQADMRAAEKVLATAPSEQTVAVEARKQTAEKKPQKARTAEAQKTPV 171

Query: 167 APTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKK--KDKHGDKTNPKEKD 215
                     + K+K+K     T   + +    +   K    DKT PK K+
Sbjct: 172 ETEKIASKVKEAKQKQKALPKQTAETQSNSKPIETAPKADKADKTKPKPKE 222


>gnl|CDD|233044 TIGR00600, rad2, DNA excision repair protein (rad2).  All proteins
           in this family for which functions are known are flap
           endonucleases that generate the 3' incision next to DNA
           damage as part of nucleotide excision repair. This
           family is related to many other flap endonuclease
           families including the fen1 family. This family is based
           on the phylogenomic analysis of JA Eisen (1999, Ph.D.
           Thesis, Stanford University) [DNA metabolism, DNA
           replication, recombination, and repair].
          Length = 1034

 Score = 32.9 bits (75), Expect = 0.54
 Identities = 17/111 (15%), Positives = 43/111 (38%), Gaps = 1/111 (0%)

Query: 21  NKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE 80
               +  ++ ST      S    +  S+++ ++K    E E  K+ ++            
Sbjct: 658 GSFIEVDSVSSTLELQVPSKSQPTDESEENAENKVASIEGEHRKEIEDLLFDESEEDNIV 717

Query: 81  KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK-SHKHKDKDRERDKDE 130
                 K+ +  +++ ++ S E+ +  E     E+ S K + + ++R   E
Sbjct: 718 GMIEEEKDADDFKNEWQDISLEELEALEANLLAEQNSLKAQKQQQKRIAAE 768



 Score = 31.8 bits (72), Expect = 1.2
 Identities = 20/85 (23%), Positives = 34/85 (40%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
              E E  + EK++S       E D VSS  + +  SK + +   +E  + K    E  H
Sbjct: 640 NPMEVEPMESEKEESESDGSFIEVDSVSSTLELQVPSKSQPTDESEENAENKVASIEGEH 699

Query: 118 KHKDKDRERDKDEKKEQKESKSSSK 142
           + + +D   D+ E+          K
Sbjct: 700 RKEIEDLLFDESEEDNIVGMIEEEK 724


>gnl|CDD|217434 pfam03224, V-ATPase_H_N, V-ATPase subunit H.  The yeast
           Saccharomyces cerevisiae vacuolar H+-ATPase (V-ATPase)
           is a multisubunit complex responsible for acidifying
           organelles. It functions as an ATP dependent proton pump
           that transports protons across a lipid bilayer. This
           domain corresponds to the N terminal domain of the H
           subunit of V-ATPase. The N-terminal domain is required
           for the activation of the complex whereas the C-terminal
           domain is required for coupling ATP hydrolysis to proton
           translocation.
          Length = 312

 Score = 32.6 bits (75), Expect = 0.54
 Identities = 20/91 (21%), Positives = 36/91 (39%), Gaps = 5/91 (5%)

Query: 288 QPSLGISSHDQQFIQH---HIHVIIHAAASLRFDELIQDAFTLNIQATRELLDLATRCSQ 344
            P L + S+   FI      I   + A +  + ++L+++A  L +     LL  +T   Q
Sbjct: 108 SPFLKLLSNQDDFIVLLALFILAKLLAYSKKKSNKLVEEALPLLLSLLSSLLQSSTLGLQ 167

Query: 345 LKAILHVSTLYTHS-YREDIQE-EFYPPLFS 373
             A+  +  L     YR+   E +    L  
Sbjct: 168 YIAVRCLQELLRVKEYRKLFWESDGVSTLID 198


>gnl|CDD|237258 PRK12903, secA, preprotein translocase subunit SecA; Reviewed.
          Length = 925

 Score = 32.7 bits (75), Expect = 0.57
 Identities = 16/96 (16%), Positives = 41/96 (42%), Gaps = 8/96 (8%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVS-SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
            + ++     E+ +   E+ K+     K +  +   ++E+ +  ++ K    E +   +K
Sbjct: 824 IQREEMLMRPEELELINEEQKNLKQEIKLELSEIQEAEEEIQNINENKNEFVEFKNDPKK 883

Query: 110 -------KDKKEKSHKHKDKDRERDKDEKKEQKESK 138
                  KD   K     D+ ++ +K  KK++K+ +
Sbjct: 884 LNKLIIAKDVLIKLVISSDEIKQDEKTTKKKKKDLE 919



 Score = 28.9 bits (65), Expect = 8.9
 Identities = 13/98 (13%), Positives = 42/98 (42%), Gaps = 5/98 (5%)

Query: 18  PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK 77
           P + +  +              +    +  +    ++++++  E +   K+ +K  + +K
Sbjct: 833 PEELELINEEQKNLKQEIKLELSEIQEAEEEIQNINENKNEFVEFKNDPKKLNK-LIIAK 891

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           +     V S ++ +++    E +++K+KK  +K  +E 
Sbjct: 892 DVLIKLVISSDEIKQD----EKTTKKKKKDLEKTDEEA 925


>gnl|CDD|237799 PRK14715, PRK14715, DNA polymerase II large subunit; Provisional.
          Length = 1627

 Score = 32.9 bits (75), Expect = 0.59
 Identities = 19/61 (31%), Positives = 33/61 (54%)

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
           KEKK+EK ++K +  K ++ D E +++EK    E      I ++    KE  +G  + +H
Sbjct: 280 KEKKEEKDEEKSEEVKTEEVDEEFEEEEKGFYYELYEKVNIEANKKFIKEVIAGRPVFAH 339

Query: 163 P 163
           P
Sbjct: 340 P 340


>gnl|CDD|220252 pfam09468, RNase_H2-Ydr279, Ydr279p protein family (RNase H2
           complex component).  RNases H are enzymes that
           specifically hydrolyse RNA when annealed to a
           complementary DNA and are present in all living
           organisms. In yeast RNase H2 is composed of a complex of
           three proteins (Rnh2Ap, Ydr279p and Ylr154p), this
           family represents the homologues of Ydr279p. It is not
           known whether non yeast proteins in this family fulfil
           the same function.
          Length = 287

 Score = 32.3 bits (74), Expect = 0.61
 Identities = 19/99 (19%), Positives = 30/99 (30%), Gaps = 3/99 (3%)

Query: 12  SSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
               P    N  +   A        S   P       K     +     +  K+ K+K +
Sbjct: 179 ELDIPDDILNLLRLRYAC---DLLCSYLPPDLYKELLKSLLIPEFKPLDKYLKESKKKKR 235

Query: 72  SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
                 E  + +   K K ++E K K+    K  K  KK
Sbjct: 236 ETEEDVEAAESRAEKKRKSKEEIKKKKPKESKGVKALKK 274



 Score = 31.9 bits (73), Expect = 0.85
 Identities = 21/69 (30%), Positives = 34/69 (49%), Gaps = 5/69 (7%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKE-KDKVSSKEKERKESKPKESSSEKE 104
             +    DK   + K+K+++ +E  ++A S  EK+ K K   K+K+ KESK  +      
Sbjct: 217 IPEFKPLDKYLKESKKKKRETEEDVEAAESRAEKKRKSKEEIKKKKPKESKGVK----AL 272

Query: 105 KKKEKKDKK 113
           KK   K  K
Sbjct: 273 KKVVAKGMK 281



 Score = 31.1 bits (71), Expect = 1.5
 Identities = 16/67 (23%), Positives = 31/67 (46%), Gaps = 1/67 (1%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           ++  K     +  K       + K+K + + ++ E  ES+ ++    KE+ K+KK K+ K
Sbjct: 209 KELLKSLLIPE-FKPLDKYLKESKKKKRETEEDVEAAESRAEKKRKSKEEIKKKKPKESK 267

Query: 116 SHKHKDK 122
             K   K
Sbjct: 268 GVKALKK 274



 Score = 29.2 bits (66), Expect = 6.1
 Identities = 17/79 (21%), Positives = 32/79 (40%), Gaps = 2/79 (2%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
           S       K+  +     + K   +  K +   K + ++ V + E   +  K ++S  E 
Sbjct: 201 SYLPPDLYKELLKSLLIPEFKPLDKYLKESKKKKRETEEDVEAAES--RAEKKRKSKEEI 258

Query: 104 EKKKEKKDKKEKSHKHKDK 122
           +KKK K+ K  K+ K    
Sbjct: 259 KKKKPKESKGVKALKKVVA 277


>gnl|CDD|218597 pfam05466, BASP1, Brain acid soluble protein 1 (BASP1 protein).
           This family consists of several brain acid soluble
           protein 1 (BASP1) or neuronal axonal membrane protein
           NAP-22. The BASP1 is a neuron enriched Ca(2+)-dependent
           calmodulin-binding protein of unknown function.
          Length = 233

 Score = 32.1 bits (72), Expect = 0.61
 Identities = 38/189 (20%), Positives = 84/189 (44%), Gaps = 6/189 (3%)

Query: 53  DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDK 112
           +K +DK+K+ E    E++ +   ++E +    +++ KE KE KP + + +   K E+K+ 
Sbjct: 16  EKAKDKDKKAEGAATEEEGTPKENEEAQAAAETTEVKEAKEEKPDKDAQDTANKTEEKEG 75

Query: 113 KEKSHKHKDK----DRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAP 168
           ++++   K++    + E+ +   + + E   +S        +  PA+G +        + 
Sbjct: 76  EKEAAAAKEEAPKAEPEKTEGAAEAKAEPPKASDPEQEPAAAPGPAAGGEAPKASEASSQ 135

Query: 169 TPTQKSPVKTKEKEKEK-ESSTTHDKHSKHKHKKKD-KHGDKTNPKEKDAKSKEKESHKS 226
                +P K +EK KE+ E+  T    +  +  K D      + P   +A    KE+  +
Sbjct: 136 PAESAAPAKEEEKSKEEGEAKKTEAPAAAAQETKSDAAPASDSKPSSSEAAPSSKETPAA 195

Query: 227 SAGPKCYPE 235
           +  P    +
Sbjct: 196 TEAPSSTAK 204



 Score = 31.0 bits (69), Expect = 1.3
 Identities = 36/197 (18%), Positives = 85/197 (43%), Gaps = 13/197 (6%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPT---NSSSSKKDKKDKDRDKEKEK 62
           K+  +++    +P +N++  ++A  +    +    P      +++K ++K+ +++    K
Sbjct: 24  KAEGAATEEEGTPKENEEAQAAAETTEVKEAKEEKPDKDAQDTANKTEEKEGEKEAAAAK 83

Query: 63  EKKDK-EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD 121
           E+  K E +K+  +++ K +   +S  ++   + P  ++  +  K  +   +        
Sbjct: 84  EEAPKAEPEKTEGAAEAKAEPPKASDPEQEPAAAPGPAAGGEAPKASEASSQPAESAAPA 143

Query: 122 KDRERDKDEKKEQK---------ESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
           K+ E+ K+E + +K         E+KS +   S S  S   A+ S   +     AP+ T 
Sbjct: 144 KEEEKSKEEGEAKKTEAPAAAAQETKSDAAPASDSKPSSSEAAPSSKETPAATEAPSSTA 203

Query: 173 KSPVKTKEKEKEKESST 189
           K+       E+ K S  
Sbjct: 204 KASAPAAPAEEVKPSEA 220


>gnl|CDD|129661 TIGR00570, cdk7, CDK-activating kinase assembly factor MAT1.  All
           proteins in this family for which functions are known
           are cyclin dependent protein kinases that are components
           of TFIIH, a complex that is involved in nucleotide
           excision repair and transcription initiation. Also known
           as MAT1 (menage a trois 1). This family is based on the
           phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis,
           Stanford University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 309

 Score = 32.5 bits (74), Expect = 0.62
 Identities = 24/131 (18%), Positives = 54/131 (41%), Gaps = 14/131 (10%)

Query: 53  DKDRDKEKEKEKKDKE---KDKSAVSSKEKEKDKVSSKEKERKESKPK--ESSSEKEKKK 107
           +  + K +  +K++K+   K+K   + +++E ++    EKE +E +    +   E+++  
Sbjct: 116 ENTKKKIETYQKENKDVIQKNKEKSTREQEELEEALEFEKEEEEQRRLLLQKEEEEQQMN 175

Query: 108 EKKDKK----EKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
           ++K+K+    E             + +K   K      K        ++P + S  I   
Sbjct: 176 KRKNKQALLDELETSTLPAAELIAQHKKNSVKLEMQVEKP-----KPEKPNTFSTGIKMG 230

Query: 164 PPPAPTPTQKS 174
              +  P QKS
Sbjct: 231 YQISLVPVQKS 241


>gnl|CDD|184860 PRK14858, tatA, twin arginine translocase protein A; Provisional.
          Length = 108

 Score = 30.6 bits (69), Expect = 0.63
 Identities = 14/62 (22%), Positives = 26/62 (41%), Gaps = 2/62 (3%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKD--KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           K    ++    +EKEK +K  +  K A + + K ++  + K K   E     +S   +  
Sbjct: 47  KQSMQEESRTAEEKEKAEKLAETKKEAEAPEAKAEEDQAPKPKGAGEPPATVASKAGDGA 106

Query: 107 KE 108
           K 
Sbjct: 107 KA 108


>gnl|CDD|221821 pfam12871, PRP38_assoc, Pre-mRNA-splicing factor 38-associated
           hydrophilic C-term.  This domain is a hydrophilic region
           found at the C-terminus of plant and metazoan
           pre-mRNA-splicing factor 38 proteins. The function is
           not known.
          Length = 97

 Score = 30.5 bits (69), Expect = 0.63
 Identities = 11/68 (16%), Positives = 37/68 (54%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
              ++E  + + ++ +R    P+  +  + +++++  K+ +  + +D+ R RD+D++   
Sbjct: 19  EEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDDRDRD 78

Query: 135 KESKSSSK 142
           +  +S S+
Sbjct: 79  RYDRSRSR 86



 Score = 30.1 bits (68), Expect = 0.71
 Identities = 7/83 (8%), Positives = 31/83 (37%), Gaps = 7/83 (8%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
            + ++ +E+E  ++ + K+        +       +  +  K       + + +++   +
Sbjct: 11  DEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYR 70

Query: 114 EKSH-------KHKDKDRERDKD 129
           ++         + + + R R +D
Sbjct: 71  DRDDRDRDRYDRSRSRSRSRSRD 93



 Score = 29.8 bits (67), Expect = 1.2
 Identities = 15/81 (18%), Positives = 34/81 (41%), Gaps = 1/81 (1%)

Query: 68  EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
           E+D       E+E+D    + K  ++      S  +  ++  + +K  S K + + R+RD
Sbjct: 7   EEDLDEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKR-SRKRRRRRRDRD 65

Query: 128 KDEKKEQKESKSSSKIVSSSH 148
           +   +++ +        S S 
Sbjct: 66  RARYRDRDDRDRDRYDRSRSR 86



 Score = 29.4 bits (66), Expect = 1.4
 Identities = 9/74 (12%), Positives = 32/74 (43%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH 117
            E+E+ +++++ ++    ++        S  +  +    +   S K +++ +   + +  
Sbjct: 11  DEEEESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYR 70

Query: 118 KHKDKDRERDKDEK 131
              D+DR+R    +
Sbjct: 71  DRDDRDRDRYDRSR 84



 Score = 29.0 bits (65), Expect = 1.9
 Identities = 10/84 (11%), Positives = 32/84 (38%), Gaps = 1/84 (1%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
            S  ++D ++  R  E++ ++  +   +       + K     + + R   + +    + 
Sbjct: 15  ESEEEEDDEEIRRKAERDVDRGRRSPRRRTRRRSRRRKRSRKRRRRRRDRDRARYRDRDD 74

Query: 104 EKKKEKKDKKEKSHKHKDKDRERD 127
             +      + +S + + +DR R 
Sbjct: 75  RDRDRYDRSRSRS-RSRSRDRRRR 97


>gnl|CDD|221643 pfam12572, DUF3752, Protein of unknown function (DUF3752).  This
           domain family is found in eukaryotes, and is typically
           between 140 and 163 amino acids in length.
          Length = 148

 Score = 31.2 bits (71), Expect = 0.63
 Identities = 28/123 (22%), Positives = 50/123 (40%), Gaps = 6/123 (4%)

Query: 6   KSSSSSSSAHPSPHKN-KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
           + S  +S   P+  +N K    +       SS T  P       K K+ +D     E   
Sbjct: 3   ERSDLNSRVDPTKLRNRKFSTGTKSARGDDSSWTETPEE-----KAKRLQDEVLGVEAGA 57

Query: 65  KDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
                  +  S ++KE  +   +  E+K  K      +K++KK+KK+++      +  DR
Sbjct: 58  SAPAAASAKASKRDKEMARKVKEYNEKKRGKSLVEQHQKKQKKKKKEEENDDPSRRPFDR 117

Query: 125 ERD 127
           E+D
Sbjct: 118 EKD 120


>gnl|CDD|218215 pfam04696, Pinin_SDK_memA, pinin/SDK/memA/ protein conserved
           region.  Members of this family have very varied
           localisations within the eukaryotic cell. pinin is known
           to localise at the desmosomes and is implicated in
           anchoring intermediate filaments to the desmosomal
           plaque. SDK2/3 is a dynamically localised nuclear
           protein thought to be involved in modulation of
           alternative pre-mRNA splicing. memA is a tumour marker
           preferentially expressed in human melanoma cell lines. A
           common feature of the members of this family is that
           they may all participate in regulating protein-protein
           interactions.
          Length = 131

 Score = 30.9 bits (70), Expect = 0.64
 Identities = 21/75 (28%), Positives = 40/75 (53%), Gaps = 2/75 (2%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
               +E+ +++SKEK R E + K    E+EK++ ++ +KEK    +++ R++ +  K EQ
Sbjct: 21  QKFSQEESRLTSKEKRRAEIEQKLE--EQEKQEREELRKEKRELFEERRRKQLELRKLEQ 78

Query: 135 KESKSSSKIVSSSHN 149
           K      +     HN
Sbjct: 79  KMEDEKLQETWHEHN 93


>gnl|CDD|217286 pfam02919, Topoisom_I_N, Eukaryotic DNA topoisomerase I, DNA
           binding fragment.  Topoisomerase I promotes the
           relaxation of DNA superhelical tension by introducing a
           transient single-stranded break in duplex DNA and are
           vital for the processes of replication, transcription,
           and recombination. This family may be more than one
           structural domain.
          Length = 215

 Score = 31.8 bits (73), Expect = 0.64
 Identities = 14/32 (43%), Positives = 21/32 (65%), Gaps = 3/32 (9%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
            EKE+K++  KE   EK+  KE+KDK E+ + 
Sbjct: 97  AEKEKKKAMSKE---EKKAIKEEKDKLEEPYG 125


>gnl|CDD|217450 pfam03247, Prothymosin, Prothymosin/parathymosin family.
           Prothymosin alpha and parathymosin are two ubiquitous
           small acidic nuclear proteins that are thought to be
           involved in cell cycle progression, proliferation, and
           cell differentiation.
          Length = 106

 Score = 30.7 bits (69), Expect = 0.67
 Identities = 19/96 (19%), Positives = 47/96 (48%), Gaps = 1/96 (1%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKD-KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
           ++++   KD    KE  +EK++ K    +   ++E    +   + +E +E    +   E 
Sbjct: 7   AAAELSAKDLKEKKEVVEEKENGKNAPANGNENEENGAQEGDDEMEEEEEVDEDDEEEEG 66

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKS 139
           E ++E+ +++E++     K    D+++  E K+ K+
Sbjct: 67  EGEEEEGEEEEETEGATGKRAAEDEEDDAETKKQKT 102



 Score = 29.5 bits (66), Expect = 1.6
 Identities = 18/85 (21%), Positives = 37/85 (43%), Gaps = 1/85 (1%)

Query: 52  KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
            D   D   E   KD ++ K  V  KE  K+   +   E +E+  +E   E E+++E  +
Sbjct: 1   SDTKVDAAAELSAKDLKEKKEVVEEKENGKNA-PANGNENEENGAQEGDDEMEEEEEVDE 59

Query: 112 KKEKSHKHKDKDRERDKDEKKEQKE 136
             E+     +++   +++E +    
Sbjct: 60  DDEEEEGEGEEEEGEEEEETEGATG 84


>gnl|CDD|220369 pfam09731, Mitofilin, Mitochondrial inner membrane protein.
           Mitofilin controls mitochondrial cristae morphology.
           Mitofilin is enriched in the narrow space between the
           inner boundary and the outer membranes, where it forms a
           homotypic interaction and assembles into a large
           multimeric protein complex. The first 78 amino acids
           contain a typical amino-terminal-cleavable mitochondrial
           presequence rich in positive-charged and hydroxylated
           residues and a membrane anchor domain. In addition, it
           has three centrally located coiled coil domains.
          Length = 493

 Score = 32.7 bits (75), Expect = 0.67
 Identities = 28/153 (18%), Positives = 60/153 (39%), Gaps = 20/153 (13%)

Query: 2   AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD-KEK 60
               ++ ++S +A  +  K+  +   A+            T S    ++      D  + 
Sbjct: 94  VAEAEAKATSVAAEATTPKSIQELVEALEELLEELLKE--TASDPVVQELVSIFNDLIDS 151

Query: 61  EKEKKDKEKDKSAVSS---------------KEKEKDKVSSKEKERKESKPKESSSEKEK 105
            KE   K+  +S ++S               K +E++++    KE++E     S  E+E 
Sbjct: 152 IKEDNLKDDLESLIASAKEELDQLSKKLAELKAEEEEELERALKEKREE--LLSKLEEEL 209

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
               + K+    K    + ER+K+E +++ E K
Sbjct: 210 LARLESKEAALEKQLRLEFEREKEELRKKYEEK 242


>gnl|CDD|224013 COG1088, RfbB, dTDP-D-glucose 4,6-dehydratase [Cell envelope
           biogenesis, outer membrane].
          Length = 340

 Score = 32.2 bits (74), Expect = 0.69
 Identities = 45/177 (25%), Positives = 63/177 (35%), Gaps = 54/177 (30%)

Query: 282 ISGDISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQD--AFT-LNIQATRELLDL 338
           + GDI    L     D+ F ++    ++H AA    D  I     F   N+  T  LL+ 
Sbjct: 56  VQGDICDREL----VDRLFKEYQPDAVVHFAAESHVDRSIDGPAPFIQTNVVGTYTLLEA 111

Query: 339 ATRCSQLKAILHVSTLYTHSYREDIQEEFYPPLFSYEDLAHVMQTTNQEELEILSSMLFG 398
           A +        H+ST           +E Y  L   +D     +TT              
Sbjct: 112 ARKYWGKFRFHHIST-----------DEVYGDLGLDDDA--FTETTP------------- 145

Query: 399 GIYNNS--YSFTKAIGESVVEKYL--YKLPLAMVRPSIVVSTWKEPIVGWSNNLYGP 451
             YN S  YS +KA  + +V  Y+  Y LP  + R               SNN YGP
Sbjct: 146 --YNPSSPYSASKAASDLLVRAYVRTYGLPATITRC--------------SNN-YGP 185


>gnl|CDD|233223 TIGR00990, 3a0801s09, mitochondrial precursor proteins import
           receptor (72 kDa mitochondrial outermembrane protein)
           (mitochondrial import receptor for the ADP/ATP carrier)
           (translocase of outermembrane tom70).  [Transport and
           binding proteins, Amino acids, peptides and amines].
          Length = 615

 Score = 32.7 bits (74), Expect = 0.71
 Identities = 21/81 (25%), Positives = 42/81 (51%), Gaps = 1/81 (1%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           S  K  KK++ + K+ EKE + K ++K + + K    +      +  + S    S  E++
Sbjct: 65  SKPKISKKERRKRKQAEKETEGKTEEKKSTAPKNAPVEPADELPEIDESSVANLSEEERK 124

Query: 105 KKKEK-KDKKEKSHKHKDKDR 124
           K   K K+K  K++++KD ++
Sbjct: 125 KYAAKLKEKGNKAYRNKDFNK 145



 Score = 30.7 bits (69), Expect = 2.2
 Identities = 17/85 (20%), Positives = 34/85 (40%)

Query: 118 KHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVK 177
           K + +   + K  KKE+++ K + K        K+  +       P    P   + S   
Sbjct: 58  KGQQQRESKPKISKKERRKRKQAEKETEGKTEEKKSTAPKNAPVEPADELPEIDESSVAN 117

Query: 178 TKEKEKEKESSTTHDKHSKHKHKKK 202
             E+E++K ++   +K +K    K 
Sbjct: 118 LSEEERKKYAAKLKEKGNKAYRNKD 142


>gnl|CDD|235322 PRK04950, PRK04950, ProP expression regulator; Provisional.
          Length = 213

 Score = 31.8 bits (73), Expect = 0.72
 Identities = 13/62 (20%), Positives = 27/62 (43%)

Query: 62  KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKD 121
           +E K K + + A    +K +     ++  R+E KPK  +  K++K   +  + +     D
Sbjct: 104 EEAKAKVQAQRAEQQAKKREAAGEKEKAPRRERKPKPKAPRKKRKPRAQKPEPQHTPVSD 163

Query: 122 KD 123
             
Sbjct: 164 IS 165



 Score = 28.7 bits (65), Expect = 6.2
 Identities = 10/59 (16%), Positives = 27/59 (45%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           K K + ++ +++ K   ++ EKEK     ++ + K  + K     ++ + +     + S
Sbjct: 107 KAKVQAQRAEQQAKKREAAGEKEKAPRRERKPKPKAPRKKRKPRAQKPEPQHTPVSDIS 165



 Score = 28.7 bits (65), Expect = 7.0
 Identities = 16/72 (22%), Positives = 33/72 (45%), Gaps = 3/72 (4%)

Query: 81  KDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER-DKDEKKEQKESKS 139
           K KV ++  E++  K + +   +++K  ++++K K    + K + R  K E +    S  
Sbjct: 107 KAKVQAQRAEQQAKKREAA--GEKEKAPRRERKPKPKAPRKKRKPRAQKPEPQHTPVSDI 164

Query: 140 SSKIVSSSHNSK 151
           S   V  +   K
Sbjct: 165 SELTVGQAVKVK 176


>gnl|CDD|227371 COG5038, COG5038, Ca2+-dependent lipid-binding protein, contains C2
           domain [General function prediction only].
          Length = 1227

 Score = 32.8 bits (75), Expect = 0.72
 Identities = 17/139 (12%), Positives = 40/139 (28%), Gaps = 27/139 (19%)

Query: 11  SSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKD 70
           S+              S+   +      + P    S  +  + K+ ++E + E +     
Sbjct: 2   STKQQHYR--------SSDNYSGNRPIPTIPKFFRSRGQRAEKKEEEQEMQPEDEKLFAP 53

Query: 71  KSAVSS-------------------KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
            +  +                      K+  + SS EK   ++     S  +E  K+   
Sbjct: 54  IAQRTVQIADVNFQGAKGIDDLSFTVPKQSIESSSPEKSDVDTSNTRPSVSRELHKDDYV 113

Query: 112 KKEKSHKHKDKDRERDKDE 130
             ++    + K     + E
Sbjct: 114 GPDQDGGWQRKVELSSEQE 132



 Score = 30.5 bits (69), Expect = 3.0
 Identities = 15/88 (17%), Positives = 33/88 (37%), Gaps = 9/88 (10%)

Query: 153 PASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGD----- 207
             SG++ I   P    +  Q++  K +E+E + E        ++   +  D +       
Sbjct: 13  NYSGNRPIPTIPKFFRSRGQRAEKKEEEQEMQPEDEKLFAPIAQRTVQIADVNFQGAKGI 72

Query: 208 ----KTNPKEKDAKSKEKESHKSSAGPK 231
                T PK+    S  ++S   ++  +
Sbjct: 73  DDLSFTVPKQSIESSSPEKSDVDTSNTR 100


>gnl|CDD|225381 COG2825, HlpA, Outer membrane protein [Cell envelope biogenesis,
           outer membrane].
          Length = 170

 Score = 31.2 bits (71), Expect = 0.74
 Identities = 23/95 (24%), Positives = 39/95 (41%), Gaps = 6/95 (6%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
              K    D E E +K+ KE  K     K KE       + +        S   K + + 
Sbjct: 40  PQAKKVSADLESEFKKRQKELQKMQKELKAKE------AKLQDDGKMEALSDRAKAEAEI 93

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
           KK+K   +   K ++ E+D + ++ ++E K   KI
Sbjct: 94  KKEKLVNAFNKKQQEYEKDLNRREAEEEQKLLEKI 128


>gnl|CDD|227468 COG5139, COG5139, Uncharacterized conserved protein [Function
           unknown].
          Length = 397

 Score = 32.4 bits (73), Expect = 0.80
 Identities = 36/181 (19%), Positives = 70/181 (38%), Gaps = 13/181 (7%)

Query: 99  SSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE--KKEQKESKSSSKIVSSSHNSKEPASG 156
           S++++E+ K  +   E       K     ++E  K+ Q      +    +S+++K+P +G
Sbjct: 2   STADQEQPKVVEATPEDGTASSQKSTINAENENTKQNQSMEPQETSK-GTSNDTKDPDNG 60

Query: 157 SQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA 216
            +             + S V+  E+ K K  ST     S  + +K D+      P  +  
Sbjct: 61  EK------NEEAAIDENSNVEAAER-KRKHISTDFSDMSLLRKRKNDQ---SLQPTREPM 110

Query: 217 KSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQADILQ 276
            S++     + A      + G   +   +      +E L EQ  DE+  RLK    D  +
Sbjct: 111 DSRDSGQDFTEAQSGELGDTGDRQLKAPAASRARRKEDLLEQTVDEISLRLKKRMQDAAK 170

Query: 277 R 277
           +
Sbjct: 171 K 171


>gnl|CDD|235582 PRK05729, valS, valyl-tRNA synthetase; Reviewed.
          Length = 874

 Score = 32.4 bits (75), Expect = 0.80
 Identities = 18/62 (29%), Positives = 28/62 (45%), Gaps = 8/62 (12%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK---EKERKESKPKESSSEKEKKKEKKDKK 113
           D E E  + +KE  K      EKE ++V  K   E    ++  +    E+EK  E ++K 
Sbjct: 808 DVEAELARLEKELAKL-----EKEIERVEKKLSNEGFVAKAPEEVVEKEREKLAEYEEKL 862

Query: 114 EK 115
            K
Sbjct: 863 AK 864


>gnl|CDD|149343 pfam08229, SHR3_chaperone, ER membrane protein SH3.  This family of
           proteins are membrane localised chaperones that are
           required for correct plasma membrane localisation of
           amino acid permeases (AAPs). SH3 prevents AAPs proteins
           from aggregating and assists in their correct folding.
           In the absence of SH3, AAPs are retained in the ER.
          Length = 196

 Score = 31.5 bits (72), Expect = 0.81
 Identities = 10/38 (26%), Positives = 21/38 (55%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
               K  E+   E+ ++A ++KE+E  +   KE ++K+
Sbjct: 159 WKDAKLLEEFAAEEAEAAAAAKEEESAEGEKKESKKKK 196



 Score = 30.4 bits (69), Expect = 2.1
 Identities = 14/49 (28%), Positives = 23/49 (46%), Gaps = 10/49 (20%)

Query: 90  ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           E K++K  E  + +E +     K+E+S +           EKKE K+ K
Sbjct: 158 EWKDAKLLEEFAAEEAEAAAAAKEEESAE----------GEKKESKKKK 196


>gnl|CDD|216269 pfam01056, Myc_N, Myc amino-terminal region.  The myc family
           belongs to the basic helix-loop-helix leucine zipper
           class of transcription factors, see pfam00010. Myc forms
           a heterodimer with Max, and this complex regulates cell
           growth through direct activation of genes involved in
           cell replication. Mutations in the C-terminal 20
           residues of this domain cause unique changes in the
           induction of apoptosis, transformation, and G2 arrest.
          Length = 329

 Score = 32.2 bits (73), Expect = 0.81
 Identities = 26/91 (28%), Positives = 42/91 (46%), Gaps = 13/91 (14%)

Query: 12  SSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
           S   P P   + K S     T      + P +SSSS  D + ++ ++E+E+E++++E D 
Sbjct: 187 SVVFPYPLNERSKSSKVASPTPRLGLRTPPNSSSSSGSDSESEEDEEEEEEEEEEEEID- 245

Query: 72  SAVSSKEKEKDKVSSKEKERKESKPKESSSE 102
                       V + EK R  S  K S+SE
Sbjct: 246 ------------VVTVEKRRSSSNRKASTSE 264


>gnl|CDD|150392 pfam09710, Trep_dent_lipo, Treponema clustered lipoprotein
           (Trep_dent_lipo).  This entry represents a family of six
           predicted lipoproteins from a region of about 20
           tandemly arranged genes in the Treponema denticola
           genome. Two other neighboring genes share the
           lipoprotein signal peptide region but do not show more
           extensive homology. The function of this locus is
           unknown.
          Length = 394

 Score = 32.2 bits (73), Expect = 0.82
 Identities = 22/93 (23%), Positives = 41/93 (44%), Gaps = 5/93 (5%)

Query: 46  SSKKDKKDKDRDKEKEKEKKDKEKDK-SAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           S  K+ K++ R+   + E   K + K +   SK +  + V + E + KE + K++  E  
Sbjct: 17  SCSKEVKEQ-REMRIKVESSMKIEPKENEFLSKPEYDEHVKTPE-QIKELEEKKAYEESL 74

Query: 105 KKKEKK-DKKEKSHKHKDKDRER-DKDEKKEQK 135
           K+ + + DK +       K       D   +QK
Sbjct: 75  KQLQFELDKYDLVLIQAYKTPTNIGIDNLAQQK 107


>gnl|CDD|173965 cd08045, TAF4, TATA Binding Protein (TBP) Associated Factor 4
           (TAF4) is one of several TAFs that bind TBP and is
           involved in forming Transcription Factor IID (TFIID)
           complex.  The TATA Binding Protein (TBP) Associated
           Factor 4 (TAF4) is one of several TAFs that bind TBP and
           are involved in forming the Transcription Factor IID
           (TFIID) complex. TFIID is one of seven General
           Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE,
           TFIIF, and TFIID) that are involved in accurate
           initiation of transcription by RNA polymerase II in
           eukaryote. TFIID plays an important role in the
           recognition of promoter DNA and assembly of the
           pre-initiation complex. TFIID complex is composed of the
           TBP and at least 13 TAFs. TAFs from various species were
           originally named by their predicted molecular weight or
           their electrophoretic mobility in polyacrylamide gels. A
           new, unified nomenclature for the pol II TAFs has been
           suggested to show the relationship between TAF orthologs
           and paralogs. Several hypotheses are proposed for TAFs
           functions such as serving as activator-binding sites,
           core-promoter recognition or a role in essential
           catalytic activity. Each TAF, with the help of a
           specific activator, is required only for the expression
           of subset of genes and is not universally involved for
           transcription as are GTFs. In yeast and human cells,
           TAFs have been found as components of other complexes
           besides TFIID.   Several TAFs interact via histone-fold
           (HFD) motifs; HFD is the interaction motif involved in
           heterodimerization of the core histones and their
           assembly into nucleosome octamers. The minimal HFD
           contains three alpha-helices linked by two loops and is
           found in core histones, TAFS and many other
           transcription factors. TFIID has a histone octamer-like
           substructure. TAF4 domain interacts with TAF12 and makes
           a novel histone-like heterodimer that binds DNA and has
           a core promoter function of a subset of genes.
          Length = 212

 Score = 31.5 bits (72), Expect = 0.83
 Identities = 9/45 (20%), Positives = 27/45 (60%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
           + +++++++  E+E+E+  +     +  S+ K+K K   KE++ +
Sbjct: 121 QLEREEEEKRDEEERERLLRAAKSRSEQSRLKQKAKEMQKEEDEE 165


>gnl|CDD|217476 pfam03286, Pox_Ag35, Pox virus Ag35 surface protein. 
          Length = 198

 Score = 31.7 bits (72), Expect = 0.83
 Identities = 21/78 (26%), Positives = 37/78 (47%), Gaps = 3/78 (3%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
           S K+ +K + ++  K     K K+   EK     +++KK +S   K ++ E D D  +E 
Sbjct: 47  SPKQPKKKRPTTPRKPATTKKSKKKDKEKL---TEEEKKPESDDDKTEENENDPDNNEES 103

Query: 135 KESKSSSKIVSSSHNSKE 152
            +S+ S+   S S    E
Sbjct: 104 GDSQESASANSLSDIDNE 121



 Score = 29.0 bits (65), Expect = 5.0
 Identities = 16/65 (24%), Positives = 27/65 (41%), Gaps = 5/65 (7%)

Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
           P    PT  +K     K K+K+KE  T  +     K  + D    + N  + D   +  +
Sbjct: 51  PKKKRPTTPRKPATTKKSKKKDKEKLTEEE-----KKPESDDDKTEENENDPDNNEESGD 105

Query: 223 SHKSS 227
           S +S+
Sbjct: 106 SQESA 110



 Score = 29.0 bits (65), Expect = 5.3
 Identities = 14/65 (21%), Positives = 25/65 (38%), Gaps = 4/65 (6%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
               KK +    R     K+ K K+K+K     K+ E D     +K  +     +++ E 
Sbjct: 48  PKQPKKKRPTTPRKPATTKKSKKKDKEKLTEEEKKPESD----DDKTEENENDPDNNEES 103

Query: 104 EKKKE 108
              +E
Sbjct: 104 GDSQE 108


>gnl|CDD|219621 pfam07890, Rrp15p, Rrp15p.  Rrp15p is required for the formation of
           60S ribosomal subunits.
          Length = 132

 Score = 30.8 bits (70), Expect = 0.83
 Identities = 16/64 (25%), Positives = 27/64 (42%)

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
           K    K K  + S+ K+  K   K K  K  K  +     EK++  +  + K    +D +
Sbjct: 13  KLPASKRKDPILSRSKKLLKAKKKLKSEKLEKKAKRQLRAEKRQALEKGRVKPVLPEDLE 72

Query: 124 RERD 127
           +ER 
Sbjct: 73  KERR 76



 Score = 28.1 bits (63), Expect = 7.0
 Identities = 18/73 (24%), Positives = 31/73 (42%), Gaps = 10/73 (13%)

Query: 27 SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDK--- 83
          + I ++   +S       S SKK  K K   K+ + EK +K+  +   + K +  +K   
Sbjct: 7  AKILASKLPASKRKDPILSRSKKLLKAK---KKLKSEKLEKKAKRQLRAEKRQALEKGRV 63

Query: 84 ----VSSKEKERK 92
                  EKER+
Sbjct: 64 KPVLPEDLEKERR 76


>gnl|CDD|240254 PTZ00069, PTZ00069, 60S ribosomal protein L5; Provisional.
          Length = 300

 Score = 32.0 bits (73), Expect = 0.87
 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 6/90 (6%)

Query: 60  KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE------SKPKESSSEKEKKKEKKDKK 113
           K+ +++D +K K   S   K      S E   K+      + P +   +K+KKK+   KK
Sbjct: 210 KQLKEEDPDKYKKQFSKYIKAGVGPDSLEDMYKKAHAAIRANPSKVKKKKKKKKKVVHKK 269

Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
            K+ K   K R+     KK Q+  +   KI
Sbjct: 270 YKTKKLTGKQRKARVKAKKAQRRERLQKKI 299


>gnl|CDD|215579 PLN03106, TCP2, Protein TCP2; Provisional.
          Length = 447

 Score = 32.0 bits (72), Expect = 0.92
 Identities = 17/58 (29%), Positives = 35/58 (60%), Gaps = 1/58 (1%)

Query: 15  HPSPHKNKDKDSSAIPST-STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
           + +  +N+ + +S   S  S++S TS  +  S S+ + +DK R++ +E+  K+KEK+ 
Sbjct: 169 NQTLTQNQAQHNSLSKSACSSTSDTSKGSGLSLSRSELRDKARERARERTAKEKEKED 226


>gnl|CDD|221185 pfam11719, Drc1-Sld2, DNA replication and checkpoint protein.
           Genome duplication is precisely regulated by
           cyclin-dependent kinases CDKs, which bring about the
           onset of S phase by activating replication origins and
           then prevent relicensing of origins until mitosis is
           completed. The optimum sequence motif for CDK
           phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found
           to have at least 11 potential phosphorylation sites.
           Drc1 is required for DNA synthesis and S-M replication
           checkpoint control. Drc1 associates with Cdc2 and is
           phosphorylated at the onset of S phase when Cdc2 is
           activated. Thus Cdc2 promotes DNA replication by
           phosphorylating Drc1 and regulating its association with
           Cut5. Sld2 and Sld3 represent the minimal set of S-CDK
           substrates required for DNA replication.
          Length = 397

 Score = 32.1 bits (73), Expect = 0.95
 Identities = 12/70 (17%), Positives = 28/70 (40%), Gaps = 4/70 (5%)

Query: 38  TSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK 97
            + P++  S  +    ++  K  EK   +             E D+    E+ ++E + K
Sbjct: 302 RAKPSDEPSLPESDIHEEIPKLDEKSLSEFLGY----MGGIDEDDEDEDDEESKEEVEKK 357

Query: 98  ESSSEKEKKK 107
           +   +K +K+
Sbjct: 358 QKVKKKPRKR 367



 Score = 29.4 bits (66), Expect = 5.3
 Identities = 18/113 (15%), Positives = 41/113 (36%), Gaps = 13/113 (11%)

Query: 37  STSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP 96
           +   P      K  K+   R K +    K  ++     S   +E  K+   EK   E   
Sbjct: 276 ANDEPRRVFKKKGQKRTTRRVKMRPVRAKPSDEPSLPESDIHEEIPKL--DEKSLSEFLG 333

Query: 97  KESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
                +++ + E  ++ ++           + ++K++ K+     K+   S+N
Sbjct: 334 YMGGIDEDDEDEDDEESKE-----------EVEKKQKVKKKPRKRKVNPVSNN 375


>gnl|CDD|222571 pfam14153, Spore_coat_CotO, Spore coat protein CotO.  Bacillus
           spores are protected by a protein shell consisting of
           over 50 different polypeptides, known as the coat. This
           family of proteins has an important morphogenetic role
           in coat assembly, it is involved in the assembly of at
           least 5 different coat proteins including CotB, CotG,
           CotS, CotSA and CotW. It is likely to act at a late
           stage of coat assembly.
          Length = 185

 Score = 31.3 bits (71), Expect = 0.96
 Identities = 19/80 (23%), Positives = 37/80 (46%)

Query: 52  KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
             K  +KE+EKE  D+         K + ++    KE    E +      EKE+  ++++
Sbjct: 33  IKKADEKEEEKENSDEHVKSKEEEQKIEYEEAEKEKEAGEPEREDIAEQQEKEEIAQEEE 92

Query: 112 KKEKSHKHKDKDRERDKDEK 131
           K+E++   K ++    K +K
Sbjct: 93  KEEEAEDVKQQEVFSFKRKK 112


>gnl|CDD|237744 PRK14521, rpsP, 30S ribosomal protein S16; Provisional.
          Length = 186

 Score = 31.3 bits (71), Expect = 1.0
 Identities = 17/78 (21%), Positives = 32/78 (41%), Gaps = 6/78 (7%)

Query: 39  SNPTNSSSSKKDKKDKDRDKEKEK----EKKDKEKDKSAVSSKEK--EKDKVSSKEKERK 92
                  ++KKDK  K +   K+     EKK  E    AV+ K+        + +    +
Sbjct: 108 EEKEGKVNAKKDKLSKAKKAAKKAALEAEKKVNEARAEAVAEKKAAEAAAVAAEEAAAAE 167

Query: 93  ESKPKESSSEKEKKKEKK 110
           E + +E+ +E+   +E  
Sbjct: 168 EEEAEEAPAEEAPAEESA 185


>gnl|CDD|219901 pfam08555, DUF1754, Eukaryotic family of unknown function
           (DUF1754).  This is a eukaryotic protein family of
           unknown function.
          Length = 90

 Score = 29.7 bits (67), Expect = 1.1
 Identities = 19/71 (26%), Positives = 36/71 (50%), Gaps = 1/71 (1%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKE-RKESKPKESSSEKEKKK 107
           K K  K   K+K+K+KK K K K  V ++++E++K S++      E        E+E+  
Sbjct: 11  KLKGKKIDVKKKKKKKKKKNKSKEEVVTEKEEEEKSSAESDLKEGEEDEDNEKIEQEEDG 70

Query: 108 EKKDKKEKSHK 118
               + E++ +
Sbjct: 71  MNLTEAERAFE 81


>gnl|CDD|220223 pfam09405, Btz, CASC3/Barentsz eIF4AIII binding.  This domain is
          found on CASC3 (cancer susceptibility candidate gene 3
          protein) which is also known as Barentsz (Btz). CASC3
          is a component of the EJC (exon junction complex) which
          is a complex that is involved in post-transcriptional
          regulation of mRNA in metazoa. The complex is formed by
          the association of four proteins (eIF4AIII, Barentsz,
          Mago, and Y14), mRNA, and ATP. This domain wraps around
          eIF4AIII and stacks against the 5' nucleotide.
          Length = 116

 Score = 30.1 bits (68), Expect = 1.1
 Identities = 15/40 (37%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 32 TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDK 71
             S   S  T S+   + K+DK+R K +E EK D E D+
Sbjct: 1  KVESERQSGRTPSAEPTEPKEDKER-KRREHEKYDDEDDE 39


>gnl|CDD|147051 pfam04698, MOBP_C-Myrip, Myelin-associated oligodendrocytic basic
           protein (MOBP).  MOBP is abundantly expressed in central
           nervous system myelin, and shares several
           characteristics with myelin basic protein (MBP), in
           terms of regional distribution and function. This family
           is the middle and C-terminal regions of MOBP which has
           been shown to be essential for normal arrangement of the
           radial component in central nervous system myelin. Most
           member-proteins carry a FVHE-PHD type zinc-finger at
           their N-terminus.
          Length = 710

 Score = 31.7 bits (71), Expect = 1.2
 Identities = 23/100 (23%), Positives = 41/100 (41%), Gaps = 6/100 (6%)

Query: 10  SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDK-KDKDRDKEKEKEKKDKE 68
            S  AH      +  D++ +    + +  SN      S  D  ++K R++  E   K  E
Sbjct: 393 PSPGAHL-----RALDTAQVSDDLSETDISNEAQDPQSLTDSTEEKLRNRLYELAMKMSE 447

Query: 69  KDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
           K+ S+   +E E       +KE   S+      ++E KK+
Sbjct: 448 KETSSGEDQESEPKAEPENQKESLSSEDNNQGVQEELKKK 487


>gnl|CDD|227499 COG5171, YRB1, Ran GTPase-activating protein (Ran-binding protein)
           [Intracellular trafficking and secretion].
          Length = 211

 Score = 31.1 bits (70), Expect = 1.2
 Identities = 12/89 (13%), Positives = 34/89 (38%), Gaps = 1/89 (1%)

Query: 55  DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
           +R K++ K +K++ + K        + D     +   +E K ++S   +    E  + K 
Sbjct: 4   ERKKKQAKIEKEENEQKERSLDVVSKGDAFGDGKAGGEEKKVQQSPFLENAVPEGDEGKG 63

Query: 115 KSHKHKDKDRERDKDEKKEQKESKSSSKI 143
               +   +    + ++   K ++    +
Sbjct: 64  PESPNIHFE-PVVELQRVHLKTNEEDETV 91


>gnl|CDD|222977 PHA03089, PHA03089, late transcription factor VLTF-4; Provisional.
          Length = 191

 Score = 30.9 bits (70), Expect = 1.2
 Identities = 21/101 (20%), Positives = 42/101 (41%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
           S++   DK + D   E  +    K   K   + ++K+  K + K+K+ KE  P+ ++ E 
Sbjct: 26  STTESVDKVNDDIFPEDVEIPSKKTSKKKKTTPRKKKTTKKTKKKKKEKEEVPELAAEEL 85

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
              +E ++  +K      K +    +   E     S  K+ 
Sbjct: 86  SDSEENEENDKKVDYELPKVQNTAAEVNHEDVIDLSDLKLA 126



 Score = 30.5 bits (69), Expect = 1.6
 Identities = 16/74 (21%), Positives = 29/74 (39%), Gaps = 2/74 (2%)

Query: 29  IPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKE 88
           IPS  TS           + K  K K ++KE+  E   +E   S     E+   KV  + 
Sbjct: 45  IPSKKTSKKKKTTPRKKKTTKKTKKKKKEKEEVPELAAEELSDS--EENEENDKKVDYEL 102

Query: 89  KERKESKPKESSSE 102
            + + +  + +  +
Sbjct: 103 PKVQNTAAEVNHED 116


>gnl|CDD|147580 pfam05474, Semenogelin, Semenogelin.  This family consists of
           several mammalian semenogelin (I and II) proteins.
           Freshly ejaculated human semen has the appearance of a
           loose gel in which the predominant structural protein
           components are the seminal vesicle secreted semenogelins
           (Sg).
          Length = 450

 Score = 31.6 bits (71), Expect = 1.2
 Identities = 39/251 (15%), Positives = 90/251 (35%), Gaps = 30/251 (11%)

Query: 38  TSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK 97
           T NP+    +    K K       KE+              KE+D VS  ++ R +   +
Sbjct: 139 TQNPSQDRGNSTSGKGKSSQDSNTKERL-------LARGLGKEQDSVSGAQRNRTQGGSQ 191

Query: 98  ESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS-----SHNSKE 152
            S   + +      ++E  +  ++K    +  E K++  SK  + +  +      H SK+
Sbjct: 192 SSYVLQTEDLVANKQQETQNSLQNKGSYPNVYEVKQKHSSKVQTSLHPAHQHRLQHGSKD 251

Query: 153 PASGSQ-------------LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKH 199
             + +Q               +H      + T++  +   E   +K+ S         + 
Sbjct: 252 IFTKNQHQTKNLNQDQEHGQKAHKISYQSSSTEERRLNHGENGVQKDVSKGSISRQTEEK 311

Query: 200 KKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQF 259
                    T P ++  +   K S++SS+  +     G   I     +    +   +++ 
Sbjct: 312 IMGKSQKQVTTPSQEQGQKANKISYQSSSTEERRLNHGEKGI-----QKDVSKGSTSKKT 366

Query: 260 KDELFDRLKNE 270
           ++++ D+ +N+
Sbjct: 367 EEKIHDKSQNQ 377


>gnl|CDD|219355 pfam07267, Nucleo_P87, Nucleopolyhedrovirus capsid protein P87.
           This family consists of several Nucleopolyhedrovirus
           capsid protein P87 sequences. P87 is expressed late in
           infection and concentrated in infected cell nuclei.
          Length = 606

 Score = 31.8 bits (72), Expect = 1.2
 Identities = 22/117 (18%), Positives = 37/117 (31%), Gaps = 12/117 (10%)

Query: 16  PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDR---DKEKEKEKKDKEKDKS 72
           P   + K   +        SS    P        ++  ++R   D        D + + S
Sbjct: 291 PMTEEIKSWQTPLQTPAMYSSDYQAPKPEPIYTWEELLRERFPSDLFAISSLPDSDSEAS 350

Query: 73  AVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
                 K K +      E    + ++ S E E   EK+ K+          RE DK+
Sbjct: 351 DSGPTRKRKRRRVPPLPEYSSDEDEDDSDEDEVDYEKERKRR---------REEDKN 398



 Score = 29.9 bits (67), Expect = 5.3
 Identities = 18/103 (17%), Positives = 40/103 (38%), Gaps = 9/103 (8%)

Query: 36  SSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEK---DKSAVSSKEKEKDKVSS---KEK 89
           S  +     +    D +    +     E+  +E+   D  A+SS      + S      K
Sbjct: 298 SWQTPLQTPAMYSSDYQAPKPEPIYTWEELLRERFPSDLFAISSLPDSDSEASDSGPTRK 357

Query: 90  ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKK 132
            ++   P       ++ ++  D+ E  +   +K+R+R ++E K
Sbjct: 358 RKRRRVPPLPEYSSDEDEDDSDEDEVDY---EKERKRRREEDK 397



 Score = 29.1 bits (65), Expect = 7.6
 Identities = 25/86 (29%), Positives = 38/86 (44%), Gaps = 3/86 (3%)

Query: 10  SSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD-KEKEKE-KKDK 67
           S   A  S   +  + S + P+         P    SS +D+ D D D  + EKE K+ +
Sbjct: 334 SDLFAISSLPDSDSEASDSGPTRKRKRRRVPPLPEYSSDEDEDDSDEDEVDYEKERKRRR 393

Query: 68  EKDKSAVSSKEKEKDKVSSKEKERKE 93
           E+DK+ +  K  E  K +    ER E
Sbjct: 394 EEDKNFLRLKALELSKYAGV-NERME 418



 Score = 29.1 bits (65), Expect = 7.7
 Identities = 18/80 (22%), Positives = 34/80 (42%), Gaps = 1/80 (1%)

Query: 27  SAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDK-EKEKEKKDKEKDKSAVSSKEKEKDKVS 85
           S++P + + +S S PT     ++     +    E E +  + E D      + +E+DK  
Sbjct: 340 SSLPDSDSEASDSGPTRKRKRRRVPPLPEYSSDEDEDDSDEDEVDYEKERKRRREEDKNF 399

Query: 86  SKEKERKESKPKESSSEKEK 105
            + K  + SK    +   EK
Sbjct: 400 LRLKALELSKYAGVNERMEK 419


>gnl|CDD|235971 PRK07219, PRK07219, DNA topoisomerase I; Validated.
          Length = 822

 Score = 31.9 bits (73), Expect = 1.2
 Identities = 13/79 (16%), Positives = 26/79 (32%), Gaps = 8/79 (10%)

Query: 49  KDKKDKDRD-----KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
           K     +       K    EK+ KEK+     S+ +    V +K  E+ +    E+ ++ 
Sbjct: 747 KGGFGDELGCCNNPKCNYTEKQKKEKES---KSELEALKGVGAKTAEKLKDAGVETVTDL 803

Query: 104 EKKKEKKDKKEKSHKHKDK 122
                     +      D+
Sbjct: 804 TAADPDAVAAKVDGVSADR 822



 Score = 30.0 bits (68), Expect = 4.4
 Identities = 11/66 (16%), Positives = 20/66 (30%), Gaps = 6/66 (9%)

Query: 77  KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
              EK K   + K   E+     +   EK K+   +              D D    + +
Sbjct: 763 NYTEKQKKEKESKSELEALKGVGAKTAEKLKDAGVET------VTDLTAADPDAVAAKVD 816

Query: 137 SKSSSK 142
             S+ +
Sbjct: 817 GVSADR 822


>gnl|CDD|220710 pfam10351, Apt1, Golgi-body localisation protein domain.  This is
           the C-terminus of a family of proteins conserved from
           plants to humans. The plant members are localised to the
           Golgi proteins and appear to regulate membrane
           trafficking, as they are required for rapid vesicle
           accumulation at the tip of the pollen tube. The
           C-terminus probably contains the Golgi localisation
           signal and it is well-conserved.
          Length = 451

 Score = 31.5 bits (72), Expect = 1.2
 Identities = 20/68 (29%), Positives = 34/68 (50%), Gaps = 1/68 (1%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD-KEKEKEK 64
           + S  SSS+     ++     S+  S+S  SS+S  + SS   KDK+   +  K +++E 
Sbjct: 312 EDSDISSSSSSGSRRSSSTSRSSSSSSSLLSSSSILSKSSDKSKDKRFSLKLSKSEKEES 371

Query: 65  KDKEKDKS 72
            D E+  S
Sbjct: 372 DDLEEMIS 379



 Score = 30.7 bits (70), Expect = 2.6
 Identities = 18/94 (19%), Positives = 30/94 (31%), Gaps = 1/94 (1%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
            +D    +E     K +E    + SS    +   S+       S    SSS    K   K
Sbjct: 295 GRDSSLSEEDSDSSKREEDSDISSSSSSGSRRSSSTSRSSSSSSSLL-SSSSILSKSSDK 353

Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
            K ++      K  + + D+ +E     S     
Sbjct: 354 SKDKRFSLKLSKSEKEESDDLEEMISRSSKYMSF 387


>gnl|CDD|116627 pfam08017, Fibrinogen_BP, Fibrinogen binding protein.  Proteins in
           this family bind to fibrinogen. Members of this family
           includes the fibrinogen receptor, FbsA, which mediates
           platelet aggregation.
          Length = 393

 Score = 31.4 bits (70), Expect = 1.3
 Identities = 34/187 (18%), Positives = 83/187 (44%), Gaps = 13/187 (6%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK----ERKESKPKESSSEK 103
           ++D ++K +    E+ ++D E ++S  +  E+ +  V +K +    ER++   +  S   
Sbjct: 147 QRDAENKSQGNVLERRQRDAE-NRSQGNVLERRQRDVENKSQGNVLERRQRDVENKSQGN 205

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
             ++ ++D + +S  +  + R+RD        E+KS   ++       E  S   ++   
Sbjct: 206 VLERRQRDAENRSQGNVLERRQRDV-------ENKSQGNVLERRQRDVENKSQGNVLERR 258

Query: 164 PPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK-EKE 222
              A   +Q + ++ ++++ E +S     +  +   + K + G         +KS   +E
Sbjct: 259 QRDAENRSQGNVLERRQRDVENKSQGNVLERRQRDAENKSQVGQLIGKNPLLSKSIISRE 318

Query: 223 SHKSSAG 229
           ++ SS G
Sbjct: 319 NNHSSQG 325



 Score = 29.4 bits (65), Expect = 5.4
 Identities = 30/208 (14%), Positives = 88/208 (42%), Gaps = 13/208 (6%)

Query: 26  SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKD--- 82
           SS + +  +  + S        ++D +++ +    E+ ++D E        + +++D   
Sbjct: 13  SSPVSAMDSVGNQSQGNVLERRQRDAENRSQGNVLERRQRDAENRSQGNVLERRQRDAEN 72

Query: 83  KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE--------- 133
           +      ER++   +  S     ++ ++D + KS  +  + R+RD + K +         
Sbjct: 73  RSQGNVLERRQRDAENRSQGNVLERRQRDVENKSQGNVLERRQRDVENKSQGNVLERRQR 132

Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDK 193
             E++S   ++       E  S   ++      A   +Q + ++ ++++ E +S     +
Sbjct: 133 DAENRSQGNVLERRQRDAENKSQGNVLERRQRDAENRSQGNVLERRQRDVENKSQGNVLE 192

Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
             +   + K + G+    +++DA+++ +
Sbjct: 193 RRQRDVENKSQ-GNVLERRQRDAENRSQ 219


>gnl|CDD|152107 pfam11671, Apis_Csd, Complementary sex determiner protein.  This
           family of proteins represents the complementary sex
           determiner in the honeybee. In the honeybee, the
           mechanism of sex determination depends on the csd gene
           which produces an SR-type protein. Males are homozygous
           while females are homozygous for the csd gene.
           Heterozygosity generates an active protein which
           initiates female development.
          Length = 146

 Score = 30.5 bits (68), Expect = 1.3
 Identities = 14/50 (28%), Positives = 29/50 (58%)

Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
           S E+E+K  K +   + ++   ++R RD+ E++  +E K  S + + S+N
Sbjct: 9   SREREQKSYKNENSYREYRETSRERSRDRTERERSREHKIISSLSNLSNN 58


>gnl|CDD|224212 COG1293, COG1293, Predicted RNA-binding protein homologous to
           eukaryotic snRNP [Transcription].
          Length = 564

 Score = 31.6 bits (72), Expect = 1.3
 Identities = 17/91 (18%), Positives = 33/91 (36%), Gaps = 1/91 (1%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
                  E+ K + +K K+   + ++   K   K K  K +  ++ S  KE     +  K
Sbjct: 342 LADFYGNEEIKIELDKSKTPSENAQRYF-KKYKKLKGAKVNLDRQLSELKEAIAYYESAK 400

Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
               K + K    +  E+  ++    S K  
Sbjct: 401 TALEKAEGKKAIEEIREELIEEGLLKSKKKK 431



 Score = 30.1 bits (68), Expect = 3.8
 Identities = 22/77 (28%), Positives = 34/77 (44%), Gaps = 5/77 (6%)

Query: 48  KKDKKDKDRDKEKEKEKKDK-EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           K  K + DR   + KE     E  K+A+   E +K  +    +E  E    +S   K+KK
Sbjct: 376 KGAKVNLDRQLSELKEAIAYYESAKTALEKAEGKKA-IEEIREELIEEGLLKS---KKKK 431

Query: 107 KEKKDKKEKSHKHKDKD 123
           ++KK+  EK       D
Sbjct: 432 RKKKEWFEKFRWFVSSD 448



 Score = 28.9 bits (65), Expect = 9.2
 Identities = 17/75 (22%), Positives = 27/75 (36%), Gaps = 11/75 (14%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKV----SSKEKERKESKPK-------ESSSEKEK 105
            +  +K KK K    +      + K+ +    S+K    K    K       E   E   
Sbjct: 366 QRYFKKYKKLKGAKVNLDRQLSELKEAIAYYESAKTALEKAEGKKAIEEIREELIEEGLL 425

Query: 106 KKEKKDKKEKSHKHK 120
           K +KK +K+K    K
Sbjct: 426 KSKKKKRKKKEWFEK 440


>gnl|CDD|179712 PRK04019, rplP0, acidic ribosomal protein P0; Validated.
          Length = 330

 Score = 31.4 bits (72), Expect = 1.3
 Identities = 7/43 (16%), Positives = 24/43 (55%)

Query: 72  SAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
           +A++ K+   +++      + ++   E   E+E+++E+++  E
Sbjct: 276 AALADKDALDEELKEVLSAQAQAAAAEEEEEEEEEEEEEEPSE 318


>gnl|CDD|218738 pfam05766, NinG, Bacteriophage Lambda NinG protein.  NinG or Rap is
           involved in recombination. Rap (recombination adept with
           plasmid) increases lambda-by-plasmid recombination
           catalyzed by Escherichia coli's RecBCD pathway.
          Length = 188

 Score = 30.8 bits (70), Expect = 1.4
 Identities = 12/55 (21%), Positives = 21/55 (38%), Gaps = 11/55 (20%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKE--KSHKHKDKD---------RERDKDE 130
             K  K  + K  +  + +++E K +KE  K+     K+         R RD   
Sbjct: 33  ALKREKAQEKKRKAEAQAERRELKARKEKLKTRSDWLKEAQAAVNKYIRLRDAGL 87


>gnl|CDD|217834 pfam03998, Utp11, Utp11 protein.  This protein is found to be part
           of a large ribonucleoprotein complex containing the U3
           snoRNA. Depletion of the Utp proteins impedes production
           of the 18S rRNA, indicating that they are part of the
           active pre-rRNA processing complex. This large RNP
           complex has been termed the small subunit (SSU)
           processome.
          Length = 239

 Score = 31.2 bits (71), Expect = 1.4
 Identities = 20/95 (21%), Positives = 40/95 (42%), Gaps = 2/95 (2%)

Query: 34  TSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKS--AVSSKEKEKDKVSSKEKER 91
           T+    +   +       +      EK+K+K  K+K K    +  +++ + K+   E+  
Sbjct: 145 TTPELLDRRENRPRISQLEKTSLVDEKQKKKSAKKKRKLYKELKERKEREKKLKKVEQRL 204

Query: 92  KESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
           +  +      + +KKK  KDK  K      K+R+R
Sbjct: 205 ELQRELMKKGKGKKKKIVKDKDGKVVYKWKKERKR 239



 Score = 28.5 bits (64), Expect = 9.9
 Identities = 17/94 (18%), Positives = 41/94 (43%), Gaps = 2/94 (2%)

Query: 53  DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK--K 110
           D + +++     +  +     +  +E        ++    + K K+ S++K++K  K  K
Sbjct: 129 DDEEEQKSFDPAEYFDTTPELLDRRENRPRISQLEKTSLVDEKQKKKSAKKKRKLYKELK 188

Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           ++KE+  K K  ++  +   +  +K      KIV
Sbjct: 189 ERKEREKKLKKVEQRLELQRELMKKGKGKKKKIV 222


>gnl|CDD|234428 TIGR03979, His_Ser_Rich, His-Xaa-Ser repeat protein HxsA.  Members
           of this protein share two defining regions. One is a
           histidine/serine-rich cluster, typically
           H-R-S-H-S-S-H-R-S-H-S-S-H. Members are found always in
           the context of a pair of radical SAM proteins, HxsB and
           HxsC, and a fourth protein HxsD. The system is predicted
           to perform peptide modifications, likely in the
           His-Xaa-Ser region, to produce some uncharacterized
           natural product.
          Length = 186

 Score = 30.6 bits (69), Expect = 1.4
 Identities = 15/63 (23%), Positives = 24/63 (38%), Gaps = 2/63 (3%)

Query: 3   YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
            S  S +  S + PS   +    S  +PS S S S  +   S  S    + +   +    
Sbjct: 66  SSHYSGAGGSYSVPS--GDTSTYSYPVPSPSYSPSPGSSIQSLPSTTGVRPQSSAENANS 123

Query: 63  EKK 65
           EK+
Sbjct: 124 EKR 126



 Score = 29.5 bits (66), Expect = 3.3
 Identities = 14/57 (24%), Positives = 23/57 (40%), Gaps = 4/57 (7%)

Query: 3   YSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKE 59
           YSV S  +S+ ++P P        S  P +S  S  S       S  +  + ++ K 
Sbjct: 76  YSVPSGDTSTYSYPVP----SPSYSPSPGSSIQSLPSTTGVRPQSSAENANSEKRKL 128


>gnl|CDD|220624 pfam10187, Nefa_Nip30_N, N-terminal domain of NEFA-interacting
           nuclear protein NIP30.  This is a the N-terminal 100
           amino acids of a family of proteins conserved from
           plants to humans. The full-length protein has putatively
           been called NEFA-interacting nuclear protein NIP30,
           however no reference could be found to confirm this.
          Length = 99

 Score = 29.3 bits (66), Expect = 1.5
 Identities = 21/68 (30%), Positives = 33/68 (48%), Gaps = 11/68 (16%)

Query: 74  VSSKEKEKDKVSSKEKERKESKPKESSSEK-------EKKKEKKDKK----EKSHKHKDK 122
           VS  E ++ +   +E+ R    PK    E+       E+ +E KDKK    E+  K K++
Sbjct: 2   VSESELDEARKRRQEEVRAPRDPKAEPEEEYDGRSLYERLQENKDKKQEEFEEKFKLKNQ 61

Query: 123 DRERDKDE 130
            R  D+DE
Sbjct: 62  FRGLDEDE 69


>gnl|CDD|220838 pfam10659, Trypan_glycop_C, Trypanosome variant surface
           glycoprotein C-terminal domain.  The trypanosome
           parasite expresses these proteins to evade the immune
           response.
          Length = 98

 Score = 29.3 bits (66), Expect = 1.5
 Identities = 23/98 (23%), Positives = 31/98 (31%), Gaps = 17/98 (17%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            K  K K   KEK  +   KE D      + K K   +   +        E    K+ KK
Sbjct: 2   NKKNKTKTECKEKGCKWDKKEDDGKCKPKEGKAKKNGAPVTQTAGTETTTEKCKGKKDKK 61

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
           + K                 K  K E    K SS +V+
Sbjct: 62  DCK-----------------KGCKWEGNTCKDSSFLVN 82


>gnl|CDD|227612 COG5293, COG5293, Predicted ATPase [General function prediction
           only].
          Length = 591

 Score = 31.4 bits (71), Expect = 1.5
 Identities = 49/304 (16%), Positives = 86/304 (28%), Gaps = 36/304 (11%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESK-----------PKE 98
           DK  +   K+K  E   K    S  S +E E  ++  +++  K+               E
Sbjct: 200 DKIQELESKKKLAELLRKTWIGSLDSLEEIETTELRKQDEVNKKQATLNTFDFHAQDYAE 259

Query: 99  SSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
           +        E+  +            +R K   KEQ                  P     
Sbjct: 260 TEELVNTVDERIAELNNRRISMQSHWKRVKTSLKEQIL--------------FCPDEIQV 305

Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGD-KTNPKEKDAK 217
           L        P        K  E       + T ++H   + +  +  GD K    E D  
Sbjct: 306 LYEEVGVLFPGQV----KKDFEHVIAFNRAITEERHDYLQEEIAEIEGDLKEVNAELDDL 361

Query: 218 SKEKESH----KSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQAD 273
            K +       K+    + Y  +    I LR +  +        +    L   +   + +
Sbjct: 362 GKRRAEGLAFLKNRGVFEKYQTLCEEIIALRGELAELEYRIEPLRKLHALDQYIGTLKHE 421

Query: 274 ILQRKVHIISGDISQPSLGISSHDQQFIQHHIHVIIHAAASLRFDELIQDAFTLNIQATR 333
            L  +   I  ++ Q     +S  + F +  I  +     SLR         T   + T 
Sbjct: 422 CLDLE-ERIYTEVQQQCSLFASIGRLF-KEMIREVYDCYGSLRVTTNKNGHLTFGAEITD 479

Query: 334 ELLD 337
              D
Sbjct: 480 AAPD 483


>gnl|CDD|202427 pfam02841, GBP_C, Guanylate-binding protein, C-terminal domain.
           Transcription of the anti-viral guanylate-binding
           protein (GBP) is induced by interferon-gamma during
           macrophage induction. This family contains GBP1 and
           GPB2, both GTPases capable of binding GTP, GDP and GMP.
          Length = 297

 Score = 31.1 bits (71), Expect = 1.5
 Identities = 14/59 (23%), Positives = 31/59 (52%), Gaps = 4/59 (6%)

Query: 80  EKDKVSSKEKERKESKPKESSSEKEKKKE---KKDKKEKSHK-HKDKDRERDKDEKKEQ 134
            K+K    E+ + E+   E    +EK+KE     + +E+S++ H  +  E+ + E+++ 
Sbjct: 201 AKEKAIEAERAKAEAAEAEQELLREKQKEEEQMMEAQERSYQEHVKQLIEKMEAEREKL 259



 Score = 28.4 bits (64), Expect = 9.9
 Identities = 19/78 (24%), Positives = 39/78 (50%), Gaps = 2/78 (2%)

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
            KEK  +  E+ K+  +  E+E  +   KE+E+     + S  E  K+  +K + E+   
Sbjct: 201 AKEKAIE-AERAKAEAAEAEQELLREKQKEEEQMMEAQERSYQEHVKQLIEKMEAEREKL 259

Query: 119 HKDKDRERDKDEKKEQKE 136
             +++R  +  + +EQ+E
Sbjct: 260 LAEQERMLEH-KLQEQEE 276


>gnl|CDD|227931 COG5644, COG5644, Uncharacterized conserved protein [Function
           unknown].
          Length = 869

 Score = 31.6 bits (71), Expect = 1.7
 Identities = 17/115 (14%), Positives = 42/115 (36%)

Query: 17  SPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSS 76
           +P K K+   +   S   S           ++     +   +  E      ++    +  
Sbjct: 587 APRKRKEDFVTPSTSLEKSMDRILHGQKKRAEGAVVFEKPLEATENFNPWLDRKMRRIKR 646

Query: 77  KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEK 131
            +K+  +   ++K  K+  P+E ++++     +K +         K+ E + DEK
Sbjct: 647 IKKKAYRRIRRDKRLKKKMPEEENTQENHLGSEKKRHGGVPDILLKEIEVEDDEK 701



 Score = 30.4 bits (68), Expect = 2.9
 Identities = 25/97 (25%), Positives = 44/97 (45%), Gaps = 6/97 (6%)

Query: 52  KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-----KPKESSSEKEKK 106
                 + K   +K KE   +  +S EK  D++   +K+R E      KP E++      
Sbjct: 577 FPVVEQRRKLAPRKRKEDFVTPSTSLEKSMDRILHGQKKRAEGAVVFEKPLEATENFNPW 636

Query: 107 KEKKDKKEKSHKHKDKDR-ERDKDEKKEQKESKSSSK 142
            ++K ++ K  K K   R  RDK  KK+  E +++ +
Sbjct: 637 LDRKMRRIKRIKKKAYRRIRRDKRLKKKMPEEENTQE 673


>gnl|CDD|218049 pfam04373, DUF511, Protein of unknown function (DUF511).  Bacterial
           protein of unknown function.
          Length = 310

 Score = 31.2 bits (71), Expect = 1.7
 Identities = 19/88 (21%), Positives = 37/88 (42%), Gaps = 10/88 (11%)

Query: 50  DKKDKDRDKEKE----------KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
           DKK   R+K K            ++K+  K    + ++EK   +   K +E +       
Sbjct: 27  DKKLNSREKGKTPIAQLGAEIGSDRKELAKKSPFIKTQEKPPRRYYLKSREDELELKALD 86

Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERD 127
             + E+ +E+ +  + + K K+   ERD
Sbjct: 87  EIKSEEDEEQSEAPKANKKQKNSFHERD 114


>gnl|CDD|222613 pfam14235, DUF4337, Domain of unknown function (DUF4337).  This
           family of proteins is functionally uncharacterized. This
           family of proteins is found in bacteria. Proteins in
           this family are typically between 187 and 201 amino
           acids in length. There is a single completely conserved
           residue Q that may be functionally important.
          Length = 158

 Score = 30.2 bits (69), Expect = 1.7
 Identities = 12/46 (26%), Positives = 21/46 (45%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKK 132
               R E + K +  +KEK + + + KE   K K+ + E D    +
Sbjct: 65  AAAPRAELQAKIARYKKEKARYRSEAKELEAKAKEAEAESDHALHQ 110


>gnl|CDD|227596 COG5271, MDN1, AAA ATPase containing von Willebrand factor type A
            (vWA) domain [General function prediction only].
          Length = 4600

 Score = 31.5 bits (71), Expect = 1.7
 Identities = 35/209 (16%), Positives = 72/209 (34%), Gaps = 39/209 (18%)

Query: 50   DKKDKD-RDKEKEKEKKDKEKDKSAVSS----KEKEKDKVSSKEKERKES---------- 94
            +++D   +    E E ++ E D + V+      E E      + ++  E           
Sbjct: 3842 NEEDTANQSDLDESEARELESDMNGVTKDSVVSENENSDSEEENQDLDEEVNDIPEDLSN 3901

Query: 95   ---------KPKESSSEKEKKKEKK----DKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
                       +E   E E+K  ++    ++ +   K  D     DKD ++++ E + S 
Sbjct: 3902 SLNEKLWDEPNEEDLLETEQKSNEQSAANNESDLVSKEDDNKALEDKDRQEKEDEEEMSD 3961

Query: 142  KIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESST--------THDK 193
             +     +  +P       S PPP          +K  EKE +    +          D+
Sbjct: 3962 DV--GIDDEIQPDIQENN-SQPPPENEDLDLPEDLKLDEKEGDVSKDSDLEDMDMEAADE 4018

Query: 194  HSKHKHKKKDKHGDKTNPKEKDAKSKEKE 222
            + +    +KD+     +P E++    E  
Sbjct: 4019 NKEEADAEKDEPMQDEDPLEENNTLDEDI 4047



 Score = 29.6 bits (66), Expect = 6.1
 Identities = 21/106 (19%), Positives = 42/106 (39%), Gaps = 6/106 (5%)

Query: 38   TSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK 97
             +N  +    + D  D   D EK  E   +E       ++E  +D V S E+  +   P+
Sbjct: 4039 ENNTLDEDIQQDDFSDLAEDDEKMNEDGFEEN---VQENEESTEDGVKSDEELEQGEVPE 4095

Query: 98   ESSSEKEKKKEKKD---KKEKSHKHKDKDRERDKDEKKEQKESKSS 140
            + + +   K + K      E   ++ DK    + +E  E+   + +
Sbjct: 4096 DQAIDNHPKMDAKSTFASAEADEENTDKGIVGENEELGEEDGVRGN 4141



 Score = 29.2 bits (65), Expect = 9.3
 Identities = 28/164 (17%), Positives = 64/164 (39%), Gaps = 13/164 (7%)

Query: 51   KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVS-SKEKERKESKPKESSSEKEKKKEK 109
             +D D ++   +E    + D     ++E E D    +K+    E++  +S  E +   E+
Sbjct: 3832 NEDDDLEELANEEDTANQSDLDESEARELESDMNGVTKDSVVSENENSDSEEENQDLDEE 3891

Query: 110  KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
             +   +   +   ++  D+  +++  E++  S   S+++N       S L+S        
Sbjct: 3892 VNDIPEDLSNSLNEKLWDEPNEEDLLETEQKSNEQSAANNE------SDLVSKEDDN--- 3942

Query: 170  PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKE 213
               K+      +EKE E   + D     + +   +  +   P E
Sbjct: 3943 ---KALEDKDRQEKEDEEEMSDDVGIDDEIQPDIQENNSQPPPE 3983


>gnl|CDD|177433 PHA02608, 67, prohead core protein; Provisional.
          Length = 80

 Score = 28.6 bits (64), Expect = 1.8
 Identities = 7/31 (22%), Positives = 15/31 (48%)

Query: 52 KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKD 82
          + +D D +++ +  D + DK      + E D
Sbjct: 49 EPEDDDDDEDDDDDDDKDDKDDDDDDDDEDD 79


>gnl|CDD|234767 PRK00448, polC, DNA polymerase III PolC; Validated.
          Length = 1437

 Score = 31.3 bits (72), Expect = 1.9
 Identities = 26/103 (25%), Positives = 44/103 (42%), Gaps = 13/103 (12%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
             D K++    E +KE++D++  K A+ + +K        E E+K+        E   + 
Sbjct: 165 IDDSKEELEKFEAQKEEEDEKLAKEALEAMKK-------LEAEKKKQSKNFDPKEGPVQI 217

Query: 108 EKKDKKEKSHKHKDKDRERDKDE------KKEQKESKSSSKIV 144
            KK  KE+    K+ + E  +        K E KE KS   I+
Sbjct: 218 GKKIDKEEITPMKEINEEERRVVVEGYVFKVEIKELKSGRHIL 260


>gnl|CDD|218899 pfam06102, DUF947, Domain of unknown function (DUF947).  Family of
           eukaryotic proteins with unknown function.
          Length = 168

 Score = 29.9 bits (68), Expect = 2.1
 Identities = 24/92 (26%), Positives = 40/92 (43%), Gaps = 2/92 (2%)

Query: 53  DKDRDKEKEKEKK--DKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
           D  R+KE E+ +K   K KD       ++    + S+ K  K    +    ++ KK+EK+
Sbjct: 58  DDYREKEIEELEKALKKTKDSEEKEELKRTLQSMKSRLKTLKNKDREREILKEHKKQEKE 117

Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
             KE    +  K  E  K   K++ +    SK
Sbjct: 118 LIKEGKKPYYLKKSEIKKLVLKKKFDELKKSK 149



 Score = 29.9 bits (68), Expect = 2.3
 Identities = 26/97 (26%), Positives = 47/97 (48%), Gaps = 7/97 (7%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES-----KPKESSSE 102
           KK K  +++++ K   +  K + K  + +K++E++ +   +K+ KE      KP      
Sbjct: 73  KKTKDSEEKEELKRTLQSMKSRLK-TLKNKDREREILKEHKKQEKELIKEGKKPYYLKKS 131

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKE-QKESK 138
           + KK   K K ++  K K  D+  +K  KK   KE K
Sbjct: 132 EIKKLVLKKKFDELKKSKQLDKALEKKRKKNAGKEKK 168



 Score = 29.9 bits (68), Expect = 2.5
 Identities = 21/77 (27%), Positives = 35/77 (45%), Gaps = 9/77 (11%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
            D  +EKE ++ EK           K    S+EKE  +   +   S  +  K K  ++E 
Sbjct: 57  LDDYREKEIEELEK---------ALKKTKDSEEKEELKRTLQSMKSRLKTLKNKDREREI 107

Query: 116 SHKHKDKDRERDKDEKK 132
             +HK +++E  K+ KK
Sbjct: 108 LKEHKKQEKELIKEGKK 124


>gnl|CDD|227360 COG5027, SAS2, Histone acetyltransferase (MYST family) [Chromatin
           structure and dynamics].
          Length = 395

 Score = 30.9 bits (70), Expect = 2.1
 Identities = 10/65 (15%), Positives = 25/65 (38%)

Query: 55  DRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKE 114
           +        K+ K+ +K     K K  D++    +  +       S+E+E ++ +    +
Sbjct: 58  NLGAAISIPKRKKQTEKGKKEKKPKVSDRMDLDNENVQLEMLYSISNEREIRQLRFGGSK 117

Query: 115 KSHKH 119
             + H
Sbjct: 118 VQNPH 122


>gnl|CDD|220135 pfam09184, PPP4R2, PPP4R2.  PPP4R2 (protein phosphatase 4 core
           regulatory subunit R2) is the regulatory subunit of the
           histone H2A phosphatase complex. It has been shown to
           confer resistance to the anticancer drug cisplatin in
           yeast, and may confer resistance in higher eukaryotes.
          Length = 285

 Score = 30.6 bits (69), Expect = 2.1
 Identities = 21/112 (18%), Positives = 50/112 (44%), Gaps = 2/112 (1%)

Query: 2   AYSVKSSSSSSSAHPSPHKN-KDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDK-E 59
            +  +  S +    P P  + KD   +   +     S+ +   S     +K+  D    +
Sbjct: 174 PFIERIDSVNGPGEPEPEDDPKDSLGNGSSTNGLPDSSQDKNKSLEEYYEKESSDAAASQ 233

Query: 60  KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKD 111
            +  K    K+K +   ++ ++D    +EKE KE + +E + E+E+++++ +
Sbjct: 234 DDGPKGSDVKNKKSDDEEDDDQDGDYVEEKELKEDEEEEETEEEEEEEDEDE 285



 Score = 30.2 bits (68), Expect = 3.1
 Identities = 23/113 (20%), Positives = 44/113 (38%), Gaps = 12/113 (10%)

Query: 22  KDKDSSAIPSTST-------SSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAV 74
           +  DS   P           S    + TN        K+K  ++  EKE  D     +A 
Sbjct: 177 ERIDSVNGPGEPEPEDDPKDSLGNGSSTNGLPDSSQDKNKSLEEYYEKESSD-----AAA 231

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
           S  +  K      +K   E    +     E+K+ K+D++E+  + ++++ + D
Sbjct: 232 SQDDGPKGSDVKNKKSDDEEDDDQDGDYVEEKELKEDEEEEETEEEEEEEDED 284


>gnl|CDD|236081 PRK07735, PRK07735, NADH dehydrogenase subunit C; Validated.
          Length = 430

 Score = 30.7 bits (69), Expect = 2.1
 Identities = 34/182 (18%), Positives = 72/182 (39%), Gaps = 10/182 (5%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
            +KD +  K++   +  +E  K  V+    E  K+  + +E++++ PK      E+ K +
Sbjct: 3   PEKDLEDLKKEAARRAKEEARKRLVAKHGAEISKLEEENREKEKALPKNDDMTIEEAKRR 62

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
                K+       ++R+  E+  ++E   +    +++  +K  A   Q           
Sbjct: 63  AAAAAKAKAAALAKQKREGTEEVTEEEKAKAKAKAAAAAKAKAAALAKQKREGTE----- 117

Query: 170 PTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
                 V  +EK   K  +    K       K+ + G +   +E++   KEK   K++A 
Sbjct: 118 -----EVTEEEKAAAKAKAAAAAKAKAAALAKQKREGTEEVTEEEEETDKEKAKAKAAAA 172

Query: 230 PK 231
            K
Sbjct: 173 AK 174



 Score = 30.7 bits (69), Expect = 2.2
 Identities = 23/94 (24%), Positives = 45/94 (47%), Gaps = 5/94 (5%)

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSH- 117
           E  +E+K   K K+A ++K K       K +  +E   +E  ++KEK K K     K+  
Sbjct: 118 EVTEEEKAAAKAKAAAAAKAKAAALAKQKREGTEEVTEEEEETDKEKAKAKAAAAAKAKA 177

Query: 118 ----KHKDKDRERDKDEKKEQKESKSSSKIVSSS 147
               K K  +     +E  E++++K+ +K  +++
Sbjct: 178 AALAKQKAAEAGEGTEEVTEEEKAKAKAKAAAAA 211



 Score = 28.8 bits (64), Expect = 9.2
 Identities = 31/132 (23%), Positives = 59/132 (44%), Gaps = 2/132 (1%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK-KKEK 109
           K+ ++  +E  +E+K K K K+A ++K K       K +  +E   +E ++ K K     
Sbjct: 76  KQKREGTEEVTEEEKAKAKAKAAAAAKAKAAALAKQKREGTEEVTEEEKAAAKAKAAAAA 135

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPT 169
           K K     K K +  E   +E++E  + K+ +K  +++  +K  A   Q  +        
Sbjct: 136 KAKAAALAKQKREGTEEVTEEEEETDKEKAKAKAAAAA-KAKAAALAKQKAAEAGEGTEE 194

Query: 170 PTQKSPVKTKEK 181
            T++   K K K
Sbjct: 195 VTEEEKAKAKAK 206


>gnl|CDD|226400 COG3883, COG3883, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 265

 Score = 30.5 bits (69), Expect = 2.2
 Identities = 23/94 (24%), Positives = 46/94 (48%), Gaps = 2/94 (2%)

Query: 32  TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSK-EKEKDKVSSKEKE 90
            ST+  T+      S K   +D     E +KEKK+ + +  ++ ++ E+ + K+   +KE
Sbjct: 16  ISTAFLTTVFAALLSDKIQNQDSKL-SELQKEKKNIQNEIESLDNQIEEIQSKIDELQKE 74

Query: 91  RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDR 124
             +SK +    +KE  + K++  E+    K + R
Sbjct: 75  IDQSKAEIKKLQKEIAELKENIVERQELLKKRAR 108



 Score = 28.9 bits (65), Expect = 6.2
 Identities = 23/112 (20%), Positives = 50/112 (44%), Gaps = 7/112 (6%)

Query: 51  KKDKDRDKEKEKEKKDK-EKDKSAVSSKEKEKDKVSSKEKERK----ESKPKESSSEKEK 105
           K+DK   +EK+   +DK E   +  +  E + + ++S++ E+         KE+S+  EK
Sbjct: 154 KEDKKSLEEKQAALEDKLETLVALQNELETQLNSLNSQKAEKNALIAALAAKEASALGEK 213

Query: 106 KKEKKDKK--EKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPAS 155
              ++ K   E +     K     K   +EQ   ++++     S  ++  ++
Sbjct: 214 AALEEQKALAEAAAAEAAKQEAAAKAAAQEQAALQAAATAAQPSAVTESASA 265


>gnl|CDD|217667 pfam03666, NPR3, Nitrogen Permease regulator of amino acid
           transport activity 3.  This family, also known in yeasts
           as Rmd11, complexes with NPR2, pfam06218. This complex
           heterodimer is responsible for inactivating TORC1. an
           evolutionarily conserved protein complex that controls
           cell size via nutritional input signals, specifically,
           in response to amino acid starvation.
          Length = 424

 Score = 30.8 bits (70), Expect = 2.2
 Identities = 19/52 (36%), Positives = 27/52 (51%)

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
           ++KK KK+K     D + +R+     E   SKSSSK  S S  + +PAS   
Sbjct: 45  RKKKKKKQKKSDRADPNDDREPSVDSEDSSSKSSSKSESGSLANSDPASDPS 96


>gnl|CDD|218435 pfam05104, Rib_recp_KP_reg, Ribosome receptor lysine/proline rich
           region.  This highly conserved region is found towards
           the C-terminus of the transmembrane domain. The function
           is unclear.
          Length = 151

 Score = 29.9 bits (67), Expect = 2.2
 Identities = 30/116 (25%), Positives = 47/116 (40%)

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
           K+RKES   +S    +KKKEK  +K+   K K++       E +  +E      I+    
Sbjct: 12  KQRKESGKTQSQKSDKKKKEKVSEKKGKSKKKEEKPNGKIPEHEPNQEVTEVEVIIEKEP 71

Query: 149 NSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDK 204
                 +   +    P  AP P +  PV ++EK    + S       + K KK  K
Sbjct: 72  VPAVAVAPVPVAVVAPVVAPKPKKSQPVMSQEKTASPQKSVPAPSPKEKKKKKVAK 127


>gnl|CDD|218336 pfam04935, SURF6, Surfeit locus protein 6.  The surfeit locus
           protein SURF-6 is shown to be a component of the
           nucleolar matrix and has a strong binding capacity for
           nucleic acids.
          Length = 206

 Score = 30.0 bits (68), Expect = 2.3
 Identities = 22/59 (37%), Positives = 35/59 (59%), Gaps = 4/59 (6%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           KEK+K+K  KE  +     KEK + K + ++K+R+E+  K    +K KKK+K  KK + 
Sbjct: 151 KEKQKKKSKKEWKER----KEKVEKKKAERQKKREENLKKRKDDKKNKKKKKAKKKGRI 205



 Score = 29.2 bits (66), Expect = 4.7
 Identities = 32/85 (37%), Positives = 48/85 (56%), Gaps = 7/85 (8%)

Query: 48  KKDKKDKDRDK-EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           +K+K  K   K E  K K D++  K A+  KEK+K K   + KERKE K ++  +E++KK
Sbjct: 121 EKEKWTKALAKAEGVKVKDDEKLLKKALKRKEKQKKKSKKEWKERKE-KVEKKKAERQKK 179

Query: 107 -----KEKKDKKEKSHKHKDKDRER 126
                K++KD K+   K K K + R
Sbjct: 180 REENLKKRKDDKKNKKKKKAKKKGR 204


>gnl|CDD|237869 PRK14962, PRK14962, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 472

 Score = 30.9 bits (70), Expect = 2.4
 Identities = 17/57 (29%), Positives = 28/57 (49%)

Query: 60  KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
              ++ D E+     + ++KEK K  SK KE K+   +     KE  +E K+K + S
Sbjct: 337 PNVQENDVEEKNDNSNVQQKEKKKEESKAKEEKQEDIEFEKRFKELMEELKEKGDLS 393


>gnl|CDD|234941 PRK01315, PRK01315, putative inner membrane protein translocase
           component YidC; Provisional.
          Length = 329

 Score = 30.5 bits (69), Expect = 2.4
 Identities = 12/71 (16%), Positives = 30/71 (42%)

Query: 40  NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
           NPT  S +   ++++   K K+  + + +      +  +  + +  +K +  +  +PK  
Sbjct: 258 NPTPGSPAAIAREERLAKKGKDHGESEGKVVAPEGAVAQTTEVREQTKRQTVQRQQPKRQ 317

Query: 100 SSEKEKKKEKK 110
           S  K +K    
Sbjct: 318 SRAKRQKGGAA 328


>gnl|CDD|237554 PRK13909, PRK13909, putative recombination protein RecB;
           Provisional.
          Length = 910

 Score = 31.1 bits (71), Expect = 2.4
 Identities = 25/101 (24%), Positives = 38/101 (37%), Gaps = 32/101 (31%)

Query: 53  DKD--RDKEKEKEKKDKE--------------------KDKSAVSSKEK------EKDKV 84
           DKD  R  EKEK  K +E                    KD+S+ S  E       E+ ++
Sbjct: 663 DKDYARALEKEKALKYEEEINVLYVAFTRAKNSLIVVKKDESSGSMFEILDLKPLERGEI 722

Query: 85  SSKEKERKESKPKESSSEKEKKK----EKKDKKEKSHKHKD 121
             KE +    K    +S K K      + K+ +E+  +  D
Sbjct: 723 EIKEPKISPKKESLITSVKLKPHGYQEQVKEIEEEPKEDND 763


>gnl|CDD|219913 pfam08576, DUF1764, Eukaryotic protein of unknown function
           (DUF1764).  This is a family of eukaryotic proteins of
           unknown function. This family contains many hypothetical
           proteins.
          Length = 98

 Score = 29.0 bits (65), Expect = 2.4
 Identities = 16/59 (27%), Positives = 33/59 (55%), Gaps = 2/59 (3%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           KE++K +K  ++D   + S  K++ K   K++  K ++PK +   ++K K+K +  E  
Sbjct: 1   KEEKKNEKTDKRDIDDIFSNIKKRKKK--KKRTAKTARPKATKKGQKKDKKKDEFPEFP 57


>gnl|CDD|222010 pfam13254, DUF4045, Domain of unknown function (DUF4045).  This
           presumed domain is functionally uncharacterized. This
           domain family is found in bacteria and eukaryotes, and
           is typically between 384 and 430 amino acids in length.
          Length = 414

 Score = 30.6 bits (69), Expect = 2.4
 Identities = 37/228 (16%), Positives = 73/228 (32%), Gaps = 20/228 (8%)

Query: 2   AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSS--SSKKDKKDKDRDKE 59
           +      S S+S   S   ++  D    PS +      +PT ++   S  +K +  + K 
Sbjct: 109 SLPSHPRSRSASVSNSKDGDRPSDLPPSPSKTMDPRRWSPTKATWLESALNKPESPKHKP 168

Query: 60  KEKE----KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           +  +    KKD  + + + +S +  +   +S ++       +        K   K     
Sbjct: 169 QPPQQPEWKKDLSRLRQSRASVDLGR--TNSFKEVTPVGLMRTPPPGSHSKSPSKSGIPD 226

Query: 116 SHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSP 175
               +D   E+ K EK +Q+ S   +          E +S  +      P +P       
Sbjct: 227 LPSSRDS--EKTKPEKPQQETSSMDT----------EKSSAPKPRETLDPKSPEKAPPID 274

Query: 176 VKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKES 223
              +E +  + S    ++ S  K               K   S  K  
Sbjct: 275 TTEEELKSPEASPKESEEASARKRSPSLLSPSPKAESPKPLASPGKSP 322


>gnl|CDD|240339 PTZ00265, PTZ00265, multidrug resistance protein (mdr1);
           Provisional.
          Length = 1466

 Score = 30.8 bits (69), Expect = 2.4
 Identities = 30/128 (23%), Positives = 51/128 (39%), Gaps = 8/128 (6%)

Query: 40  NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK---ERKESKP 96
           +PT  +    +K +KD +        +K  +  +   ++   D +   +        +  
Sbjct: 669 DPTKDNKENNNKNNKDDNNNNNNNNNNKINNAGSYIIEQGTHDALMKNKNGIYYTMINNQ 728

Query: 97  KESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE----KKEQKESKSSSKIVS-SSHNSK 151
           K SS +       KD   KS  +KD +R  D DE     K + ES S+ K    S  N+ 
Sbjct: 729 KVSSKKSSNNDNDKDSDMKSSAYKDSERGYDPDEMNGNSKHENESASNKKSCKMSDENAS 788

Query: 152 EPASGSQL 159
           E  +G +L
Sbjct: 789 ENNAGGKL 796


>gnl|CDD|221818 pfam12868, DUF3824, Domain of unknwon function (DUF3824).  This is
           a repeating domain found in fungal proteins. It is
           proline-rich, and the function is not known.
          Length = 135

 Score = 29.5 bits (66), Expect = 2.5
 Identities = 16/69 (23%), Positives = 24/69 (34%), Gaps = 6/69 (8%)

Query: 103 KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISH 162
            ++++ KK ++E+     D D   D  E+         S  VS                 
Sbjct: 24  SQRRERKKAERERERYRHDHDAYSDSYEEPYDPTPYPPSPPVSDPR------YYPNSNYF 77

Query: 163 PPPPAPTPT 171
           PPPP  TP 
Sbjct: 78  PPPPGSTPV 86


>gnl|CDD|233191 TIGR00927, 2A1904, K+-dependent Na+/Ca+ exchanger.  [Transport and
           binding proteins, Cations and iron carrying compounds].
          Length = 1096

 Score = 30.7 bits (69), Expect = 2.5
 Identities = 19/92 (20%), Positives = 39/92 (42%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
           SK D  + +   E+  E+ ++  +    + +E   +     E E K     E     E+K
Sbjct: 631 SKGDVAEAEHTGERTGEEGERPTEAEGENGEESGGEAEQEGETETKGENESEGEIPAERK 690

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
            E++ + E   K  D   E + +E + + E++
Sbjct: 691 GEQEGEGEIEAKEADHKGETEAEEVEHEGETE 722


>gnl|CDD|233758 TIGR02169, SMC_prok_A, chromosome segregation protein SMC,
           primarily archaeal type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. It is found
           in a single copy and is homodimeric in prokaryotes, but
           six paralogs (excluded from this family) are found in
           eukarotes, where SMC proteins are heterodimeric. This
           family represents the SMC protein of archaea and a few
           bacteria (Aquifex, Synechocystis, etc); the SMC of other
           bacteria is described by TIGR02168. The N- and
           C-terminal domains of this protein are well conserved,
           but the central hinge region is skewed in composition
           and highly divergent [Cellular processes, Cell division,
           DNA metabolism, Chromosome-associated proteins].
          Length = 1164

 Score = 30.8 bits (70), Expect = 2.6
 Identities = 18/102 (17%), Positives = 41/102 (40%), Gaps = 8/102 (7%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK------EKERKESKPKESSSEKEKKK 107
           KD  ++ EK K++  + K  +   ++E  ++S +           E+K  E   EKE K 
Sbjct: 388 KDYREKLEKLKREINELKRELDRLQEELQRLSEELADLNAAIAGIEAKINELEEEKEDKA 447

Query: 108 EKKDKKEKSHKH--KDKDRERDKDEKKEQKESKSSSKIVSSS 147
            +  K+E   +    D  +   +    +++  +   ++    
Sbjct: 448 LEIKKQEWKLEQLAADLSKYEQELYDLKEEYDRVEKELSKLQ 489



 Score = 30.4 bits (69), Expect = 3.7
 Identities = 14/97 (14%), Positives = 47/97 (48%), Gaps = 5/97 (5%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK--EKERKESKPKESSSEKE 104
            K   + ++ ++E E+E+K ++K     +  ++E + + ++  E +++ ++ ++   +  
Sbjct: 332 DKLLAEIEELEREIEEERKRRDKLTEEYAELKEELEDLRAELEEVDKEFAETRDELKDYR 391

Query: 105 KKKEK-KDKKEKSHKHKDK--DRERDKDEKKEQKESK 138
           +K EK K +  +  +  D+  +  +   E+     + 
Sbjct: 392 EKLEKLKREINELKRELDRLQEELQRLSEELADLNAA 428



 Score = 30.0 bits (68), Expect = 4.6
 Identities = 23/153 (15%), Positives = 56/153 (36%), Gaps = 18/153 (11%)

Query: 50  DKKDKDRDKEK---EKEKKDKEKDKSAVSSKEKE--------KDKVSSKEKERKESKPKE 98
           ++K      EK   EKE ++ ++ +  +  + K           K    E+E +E +   
Sbjct: 818 EQKLNRLTLEKEYLEKEIQELQEQRIDLKEQIKSIEKEIENLNGKKEELEEELEELEAAL 877

Query: 99  SSSEKEKKKEKKDKKE-KSHKHKDKDRERDKDEKKEQKES-----KSSSKIVSSSHNSKE 152
              E      KK++ E ++   + + +  + + + E+K       K+  + +    +  E
Sbjct: 878 RDLESRLGDLKKERDELEAQLRELERKIEELEAQIEKKRKRLSELKAKLEALEEELSEIE 937

Query: 153 PASGSQLISHPPPPAPTPTQKSPVKTKEKEKEK 185
              G +    P         ++ ++  E+E   
Sbjct: 938 DPKG-EDEEIPEEELSLEDVQAELQRVEEEIRA 969



 Score = 29.7 bits (67), Expect = 5.3
 Identities = 19/101 (18%), Positives = 47/101 (46%), Gaps = 12/101 (11%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK----------ESKPKES 99
           +++    + E +K   + E+ +  +  + K +DK++ +  E K          E   KE 
Sbjct: 321 EERLAKLEAEIDKLLAEIEELEREIEEERKRRDKLTEEYAELKEELEDLRAELEEVDKEF 380

Query: 100 SSEKEKKKEKKDKKEK-SHKHKDKDRERDK-DEKKEQKESK 138
           +  +++ K+ ++K EK   +  +  RE D+  E+ ++   +
Sbjct: 381 AETRDELKDYREKLEKLKREINELKRELDRLQEELQRLSEE 421


>gnl|CDD|218148 pfam04557, tRNA_synt_1c_R2, Glutaminyl-tRNA synthetase,
          non-specific RNA binding region part 2.  This is a
          region found N terminal to the catalytic domain of
          glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes
          but not in Escherichia coli. This region is thought to
          bind RNA in a non-specific manner, enhancing
          interactions between the tRNA and enzyme, but is not
          essential for enzyme function.
          Length = 83

 Score = 28.5 bits (64), Expect = 2.6
 Identities = 10/40 (25%), Positives = 16/40 (40%), Gaps = 1/40 (2%)

Query: 54 KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
           + D  K K+KK K+K     ++  K K   +    E   
Sbjct: 18 TEADLVK-KKKKKKKKKAEDTAATAKAKKATAEDVSEGAM 56


>gnl|CDD|114337 pfam05609, LAP1C, Lamina-associated polypeptide 1C (LAP1C).  This
           family contains rat LAP1C proteins and several
           uncharacterized highly related sequences from both mice
           and humans. LAP1s (lamina-associated polypeptide 1s) are
           type 2 integral membrane proteins with a single
           membrane-spanning region of the inner nuclear membrane.
           LAP1s bind to both A- and B-type lamins and have a
           putative role in the membrane attachment and assembly of
           the nuclear lamina.
          Length = 465

 Score = 30.4 bits (68), Expect = 2.7
 Identities = 32/149 (21%), Positives = 58/149 (38%), Gaps = 4/149 (2%)

Query: 30  PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEK-DKSAVSSKEKEKDKVSSKE 88
           P+            S   ++D+     +  +   KK     D++ VS   K+K +     
Sbjct: 22  PAEEARGLRDACGLSKDHQEDETSSQPESSQTGSKKTVRSPDEANVSEDPKDKLRRPPLR 81

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKH--KDKDRERDKDEKKEQKESKSSSKIVSS 146
             R E+   ++     ++ E +D    S  +  K+  R RD  E  + K  ++ + + SS
Sbjct: 82  YPRYEATEVQNKQSFLEEGETEDDHHSSSSNVTKEPLRSRDSHESSD-KVGRADAHLGSS 140

Query: 147 SHNSKEPASGSQLISHPPPPAPTPTQKSP 175
           S    + AS     S  P    T +QK+P
Sbjct: 141 SWALPKSASDFTAHSQQPSVLTTGSQKAP 169


>gnl|CDD|237496 PRK13766, PRK13766, Hef nuclease; Provisional.
          Length = 773

 Score = 30.6 bits (70), Expect = 2.7
 Identities = 16/73 (21%), Positives = 36/73 (49%), Gaps = 5/73 (6%)

Query: 75  SSKEKEKD-----KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKD 129
           SS+ KEK      K       +K  +  E    +E++K+++   +   K K K+ E +++
Sbjct: 488 SSRRKEKKMKEELKNLKGILNKKLQELDEEQKGEEEEKDEQLSLDDFVKSKGKEEEEEEE 547

Query: 130 EKKEQKESKSSSK 142
           ++++ KE++    
Sbjct: 548 KEEKDKETEEDEP 560


>gnl|CDD|215581 PLN03109, PLN03109, ETHYLENE-INSENSITIVE3-like3 protein;
           Provisional.
          Length = 599

 Score = 30.6 bits (69), Expect = 2.8
 Identities = 26/150 (17%), Positives = 59/150 (39%), Gaps = 1/150 (0%)

Query: 24  KDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKS-AVSSKEKEKD 82
           ++ S I   S+ + TS  T +     + ++KD          D  +D   +VSSK+  ++
Sbjct: 284 REESLIRQPSSDNGTSGITETPRGGHEDRNKDAISSDSDYDVDGLEDAPGSVSSKDDRRN 343

Query: 83  KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
                ++  +      +    +K+K KK +K K  + +    E++ +  +E   ++S + 
Sbjct: 344 LQPVAQEPERARDDAPNQVVPDKEKTKKPRKRKRPRGRSTVAEQEVEVTQEHPPAESRNA 403

Query: 143 IVSSSHNSKEPASGSQLISHPPPPAPTPTQ 172
           +   +H   +        +       T  Q
Sbjct: 404 LPDMNHVDAQGMEYQITGTSHENDTVTALQ 433


>gnl|CDD|227891 COG5604, COG5604, Uncharacterized conserved protein [Function
           unknown].
          Length = 523

 Score = 30.6 bits (69), Expect = 2.8
 Identities = 23/83 (27%), Positives = 38/83 (45%), Gaps = 6/83 (7%)

Query: 83  KVSSKEKERKESKPKES-SSEKEKKKEKK---DKKEKSHKHKDKDRERDKDEKKEQKESK 138
           K S   K+  ++  K +    K+  + KK    K   SH     +   + + K ++K SK
Sbjct: 3   KASKATKKFTKNHLKNTIDRRKQLARSKKVYGTKNRNSHTENKMESGTNDNNKNKEKLSK 62

Query: 139 SSSKIVSSSHNSKEPASGSQLIS 161
             S + SS  +S+E   GS+ IS
Sbjct: 63  LYSDVDSS--SSEEEEDGSESIS 83



 Score = 30.2 bits (68), Expect = 3.1
 Identities = 19/71 (26%), Positives = 28/71 (39%), Gaps = 1/71 (1%)

Query: 40  NPTNSSSSKKDKKDKDRDKE-KEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
              N +S  ++K +   +   K KEK  K       SS E+E+D   S  K    SK   
Sbjct: 34  GTKNRNSHTENKMESGTNDNNKNKEKLSKLYSDVDSSSSEEEEDGSESISKLNVNSKKIS 93

Query: 99  SSSEKEKKKEK 109
            +    +K  K
Sbjct: 94  LNQVSTQKWRK 104


>gnl|CDD|216095 pfam00748, Calpain_inhib, Calpain inhibitor.  This region is found
           multiple times in calpain inhibitor proteins.
          Length = 131

 Score = 29.4 bits (66), Expect = 2.8
 Identities = 30/123 (24%), Positives = 50/123 (40%), Gaps = 12/123 (9%)

Query: 9   SSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKE------- 61
           + S+S  PSP   K K+ +   + S    ++    S  S     +K RDK  +       
Sbjct: 9   TCSASPPPSPTAKKKKEEAEKTAASGEVVSAQSAPSVRSAAPPPEKKRDKMSDDALDALS 68

Query: 62  ----KEKKDKEKDKSAVS-SKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
               + + D E+ K      KEK K++   K  ER+++ P E    + K K+ K    K 
Sbjct: 69  DSLGQREPDPEEKKPVEDKVKEKAKEEKLEKLGEREDTIPPEYRLLEAKDKDGKPLLPKP 128

Query: 117 HKH 119
            + 
Sbjct: 129 EEE 131


>gnl|CDD|218274 pfam04801, Sin_N, Sin-like protein conserved region.  Family of
           higher eukaryotic proteins. SIN was identified as a
           protein that interacts specifically with SXL (sex
           lethal) in a yeast two-hybrid assay. The interaction is
           mediated by one of the SXL RNA binding domains.
          Length = 422

 Score = 30.5 bits (69), Expect = 2.9
 Identities = 24/83 (28%), Positives = 41/83 (49%), Gaps = 6/83 (7%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
           DKKDK + KE++   +D++ D+ A    ++   K S  E E++  + +E S    +KK  
Sbjct: 140 DKKDKRK-KEEDTADEDEDPDEEAEEELKQVTVKFSRPETEKQRKR-REQSYNFLQKKIA 197

Query: 110 KDKKEKSHKHKDKD----RERDK 128
           ++   +   H  KD     ER K
Sbjct: 198 EEPWIELKYHGKKDSESELERQK 220


>gnl|CDD|233757 TIGR02168, SMC_prok_B, chromosome segregation protein SMC, common
           bacterial type.  SMC (structural maintenance of
           chromosomes) proteins bind DNA and act in organizing and
           segregating chromosomes for partition. SMC proteins are
           found in bacteria, archaea, and eukaryotes. This family
           represents the SMC protein of most bacteria. The smc
           gene is often associated with scpB (TIGR00281) and scpA
           genes, where scp stands for segregation and condensation
           protein. SMC was shown (in Caulobacter crescentus) to be
           induced early in S phase but present and bound to DNA
           throughout the cell cycle [Cellular processes, Cell
           division, DNA metabolism, Chromosome-associated
           proteins].
          Length = 1179

 Score = 30.8 bits (70), Expect = 3.0
 Identities = 16/88 (18%), Positives = 41/88 (46%), Gaps = 3/88 (3%)

Query: 51  KKDKDRDKEKEKEKKDKEKD--KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
            + +  + E+E E+  KE     + +S  E++K ++  +     E + +E  ++ E+ + 
Sbjct: 272 LRLEVSELEEEIEELQKELYALANEISRLEQQK-QILRERLANLERQLEELEAQLEELES 330

Query: 109 KKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           K D+  +     ++  E  K+E +  + 
Sbjct: 331 KLDELAEELAELEEKLEELKEELESLEA 358


>gnl|CDD|237855 PRK14900, valS, valyl-tRNA synthetase; Provisional.
          Length = 1052

 Score = 30.7 bits (69), Expect = 3.1
 Identities = 26/173 (15%), Positives = 59/173 (34%), Gaps = 11/173 (6%)

Query: 40   NPTNSSSSKKDKKDKDRDKEKE-KEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKE 98
            NP+   ++     +KDR + +E +EK+ K +   A+ S  +         + + E KP +
Sbjct: 867  NPSFVQNAPPAVVEKDRARAEELREKRGKLEAHRAMLSGSEANSARRDTMEIQNEQKPTQ 926

Query: 99   SSSEKEKKKE---------KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHN 149
                 E +           +K     S   +          +K  +  + + +       
Sbjct: 927  DGPAAEAQPAQENTVVESAEKAVAAVSEAAQQAATAVASGIEKVAEAVRKTVRRSVKKAA 986

Query: 150  SKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKK 202
            +   A+  + ++   P      +K+  K    +K+        K ++    KK
Sbjct: 987  AT-RAAMKKKVAKKAPAKKAAAKKAAAKKAAAKKKVAKKAPAKKVARKPAAKK 1038


>gnl|CDD|222918 PHA02687, PHA02687, ORF061 late transcription factor VLTF-4;
           Provisional.
          Length = 231

 Score = 30.0 bits (67), Expect = 3.2
 Identities = 25/80 (31%), Positives = 39/80 (48%), Gaps = 1/80 (1%)

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE-RDKDEKKEQKE 136
           E+E  +      E     P ++   + KKK K DK EKS K  +K     D+D+K E+KE
Sbjct: 64  EQECQQEQLPVPESVPPAPVKTPKRRTKKKAKADKPEKSPKAVEKLCPPDDRDDKNEEKE 123

Query: 137 SKSSSKIVSSSHNSKEPASG 156
               ++    S +++  ASG
Sbjct: 124 PTEEAQRNEESGDAEGGASG 143


>gnl|CDD|197310 cd09076, L1-EN, Endonuclease domain (L1-EN) of the non-LTR
           retrotransposon LINE-1 (L1), and related domains.  This
           family contains the endonuclease domain (L1-EN) of the
           non-LTR retrotransposon LINE-1 (L1), and related
           domains, including the endonuclease of Xenopus laevis
           Tx1. These retrotranspons belong to the subtype 2,
           L1-clade. LINES can be classified into two subtypes.
           Subtype 2 has two ORFs: the second (ORF2) encodes a
           modular protein consisting of an N-terminal
           apurine/apyrimidine endonuclease domain (EN), a central
           reverse transcriptase, and a zinc-finger-like domain at
           the C-terminus. LINE-1/L1 elements (full length and
           truncated) comprise about 17% of the human genome. This
           endonuclease nicks the genomic DNA at the consensus
           target sequence 5'TTTT-AA3' producing a ribose
           3'-hydroxyl end as a primer for reverse transcription of
           associated template RNA. This subgroup also includes the
           endonuclease of Xenopus laevis Tx1, another member of
           the L1-clade. This family belongs to the large EEP
           (exonuclease/endonuclease/phosphatase) superfamily that
           contains functionally diverse enzymes that share a
           common catalytic mechanism of cleaving phosphodiester
           bonds.
          Length = 236

 Score = 30.0 bits (68), Expect = 3.2
 Identities = 9/30 (30%), Positives = 17/30 (56%)

Query: 256 AEQFKDELFDRLKNEQADILQRKVHIISGD 285
            E+ K+E +D+L++    + +    II GD
Sbjct: 112 DEEEKEEFYDQLQDVLDKVPRHDTLIIGGD 141


>gnl|CDD|218439 pfam05109, Herpes_BLLF1, Herpes virus major outer envelope
           glycoprotein (BLLF1).  This family consists of the BLLF1
           viral late glycoprotein, also termed gp350/220. It is
           the most abundantly expressed glycoprotein in the viral
           envelope of the Herpesviruses and is the major antigen
           responsible for stimulating the production of
           neutralising antibodies in vivo.
          Length = 830

 Score = 30.5 bits (68), Expect = 3.2
 Identities = 25/221 (11%), Positives = 59/221 (26%), Gaps = 22/221 (9%)

Query: 8   SSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE---- 63
            ++ ++  P+  K  D  ++  P+      T+  T+  +      +    +  E+     
Sbjct: 506 VTTPNATSPTTQKTSDTPNATSPTPIVIGVTTTATSPPTGTTSVPNATSPQVTEESPVNN 565

Query: 64  ---KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHK 120
                         S+    +    S    ++   P  S S       + +    +    
Sbjct: 566 TNTPVVTSAPSVLTSAVTTGQHGTGSSPTSQQPGIPSSSHSTP-----RSNSTSTTPLLT 620

Query: 121 DKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKE 180
                  ++  +E     S++ + + S     P  G    S    P  + T + P +   
Sbjct: 621 SAHPTGGENITEETPSVPSTTHVSTLS-----PGPGPGTTSQVSGPGNSSTSRYPGEVHV 675

Query: 181 KE-----KEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDA 216
            E          S    + +             +  KE   
Sbjct: 676 TEGMPNPNATSPSAPSGQKTAVPTVTSTGGKANSTTKETSG 716



 Score = 29.8 bits (66), Expect = 4.7
 Identities = 19/64 (29%), Positives = 34/64 (53%), Gaps = 17/64 (26%)

Query: 3   YSVKSSSSS-----SSAHPSPHKNKDKDSSAIPST------------STSSSTSNPTNSS 45
            + +S+S+S     +SAHP+  +N  +++ ++PST             T+S  S P NSS
Sbjct: 606 STPRSNSTSTTPLLTSAHPTGGENITEETPSVPSTTHVSTLSPGPGPGTTSQVSGPGNSS 665

Query: 46  SSKK 49
           +S+ 
Sbjct: 666 TSRY 669


>gnl|CDD|225288 COG2433, COG2433, Uncharacterized conserved protein [Function
           unknown].
          Length = 652

 Score = 30.4 bits (69), Expect = 3.3
 Identities = 20/80 (25%), Positives = 40/80 (50%), Gaps = 1/80 (1%)

Query: 58  KEKEKEK-KDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           K KE+E+ ++KE  +     +    +K   K +E  E   +E+S  K + +E K + EK 
Sbjct: 396 KVKEEERPREKEGTEEEERREITVYEKRIKKLEETVERLEEENSELKRELEELKREIEKL 455

Query: 117 HKHKDKDRERDKDEKKEQKE 136
               ++ R   +D+ ++ +E
Sbjct: 456 ESELERFRREVRDKVRKDRE 475


>gnl|CDD|114172 pfam05432, BSP_II, Bone sialoprotein II (BSP-II).  Bone
           sialoprotein (BSP) is a major structural protein of the
           bone matrix that is specifically expressed by
           fully-differentiated osteoblasts. The expression of bone
           sialoprotein (BSP) is normally restricted to mineralised
           connective tissues of bones and teeth where it has been
           associated with mineral crystal formation. However, it
           has been found that ectopic expression of BSP occurs in
           various lesions, including oral and extraoral
           carcinomas, in which it has been associated with the
           formation of microcrystalline deposits and the
           metastasis of cancer cells to bone.
          Length = 291

 Score = 30.0 bits (67), Expect = 3.4
 Identities = 23/129 (17%), Positives = 48/129 (37%), Gaps = 5/129 (3%)

Query: 67  KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
           K  +    ++KE E D+   +E+E +E + +   +E+       +  E  H +     + 
Sbjct: 121 KAGNAGKKATKEDESDEDEEEEEEEEEEEAEVEENEQGTNGTSTNSTEVDHGNGSSGGDN 180

Query: 127 DKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPP---APTPTQKSPVKTKEKEK 183
            ++ ++E      +     +   +  P  G Q     PP      T      V T E + 
Sbjct: 181 GEEGEEESVTEAEAEGTTVAGPTTTSPNGGFQ--PTTPPQEVYGTTDPPFGKVTTPEYQG 238

Query: 184 EKESSTTHD 192
           E E +  ++
Sbjct: 239 EYEQTGANE 247


>gnl|CDD|237875 PRK14974, PRK14974, cell division protein FtsY; Provisional.
          Length = 336

 Score = 29.9 bits (68), Expect = 3.4
 Identities = 17/59 (28%), Positives = 29/59 (49%)

Query: 67  KEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRE 125
           KEK    V   E++ ++   +E    E + +E   E++K+K     K K  + K+KD E
Sbjct: 6   KEKLSKFVEKVEEKIEEEEEEEAPEAEEEEEEEDEEEKKEKPGFFDKAKITEIKEKDIE 64


>gnl|CDD|237631 PRK14162, PRK14162, heat shock protein GrpE; Provisional.
          Length = 194

 Score = 29.4 bits (66), Expect = 3.5
 Identities = 27/106 (25%), Positives = 48/106 (45%), Gaps = 5/106 (4%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKK 110
           K++   +K+  +E+   E  K       KE+D+      E  E   KE +  K K K+ +
Sbjct: 3   KEEFPSEKDLPQEETTDEAPKKEAKEAPKEEDQEKQNPVEDLE---KEIADLKAKNKDLE 59

Query: 111 DKKEKSHKHKDKDRERDKDEKKE--QKESKSSSKIVSSSHNSKEPA 154
           DK  +S       + R   E+ +  + ES+S +K V  + ++ E A
Sbjct: 60  DKYLRSQAEIQNMQNRYAKERAQLIKYESQSLAKDVLPAMDNLERA 105


>gnl|CDD|234715 PRK00290, dnaK, molecular chaperone DnaK; Provisional.
          Length = 627

 Score = 30.1 bits (69), Expect = 3.6
 Identities = 26/112 (23%), Positives = 48/112 (42%), Gaps = 13/112 (11%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK----------EKERKESKPKESSSEKEKK 106
           D+E E+  KD E +       +K K+ V ++          EK  KE   K  + EKEK 
Sbjct: 502 DEEIERMVKDAEANAEE---DKKRKELVEARNQADSLIYQTEKTLKELGDKVPADEKEKI 558

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ 158
           +    + +++ K +DK+  + K E+  Q   K    +   +  ++  A  + 
Sbjct: 559 EAAIKELKEALKGEDKEAIKAKTEELTQASQKLGEAMYQQAQAAQGAAGAAA 610


>gnl|CDD|148635 pfam07139, DUF1387, Protein of unknown function (DUF1387).  This
           family represents a conserved region approximately 300
           residues long within a number of hypothetical proteins
           of unknown function that seem to be restricted to
           mammals.
          Length = 301

 Score = 30.0 bits (67), Expect = 3.6
 Identities = 28/148 (18%), Positives = 58/148 (39%), Gaps = 6/148 (4%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKS----SSK 142
           K+ ++K+SKPK  +  K   KE+   +E++    +KD            +++S    S  
Sbjct: 7   KKNKKKKSKPKPEAPAKSASKEETTPEEQAAPGDEKDEVNGFHANGSADDTESVDSLSEG 66

Query: 143 IVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKK 202
           + S+S +++EP + +     PP P+ + T        + E +    ++   H      K 
Sbjct: 67  LDSASLDAREPEAVTLDA--PPSPSSSLTNGLSDLQSKLELQSSPHSSAKPHPSSDQHKN 124

Query: 203 DKHGDKTNPKEKDAKSKEKESHKSSAGP 230
            K       +     +       ++ GP
Sbjct: 125 AKKYVSKPSQPVTPNNSAHHDAPAALGP 152


>gnl|CDD|223683 COG0610, COG0610, Type I site-specific restriction-modification
           system, R (restriction) subunit and related helicases
           [Defense mechanisms].
          Length = 962

 Score = 30.5 bits (69), Expect = 3.6
 Identities = 15/88 (17%), Positives = 35/88 (39%), Gaps = 5/88 (5%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
            K+   +   +  E+  K+  +D      ++K+K      E   +    K  ++EK ++ 
Sbjct: 827 DKNGAYESLKELIERIIKEWIED-----LRQKKKLIERLIEAINQYRAKKLDTAEKLEEL 881

Query: 108 EKKDKKEKSHKHKDKDRERDKDEKKEQK 135
               KKE+  K   ++   +++E     
Sbjct: 882 YILAKKEEEFKQFAEEEGLNEEELAFYD 909


>gnl|CDD|206034 pfam13863, DUF4200, Domain of unknown function (DUF4200).  This
           family is found in eukaryotes. It is a coiled-coil
           domain of unknwon function.
          Length = 126

 Score = 28.7 bits (65), Expect = 3.7
 Identities = 19/80 (23%), Positives = 48/80 (60%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
           R++ + +E+  K++++     +E+ ++ +   +K  KE++ K   +EK+ ++EKK +KEK
Sbjct: 20  REEFERREELLKQREEELEKKEEELQESLIKFDKFLKENEAKRRRAEKKAEEEKKLRKEK 79

Query: 116 SHKHKDKDRERDKDEKKEQK 135
             + K+   E ++ + + +K
Sbjct: 80  EEEIKELKAELEELKAEIEK 99


>gnl|CDD|215656 pfam00012, HSP70, Hsp70 protein.  Hsp70 chaperones help to fold
           many proteins. Hsp70 assisted folding involves repeated
           cycles of substrate binding and release. Hsp70 activity
           is ATP dependent. Hsp70 proteins are made up of two
           regions: the amino terminus is the ATPase domain and the
           carboxyl terminus is the substrate binding region.
          Length = 598

 Score = 30.3 bits (69), Expect = 3.7
 Identities = 22/104 (21%), Positives = 50/104 (48%), Gaps = 4/104 (3%)

Query: 40  NPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKES 99
             +  S  + ++  KD ++   ++KK KE     + +K + ++ V S EK  KE   K  
Sbjct: 497 ASSGLSDDEIERMVKDAEEYAAEDKKRKE----RIEAKNEAEEYVYSLEKSLKEEGDKLP 552

Query: 100 SSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKI 143
            ++K+K +E  +  ++  + +DK+    K E+ ++       ++
Sbjct: 553 EADKKKVEEAIEWLKEELEGEDKEEIEAKTEELQKVVQPIGERM 596


>gnl|CDD|221028 pfam11208, DUF2992, Protein of unknown function (DUF2992).  This
           bacterial family of proteins has no known function.
           However, the cis-regulatory yjdF motif, just upstream
           from the gene encoding the proteins for this family, is
           a small non-coding RNA, Rfam:RF01764. The yjdF motif is
           found in many Firmicutes, including Bacillus subtilis.
           In most cases, it resides in potential 5' UTRs of
           homologues of the yjdF gene whose function is unknown.
           However, in Streptococcus thermophilus, a yjdF RNA motif
           is associated with an operon whose protein products
           synthesise nicotinamide adenine dinucleotide (NAD+).
           Also, the S. thermophilus yjdF RNA lacks typical yjdF
           motif consensus features downstream of and including the
           P4 stem. Thus, if yjdF RNAs are riboswitch aptamers, the
           S. thermophilus RNAs might sense a distinct compound
           that structurally resembles the ligand bound by other
           yjdF RNAs. On the ohter hand, perhaps these RNAs have an
           alternative solution forming a similar binding site, as
           is observed with some SAM riboswitches.
          Length = 132

 Score = 28.8 bits (65), Expect = 3.7
 Identities = 17/62 (27%), Positives = 35/62 (56%), Gaps = 9/62 (14%)

Query: 76  SKEKEKDKVSSKEKE--RKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
           +KE +K  +S+K ++  + E        E+ K+++KK  KEK  + K++ R+  + +KK 
Sbjct: 75  AKEVKKPGISTKAQQALKLEH-------ERNKQEKKKRSKEKKEEEKERKRQLKQQKKKA 127

Query: 134 QK 135
           + 
Sbjct: 128 KH 129


>gnl|CDD|185429 PTZ00074, PTZ00074, 60S ribosomal protein L34; Provisional.
          Length = 135

 Score = 28.9 bits (65), Expect = 3.8
 Identities = 12/30 (40%), Positives = 15/30 (50%)

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSK 87
           KEK K+KK K+K K     K  +K     K
Sbjct: 106 KEKAKQKKQKKKKKKKKKKKTSKKAAKKKK 135



 Score = 28.5 bits (64), Expect = 4.8
 Identities = 11/30 (36%), Positives = 19/30 (63%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           KEK +++ + K+   +K+KK  KK  K+K 
Sbjct: 106 KEKAKQKKQKKKKKKKKKKKTSKKAAKKKK 135


>gnl|CDD|215412 PLN02769, PLN02769, Probable galacturonosyltransferase.
          Length = 629

 Score = 30.0 bits (68), Expect = 3.8
 Identities = 32/155 (20%), Positives = 55/155 (35%), Gaps = 27/155 (17%)

Query: 111 DKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTP 170
                + ++  K  +    E  ++   +S   + SS  +    +S S+L    P P   P
Sbjct: 63  SHVGSARENGTKKTQNQVSEGVDEILKESG--LTSSKPSDIVISSRSKLKKVFPDPKLNP 120

Query: 171 -TQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
              K           K  ST  DK                  + K  K+ E E+ KS   
Sbjct: 121 LPVKPHSVPVPSSDTKNKSTAIDK------------------ENKGQKADEDENEKS--- 159

Query: 230 PKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELF 264
             C  E G  Y L   +  + +++ + ++ KD+LF
Sbjct: 160 --CELEFGS-YCLWSEEHKEVMKDSIVKRLKDQLF 191


>gnl|CDD|241486 cd13332, FERM_C_JAK1, Janus kinase 1 FERM domain C-lobe.  JAK1 is a
           tyrosine kinase protein essential in signaling type I
           and type II cytokines. It interacts with the gamma chain
           of type I cytokine receptors to elicit signals from the
           IL-2 receptor family, the IL-4 receptor family, the
           gp130 receptor family, ciliary neurotrophic factor
           receptor (CNTF-R), neurotrophin-1 receptor (NNT-1R) and
           Leptin-R). It also is involved in transducing a signal
           by type I (IFN-alpha/beta) and type II (IFN-gamma)
           interferons, and members of the IL-10 family via type II
           cytokine receptors. JAK (also called Just Another
           Kinase) is a family of intracellular, non-receptor
           tyrosine kinases that transduce cytokine-mediated
           signals via the JAK-STAT pathway. The JAK family in
           mammals consists of 4 members: JAK1, JAK2, JAK3 and
           TYK2. JAKs are composed of seven JAK homology (JH)
           domains (JH1-JH7) . The C-terminal JH1 domain is the
           main catalytic domain, followed by JH2, which is often
           referred to as a pseudokinase domain, followed by
           JH3-JH4 which is homologous to the SH2 domain, and
           lastly JH5-JH7 which is a FERM domain.  Named after
           Janus, the two-faced Roman god of doorways, JAKs possess
           two near-identical phosphate-transferring domains; one
           which displays the kinase activity (JH1), while the
           other negatively regulates the kinase activity of the
           first (JH2). The FERM domain has a cloverleaf tripart
           structure (FERM_N, FERM_M, FERM_C/N, alpha-, and
           C-lobe/A-lobe,A-lobe, B-lobe, C-lobe/F1, F2, F3). The
           C-lobe/F3 within the FERM domain is part of the PH
           domain family. The FERM domain is found in the
           cytoskeletal-associated proteins such as ezrin, moesin,
           radixin, 4.1R, and merlin. These proteins provide a link
           between the membrane and cytoskeleton and are involved
           in signal transduction pathways. The FERM domain is also
           found in protein tyrosine phosphatases (PTPs) , the
           tyrosine kinases FAK and JAK, in addition to other
           proteins involved in signaling. This domain is
           structurally similar to the PH and PTB domains and
           consequently is capable of binding to both peptides and
           phospholipids at different sites.
          Length = 198

 Score = 29.4 bits (66), Expect = 3.9
 Identities = 13/33 (39%), Positives = 20/33 (60%)

Query: 95  KPKESSSEKEKKKEKKDKKEKSHKHKDKDRERD 127
           KP  ++ EK+KK + K  K K  K +DK + R+
Sbjct: 83  KPATTAVEKKKKGKSKKNKLKGKKDEDKKKARE 115


>gnl|CDD|215104 PLN00207, PLN00207, polyribonucleotide nucleotidyltransferase;
           Provisional.
          Length = 891

 Score = 30.2 bits (68), Expect = 4.0
 Identities = 15/63 (23%), Positives = 32/63 (50%)

Query: 59  EKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHK 118
           E   EK  +++   +   K  +K  V++  + R+ ++ +++S+E     +KKD K  +  
Sbjct: 828 EANSEKSSQKQQGGSTKDKAPQKKYVNTSSRPRRAAQAEKNSAENAAVPKKKDYKRATSG 887

Query: 119 HKD 121
            KD
Sbjct: 888 SKD 890


>gnl|CDD|203489 pfam06644, ATP11, ATP11 protein.  This family consists of several
           eukaryotic ATP11 proteins. In Saccharomyces cerevisiae,
           expression of functional F1-ATPase requires two proteins
           encoded by the ATP11 and ATP12 genes. Atp11p is a
           molecular chaperone of the mitochondrial matrix that
           participates in the biogenesis pathway to form F1, the
           catalytic unit of the ATP synthase.
          Length = 250

 Score = 29.6 bits (67), Expect = 4.0
 Identities = 16/75 (21%), Positives = 28/75 (37%), Gaps = 3/75 (4%)

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHP 163
           EK + K  +K K    +  +R +   + K +K+  S+ K         + AS  + +  P
Sbjct: 2   EKYRSKLLQKAKESGLEFIERLKKALKDKIEKKEFSAKK---PPTGPSKQASKFKTLKPP 58

Query: 164 PPPAPTPTQKSPVKT 178
            P         P K 
Sbjct: 59  KPADKKKPFDKPFKP 73


>gnl|CDD|239570 cd03488, Topoisomer_IB_N_htopoI_like, Topoisomer_IB_N_htopoI_like :
           N-terminal DNA binding fragment found in eukaryotic DNA
           topoisomerase (topo) IB proteins similar to the
           monomeric yeast and human topo I.  Topo I enzymes are
           divided into:  topo type IA (bacterial) and type IB
           (eukaryotic). Topo I relaxes superhelical tension in
           duplex DNA by creating a single-strand nick, the broken
           strand can then rotate around the unbroken strand to
           remove DNA supercoils and, the nick is religated,
           liberating topo I. These enzymes regulate the
           topological changes that accompany DNA replication,
           transcription and other nuclear processes.  Human topo I
           is the target of a diverse set of anticancer drugs
           including camptothecins (CPTs). CPTs bind to the topo
           I-DNA complex and inhibit religation of the
           single-strand nick, resulting in the accumulation of
           topo I-DNA adducts.  This family may represent more than
           one structural domain.
          Length = 215

 Score = 29.6 bits (67), Expect = 4.0
 Identities = 14/42 (33%), Positives = 23/42 (54%), Gaps = 9/42 (21%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHK------HKDK 122
            +KE K++  KE   EK+  K +K+K E+ +       HK+K
Sbjct: 96  AQKEEKKAMSKE---EKKAIKAEKEKLEEEYGFCILDGHKEK 134


>gnl|CDD|216257 pfam01034, Syndecan, Syndecan domain.  Syndecans are transmembrane
           heparin sulfate proteoglycans which are implicated in
           the binding of extracellular matrix components and
           growth factors.
          Length = 207

 Score = 29.3 bits (66), Expect = 4.0
 Identities = 23/99 (23%), Positives = 35/99 (35%), Gaps = 12/99 (12%)

Query: 7   SSSSSSSAHPSPHKNKDKD--SSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK 64
            S S S A PS  ++ +    S+  P  +T+SS+ +   +++S   K             
Sbjct: 53  YSGSGSGATPSDDEDSEPVTTSATPPKLTTTSSSPSNDTTTASTSTKTSPTVSTTVTTTT 112

Query: 65  KDKEKDKSAV---SSKEKEKDKVSSKEK-------ERKE 93
              E D        S E   +  SS          ERKE
Sbjct: 113 SPSETDTEEATTTVSTETPTEGGSSAATDPSKNLLERKE 151


>gnl|CDD|215544 PLN03029, PLN03029, type-a response regulator protein; Provisional.
          Length = 222

 Score = 29.6 bits (66), Expect = 4.0
 Identities = 11/58 (18%), Positives = 37/58 (63%)

Query: 52  KDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
           K K +++++E ++K ++ ++S + S+++E+     + + + + +P++ ++ K K  E+
Sbjct: 147 KTKSKNQKQENQEKQEKLEESEIQSEKQEQPSQQPQSQPQPQQQPQQPNNNKRKAMEE 204


>gnl|CDD|219500 pfam07655, Secretin_N_2, Secretin N-terminal domain.  This is a
          short domain found in bacterial type II/III secretory
          system proteins. The architecture of these proteins
          suggest that this family may be functionally analogous
          to pfam03958.
          Length = 95

 Score = 28.1 bits (63), Expect = 4.1
 Identities = 17/44 (38%), Positives = 22/44 (50%), Gaps = 6/44 (13%)

Query: 4  SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSS 47
          SV S S SSS       +    SS+  S   SSS+S+  +SSS 
Sbjct: 20 SVTSGSVSSSG------SNSSSSSSNSSNGGSSSSSSSGDSSSG 57


>gnl|CDD|220614 pfam10174, Cast, RIM-binding protein of the cytomatrix active zone.
            This is a family of proteins that form part of the CAZ
           (cytomatrix at the active zone) complex which is
           involved in determining the site of synaptic vesicle
           fusion. The C-terminus is a PDZ-binding motif that binds
           directly to RIM (a small G protein Rab-3A effector). The
           family also contains four coiled-coil domains.
          Length = 774

 Score = 30.0 bits (67), Expect = 4.1
 Identities = 18/88 (20%), Positives = 42/88 (47%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
           D +D+    E++     K+ +    + + KE+     KE+ R       + +  EK ++ 
Sbjct: 382 DMRDRYEKTERKLRVLQKKIENLQETFRRKERRLKEEKERLRSLQTDTNTDTALEKLEKA 441

Query: 110 KDKKEKSHKHKDKDRERDKDEKKEQKES 137
             +KE+  +   + R+RD+  ++E+ E+
Sbjct: 442 LAEKERIIERLKEQRDRDERYEQEEFET 469


>gnl|CDD|223649 COG0576, GrpE, Molecular chaperone GrpE (heat shock protein)
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 193

 Score = 29.2 bits (66), Expect = 4.2
 Identities = 13/78 (16%), Positives = 36/78 (46%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
             DK+ K  + + E+ ++ ++ ++     +E E++    +E++       +    K+K  
Sbjct: 1   MSDKEQKTEEPDAEETEEAEKSEEEEAEEEEPEEENELEEEQQEIAELEAQLEELKDKYL 60

Query: 108 EKKDKKEKSHKHKDKDRE 125
             + + E   K  +++RE
Sbjct: 61  RAQAEFENLRKRTERERE 78


>gnl|CDD|223599 COG0525, ValS, Valyl-tRNA synthetase [Translation, ribosomal
           structure and biogenesis].
          Length = 877

 Score = 29.9 bits (68), Expect = 4.2
 Identities = 18/69 (26%), Positives = 28/69 (40%), Gaps = 8/69 (11%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSK-EKERKESKPKESSSEKEKK 106
                  D   E  + +K+ EK        EKE D++  K   E   +K  E   EKEK+
Sbjct: 804 LPLAGLIDLAAELARLEKELEK-------LEKEIDRIEKKLSNEGFVAKAPEEVVEKEKE 856

Query: 107 KEKKDKKEK 115
           K  + + + 
Sbjct: 857 KLAEYQVKL 865


>gnl|CDD|204935 pfam12474, PKK, Polo kinase kinase.  This domain family is found in
           eukaryotes, and is approximately 140 amino acids in
           length. The family is found in association with
           pfam00069. Polo-like kinase 1 (Plx1) is essential during
           mitosis for the activation of Cdc25C, for spindle
           assembly, and for cyclin B degradation. This family is
           Polo kinase kinase (PKK) which phosphorylates Polo
           kinase and Polo-like kinase to activate them. PKK is a
           serine/threonine kinase.
          Length = 142

 Score = 28.8 bits (65), Expect = 4.3
 Identities = 22/92 (23%), Positives = 50/92 (54%), Gaps = 2/92 (2%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK 106
               + +K+ ++ + ++K+  EK +   + + +   K    E++ +    KE  S K +K
Sbjct: 19  QLLKRHEKELEQLERQQKRTIEKLEQRQTQELRRLPKRIRAEQKTRLKMFKE--SLKIEK 76

Query: 107 KEKKDKKEKSHKHKDKDRERDKDEKKEQKESK 138
           KE K + EK  + ++++++R K EK+EQ++  
Sbjct: 77  KELKQEVEKLPRFQEQEKKRMKAEKEEQEQKH 108


>gnl|CDD|223624 COG0550, TopA, Topoisomerase IA [DNA replication, recombination,
           and repair].
          Length = 570

 Score = 29.9 bits (68), Expect = 4.3
 Identities = 13/44 (29%), Positives = 24/44 (54%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPK 97
            + + +KE   KDK++ +  V+  + +  KV S EK+ K+  P 
Sbjct: 214 TEIEGKKEGRLKDKDEAEEIVNKLKGKPAKVVSVEKKPKKRSPP 257


>gnl|CDD|178744 PLN03205, PLN03205, ATR interacting protein; Provisional.
          Length = 652

 Score = 30.1 bits (67), Expect = 4.5
 Identities = 31/131 (23%), Positives = 63/131 (48%), Gaps = 10/131 (7%)

Query: 31  STSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKV---SSK 87
           S ST  + + P + +SS +   D ++D E ++ KK+ E+    +   E+E  ++    +K
Sbjct: 108 SNSTVVTAAKPISPNSSNR-CCDSEKDLEIDRLKKELERVSKQLLDVEQECSQLKKGKNK 166

Query: 88  EKERK-----ESKPKESSSEKEKKKE-KKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
           E E K     ++K + S+    K+ + + D    S  H++ D     D+KK  K +   +
Sbjct: 167 EMESKNLCADDNKGQCSTVHASKRIDLEPDVATSSVIHRENDSRMALDDKKSFKTAGVQA 226

Query: 142 KIVSSSHNSKE 152
            + + +  SK+
Sbjct: 227 DLANHADLSKK 237


>gnl|CDD|240235 PTZ00032, PTZ00032, 60S ribosomal protein L18; Provisional.
          Length = 211

 Score = 29.4 bits (66), Expect = 4.5
 Identities = 21/92 (22%), Positives = 33/92 (35%), Gaps = 12/92 (13%)

Query: 32  TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER 91
           T+ +    NP +  SS   KK         K K    K           + K+    K R
Sbjct: 19  TNKAVYPPNPLSLFSSPNRKKSAPEQVPTGKNKLLLTK-----------RSKLKGIPKPR 67

Query: 92  KESKPKESSSEKEKKKEKKDKKEKSHKHKDKD 123
           K  K     +E  ++K ++++     K  DKD
Sbjct: 68  KLHKHGF-WAEIFEEKVEREELGNPCKDLDKD 98


>gnl|CDD|181632 PRK09060, PRK09060, dihydroorotase; Validated.
          Length = 444

 Score = 29.9 bits (68), Expect = 4.6
 Identities = 11/25 (44%), Positives = 15/25 (60%)

Query: 329 IQATRELLDLATRCSQLKAILHVST 353
           + ATR L+ LA    +   +LHVST
Sbjct: 213 LLATRRLVRLARETGRRIHVLHVST 237


>gnl|CDD|235132 PRK03577, PRK03577, acid shock protein precursor; Provisional.
          Length = 102

 Score = 28.1 bits (62), Expect = 4.6
 Identities = 22/74 (29%), Positives = 31/74 (41%), Gaps = 2/74 (2%)

Query: 163 PPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK--E 220
              PA T T  +P KT    K++       K    K   K+K   K   ++  A  K  +
Sbjct: 27  TAAPAATTTTAAPAKTTHHHKKQHKKAPEQKAQAAKKHHKNKKEQKAPEQKAQAAKKHAK 86

Query: 221 KESHKSSAGPKCYP 234
           K SHK++A P   P
Sbjct: 87  KHSHKTAAKPAAQP 100


>gnl|CDD|202833 pfam03962, Mnd1, Mnd1 family.  This family of proteins includes
           MND1 from S. cerevisiae. The mnd1 protein forms a
           complex with hop2 to promote homologous chromosome
           pairing and meiotic double-strand break repair.
          Length = 188

 Score = 29.1 bits (66), Expect = 4.8
 Identities = 22/96 (22%), Positives = 49/96 (51%), Gaps = 8/96 (8%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK--E 104
           S+   K K R ++ +KE ++ ++         + + ++   +K R+E++ +    E+  +
Sbjct: 61  SQALNKLKTRLEKLKKELEELKQRI------AELQAQIEKLKKGREETEERTELLEELKQ 114

Query: 105 KKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
            +KE K  K +  K++  D ER +  K+E K +K +
Sbjct: 115 LEKELKKLKAELEKYEKNDPERIEKLKEETKVAKEA 150


>gnl|CDD|215584 PLN03113, PLN03113, DNA ligase 1; Provisional.
          Length = 744

 Score = 30.0 bits (67), Expect = 4.8
 Identities = 17/105 (16%), Positives = 38/105 (36%), Gaps = 6/105 (5%)

Query: 2   AYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSS------TSNPTNSSSSKKDKKDKD 55
           A + K    + S   SP K K  ++       T+ S      T +     S     +   
Sbjct: 17  AAAKKKQPQTQSQSSSPKKRKIGETQDANLGKTNVSEGTLPKTEDTIEPKSDSAKPRSST 76

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS 100
               ++ +   K+    +   K++ K K+   +K+  +  P++ +
Sbjct: 77  SSIAEDSKTGTKKAQTLSKPKKDEMKSKIGLLKKKPNDFDPEKVA 121


>gnl|CDD|152468 pfam12033, DUF3519, Protein of unknown function (DUF3519).  This
           family of proteins is functionally uncharacterized. This
           protein is found in bacteria. Proteins in this family
           are typically between 117 to 1154 amino acids in length.
           This protein has a single completely conserved residue Q
           that may be functionally important.
          Length = 104

 Score = 28.0 bits (62), Expect = 4.9
 Identities = 24/86 (27%), Positives = 35/86 (40%), Gaps = 10/86 (11%)

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPP 165
           K  KK +K K+H +     E+  D  +    S   +K  +S  NS EP          P 
Sbjct: 26  KDNKKGEKLKNH-YVITGFEKRLDNSESLYTSPIITKHETSPLNSNEPN---------PT 75

Query: 166 PAPTPTQKSPVKTKEKEKEKESSTTH 191
           P P  +Q+  +KT E   E     T+
Sbjct: 76  PKPLTSQEDLLKTSENLNETTPEPTN 101


>gnl|CDD|217829 pfam03985, Paf1, Paf1.  Members of this family are components of
           the RNA polymerase II associated Paf1 complex. The Paf1
           complex functions during the elongation phase of
           transcription in conjunction with Spt4-Spt5 and
           Spt16-Pob3i.
          Length = 431

 Score = 29.7 bits (67), Expect = 5.0
 Identities = 20/80 (25%), Positives = 38/80 (47%), Gaps = 4/80 (5%)

Query: 41  PTNSSSSKKDKKDKDR----DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP 96
             ++  SK   K + R    D E+  E +D+E+++ +   +E+E +    +  + +E   
Sbjct: 346 NPSTKESKMRDKRRARLDPIDFEEVDEDEDEEEEQRSDEHEEEEGEDSEEEGSQSREDGS 405

Query: 97  KESSSEKEKKKEKKDKKEKS 116
            ESSS+     E K  KE +
Sbjct: 406 SESSSDVGSDSESKADKESA 425


>gnl|CDD|220178 pfam09321, DUF1978, Domain of unknown function (DUF1978).  Members
           of this family are found in various hypothetical
           proteins produced by the bacterium Chlamydia pneumoniae.
           Their exact function has not, as yet, been identified.
          Length = 241

 Score = 29.4 bits (66), Expect = 5.1
 Identities = 15/66 (22%), Positives = 30/66 (45%)

Query: 71  KSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDE 130
            S V   E+ + K S K  ER   K  +   ++  K+ KK  ++   + + + ++    E
Sbjct: 55  FSEVDRDEQWEKKTSLKHLERTYEKALDRLEKQSSKENKKVLQDAQREFERQSQDFYDKE 114

Query: 131 KKEQKE 136
            +E +E
Sbjct: 115 IEEVEE 120


>gnl|CDD|236267 PRK08451, PRK08451, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 535

 Score = 29.6 bits (67), Expect = 5.1
 Identities = 16/73 (21%), Positives = 34/73 (46%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKE 108
           K  K K +   K  E   +   +   S+++K+ D  S+ E   +E+K  +   ++ + KE
Sbjct: 446 KGAKIKIQKALKSAENPLQSLKEFKPSNEKKKIDTESTAEMLEEEAKKDDEEVQETQLKE 505

Query: 109 KKDKKEKSHKHKD 121
             + +E    ++D
Sbjct: 506 ATELQEFMINNED 518



 Score = 28.8 bits (65), Expect = 8.8
 Identities = 13/67 (19%), Positives = 25/67 (37%)

Query: 160 ISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSK 219
           I      A  P Q         EK+K  + +  +  + + KK D+   +T  KE     +
Sbjct: 452 IQKALKSAENPLQSLKEFKPSNEKKKIDTESTAEMLEEEAKKDDEEVQETQLKEATELQE 511

Query: 220 EKESHKS 226
              +++ 
Sbjct: 512 FMINNED 518


>gnl|CDD|112890 pfam04094, DUF390, Protein of unknown function (DUF390).  This is a
           family of long proteins currently only found in the rice
           genome. They have no known function. However they may be
           some kind of transposable element.
          Length = 843

 Score = 29.8 bits (66), Expect = 5.1
 Identities = 27/166 (16%), Positives = 67/166 (40%), Gaps = 20/166 (12%)

Query: 30  PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK 89
           PS  + S  S   + ++++  +++ DR +  ++ ++ +E  + A  +++ E+   +++E+
Sbjct: 216 PSRHSKSGQSEAEDPAAAEARRREADRREAADRLREAEEAAQDAARARQAEE---AAREE 272

Query: 90  ERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVS---- 145
             +  + +E++ E E      +    S   +D+              + ++S+       
Sbjct: 273 AARARQAEEAAREAEAAFRADEAAATSEAARDEAAGAQLAPDPSGDAAATTSEAAGDEAA 332

Query: 146 --------SSHNSKEPASGS-----QLISHPPPPAPTPTQKSPVKT 178
                   S     EPA G        I  P   AP+P +  P+ +
Sbjct: 333 GALLGPDPSGDAQDEPAPGGAPDSGTSIGGPSRAAPSPRRLFPLPS 378


>gnl|CDD|206228 pfam14058, PcfK, PcfK-like protein.  The PcfK-like protein family
           includes the Enterococcus faecalis PcfK protein, which
           is functionally uncharacterized. This family of proteins
           is found in bacteria and viruses. Proteins in this
           family are typically between 137 and 257 amino acids in
           length. There are two completely conserved residues (D
           and L) that may be functionally important.
          Length = 136

 Score = 28.5 bits (64), Expect = 5.1
 Identities = 12/60 (20%), Positives = 23/60 (38%), Gaps = 3/60 (5%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           D++  K  K        V+     +     K + RKE+       E  K +++  K +K+
Sbjct: 68  DEDDIKVGKPINCR---VTVNHTVELTEEEKAEARKEALKAYQQEELRKIQKRSKKSKKA 124


>gnl|CDD|220267 pfam09494, Slx4, Slx4 endonuclease.  The Slx4 protein is a
           heteromeric structure-specific endonuclease found in
           fungi. Slx4 with Slx1 acts as a nuclease on branched DNA
           substrates, particularly simple-Y, 5'-flap, or
           replication fork structures by cleaving the strand
           bearing the 5' non-homologous arm at the branch junction
           and thus generating ligatable nicked products from
           5'-flap or replication fork substrates.
          Length = 627

 Score = 29.6 bits (66), Expect = 5.2
 Identities = 29/130 (22%), Positives = 48/130 (36%), Gaps = 13/130 (10%)

Query: 73  AVSSKEKEKDKVSSKEKERKESKPK----ESSSEKEKKKEKKDKKE--------KSHKHK 120
            VS K   K K   K K RK +K K    +S +   ++  + D+          K  K  
Sbjct: 65  NVSGKRVPKKKKIKKPKLRKRTKRKNKKIKSLTAFNEENFETDRAPSLLSYLSGKQSKVN 124

Query: 121 DKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKE 180
           D  +  +  +K +   S  S+   S+ ++  E     +LI    P       KS +K   
Sbjct: 125 DILKRLESSKKIKNSRSSESTFETSALYSEDEWIDIVKLIRLRFPKLSESDLKS-LKNYI 183

Query: 181 KEKEKESSTT 190
              EK+  + 
Sbjct: 184 YGAEKQEESE 193


>gnl|CDD|235370 PRK05244, PRK05244, Der GTPase activator; Provisional.
          Length = 177

 Score = 28.7 bits (65), Expect = 5.2
 Identities = 26/114 (22%), Positives = 45/114 (39%), Gaps = 15/114 (13%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS-HKHKDKDRERDKDEKKE 133
             K+    +     K +K+++ +  +  +E+K++KK K  KS  +H          E   
Sbjct: 1   KKKKSSPKRSKGMAKSKKKTREELDAEARERKRKKKHKGLKSGSRHN---------EGNT 51

Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLI--SHPPPPAPTPTQKSPVKTKEKEKEK 185
           Q + K  ++       SK+P     L       P    P  K P  + E+E EK
Sbjct: 52  QSKGKGQAQKKDPRIGSKKPI---PLGVEEKVKPKKKKPKSKKPKLSPEQELEK 102


>gnl|CDD|218391 pfam05029, TIMELESS_C, Timeless protein C terminal region.  The
           timeless (tim) gene is essential for circadian function
           in Drosophila. Putative homologues of Drosophila tim
           have been identified in both mice and humans (mTim and
           hTIM, respectively). Mammalian TIM is not the true
           orthologue of Drosophila TIM, but is the likely
           orthologue of a fly gene, timeout (also called tim-2).
           mTim has been shown to be essential for embryonic
           development, but does not have substantiated circadian
           function. Some family members contain a SANT domain in
           this region.
          Length = 507

 Score = 29.7 bits (66), Expect = 5.3
 Identities = 26/192 (13%), Positives = 71/192 (36%), Gaps = 11/192 (5%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESS---SEKEK 105
           K ++ K + K+   E+  ++  +     +E+  +    + ++R  +   E S   S +  
Sbjct: 227 KKRRKKLKPKQPNGEESGEDDFQEDPEEEEQLPESKPEETEKRVSAFQVEGSTLISAENL 286

Query: 106 KKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQ------- 158
           +++ K +K        +       + +E+ E   +  +V  +  ++E     Q       
Sbjct: 287 RQQLKQEKTSWPLLWLQSCLIRAADDREEDECDQAVPLVPLTEENEEAMENEQFQRLLKA 346

Query: 159 LISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKD-AK 217
           L   PP        + P K    +  + +++   +  + + + KD    +   + +    
Sbjct: 347 LGLRPPRSGQEGFWRIPAKLSSTQLRRRAASLSGEEEEPEDELKDDVDGEQADESEHETL 406

Query: 218 SKEKESHKSSAG 229
           +  K + +  AG
Sbjct: 407 ALRKNARQRKAG 418


>gnl|CDD|234368 TIGR03835, termin_org_DnaJ, terminal organelle assembly protein
           TopJ.  This model describes TopJ (MG_200, CbpA), a DnaJ
           homolog and probable assembly protein of the Mycoplasma
           terminal organelle. The terminal organelle is involved
           in both cytadherence and gliding motility [Cellular
           processes, Chemotaxis and motility].
          Length = 871

 Score = 29.8 bits (66), Expect = 5.5
 Identities = 23/96 (23%), Positives = 35/96 (36%), Gaps = 7/96 (7%)

Query: 177 KTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAGPKCYPEV 236
           +   K  E ++ T  D  SK K KKK K       K++    + +E     A      E 
Sbjct: 89  EEINKSGEFDNITDDDTPSKKKKKKKKKGWFWAKSKQESKTIETEEIIDVGASVNQANET 148

Query: 237 GGIYILLRSKKNKTVQ-------ERLAEQFKDELFD 265
                 L  +  ++V        ERL +Q K+  F 
Sbjct: 149 RLFDDTLDDQLEESVSTQSTDDGERLFDQNKEPSFT 184


>gnl|CDD|220237 pfam09428, DUF2011, Fungal protein of unknown function (DUF2011).
           This is a family of fungal proteins whose function is
           unknown.
          Length = 130

 Score = 28.4 bits (64), Expect = 5.5
 Identities = 13/59 (22%), Positives = 29/59 (49%)

Query: 68  EKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRER 126
           EK+      K+K++ +   K +     + + +    EK+K  +  +EK  K + K++E+
Sbjct: 72  EKELLREKEKKKKRKRPGKKRRIALRLRRERTKERAEKEKRTRKNREKKFKRRQKEKEK 130


>gnl|CDD|178307 PLN02705, PLN02705, beta-amylase.
          Length = 681

 Score = 29.5 bits (66), Expect = 5.5
 Identities = 15/74 (20%), Positives = 34/74 (45%), Gaps = 8/74 (10%)

Query: 11  SSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKD-------KKDKDRDKEKEKE 63
            +     P   + +  +A  + +  + T N  N+ +           K  ++R+KEKE+ 
Sbjct: 29  PNRNRNQPQSRRPRGFAATAAAAAIAPTENDVNNGNISSGGGGGGGGKGKREREKEKERT 88

Query: 64  KKDKEKDKSAVSSK 77
           K  +E+ + A++S+
Sbjct: 89  KL-RERHRRAITSR 101


>gnl|CDD|220774 pfam10477, EIF4E-T, Nucleocytoplasmic shuttling protein for mRNA
           cap-binding EIF4E.  EIF4E-T is the transporter protein
           for shuttling the mRNA cap-binding protein EIF4E
           protein, targeting it for nuclear import. EIF4E-T
           contains several key binding domains including two
           functional leucine-rich NESs (nuclear export signals)
           between residues 438-447 and 613-638 in the human
           protein. The other two binding domains are an
           EIF4E-binding site, between residues 27-42 in Q9EST3,
           and a bipartite NLS (nuclear localisation signals)
           between 194-211, and these lie in family EIF4E-T_N.
           EIF4E is the eukaryotic translation initiation factor 4E
           that is the rate-limiting factor for cap-dependent
           translation initiation.
          Length = 520

 Score = 29.5 bits (65), Expect = 5.6
 Identities = 42/229 (18%), Positives = 76/229 (33%), Gaps = 23/229 (10%)

Query: 16  PSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEK------KDKEK 69
           P  ++    D    P    SS   +  +SS S++ KKD D D+   K +      + KE 
Sbjct: 39  PPSYRRGKSDGVWDPEKWNSSLYPSSGSSSPSERLKKDSDTDRGSLKRRIPDPRERVKED 98

Query: 70  DKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKH---------K 120
           D   V S ++           R  S+    S     ++     +    +          K
Sbjct: 99  DLDVVLSPQRRSFGGGCHVTARASSENDNESLRLLGERRIGSGRIMPSRGFERDFRGPRK 158

Query: 121 DKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKT-- 178
           D++ ER +D +++ K+ +   +   S     E            P   +    S ++T  
Sbjct: 159 DRNPERSRDRERDYKDKRFRREFGDSKRVFSESRRNDSYTIEEEPEWFSAGPTSQLETIE 218

Query: 179 ------KEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKEKDAKSKEK 221
                 K  E++ ++S    K    K  KK     K    E +     +
Sbjct: 219 LIGFDDKILEEDDKTSNGDGKQKGRKRTKKRTASVKEGHVECNGGVSLE 267


>gnl|CDD|219461 pfam07543, PGA2, Protein trafficking PGA2.  A Saccharomyces
           cerevisiae member of this family (PGA2) is an ER protein
           which has been implicated in protein trafficking.
          Length = 139

 Score = 28.2 bits (63), Expect = 5.8
 Identities = 17/96 (17%), Positives = 39/96 (40%), Gaps = 13/96 (13%)

Query: 54  KDRDKEKEKEKKDKEKDKSA---VSSKE-----KEKDKVSSKEKERKESKPKESSSE--- 102
           K ++KE EKE+ ++E+ +     +S                 + E  E      S+    
Sbjct: 40  KAQEKEHEKERAEREEAREKKAKISPNALRGGATAGHGEEDTDDEEDEEDFATPSAVPQW 99

Query: 103 --KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
             K +K+++K  ++     +    ++  DE ++ +E
Sbjct: 100 GKKARKRQRKVIRKLLEAEEQLREDQYDDEDEDIEE 135


>gnl|CDD|218941 pfam06217, GAGA_bind, GAGA binding protein-like family.  This
           family includes gbp a protein from Soybean that binds to
           GAGA element dinucleotide repeat DNA. It seems likely
           that the this domain mediates DNA binding. This putative
           domain contains several conserved cysteines and a
           histidine suggesting this may be a zinc-binding DNA
           interaction domain.
          Length = 301

 Score = 29.1 bits (65), Expect = 5.9
 Identities = 13/50 (26%), Positives = 20/50 (40%)

Query: 88  EKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKES 137
            KE K+ K  +S    +  K KK KK+ S  ++           K   +S
Sbjct: 144 AKEVKKPKKGQSPKVPKAPKPKKPKKKGSVSNRSVKMPGIDPRSKPDWKS 193


>gnl|CDD|223061 PHA03369, PHA03369, capsid maturational protease; Provisional.
          Length = 663

 Score = 29.6 bits (66), Expect = 6.0
 Identities = 22/99 (22%), Positives = 34/99 (34%), Gaps = 12/99 (12%)

Query: 89  KERKESKPKESSSE--KEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSS 146
            ERK  +  E   E  +  K  KK K+E+    K+ +    K E K+  ES+        
Sbjct: 464 HERKRKRGGELKEELIETLKLVKKLKEEQESLAKELEATAHKSEIKKIAESEFK------ 517

Query: 147 SHNSKEPASGSQLISHPP-PPAPTPTQKSPVKTKEKEKE 184
              +    + +  I       A  P  K      + E E
Sbjct: 518 ---NAGAKTAAANIEPNCSADAAAPATKRARPETKTELE 553


>gnl|CDD|218549 pfam05308, Mito_fiss_reg, Mitochondrial fission regulator.  In
           eukaryotes, this family of proteins induces
           mitochondrial fission.
          Length = 248

 Score = 28.9 bits (65), Expect = 6.0
 Identities = 20/87 (22%), Positives = 31/87 (35%), Gaps = 17/87 (19%)

Query: 133 EQKESKSSSKIVSSSHNSKEPASGSQLIS----------HPPPPAPTPTQKSPVKTKE-- 180
           EQ  S +S  + SS  +    ++ S  IS           PPPP P P     ++     
Sbjct: 146 EQSNSTTSDLL-SSDESVPSSSTTSFPISPPTEEPVLEVPPPPPPPPPPPPPSLQQSTSA 204

Query: 181 ----KEKEKESSTTHDKHSKHKHKKKD 203
               KE++ + S         K K  +
Sbjct: 205 IDLIKERKGQRSAAGKTLVLSKPKSPE 231


>gnl|CDD|191382 pfam05837, CENP-H, Centromere protein H (CENP-H).  This family
           consists of several eukaryotic centromere protein H
           (CENP-H) sequences. Macromolecular
           centromere-kinetochore complex plays a critical role in
           sister chromatid separation, but its complete protein
           composition as well as its precise dynamic function
           during mitosis has not yet been clearly determined.
           CENP-H contains a coiled-coil structure and a nuclear
           localisation signal. CENP-H is specifically and
           constitutively localised in kinetochores throughout the
           cell cycle. CENP-H may play a role in kinetochore
           organisation and function throughout the cell cycle.
           This the C-terminus of the region, which is conserved
           from fungi to humans.
          Length = 106

 Score = 27.7 bits (62), Expect = 6.1
 Identities = 15/65 (23%), Positives = 34/65 (52%), Gaps = 2/65 (3%)

Query: 82  DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
           +++S  EKER + K K      E  +  K K+  S +   + +E+ +  + + K+SK+  
Sbjct: 17  EELSDLEKERLQLKQKNVELALELLELTKKKE--SWREDMELKEQLEKLEADLKKSKAKW 74

Query: 142 KIVSS 146
           +++ +
Sbjct: 75  EVMKN 79


>gnl|CDD|224510 COG1594, RPB9, DNA-directed RNA polymerase, subunit
          M/Transcription elongation factor TFIIS
          [Transcription].
          Length = 113

 Score = 27.8 bits (62), Expect = 6.4
 Identities = 12/53 (22%), Positives = 17/53 (32%)

Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKE 93
                 +   K   R   KE  +K KE         +  K   ++KEK  K 
Sbjct: 26 RKCGYEEEASNKKVYRYSVKEAVEKKKEVVLVVEDETQGAKTLPTAKEKCPKC 78


>gnl|CDD|185616 PTZ00436, PTZ00436, 60S ribosomal protein L19-like protein;
           Provisional.
          Length = 357

 Score = 29.1 bits (64), Expect = 6.4
 Identities = 21/90 (23%), Positives = 42/90 (46%), Gaps = 1/90 (1%)

Query: 89  KERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH 148
           K + E K +   +E+   K  KD++ +    K + R+R+KD ++ ++E  +++       
Sbjct: 144 KVKNEKKKERQLAEQLAAKRLKDEQHRHKARKQELRKREKDRERARREDAAAAAAAKQKA 203

Query: 149 NSKEPASGSQLISHPPPPAPTPTQKSPVKT 178
            +K+ A+ S   S     AP     +P K 
Sbjct: 204 AAKKAAAPSGKKS-AKAAAPAKAAAAPAKA 232


>gnl|CDD|240413 PTZ00423, PTZ00423, glideosome-associated protein 45; Provisional.
          Length = 193

 Score = 28.9 bits (64), Expect = 6.5
 Identities = 34/120 (28%), Positives = 55/120 (45%), Gaps = 15/120 (12%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
             +  K+ K +D D+  E+EK  KE     V    ++K +   +E E +  +P E   E 
Sbjct: 6   RKNKAKEPKRRDIDELAEREKLKKE-----VEEIPEQKPEDIVEELEDQPEEPPEQEEEN 60

Query: 104 EKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSH----NSKEPASGSQL 159
           E++K K++      ++K        DEK      +S+S I S SH     S +  +GSQL
Sbjct: 61  EEQKPKEEIDYPIQENK------SFDEKNLDDLERSNSDIYSESHKYDNASDKLETGSQL 114


>gnl|CDD|173502 PTZ00266, PTZ00266, NIMA-related protein kinase; Provisional.
          Length = 1021

 Score = 29.3 bits (65), Expect = 6.6
 Identities = 36/174 (20%), Positives = 72/174 (41%), Gaps = 37/174 (21%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKK------------- 52
           +SSS +S    +   N    +S     S +S  S P     SK D+K             
Sbjct: 369 RSSSCASRQSANNVTNITSITSVTSVASVASVASVP-----SKDDRKYPQDGATHCHAVN 423

Query: 53  -------DKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
                  DKD  +    EK++  +   A+  K  EK ++   E+E +E   +E     E+
Sbjct: 424 GHYGGRVDKDHAERARIEKENAHR--KALEMKILEKKRIERLEREERERLERERMERIER 481

Query: 106 KKEKKDKKEKSHKHKDK-DRER---------DKDEKKEQKESKSSSKIVSSSHN 149
           ++ ++++ E+    +D+ +R+R         D+ E+   ++++ +S  +    N
Sbjct: 482 ERLERERLERERLERDRLERDRLDRLERERVDRLERDRLEKARRNSYFLKGMEN 535


>gnl|CDD|215590 PLN03123, PLN03123, poly [ADP-ribose] polymerase; Provisional.
          Length = 981

 Score = 29.4 bits (66), Expect = 6.6
 Identities = 17/64 (26%), Positives = 26/64 (40%)

Query: 30  PSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEK 89
                  +      S    K KKD   D + +K K D++   S  +S++K  D  S  E 
Sbjct: 188 SEAKEEKAEERKQESKKGAKRKKDASGDDKSKKAKTDRDVSTSTAASQKKSSDLESKLEA 247

Query: 90  ERKE 93
           + KE
Sbjct: 248 QSKE 251


>gnl|CDD|147982 pfam06112, Herpes_capsid, Gammaherpesvirus capsid protein.  This
           family consists of several Gammaherpesvirus capsid
           proteins. The exact function of this family is unknown.
          Length = 148

 Score = 28.3 bits (63), Expect = 6.6
 Identities = 17/59 (28%), Positives = 26/59 (44%)

Query: 4   SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEK 62
           S  S+SSSS++      N+   SS    +S   S S+ ++ S S     D      K+K
Sbjct: 90  SALSASSSSASGVPGGANQLSGSSGSALSSGPGSLSSSSSLSGSGAGAGDTAPSSSKKK 148


>gnl|CDD|220654 pfam10254, Pacs-1, PACS-1 cytosolic sorting protein.  PACS-1 is a
           cytosolic sorting protein that directs the localisation
           of membrane proteins in the trans-Golgi network
           (TGN)/endosomal system. PACS-1 connects the clathrin
           adaptor AP-1 to acidic cluster sorting motifs contained
           in the cytoplasmic domain of cargo proteins such as
           furin, the cation-independent mannose-6-phosphate
           receptor and in viral proteins such as human
           immunodeficiency virus type 1 Nef.
          Length = 413

 Score = 29.4 bits (66), Expect = 6.6
 Identities = 33/145 (22%), Positives = 50/145 (34%), Gaps = 25/145 (17%)

Query: 7   SSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRD--------- 57
           S    SS+ PS      + S+  P    S S S+  +++ S  D      D         
Sbjct: 228 SLFVLSSSPPSSSGASKEASATPPP---SPSMSSSLSAAGSPVDAIGLQVDYWPAARPGE 284

Query: 58  KEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE-----KKKEKKDK 112
           + KE  K+D        S K   K    S +  R  S  +E+            KEK  K
Sbjct: 285 RRKEGSKRDA-------SGKNTLKSTFRSLQVSRLPSSGQEAQMTNTMSMTVVTKEKNKK 337

Query: 113 KEKSHKHKDKDRERDKDEKKEQKES 137
                  K K +E++ + K +  E 
Sbjct: 338 VPVMFLGK-KPKEKEVESKSQCIEG 361


>gnl|CDD|224016 COG1091, RfbD, dTDP-4-dehydrorhamnose reductase [Cell envelope
           biogenesis, outer membrane].
          Length = 281

 Score = 29.2 bits (66), Expect = 6.6
 Identities = 24/108 (22%), Positives = 40/108 (37%), Gaps = 16/108 (14%)

Query: 267 LKNEQADILQRKVHIISGDISQPSLGISSHD---QQFIQHHIHVIIHAAASLRFDELIQD 323
           L  E    L  +  +I+    +  L I+  D   +   +    V+I+AAA    D+   +
Sbjct: 12  LGTELRRALPGEFEVIA--TDRAELDITDPDAVLEVIRETRPDVVINAAAYTAVDKAESE 69

Query: 324 ---AFTLNIQATRELLDLATRCSQLKAILHVSTLYT------HSYRED 362
              AF +N      L   A        ++H+ST Y         Y+E 
Sbjct: 70  PELAFAVNATGAENLARAAAEVGAR--LVHISTDYVFDGEKGGPYKET 115


>gnl|CDD|167284 PRK01833, tatA, twin arginine translocase protein A; Provisional.
          Length = 74

 Score = 27.2 bits (60), Expect = 6.7
 Identities = 15/38 (39%), Positives = 22/38 (57%), Gaps = 1/38 (2%)

Query: 46 SSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDK 83
          S K  KK    DK K+ E + K + K A S+++K K+K
Sbjct: 35 SVKGFKKAMADDKPKDAEFE-KVEAKEAASTEQKAKEK 71


>gnl|CDD|216860 pfam02063, MARCKS, MARCKS family. 
          Length = 296

 Score = 29.0 bits (64), Expect = 6.7
 Identities = 46/216 (21%), Positives = 92/216 (42%), Gaps = 6/216 (2%)

Query: 14  AHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSA 73
           A P+  +   K+      ++ +  T     +S++  ++K+     E +KE  + E  + A
Sbjct: 45  ASPAAAEAGAKEELQANGSAPAEETGKEEAASAAAAEEKEAAASTEPDKEPAEAEPAEPA 104

Query: 74  VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKE 133
            S  E E +  +S EK    + P   SSE  KKK+K+   +KS K      +++K E  E
Sbjct: 105 -SPAEAEGEAATSTEKAEDGATP-SPSSETPKKKKKRFSFKKSFKLSGFSFKKNKKEAGE 162

Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDK 193
             E++ +    +    +KE A+ +   +     A  P +++     E E  +E +   + 
Sbjct: 163 GAEAEGA---AAEKEGAKEEAAAAAPEAGSGEEAAAPGEEAGAAGAEGEAGEEPAADAEP 219

Query: 194 HSKHKHKKKDKHGDKTNPKEKDAKSKEKESHKSSAG 229
               + K ++   +K   +E  A  ++K   K +  
Sbjct: 220 EQP-EAKPEEAAPEKPQAEEAKAAEEQKAEEKPAEE 254


>gnl|CDD|217443 pfam03234, CDC37_N, Cdc37 N terminal kinase binding.  Cdc37 is a
           molecular chaperone required for the activity of
           numerous eukaryotic protein kinases. This domain
           corresponds to the N terminal domain which binds
           predominantly to protein kinases and is found N terminal
           to the Hsp (Heat shocked protein) 90-binding domain
           pfam08565. Expression of a construct consisting of only
           the N-terminal domain of Saccharomyces pombe Cdc37
           results in cellular viability. This indicates that
           interactions with the cochaperone Hsp90 may not be
           essential for Cdc37 function.
          Length = 172

 Score = 28.6 bits (64), Expect = 6.8
 Identities = 22/104 (21%), Positives = 45/104 (43%), Gaps = 8/104 (7%)

Query: 50  DKKDKDRDKEKEKEKKDKEKDKSAV--SSKEKEKDKVSSKEKERKESKPKESSSE--KEK 105
           D+  +  DK   + K++      AV  S  E   DK + + ++   ++  E   +  K++
Sbjct: 59  DRLLERVDKLLSELKEESLDSSQAVMKSLNENFTDKENVEPEQPTYNEMVEDLFDQVKDE 118

Query: 106 KKEKKDKKE----KSHKHKDKDRERDKDEKKEQKESKSSSKIVS 145
             EK         + H+ K K  +++  +K ++ E +   KI S
Sbjct: 119 VDEKNGAALIEELQKHRDKLKKEQKELLKKLDELEKEEKKKIWS 162


>gnl|CDD|215180 PLN02316, PLN02316, synthase/transferase.
          Length = 1036

 Score = 29.5 bits (66), Expect = 6.8
 Identities = 13/50 (26%), Positives = 31/50 (62%)

Query: 87  KEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           +EK R+  K  +  +E+E++ E++ ++E+     + DR + K E ++++E
Sbjct: 252 EEKRRELEKLAKEEAERERQAEEQRRREEEKAAMEADRAQAKAEVEKRRE 301



 Score = 29.1 bits (65), Expect = 8.4
 Identities = 15/61 (24%), Positives = 30/61 (49%), Gaps = 9/61 (14%)

Query: 57  DKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKS 116
           +K +E EK  KE         E E+++ + +++ R+E K    +   + K E + ++EK 
Sbjct: 253 EKRRELEKLAKE---------EAERERQAEEQRRREEEKAAMEADRAQAKAEVEKRREKL 303

Query: 117 H 117
            
Sbjct: 304 Q 304


>gnl|CDD|183377 PRK11910, PRK11910, amidase; Provisional.
          Length = 615

 Score = 29.2 bits (65), Expect = 6.8
 Identities = 22/110 (20%), Positives = 39/110 (35%), Gaps = 9/110 (8%)

Query: 83  KVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSK 142
           + ++ E +   SK K+ + +KEK K+  +        ++   E+    K+ Q E      
Sbjct: 29  QTTTSENKSAVSKKKKPTVKKEKPKQSSNNLTLGKNKENFHLEKGFGNKQLQVERIIDRI 88

Query: 143 IVSSSHNSKE---------PASGSQLISHPPPPAPTPTQKSPVKTKEKEK 183
             SS  N  E             +     P P  P   + SP    +K +
Sbjct: 89  FQSSLKNRTEIKVKPKNNPQKKQNIKPVKPIPSKPEKPEDSPSPFYDKAR 138


>gnl|CDD|224259 COG1340, COG1340, Uncharacterized archaeal coiled-coil protein
           [Function unknown].
          Length = 294

 Score = 28.9 bits (65), Expect = 6.9
 Identities = 18/95 (18%), Positives = 43/95 (45%), Gaps = 6/95 (6%)

Query: 47  SKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKER-----KESKPKESSS 101
           +K  +  K+  + KEK  +     +S + S E+E +++  K++       +E +  +   
Sbjct: 83  AKLQELRKEYRELKEKRNEFNLGGRS-IKSLEREIERLEKKQQTSVLTPEEERELVQKIK 141

Query: 102 EKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           E  K+ E   K  + ++   + +    + KK+ +E
Sbjct: 142 ELRKELEDAKKALEENEKLKELKAEIDELKKKARE 176


>gnl|CDD|224495 COG1579, COG1579, Zn-ribbon protein, possibly nucleic acid-binding
           [General function prediction only].
          Length = 239

 Score = 28.9 bits (65), Expect = 6.9
 Identities = 19/102 (18%), Positives = 36/102 (35%), Gaps = 15/102 (14%)

Query: 48  KKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKK- 106
            +         E E E  + +     VS  E E  ++  +  +R E K      E+E + 
Sbjct: 40  LEALNKALEALEIELEDLENQ-----VSQLESEIQEIRER-IKRAEEKLSAVKDERELRA 93

Query: 107 --------KEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSS 140
                   KE+ +  E       ++ E+ + E ++ KE    
Sbjct: 94  LNIEIQIAKERINSLEDELAELMEEIEKLEKEIEDLKERLER 135


>gnl|CDD|216420 pfam01298, Lipoprotein_5, Transferrin binding protein-like solute
           binding protein.  This family of proteins are distantly
           related to other families of solute binding proteins.
          Length = 554

 Score = 29.4 bits (66), Expect = 6.9
 Identities = 24/153 (15%), Positives = 50/153 (32%), Gaps = 9/153 (5%)

Query: 33  STSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERK 92
           S  + TS P   +   +D      +K +  +         A +     K    +   E K
Sbjct: 5   SPKTDTSAPKAEAPKYQDVPSAKPEKAELAK-------LDAPALGFAMKLPRRNWGPEEK 57

Query: 93  ESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKE 152
               ++   +     + +++ EK++   +   +  K+   + K  +S    +S     KE
Sbjct: 58  TELSEKDWIKTSSLSKIENEVEKNNGEDETHDKNRKEGAHDFKYVRSGYVYISGGSLEKE 117

Query: 153 PASGSQLISHPP--PPAPTPTQKSPVKTKEKEK 183
              G++             P +  PV  K   K
Sbjct: 118 DNKGAKSGYDGYVYYKGKQPAKNLPVSGKVTYK 150


>gnl|CDD|236048 PRK07561, PRK07561, DNA topoisomerase I subunit omega; Validated.
          Length = 859

 Score = 29.4 bits (67), Expect = 7.0
 Identities = 11/46 (23%), Positives = 19/46 (41%)

Query: 49  KDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES 94
             +K +      +   KD    K+AV    K K +  + EK+ K +
Sbjct: 814 LAEKPEKLRYLADAPAKDPAGKKAAVKFSRKTKQQYVASEKDGKAT 859


>gnl|CDD|236343 PRK08868, PRK08868, flagellar protein FlaG; Provisional.
          Length = 144

 Score = 28.2 bits (63), Expect = 7.0
 Identities = 16/88 (18%), Positives = 34/88 (38%), Gaps = 4/88 (4%)

Query: 5  VKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKE---KE 61
          + S +S+   + S    K    +   ++S   S     + SS K +K +++   E     
Sbjct: 3  ISSYASNIQPYGSNSGTKFASENGNGTSSVLVSDKTR-SVSSEKVEKTEQELSVEAAVAM 61

Query: 62 KEKKDKEKDKSAVSSKEKEKDKVSSKEK 89
           E++ +   +      E+  + V S  K
Sbjct: 62 AEQRQELNREELEKMVEQMNEFVKSINK 89


>gnl|CDD|227862 COG5575, ORC2, Origin recognition complex, subunit 2 [DNA
           replication, recombination, and repair].
          Length = 535

 Score = 29.3 bits (65), Expect = 7.2
 Identities = 23/146 (15%), Positives = 43/146 (29%)

Query: 75  SSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQ 134
                  +    K  E   S PK+   E       +    ++  H+    +    + +  
Sbjct: 24  LVFANSHESNDLKMVENVSSTPKKGVLEDPSTLTPEVVTPRTPGHRIIKAKGAYTKDRSA 83

Query: 135 KESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKH 194
           K  +   +I        +   GS  +   P  +      SP    E E  +  + +    
Sbjct: 84  KRRRGRIEIERHLLGEFDDNVGSNSLLDVPLYSLEAEPLSPSVMLEDESMEGINQSPQGI 143

Query: 195 SKHKHKKKDKHGDKTNPKEKDAKSKE 220
           S  K  K+D     + P     +S E
Sbjct: 144 SVEKLGKEDNRSRSSTPASPSLESHE 169


>gnl|CDD|115046 pfam06364, DUF1068, Protein of unknown function (DUF1068).  This
           family consists of several hypothetical plant proteins
           from Arabidopsis thaliana and Oryza sativa. The function
           of this family is unknown.
          Length = 176

 Score = 28.5 bits (63), Expect = 7.4
 Identities = 29/105 (27%), Positives = 48/105 (45%), Gaps = 20/105 (19%)

Query: 23  DKDSSAIPSTSTSSSTSNPTNSSSSKKDKK-DKDRDKE---------KEKEKKDKEKDKS 72
           D D SA P  +     SN +    +K+D + ++D +K          K++E +  EK K 
Sbjct: 48  DCDCSARPLLTIPKELSNASFEDCAKQDPEVNEDTEKNYAELLTEELKQREAESTEKHKR 107

Query: 73  A----------VSSKEKEKDKVSSKEKERKESKPKESSSEKEKKK 107
           A           SS +KE DK +S  +  +E++ K   +  E+KK
Sbjct: 108 ADVGLLEAKKLTSSYQKEADKCNSGMETCEEAREKAEEALVEQKK 152


>gnl|CDD|236978 PRK11778, PRK11778, putative inner membrane peptidase; Provisional.
          Length = 330

 Score = 29.0 bits (66), Expect = 7.4
 Identities = 14/54 (25%), Positives = 22/54 (40%), Gaps = 9/54 (16%)

Query: 83  KVSSKEKERKESKPKESSSEKE-KKKEKKDKKEKSHKHKDKDRERDKDEKKEQK 135
            ++ + KE KE        +KE K   K  KK        K+++  K  K + K
Sbjct: 44  NLNEQYKEMKEELKAALLDKKELKAWHKAQKK--------KEKQEAKAAKAKSK 89


>gnl|CDD|198139 smart01071, CDC37_N, Cdc37 N terminal kinase binding.  Cdc37 is a
           molecular chaperone required for the activity of
           numerous eukaryotic protein kinases. This domain
           corresponds to the N terminal domain which binds
           predominantly to protein kinases.and is found N terminal
           to the Hsp (Heat shocked protein) 90-binding domain.
           Expression of a construct consisting of only the
           N-terminal domain of Saccharomyces pombe Cdc37 results
           in cellular viability. This indicates that interactions
           with the cochaperone Hsp90 may not be essential for
           Cdc37 function.
          Length = 154

 Score = 28.2 bits (63), Expect = 7.6
 Identities = 21/105 (20%), Positives = 41/105 (39%), Gaps = 16/105 (15%)

Query: 56  RDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK------------ 103
           R +  E E K+ + +        K  DK+    +E + S    + +E             
Sbjct: 41  RVERME-EIKNLKYELIMNDHLNKRIDKLLKGLREEELSPETPTYNEMLAELQDQLKKEL 99

Query: 104 -EKKKEKKDKKEKSHKHKDK--DRERDKDEKKEQKESKSSSKIVS 145
            E   + +   E+  KH+DK    +++  +K ++ E +   KI S
Sbjct: 100 EEANGDSEGLLEELKKHRDKLKKEQKELRKKLDELEKEEKKKIWS 144


>gnl|CDD|240419 PTZ00440, PTZ00440, reticulocyte binding protein 2-like protein;
            Provisional.
          Length = 2722

 Score = 29.4 bits (66), Expect = 7.6
 Identities = 31/96 (32%), Positives = 42/96 (43%), Gaps = 13/96 (13%)

Query: 60   KEKEKKDKEKDKSAVSSKEKEKDKVSSKE-----KERKESKPKESSSEKEKKKEK----- 109
            KEK K+ +EK    +S  EK K K+SS       K+ K  K KE     E+K E      
Sbjct: 1034 KEKGKEIEEKVDQYISLLEKMKTKLSSFHFNIDIKKYKNPKIKEEIKLLEEKVEALLKKI 1093

Query: 110  KDKKEKSHKHKDKDRER---DKDEKKEQKESKSSSK 142
             + K K  + K+K  E       EK +Q E  +  K
Sbjct: 1094 DENKNKLIEIKNKSHEHVVNADKEKNKQTEHYNKKK 1129


>gnl|CDD|221937 pfam13148, DUF3987, Protein of unknown function (DUF3987).  A
           family of uncharacterized proteins found by clustering
           human gut metagenomic sequences.
          Length = 379

 Score = 29.2 bits (66), Expect = 7.7
 Identities = 13/51 (25%), Positives = 24/51 (47%), Gaps = 2/51 (3%)

Query: 88  EKERKESKPKESSSEKEKK--KEKKDKKEKSHKHKDKDRERDKDEKKEQKE 136
           E+ R+E + +    E EK+  + +K   EK  K   K  + ++   +E  E
Sbjct: 68  EELREEYEEELKEYEAEKEIWEAEKKGLEKKAKKAIKKGKDEEALAEELLE 118


>gnl|CDD|227496 COG5167, VID27, Protein involved in vacuole import and degradation
           [Intracellular trafficking and secretion].
          Length = 776

 Score = 29.2 bits (65), Expect = 7.8
 Identities = 21/87 (24%), Positives = 39/87 (44%), Gaps = 5/87 (5%)

Query: 57  DKEKEKEKKDKEKDKSAVSS---KEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           + EK   ++ + KD    SS    EK+ D +   EK   E++  E S  +E+ ++ +D  
Sbjct: 337 NNEKWGNEEAERKDYILDSSSVPLEKQFDDILYFEKMEIENRNPEESEHEEEVEDYED-- 394

Query: 114 EKSHKHKDKDRERDKDEKKEQKESKSS 140
           E  H  +  D +  ++  +   E  S 
Sbjct: 395 ENDHSKRICDDDELENHFRAADEKNSH 421


>gnl|CDD|130680 TIGR01619, hyp_HI0040, TIGR01619 family protein.  This model
           represents a hypothetical equivalog of gamma
           proteobacteria, includes HI0040. These sequences do not
           have any similarity to known proteins by PSI-BLAST.
          Length = 249

 Score = 28.8 bits (64), Expect = 7.8
 Identities = 16/55 (29%), Positives = 26/55 (47%), Gaps = 7/55 (12%)

Query: 317 FDELIQDAFTLNIQATRELLDLATRCSQLKAILHVSTLYT--HSYREDIQEEFYP 369
           FD L+     + I AT ELLDL  +  +      ++ LY   HS+  D + + + 
Sbjct: 127 FDFLLASPLEIKIHATEELLDLLKKKGR-----DLAALYLIEHSFHFDEEAKMFA 176


>gnl|CDD|222466 pfam13945, NST1, Splicing factor, salt tolerance regulator.  NST1
           is a family of proteins that seem to be involved,
           directly or indirectly, in the salt sensitivity of some
           cellular functions in yeast. These proteins also
           interact with the splicing factor Msl1p.
          Length = 189

 Score = 28.4 bits (63), Expect = 7.8
 Identities = 16/79 (20%), Positives = 38/79 (48%), Gaps = 4/79 (5%)

Query: 119 HKDKDRERDKDEKKEQKESKSSSKIVSSSHNSKEPASGSQLISHPP----PPAPTPTQKS 174
           H D    + K +KK++ ++ S S   S    +   ++ S +++ P     P     +Q++
Sbjct: 25  HNDSSSSKSKKKKKKRSKATSPSHNASDQSTNNVMSTPSAILARPQPLSYPFGSQQSQQN 84

Query: 175 PVKTKEKEKEKESSTTHDK 193
            VK  ++++   +ST  ++
Sbjct: 85  AVKNSKEKRIWNTSTQEER 103


>gnl|CDD|221825 pfam12877, DUF3827, Domain of unknown function (DUF3827).  This
           family contains the human KIAA1549 protein which has
           been found to be fused fused to BRAF gene in many cases
           of pilocytic astrocytomas. The fusion is due mainly to a
           tandem duplication of 2 Mb at 7q34. Although nothing is
           known about the function of KIAA1549 protein, the BRAF
           protein is a well characterized oncoprotein. It is a
           serine/threonine protein kinase which is implicated in
           MAP/ERK signalling, a critical pathway for the
           regulation of cell division, differentiation and
           secretion.
          Length = 684

 Score = 29.1 bits (65), Expect = 7.9
 Identities = 31/181 (17%), Positives = 58/181 (32%), Gaps = 25/181 (13%)

Query: 18  PHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKEKKD--KEKDKSAVS 75
           P     +D  A  +   +          S  +  ++K R + K K   D    +D +   
Sbjct: 482 PRFEPSRDDRAAENGKVNKEIQVALRHKSEIEHHRNKIRLRAKRKGHYDFPSVEDSNNGH 541

Query: 76  SKEKEKDKVSSKEKERKESKP----KESSSEKEKKKEKKDKKEKSHKHK----------- 120
              KE+++V  + + + +        E S+  E KK  + ++    +             
Sbjct: 542 GDPKEQERVYQRAQMQIDKILLPPDSEPSTFTEPKKSSRGQRSPKARRSRQSLNGPSTEM 601

Query: 121 DKDR--ERDKDEKKEQKESKSSSKIVSSS-----HNSKEPASGSQLISHPPPPAPTPTQK 173
           D DR  ERD+D          +     ++       S  P  G +      P     +Q 
Sbjct: 602 DLDRLIERDRDGTYRSGPGVENEAYEETNDRLPESRSYSPTRGPKGHDPSEPSY-LSSQP 660

Query: 174 S 174
           S
Sbjct: 661 S 661


>gnl|CDD|233605 TIGR01865, cas_Csn1, CRISPR-associated protein Cas9/Csn1, subtype
           II/NMEMI.  CRISPR loci appear to be mobile elements with
           a wide host range. This model represents a protein found
           only in CRISPR-containing species, near other
           CRISPR-associated proteins (cas), as part of the NMENI
           subtype of CRISPR/Cas locus. The species range so far
           for this protein is animal pathogens and commensals only
           [Mobile and extrachromosomal element functions, Other].
          Length = 805

 Score = 29.3 bits (66), Expect = 7.9
 Identities = 12/44 (27%), Positives = 19/44 (43%)

Query: 51  KKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKES 94
             +  +   KE+ KK+++K K   S+  KE  K    E   K  
Sbjct: 528 GTNFGKRNSKERYKKNEDKIKEFASALGKEILKEEPTENSSKNI 571


>gnl|CDD|148051 pfam06213, CobT, Cobalamin biosynthesis protein CobT.  This family
           consists of several bacterial cobalamin biosynthesis
           (CobT) proteins. CobT is involved in the transformation
           of precorrin-3 into cobyrinic acid.
          Length = 282

 Score = 28.6 bits (64), Expect = 8.0
 Identities = 17/60 (28%), Positives = 34/60 (56%), Gaps = 1/60 (1%)

Query: 45  SSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKE 104
           S+  +D +D+D  KE E + + +E++  +  S  ++ D  SS+E E  E +  E+S++  
Sbjct: 218 SADSEDNEDEDDPKEDEDDDQGEEEESGSSDSLSEDSDA-SSEEMESGEMEAAEASADDT 276


>gnl|CDD|129694 TIGR00606, rad50, rad50.  All proteins in this family for which
           functions are known are involvedin recombination,
           recombinational repair, and/or non-homologous end
           joining.They are components of an exonuclease complex
           with MRE11 homologs. This family is distantly related to
           the SbcC family of bacterial proteins.This family is
           based on the phylogenomic analysis of JA Eisen (1999,
           Ph.D. Thesis, Stanford University).
          Length = 1311

 Score = 29.2 bits (65), Expect = 8.0
 Identities = 23/98 (23%), Positives = 40/98 (40%), Gaps = 6/98 (6%)

Query: 23  DKDSSAIPSTSTSSSTSNPTNSSSSKKDKK-----DKDRDKEKEKEKKDKEKDKSAVSSK 77
           D++ S  P       T        S    K     DK +  E E +KK+K +D+    + 
Sbjct: 674 DENQSCCPVCQRVFQTEAELQEFISDLQSKLRLAPDKLKSTESELKKKEKRRDEMLGLA- 732

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKKEK 115
              +  +  KEKE  E + K     ++ ++ K D +E+
Sbjct: 733 PGRQSIIDLKEKEIPELRNKLQKVNRDIQRLKNDIEEQ 770


>gnl|CDD|130658 TIGR01597, PYST-B, Plasmodium yoelii subtelomeric family PYST-B.
           This model represents a paralogous family of Plasmodium
           yoelii genes preferentially located in the subtelomeric
           regions of the chromosomes. There are no obvious
           homologs to these genes in any other organism.
          Length = 255

 Score = 28.7 bits (64), Expect = 8.2
 Identities = 24/108 (22%), Positives = 45/108 (41%), Gaps = 13/108 (12%)

Query: 19  HKNKDKDSSAIPS-TSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE--------KKDKEK 69
           H  K K+ + +P   +    T    N    + ++  K+ D E   E        K   +K
Sbjct: 91  HIKKHKERNTLPDLNNVDKKTKKLINKLQKELEELKKELDNEMNDELTIQPIHDKIIIKK 150

Query: 70  DKSAVSSKEKE----KDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
           D++   S+ ++    +++ +S   E +E     S + K K+K KK  K
Sbjct: 151 DENNSVSEHEDFKQLENEKNSSVSEHEEFDIASSDNLKIKRKLKKLVK 198


>gnl|CDD|227577 COG5252, COG5252, Uncharacterized conserved protein, contains
           CCCH-type Zn-finger protein [General function prediction
           only].
          Length = 299

 Score = 28.9 bits (64), Expect = 8.2
 Identities = 15/71 (21%), Positives = 34/71 (47%)

Query: 44  SSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEK 103
               K  KK ++  K+  ++ + + +DK+     +    KV +  K+ +    KE   +K
Sbjct: 1   MPPKKMAKKQQESGKKATRDMRKELEDKTFGLKNKNRSTKVQAIIKQIETLNLKEQLEKK 60

Query: 104 EKKKEKKDKKE 114
           EK + ++ ++E
Sbjct: 61  EKMRMEEKRRE 71


>gnl|CDD|146016 pfam03179, V-ATPase_G, Vacuolar (H+)-ATPase G subunit.  This family
           represents the eukaryotic vacuolar (H+)-ATPase
           (V-ATPase) G subunit. V-ATPases generate an acidic
           environment in several intracellular compartments.
           Correspondingly, they are found as membrane-attached
           proteins in several organelles. They are also found in
           the plasma membranes of some specialised cells.
           V-ATPases consist of peripheral (V1) and membrane
           integral (V0) heteromultimeric complexes. The G subunit
           is part of the V1 subunit, but is also thought to be
           strongly attached to the V0 complex. It may be involved
           in the coupling of ATP degradation to H+ translocation.
          Length = 105

 Score = 27.5 bits (62), Expect = 8.3
 Identities = 22/88 (25%), Positives = 45/88 (51%), Gaps = 3/88 (3%)

Query: 78  EKEKDKVSSKEKERKESKPKESSSEKEKKKEK-KDKKEKSHK--HKDKDRERDKDEKKEQ 134
           EKE  ++ ++ ++R+  + K++  E EK+ E+ + ++E   K    +    R + EKK +
Sbjct: 13  EKEAAEIVNEARKRRAKRLKQAKEEAEKEIEEYRAQREAEFKEFEAEHSGSRGELEKKIE 72

Query: 135 KESKSSSKIVSSSHNSKEPASGSQLISH 162
           KE++     +  S N  + A    L+S 
Sbjct: 73  KETEEKIDELKRSFNKNKEAVVQMLLSK 100


>gnl|CDD|237015 PRK11901, PRK11901, hypothetical protein; Reviewed.
          Length = 327

 Score = 28.9 bits (65), Expect = 8.3
 Identities = 13/45 (28%), Positives = 22/45 (48%), Gaps = 1/45 (2%)

Query: 134 QKESKSSSKIVSSSHNSKEPASGSQLISHPPPPAPTPTQKSPVKT 178
              S +++       +  +  +  Q IS  PP +PTPTQ +P +T
Sbjct: 93  SSPSAANNTSDGHDASGVKNTAPPQDIS-APPISPTPTQAAPPQT 136


>gnl|CDD|213592 TIGR01179, galE, UDP-glucose-4-epimerase GalE.  Alternate name:
           UDPgalactose 4-epimerase This enzyme interconverts
           UDP-glucose and UDP-galactose. A set of related
           proteins, some of which are tentatively identified as
           UDP-glucose-4-epimerase in Thermotoga maritima, Bacillus
           halodurans, and several archaea, but deeply branched
           from this set and lacking experimental evidence, are
           excluded from This model and described by a separate
           model [Energy metabolism, Sugars].
          Length = 328

 Score = 28.8 bits (65), Expect = 8.4
 Identities = 25/97 (25%), Positives = 40/97 (41%), Gaps = 12/97 (12%)

Query: 250 TVQERLAEQFKDELFDRLKNEQADILQRKVHI-----ISGDISQPSLGISSHDQQFIQHH 304
           TV++ L    +  + D L N   + L R   I     + GD+    L     D+ F +H 
Sbjct: 15  TVRQLLESGHEVVILDNLSNGSREALPRGERITPVTFVEGDLRDREL----LDRLFEEHK 70

Query: 305 IHVIIHAAASLRFDELIQDA---FTLNIQATRELLDL 338
           I  +IH A  +   E +Q     +  N+  T  LL+ 
Sbjct: 71  IDAVIHFAGLIAVGESVQKPLKYYRNNVVGTLNLLEA 107


>gnl|CDD|187567 cd05257, Arna_like_SDR_e, Arna decarboxylase_like, extended (e)
           SDRs.  Decarboxylase domain of ArnA. ArnA, is an enzyme
           involved in the modification of outer membrane protein
           lipid A of gram-negative bacteria. It is a bifunctional
           enzyme that catalyzes the NAD-dependent decarboxylation
           of UDP-glucuronic acid and
           N-10-formyltetrahydrofolate-dependent formylation of
           UDP-4-amino-4-deoxy-l-arabinose; its NAD-dependent
           decaboxylating activity is in the C-terminal 360
           residues. This subgroup belongs to the extended SDR
           family, however the NAD binding motif is not a perfect
           match and the upstream Asn of the canonical active site
           tetrad is not conserved. Extended SDRs are distinct from
           classical SDRs. In addition to the Rossmann fold
           (alpha/beta folding pattern with a central beta-sheet)
           core region typical of all SDRs, extended SDRs have a
           less conserved C-terminal extension of approximately 100
           amino acids. Extended SDRs are a diverse collection of
           proteins, and include isomerases, epimerases,
           oxidoreductases, and lyases; they typically have a
           TGXXGXXG cofactor binding motif. SDRs are a functionally
           diverse family of oxidoreductases that have a single
           domain with a structurally conserved Rossmann fold, an
           NAD(P)(H)-binding region, and a structurally diverse
           C-terminal region. Sequence identity between different
           SDR enzymes is typically in the 15-30% range; they
           catalyze a wide range of activities including the
           metabolism of steroids, cofactors, carbohydrates,
           lipids, aromatic compounds, and amino acids, and act in
           redox sensing. Classical SDRs have an TGXXX[AG]XG
           cofactor binding motif and a YXXXK active site motif,
           with the Tyr residue of the active site motif serving as
           a critical catalytic residue (Tyr-151, human
           15-hydroxyprostaglandin dehydrogenase numbering). In
           addition to the Tyr and Lys, there is often an upstream
           Ser and/or an Asn, contributing to the active site;
           while substrate binding is in the C-terminal region,
           which determines specificity. The standard reaction
           mechanism is a 4-pro-S hydride transfer and proton relay
           involving the conserved Tyr and Lys, a water molecule
           stabilized by Asn, and nicotinamide. Atypical SDRs
           generally lack the catalytic residues characteristic of
           the SDRs, and their glycine-rich NAD(P)-binding motif is
           often different from the forms normally seen in
           classical or extended SDRs. Complex (multidomain) SDRs
           such as ketoreductase domains of fatty acid synthase
           have a GGXGXXG NAD(P)-binding motif and an altered
           active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
          Length = 316

 Score = 28.8 bits (65), Expect = 8.5
 Identities = 17/82 (20%), Positives = 31/82 (37%), Gaps = 16/82 (19%)

Query: 278 KVHIISGDISQPSLGISSHDQQFIQHHI---HVIIHAAASLRFDELIQDAFTL---NIQA 331
           + H ISGD+              +++ +    V+ H AA +          +    N+  
Sbjct: 48  RFHFISGDVRDA---------SEVEYLVKKCDVVFHLAALIAIPYSYTAPLSYVETNVFG 98

Query: 332 TRELLDLATRCSQLKAILHVST 353
           T  +L+ A      K ++H ST
Sbjct: 99  TLNVLE-AACVLYRKRVVHTST 119


>gnl|CDD|220719 pfam10368, YkyA, Putative cell-wall binding lipoprotein.  YkyA is a
           family of proteins containing a lipoprotein signal and a
           hydrolase domain. It is similar to cell wall binding
           proteins and might also be recognisable by a host immune
           defence system. It is thus likely to belong to pathways
           important for pathogenicity.
          Length = 205

 Score = 28.5 bits (64), Expect = 8.5
 Identities = 26/95 (27%), Positives = 45/95 (47%), Gaps = 1/95 (1%)

Query: 54  KDRDKEKEKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEKKKEKKDKK 113
             R+K  +KEK+  EK +    S +K  +K+  K+ ++K  +  +   E+ K  +K  K 
Sbjct: 79  DKREKLLKKEKESIEKSEEEFKSAKKYIEKIEDKKLKKKAKQLVKVMKERYKSYDKLYKA 138

Query: 114 EKSHKHKDKD-RERDKDEKKEQKESKSSSKIVSSS 147
            K   + +K+  E  KD+    KE     K V+ S
Sbjct: 139 YKKALNLEKELYEYLKDKDLTLKELDEKIKAVNQS 173


>gnl|CDD|219564 pfam07771, TSGP1, Tick salivary peptide group 1.  This contains a
           group of peptides derived from a salivary gland cDNA
           library of the tick Ixodes scapularis. Also present are
           peptides from a related tick species, Ixodes ricinus.
           They are characterized by a putative signal peptide
           indicative of secretion and conserved cysteine residues.
          Length = 120

 Score = 27.5 bits (61), Expect = 8.6
 Identities = 13/41 (31%), Positives = 22/41 (53%), Gaps = 3/41 (7%)

Query: 33  STSSSTSNPTN---SSSSKKDKKDKDRDKEKEKEKKDKEKD 70
           +TSS   +  +      ++K KK K + K+ +K KK  +KD
Sbjct: 80  TTSSGEPSHPDDHPPEPTEKPKKKKKKSKKTKKPKKSSKKD 120


>gnl|CDD|227625 COG5309, COG5309, Exo-beta-1,3-glucanase [Carbohydrate transport
          and metabolism].
          Length = 305

 Score = 28.7 bits (64), Expect = 8.7
 Identities = 16/66 (24%), Positives = 26/66 (39%), Gaps = 7/66 (10%)

Query: 1  MAYSVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTS-------NPTNSSSSKKDKKD 53
          M +S  SSS++ +   S        +S + S+S+ +S S        P N   + K    
Sbjct: 5  MQFSSTSSSAALATLSSSSSALSSSASEVSSSSSRASASGFLAFTLGPYNDDGTCKSADQ 64

Query: 54 KDRDKE 59
             D E
Sbjct: 65 VASDLE 70


>gnl|CDD|215541 PLN03020, PLN03020, low-temperature-induced protein; Provisional.
          Length = 556

 Score = 28.9 bits (64), Expect = 8.8
 Identities = 19/80 (23%), Positives = 30/80 (37%), Gaps = 2/80 (2%)

Query: 133 EQKESKSSSKIVSSSHNSKEPASGS--QLISHPPPPAPTPTQKSPVKTKEKEKEKESSTT 190
           E ++  ++  + +  H+     +GS  Q    P  PA T + + P  T   EK     T 
Sbjct: 256 EPEQPFATEDLPTRPHDKSIFPTGSHDQFSPEPLSPAETKSTEDPQSTSNAEKPSNQKTY 315

Query: 191 HDKHSKHKHKKKDKHGDKTN 210
            +K S       DK     N
Sbjct: 316 TEKISSATSAIADKAISAKN 335


>gnl|CDD|187548 cd05237, UDP_invert_4-6DH_SDR_e, UDP-Glcnac (UDP-linked
           N-acetylglucosamine) inverting 4,6-dehydratase, extended
           (e) SDRs.  UDP-Glcnac inverting 4,6-dehydratase was
           identified in Helicobacter pylori as the hexameric flaA1
           gene product (FlaA1). FlaA1 is hexameric, possesses
           UDP-GlcNAc-inverting 4,6-dehydratase activity,  and
           catalyzes the first step in the creation of a
           pseudaminic acid derivative in protein glycosylation.
           Although this subgroup has the NADP-binding motif
           characteristic of extended SDRs, its members tend to
           have a Met substituted for the active site Tyr found in
           most SDR families. Extended SDRs are distinct from
           classical SDRs. In addition to the Rossmann fold
           (alpha/beta folding pattern with a central beta-sheet)
           core region typical of all SDRs, extended SDRs have a
           less conserved C-terminal extension of approximately 100
           amino acids. Extended SDRs are a diverse collection of
           proteins, and include isomerases, epimerases,
           oxidoreductases, and lyases; they typically have a
           TGXXGXXG cofactor binding motif. SDRs are a functionally
           diverse family of oxidoreductases that have a single
           domain with a structurally conserved Rossmann fold, an
           NAD(P)(H)-binding region, and a structurally diverse
           C-terminal region. Sequence identity between different
           SDR enzymes is typically in the 15-30% range; they
           catalyze a wide range of activities including the
           metabolism of steroids, cofactors, carbohydrates,
           lipids, aromatic compounds, and amino acids, and act in
           redox sensing. Classical SDRs have an TGXXX[AG]XG
           cofactor binding motif and a YXXXK active site motif,
           with the Tyr residue of the active site motif serving as
           a critical catalytic residue (Tyr-151, human
           15-hydroxyprostaglandin dehydrogenase numbering). In
           addition to the Tyr and Lys, there is often an upstream
           Ser and/or an Asn, contributing to the active site;
           while substrate binding is in the C-terminal region,
           which determines specificity. The standard reaction
           mechanism is a 4-pro-S hydride transfer and proton relay
           involving the conserved Tyr and Lys, a water molecule
           stabilized by Asn, and nicotinamide. Atypical SDRs
           generally lack the catalytic residues characteristic of
           the SDRs, and their glycine-rich NAD(P)-binding motif is
           often different from the forms normally seen in
           classical or extended SDRs. Complex (multidomain) SDRs
           such as ketoreductase domains of fatty acid synthase
           have a GGXGXXG NAD(P)-binding motif and an altered
           active site motif (YXXXN). Fungal type ketoacyl
           reductases have a TGXXXGX(1-2)G NAD(P)-binding motif.
          Length = 287

 Score = 28.7 bits (65), Expect = 8.9
 Identities = 22/106 (20%), Positives = 43/106 (40%), Gaps = 24/106 (22%)

Query: 263 LFDRLKNEQADILQR--------KVHIISGDISQPSLGISSHDQQFIQHHIHVIIHAAA- 313
           +FDR +N+  ++++         K+  I GD+        +    F +    ++ HAAA 
Sbjct: 32  VFDRDENKLHELVRELRSRFPHDKLRFIIGDVRDKERLRRA----FKERGPDIVFHAAAL 87

Query: 314 ------SLRFDELIQDAFTLNIQATRELLDLATRCSQLKAILHVST 353
                     +E I+     N+  T+ ++D A      K +  +ST
Sbjct: 88  KHVPSMEDNPEEAIKT----NVLGTKNVIDAAIENGVEKFVC-IST 128


>gnl|CDD|165564 PHA03309, PHA03309, transcriptional regulator ICP4; Provisional.
          Length = 2033

 Score = 29.1 bits (64), Expect = 8.9
 Identities = 21/77 (27%), Positives = 35/77 (45%), Gaps = 4/77 (5%)

Query: 139  SSSKIVSSSHNSKEPASGSQLISHPP-PPAPTPTQKSPVKTKEKEKEKES---STTHDKH 194
            SSS   SSS +S  P+S     + P   P+P+P +++PV      + +E    S    + 
Sbjct: 1817 SSSSSSSSSSSSSSPSSRPSRSATPSLSPSPSPPRRAPVDRSRSGRRRERDRPSANPFRW 1876

Query: 195  SKHKHKKKDKHGDKTNP 211
            +  +  + D   D T P
Sbjct: 1877 APRQRSRADHSPDGTAP 1893


>gnl|CDD|221144 pfam11595, DUF3245, Protein of unknown function (DUF3245).  This is
           a family of proteins conserved in fungi. The function is
           not known, and there is no S. cerevisiae member.
          Length = 145

 Score = 28.0 bits (62), Expect = 9.0
 Identities = 24/130 (18%), Positives = 45/130 (34%), Gaps = 14/130 (10%)

Query: 35  SSSTSNPTNSSSSKKDKKDKDRDKEKEKEK-------------KDKEKDKSAVSSKEKEK 81
           +S     T S  S   K D++  KE+++                 +  D S   ++    
Sbjct: 14  ASWLPPMTASEQSNPKKTDEELQKEEDEIFTAVPETLGLGAPLPTQAADGSWNRTELDSN 73

Query: 82  DKVSSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
           DK+  ++   K  K   +  EK +    K K     K   +  +   D+  +  E +S S
Sbjct: 74  DKLR-RQLLGKNYKKVMAEKEKAEGGPVKRKAAVVAKEAKQSSKGVGDDDDDDDEDESRS 132

Query: 142 KIVSSSHNSK 151
                  ++K
Sbjct: 133 AAFGKKGSNK 142


>gnl|CDD|185272 PRK15374, PRK15374, pathogenicity island 1 effector protein SipB;
           Provisional.
          Length = 593

 Score = 28.8 bits (64), Expect = 9.1
 Identities = 21/104 (20%), Positives = 43/104 (41%), Gaps = 6/104 (5%)

Query: 61  EKEKKDKEKDKSAVSSKEKEKDKVSSKEKERKESKP---KESSSEKEKKKEKKDKKEKSH 117
           E   K  +  KS   + EK+  +  +K +    + P   +  ++ ++  KE  + KE   
Sbjct: 144 EASIKKTDTAKSVYDAAEKKLTQAQNKLQSLDPADPGYAQAEAAVEQAGKEATEAKEALD 203

Query: 118 KHKDKDRERDKDEK-KEQKESKSSSKI--VSSSHNSKEPASGSQ 158
           K  D   +   D K K +K     +K    +++ +  + + G Q
Sbjct: 204 KATDATVKAGTDAKAKAEKADNILTKFQGTANAASQNQVSQGEQ 247


>gnl|CDD|150531 pfam09871, DUF2098, Uncharacterized protein conserved in archaea
          (DUF2098).  This domain, found in various hypothetical
          prokaryotic proteins, has no known function.
          Length = 91

 Score = 26.9 bits (60), Expect = 9.7
 Identities = 8/38 (21%), Positives = 21/38 (55%)

Query: 41 PTNSSSSKKDKKDKDRDKEKEKEKKDKEKDKSAVSSKE 78
           T+    KK+++++D+++  E+ KK++E  +       
Sbjct: 48 VTDKVKEKKEEREEDKEELIERIKKEEETFEDVDLGSA 85


>gnl|CDD|148682 pfam07222, PBP_sp32, Proacrosin binding protein sp32.  This family
           consists of several mammalian specific proacrosin
           binding protein sp32 sequences. sp32 is a sperm specific
           protein which is known to bind with with 55- and 53-kDa
           proacrosins and the 49-kDa acrosin intermediate. The
           exact function of sp32 is unclear, it is thought however
           that the binding of sp32 to proacrosin may be involved
           in packaging the acrosin zymogen into the acrosomal
           matrix.
          Length = 243

 Score = 28.5 bits (63), Expect = 9.8
 Identities = 14/102 (13%), Positives = 36/102 (35%)

Query: 4   SVKSSSSSSSAHPSPHKNKDKDSSAIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEKEKE 63
           S +   ++ +   + H    ++ S  P      +       SS       + +  + ++E
Sbjct: 141 SAEVQPTTMTLPIAEHPTITENQSFQPWPERLHNNVEELLQSSLSLGGSVQVKAPKPKQE 200

Query: 64  KKDKEKDKSAVSSKEKEKDKVSSKEKERKESKPKESSSEKEK 105
           +   +  +     K +EK     +E+E  E + K+   +   
Sbjct: 201 QLLSKLQEYLQEHKTEEKQPQEEQEEEEVEEEAKQEEGQGTD 242


>gnl|CDD|227578 COG5253, MSS4, Phosphatidylinositol-4-phosphate 5-kinase [Signal
           transduction mechanisms].
          Length = 612

 Score = 28.8 bits (64), Expect = 9.9
 Identities = 49/304 (16%), Positives = 88/304 (28%), Gaps = 46/304 (15%)

Query: 6   KSSSSSSSAHPSPHKNKDKDSS-----AIPSTSTSSSTSNPTNSSSSKKDKKDKDRDKEK 60
            S +   S  P+     +  S      A  +     S +N  +   +  D+      KE 
Sbjct: 14  ISMTHDKSTRPNDRSMSNDSSLCGLNQASDANGNEYSPNNKVSKKDTFSDQLHDALSKEF 73

Query: 61  EKE--------KKDKEKDKSAVSSK---EKEKDKVSSKEKERKESKPKESSSEKEKKKEK 109
             E         K K +     +S    E  K+   + +          + S    K   
Sbjct: 74  TLERERDRLQLNKRKYQAIRLQTSTPIVEIFKNNKDAVDPPNHTRSSGNNLSNANVKTLS 133

Query: 110 ------KDKKEKSHKHKDKDRERDKDEK--------KEQKESKSSSKIVSSSHNSKEPAS 155
                        +  ++ D E +               K   S      +S N K  + 
Sbjct: 134 APVGEHSRSNNPPNLDQNLDTEPESSISQWGELQLNPSGKTLSSQPSRKPTSENPKSESD 193

Query: 156 GSQLISHPPPPAPTPTQKSPVKTKEKEKEKESSTTHDKHSKHKHKKKDKHGDKTNPKE-- 213
            S+L           +  SP+  K   K   S+   +++S +           + P E  
Sbjct: 194 NSKL---------PTSVNSPLPDKSLLKRTLSNFWAERNSYN----WKPLVYPSCPSEHI 240

Query: 214 -KDAKSKEKESHKSSAGPKCYPEVGGIYILLRSKKNKTVQERLAEQFKDELFDRLKNEQA 272
             D+    +E   SS    C         ++R + ++T+ ERL      E   R   E  
Sbjct: 241 FSDSDVIIREDEPSSLIAFCLSTSDYRNKMMRLRDSETMDERLLNGMPLEGGHRNPQESY 300

Query: 273 DILQ 276
           ++L 
Sbjct: 301 NMLT 304


>gnl|CDD|227818 COG5531, COG5531, SWIB-domain-containing proteins implicated in
           chromatin remodeling [Chromatin structure and dynamics].
          Length = 237

 Score = 28.2 bits (63), Expect = 9.9
 Identities = 20/63 (31%), Positives = 27/63 (42%)

Query: 85  SSKEKERKESKPKESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSSKIV 144
           S  E+ R   K K + + K   K    K+E S     K+ E    E KE  + K SS I 
Sbjct: 57  SLAEEPRVLRKEKYNITRKTTGKNDLPKEEDSSLPSSKETENGDTEGKETDKKKKSSTIS 116

Query: 145 SSS 147
            +S
Sbjct: 117 KNS 119


>gnl|CDD|219589 pfam07808, RED_N, RED-like protein N-terminal region.  This family
           contains sequences that are similar to the N-terminal
           region of Red protein. This and related proteins contain
           a RED repeat which consists of a number of RE and RD
           sequence elements. The region in question has several
           conserved NLS sequences and a putative trimeric
           coiled-coil region, suggesting that these proteins are
           expressed in the nucleus. The function of Red protein is
           unknown, but efficient sequestration to nuclear bodies
           suggests that its expression may be tightly regulated of
           that the protein self-aggregates extremely efficiently.
          Length = 238

 Score = 28.3 bits (63), Expect = 10.0
 Identities = 13/45 (28%), Positives = 25/45 (55%)

Query: 97  KESSSEKEKKKEKKDKKEKSHKHKDKDRERDKDEKKEQKESKSSS 141
           K+      +K+E+  +KE + K++D+ RER K   K+   S  ++
Sbjct: 2   KKKKYAYLRKQEENAEKEINPKYRDRARERRKGINKDYDPSSLAA 46


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.309    0.125    0.353 

Gapped
Lambda     K      H
   0.267   0.0677    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 28,624,648
Number of extensions: 2746148
Number of successful extensions: 13800
Number of sequences better than 10.0: 1
Number of HSP's gapped: 8758
Number of HSP's successfully gapped: 1629
Length of query: 596
Length of database: 10,937,602
Length adjustment: 102
Effective length of query: 494
Effective length of database: 6,413,494
Effective search space: 3168266036
Effective search space used: 3168266036
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.7 bits)
S2: 62 (27.8 bits)