RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy14977
         (469 letters)



>gnl|CDD|238492 cd00992, PDZ_signaling, PDZ domain found in a variety of Eumetazoan
           signaling molecules, often in tandem arrangements. May
           be responsible for specific protein-protein
           interactions, as most PDZ domains bind C-terminal
           polypeptides, and binding to internal (non-C-terminal)
           polypeptides and even to lipids has been demonstrated.
           In this subfamily of PDZ domains an N-terminal
           beta-strand forms the peptide-binding groove base, a
           circular permutation with respect to PDZ domains found
           in proteases.
          Length = 82

 Score = 89.5 bits (223), Expect = 5e-22
 Identities = 38/95 (40%), Positives = 59/95 (62%), Gaps = 13/95 (13%)

Query: 204 VRTINMNRSQDANHGFGICVKGGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIP 263
           VRT+ + +  D   G G  ++GG ++ G G+++SRVE G  AER GLR GD IL+VNG+ 
Sbjct: 1   VRTVTLRK--DPGGGLGFSLRGGKDSGG-GIFVSRVEPGGPAERGGLRVGDRILEVNGVS 57

Query: 264 FTGISHEEALKMCFFEGYKEGQMLKSNRELSMTVR 298
             G++HEEA+++          +  S  E+++TVR
Sbjct: 58  VEGLTHEEAVEL----------LKNSGDEVTLTVR 82



 Score = 56.8 bits (138), Expect = 2e-10
 Identities = 21/46 (45%), Positives = 27/46 (58%)

Query: 368 VRKVELNIEPGQSLGLMIRGGVEYNLGIFITGVDKDSVAERAGLLV 413
           VR V L  +PG  LG  +RGG +   GIF++ V+    AER GL V
Sbjct: 1   VRTVTLRKDPGGGLGFSLRGGKDSGGGIFVSRVEPGGPAERGGLRV 46


>gnl|CDD|214570 smart00228, PDZ, Domain present in PSD-95, Dlg, and ZO-1/2.  Also
           called DHR (Dlg homologous region) or GLGF (relatively
           well conserved tetrapeptide in these domains). Some PDZs
           have been shown to bind C-terminal polypeptides; others
           appear to bind internal (non-C-terminal) polypeptides.
           Different PDZs possess different binding specificities.
          Length = 85

 Score = 77.8 bits (192), Expect = 8e-18
 Identities = 30/84 (35%), Positives = 45/84 (53%), Gaps = 11/84 (13%)

Query: 218 GFGICVKGGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGISHEEALKMCF 277
           G G  + GG +  G GV +S V  GS A +AGLR GD IL+VNG    G++H EA+ +  
Sbjct: 13  GLGFSLVGGKDEGG-GVVVSSVVPGSPAAKAGLRVGDVILEVNGTSVEGLTHLEAVDL-- 69

Query: 278 FEGYKEGQMLKSNRELSMTVRSPS 301
                   + K+  ++++TV    
Sbjct: 70  --------LKKAGGKVTLTVLRGG 85



 Score = 48.5 bits (116), Expect = 2e-07
 Identities = 17/46 (36%), Positives = 23/46 (50%), Gaps = 1/46 (2%)

Query: 368 VRKVELNIEPGQSLGLMIRGGVEYNLGIFITGVDKDSVAERAGLLV 413
            R VEL    G  LG  + GG +   G+ ++ V   S A +AGL V
Sbjct: 2   PRLVELEKGGGG-LGFSLVGGKDEGGGVVVSSVVPGSPAAKAGLRV 46


>gnl|CDD|201332 pfam00595, PDZ, PDZ domain (Also known as DHR or GLGF).  PDZ
           domains are found in diverse signaling proteins.
          Length = 80

 Score = 73.4 bits (181), Expect = 2e-16
 Identities = 25/85 (29%), Positives = 41/85 (48%), Gaps = 11/85 (12%)

Query: 214 DANHGFGICVKGGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGISHEEAL 273
               G G  + GG ++   G+++S V  G  AE  GL+ GD IL +NG     +SH+EA+
Sbjct: 7   SGRGGLGFSLVGG-SDGDPGIFVSEVLPGGAAEAGGLQEGDRILSINGQDLENLSHDEAV 65

Query: 274 KMCFFEGYKEGQMLKSNRELSMTVR 298
                       +  S  E+++T+ 
Sbjct: 66  LA----------LKGSGGEVTLTIL 80



 Score = 39.1 bits (92), Expect = 3e-04
 Identities = 14/42 (33%), Positives = 20/42 (47%)

Query: 370 KVELNIEPGQSLGLMIRGGVEYNLGIFITGVDKDSVAERAGL 411
           +V L       LG  + GG + + GIF++ V     AE  GL
Sbjct: 1   EVTLEKSGRGGLGFSLVGGSDGDPGIFVSEVLPGGAAEAGGL 42


>gnl|CDD|238080 cd00136, PDZ, PDZ domain, also called DHR (Dlg homologous region)
           or GLGF (after a conserved sequence motif). Many PDZ
           domains bind C-terminal polypeptides, though binding to
           internal (non-C-terminal) polypeptides and even to
           lipids has been demonstrated. Heterodimerization through
           PDZ-PDZ domain interactions adds to the domain's
           versatility, and PDZ domain-mediated interactions may be
           modulated dynamically through target phosphorylation.
           Some PDZ domains play a role in scaffolding
           supramolecular complexes. PDZ domains are found in
           diverse signaling proteins in bacteria, archebacteria,
           and eurkayotes. This CD contains two distinct structural
           subgroups with either a N- or C-terminal beta-strand
           forming the peptide-binding groove base. The circular
           permutation placing the strand on the N-terminus appears
           to be found in Eumetazoa only, while the C-terminal
           variant is found in all three kingdoms of life, and
           seems to co-occur with protease domains. PDZ domains
           have been named after PSD95(post synaptic density
           protein), DlgA (Drosophila disc large tumor suppressor),
           and ZO1, a mammalian tight junction protein.
          Length = 70

 Score = 66.6 bits (163), Expect = 5e-14
 Identities = 24/59 (40%), Positives = 33/59 (55%), Gaps = 3/59 (5%)

Query: 217 HGFGICVKGGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGISHEEALKM 275
            G G  ++GG      GV +  VE GS AERAGL+ GD IL VNG     ++ E+  ++
Sbjct: 1   GGLGFSIRGGTEG---GVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAEL 56



 Score = 33.4 bits (77), Expect = 0.021
 Identities = 16/34 (47%), Positives = 19/34 (55%), Gaps = 2/34 (5%)

Query: 380 SLGLMIRGGVEYNLGIFITGVDKDSVAERAGLLV 413
            LG  IRGG E   G+ +  V+  S AERAGL  
Sbjct: 2   GLGFSIRGGTEG--GVVVLSVEPGSPAERAGLQA 33


>gnl|CDD|238487 cd00987, PDZ_serine_protease, PDZ domain of tryspin-like serine
           proteases, such as DegP/HtrA, which are oligomeric
           proteins involved in heat-shock response, chaperone
           function, and apoptosis. May be responsible for
           substrate recognition and/or binding, as most PDZ
           domains bind C-terminal polypeptides, though binding to
           internal (non-C-terminal) polypeptides and even to
           lipids has been demonstrated. In this subfamily of
           protease-associated PDZ domains a C-terminal beta-strand
           forms the peptide-binding groove base, a circular
           permutation with respect to PDZ domains found in
           Eumetazoan signaling proteins.
          Length = 90

 Score = 53.8 bits (130), Expect = 2e-09
 Identities = 18/36 (50%), Positives = 25/36 (69%)

Query: 233 GVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGIS 268
           GV ++ V+ GS A +AGL+PGD IL VNG P   ++
Sbjct: 25  GVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVA 60



 Score = 31.8 bits (73), Expect = 0.14
 Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 386 RGGVEYNLGIFITGVDKDSVAERAGL 411
             G++   G+ +  VD  S A +AGL
Sbjct: 17  ELGLKDTKGVLVASVDPGSPAAKAGL 42


>gnl|CDD|238489 cd00989, PDZ_metalloprotease, PDZ domain of bacterial and plant
           zinc metalloprotases, presumably membrane-associated or
           integral membrane proteases, which may be involved in
           signalling and regulatory mechanisms. May be responsible
           for substrate recognition and/or binding, as most PDZ
           domains bind C-terminal polypeptides, and binding to
           internal (non-C-terminal) polypeptides and even to
           lipids has been demonstrated. In this subfamily of
           protease-associated PDZ domains a C-terminal beta-strand
           forms the peptide-binding groove base, a circular
           permutation with respect to PDZ domains found in
           Eumetazoan signaling proteins.
          Length = 79

 Score = 53.4 bits (129), Expect = 3e-09
 Identities = 15/34 (44%), Positives = 19/34 (55%)

Query: 230 PGVGVYISRVEEGSIAERAGLRPGDSILQVNGIP 263
           P +   I  V  GS A +AGL+ GD IL +NG  
Sbjct: 10  PPIEPVIGEVVPGSPAAKAGLKAGDRILAINGQK 43


>gnl|CDD|223864 COG0793, Prc, Periplasmic protease [Cell envelope biogenesis, outer
           membrane].
          Length = 406

 Score = 56.6 bits (137), Expect = 1e-08
 Identities = 24/86 (27%), Positives = 42/86 (48%), Gaps = 9/86 (10%)

Query: 189 PRHRRLTPPDIDQLPVRTINMNRSQDANHGFGICVKGGANNPGVGVYISRVEEGSIAERA 248
           P    L P   +       + +       G GI ++    + G GV +    +GS A +A
Sbjct: 78  PHSTYLDP---EDAAEFRTDTSGEFG---GIGIELQ--MEDIG-GVKVVSPIDGSPAAKA 128

Query: 249 GLRPGDSILQVNGIPFTGISHEEALK 274
           G++PGD I++++G    G+S +EA+K
Sbjct: 129 GIKPGDVIIKIDGKSVGGVSLDEAVK 154


>gnl|CDD|238488 cd00988, PDZ_CTP_protease, PDZ domain of C-terminal processing-,
           tail-specific-, and tricorn proteases, which function in
           posttranslational protein processing, maturation, and
           disassembly or degradation, in Bacteria, Archaea, and
           plant chloroplasts. May be responsible for substrate
           recognition and/or binding, as most PDZ domains bind
           C-terminal polypeptides, and binding to internal
           (non-C-terminal) polypeptides and even to lipids has
           been demonstrated. In this subfamily of
           protease-associated PDZ domains a C-terminal beta-strand
           forms the peptide-binding groove base, a circular
           permutation with respect to PDZ domains found in
           Eumetazoan signaling proteins.
          Length = 85

 Score = 51.1 bits (123), Expect = 2e-08
 Identities = 17/43 (39%), Positives = 29/43 (67%)

Query: 233 GVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGISHEEALKM 275
           G+ I+ V  GS A +AG++ GD I+ ++G P  G+S E+ +K+
Sbjct: 14  GLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKL 56


>gnl|CDD|221961 pfam13180, PDZ_2, PDZ domain. 
          Length = 81

 Score = 43.0 bits (102), Expect = 1e-05
 Identities = 18/38 (47%), Positives = 24/38 (63%)

Query: 228 NNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIPFT 265
            N G GV +  V+EGS A +AGL+PGD IL ++G    
Sbjct: 9   QNEGTGVTVVSVKEGSPAAKAGLKPGDIILSIDGKKVN 46


>gnl|CDD|233695 TIGR02037, degP_htrA_DO, periplasmic serine protease, Do/DeqQ
           family.  This family consists of a set proteins various
           designated DegP, heat shock protein HtrA, and protease
           DO. The ortholog in Pseudomonas aeruginosa is designated
           MucD and is found in an operon that controls mucoid
           phenotype. This family also includes the DegQ (HhoA)
           paralog in E. coli which can rescue a DegP mutant, but
           not the smaller DegS paralog, which cannot. Members of
           this family are located in the periplasm and have
           separable functions as both protease and chaperone.
           Members have a trypsin domain and two copies of a PDZ
           domain. This protein protects bacteria from thermal and
           other stresses and may be important for the survival of
           bacterial pathogens.// The chaperone function is
           dominant at low temperatures, whereas the proteolytic
           activity is turned on at elevated temperatures [Protein
           fate, Protein folding and stabilization, Protein fate,
           Degradation of proteins, peptides, and glycopeptides].
          Length = 428

 Score = 46.8 bits (112), Expect = 1e-05
 Identities = 19/42 (45%), Positives = 25/42 (59%), Gaps = 3/42 (7%)

Query: 233 GVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGISHEEALK 274
           G  +++V  GS AE+AGL+ GD I  VNG P   IS    L+
Sbjct: 258 GALVAQVLPGSPAEKAGLKAGDVITSVNGKP---ISSFADLR 296



 Score = 42.2 bits (100), Expect = 4e-04
 Identities = 19/44 (43%), Positives = 24/44 (54%)

Query: 220 GICVKGGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIP 263
            I  +        GV +++V  GS A RAGL+PGD IL VN  P
Sbjct: 350 EIRKELRLKGDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQP 393



 Score = 28.7 bits (65), Expect = 6.3
 Identities = 17/51 (33%), Positives = 24/51 (47%), Gaps = 3/51 (5%)

Query: 361 ARSSKDTVRKVELNIEPGQSLGLMIRGGVEYNLGIFITGVDKDSVAERAGL 411
           A SS   +     N+ P     L ++G V+   G+ +T V   S A RAGL
Sbjct: 333 ASSSNPFLGLTVANLSPEIRKELRLKGDVK---GVVVTKVVSGSPAARAGL 380



 Score = 28.7 bits (65), Expect = 7.5
 Identities = 14/44 (31%), Positives = 20/44 (45%), Gaps = 6/44 (13%)

Query: 368 VRKVELNIEPGQSLGLMIRGGVEYNLGIFITGVDKDSVAERAGL 411
           V   E+  +  +SLGL  + G        +  V   S AE+AGL
Sbjct: 238 VTIQEVTSDLAKSLGLEKQRGA------LVAQVLPGSPAEKAGL 275


>gnl|CDD|238490 cd00990, PDZ_glycyl_aminopeptidase, PDZ domain associated with
           archaeal and bacterial M61 glycyl-aminopeptidases. May
           be responsible for substrate recognition and/or binding,
           as most PDZ domains bind C-terminal polypeptides, and
           binding to internal (non-C-terminal) polypeptides and
           even to lipids has been demonstrated. In this subfamily
           of protease-associated PDZ domains a C-terminal
           beta-strand is presumed to form the peptide-binding
           groove base, a circular permutation with respect to PDZ
           domains found in Eumetazoan signaling proteins.
          Length = 80

 Score = 36.3 bits (84), Expect = 0.003
 Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 229 NPGVGVYISRVEEGSIAERAGLRPGDSILQVNGI 262
                  ++ V + S A++AGL  GD ++ VNG 
Sbjct: 9   KEEGLGKVTFVRDDSPADKAGLVAGDELVAVNGW 42


>gnl|CDD|223343 COG0265, DegQ, Trypsin-like serine proteases, typically
           periplasmic, contain C-terminal PDZ domain
           [Posttranslational modification, protein turnover,
           chaperones].
          Length = 347

 Score = 39.5 bits (92), Expect = 0.003
 Identities = 14/31 (45%), Positives = 18/31 (58%)

Query: 233 GVYISRVEEGSIAERAGLRPGDSILQVNGIP 263
           G  +  V  GS A +AG++ GD I  VNG P
Sbjct: 271 GAVVLGVLPGSPAAKAGIKAGDIITAVNGKP 301


>gnl|CDD|232883 TIGR00225, prc, C-terminal peptidase (prc).  A C-terminal peptidase
           with different substrates in different species including
           processing of D1 protein of the photosystem II reaction
           center in higher plants and cleavage of a peptide of 11
           residues from the precursor form of penicillin-binding
           protein in E.coli E.coli and H influenza have the most
           distal branch of the tree and their proteins have an
           N-terminal 200 amino acids that show no homology to
           other proteins in the database [Protein fate,
           Degradation of proteins, peptides, and glycopeptides,
           Protein fate, Protein modification and repair].
          Length = 334

 Score = 38.9 bits (91), Expect = 0.004
 Identities = 16/34 (47%), Positives = 26/34 (76%)

Query: 241 EGSIAERAGLRPGDSILQVNGIPFTGISHEEALK 274
           EGS AE+AG++PGD I+++NG    G+S ++A+ 
Sbjct: 71  EGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVA 104


>gnl|CDD|182723 PRK10779, PRK10779, zinc metallopeptidase RseP; Provisional.
          Length = 449

 Score = 38.9 bits (91), Expect = 0.005
 Identities = 23/79 (29%), Positives = 38/79 (48%), Gaps = 10/79 (12%)

Query: 194 LTPPDIDQLPVRTINMNR------SQDANHGFGICVKGGANNPGVGVYISRVEEGSIAER 247
           + P   DQ   +T+++         QD     GI  +G    P +   ++ V+  S A +
Sbjct: 181 VAPFGSDQRRDKTLDLRHWAFEPDKQDPVSSLGIRPRG----PQIEPVLAEVQPNSAASK 236

Query: 248 AGLRPGDSILQVNGIPFTG 266
           AGL+ GD I++V+G P T 
Sbjct: 237 AGLQAGDRIVKVDGQPLTQ 255


>gnl|CDD|132322 TIGR03279, cyano_FeS_chp, putative FeS-containing
           Cyanobacterial-specific oxidoreductase.  Members of this
           protein family are predicted FeS-containing
           oxidoreductases of unknown function, apparently
           restricted to and universal across the Cyanobacteria.
           The high trusted cutoff score for this model, 700 bits,
           excludes homologs from other lineages. This exclusion
           seems justified because a significant number of sequence
           positions are simultaneously unique to and invariant
           across the Cyanobacteria, suggesting a specialized,
           conserved function, perhaps related to photosynthesis. A
           distantly related protein family, TIGR03278, in
           universal in and restricted to archaeal methanogens, and
           may be linked to methanogenesis.
          Length = 433

 Score = 38.2 bits (89), Expect = 0.008
 Identities = 14/27 (51%), Positives = 19/27 (70%)

Query: 236 ISRVEEGSIAERAGLRPGDSILQVNGI 262
           IS V  GSIAE  G  PGD+++ +NG+
Sbjct: 2   ISAVLPGSIAEELGFEPGDALVSINGV 28


>gnl|CDD|177681 PLN00049, PLN00049, carboxyl-terminal processing protease;
           Provisional.
          Length = 389

 Score = 37.4 bits (87), Expect = 0.014
 Identities = 19/51 (37%), Positives = 27/51 (52%)

Query: 225 GGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGISHEEALKM 275
            G++ P  G+ +     G  A RAG+RPGD IL ++G    G+S  EA   
Sbjct: 95  TGSDGPPAGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADR 145


>gnl|CDD|226483 COG3975, COG3975, Predicted protease with the C-terminal PDZ domain
           [General function prediction only].
          Length = 558

 Score = 37.4 bits (87), Expect = 0.016
 Identities = 15/35 (42%), Positives = 20/35 (57%)

Query: 228 NNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGI 262
            + G    I+ V  G  A +AGL PGD I+ +NGI
Sbjct: 458 KSEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI 492


>gnl|CDD|238491 cd00991, PDZ_archaeal_metalloprotease, PDZ domain of archaeal zinc
           metalloprotases, presumably membrane-associated or
           integral membrane proteases, which may be involved in
           signalling and regulatory mechanisms. May be responsible
           for substrate recognition and/or binding, as most PDZ
           domains bind C-terminal polypeptides, and binding to
           internal (non-C-terminal) polypeptides and even to
           lipids has been demonstrated. In this subfamily of
           protease-associated PDZ domains a C-terminal beta-strand
           forms the peptide-binding groove base, a circular
           permutation with respect to PDZ domains found in
           Eumetazoan signaling proteins.
          Length = 79

 Score = 32.3 bits (74), Expect = 0.082
 Identities = 17/36 (47%), Positives = 19/36 (52%)

Query: 233 GVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGIS 268
           GV I  V  GS AE A L  GD I  +NG P T + 
Sbjct: 11  GVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLE 46


>gnl|CDD|227505 COG5178, PRP8, U5 snRNP spliceosome subunit [RNA processing and
           modification].
          Length = 2365

 Score = 35.0 bits (80), Expect = 0.098
 Identities = 23/65 (35%), Positives = 24/65 (36%), Gaps = 27/65 (41%)

Query: 301 SIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPPRWNYS 360
           S+PP  P   P PPPP                P   P        P  PPPPPPP  N  
Sbjct: 3   SLPPGNP---PPPPPP----------------PGFEP--------PSQPPPPPPPGVNVK 35

Query: 361 ARSSK 365
            RS K
Sbjct: 36  KRSRK 40



 Score = 31.5 bits (71), Expect = 1.2
 Identities = 25/100 (25%), Positives = 35/100 (35%), Gaps = 19/100 (19%)

Query: 86  PPGTSYTIVERPPPPPPVP------LPQPPKPRGTYLGTNGSSYRTQPSSTEYRSNSPSN 139
           PPG        PPPPPP P       P PP P G  +        +        S +P  
Sbjct: 5   PPGN-------PPPPPPPPGFEPPSQPPPPPPPGVNVKKRSRKQLSIVGDILGHSGNPIY 57

Query: 140 NTSSSYR-----NTSSHSH-GTKKGALSPEQVLKMLTSGG 173
           +   S +     N +   H  T K  + PE + K+ +   
Sbjct: 58  SLRVSDKPVKLGNKAKTLHVLTLKAPIPPEHLRKIQSPCS 97


>gnl|CDD|236802 PRK10942, PRK10942, serine endoprotease; Provisional.
          Length = 473

 Score = 34.7 bits (80), Expect = 0.099
 Identities = 13/35 (37%), Positives = 19/35 (54%)

Query: 233 GVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGI 267
           GV +  V+ G+ A + GL+ GD I+  N  P   I
Sbjct: 409 GVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNI 443



 Score = 32.0 bits (73), Expect = 0.60
 Identities = 13/36 (36%), Positives = 22/36 (61%)

Query: 233 GVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGIS 268
           G ++S+V   S A +AG++ GD I  +NG P +  +
Sbjct: 312 GAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFA 347


>gnl|CDD|216205 pfam00937, Corona_nucleoca, Coronavirus nucleocapsid protein. 
          Length = 346

 Score = 34.3 bits (79), Expect = 0.11
 Identities = 41/152 (26%), Positives = 57/152 (37%), Gaps = 23/152 (15%)

Query: 96  RPPPPPPVPLPQPPK-PRGTYL-GTNGSSYRTQPSSTEYRSNSPSNNTSSSYRNTSSHSH 153
            P     +PL   P  P+G Y+ G  G S  +  SS+   S  PS  +S   RN S + +
Sbjct: 128 NPNNDEAIPLRFSPGLPKGFYIEGFRGRSRSSSRSSSRSNSRGPSRGSS---RNNSRNRN 184

Query: 154 GTKKGALSPEQVLKMLTSGGGGKKS--------------AEGSEEHHHHPRHRRLTPPDI 199
            +    L    VL  L   G GK+               AE +++  + PR +R TP   
Sbjct: 185 SSSPDDLVA-AVLAALAKLGFGKQKSSSKKPSRVTKKSAAEAAKKQLNKPRWKR-TPN-- 240

Query: 200 DQLPVRTINMNRSQDANHGFGICVKGGANNPG 231
               V      R    N G    VK G  +P 
Sbjct: 241 KGENVTQCFGPRGPGQNFGDSDMVKLGVEDPR 272


>gnl|CDD|223821 COG0750, COG0750, Predicted membrane-associated Zn-dependent
           proteases 1 [Cell envelope biogenesis, outer membrane].
          Length = 375

 Score = 34.4 bits (79), Expect = 0.12
 Identities = 13/32 (40%), Positives = 16/32 (50%)

Query: 235 YISRVEEGSIAERAGLRPGDSILQVNGIPFTG 266
            +  V   S A  AGLRPGD I+ V+G     
Sbjct: 132 VVGEVAPKSAAALAGLRPGDRIVAVDGEKVAS 163


>gnl|CDD|218549 pfam05308, Mito_fiss_reg, Mitochondrial fission regulator.  In
           eukaryotes, this family of proteins induces
           mitochondrial fission.
          Length = 248

 Score = 32.4 bits (74), Expect = 0.40
 Identities = 13/33 (39%), Positives = 16/33 (48%)

Query: 323 AYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPP 355
           +         P SPP +     +P PPPPPPPP
Sbjct: 161 SVPSSSTTSFPISPPTEEPVLEVPPPPPPPPPP 193



 Score = 28.9 bits (65), Expect = 4.5
 Identities = 13/26 (50%), Positives = 15/26 (57%)

Query: 85  SPPGTSYTIVERPPPPPPVPLPQPPK 110
             P T   ++E PPPPPP P P PP 
Sbjct: 172 ISPPTEEPVLEVPPPPPPPPPPPPPS 197


>gnl|CDD|232801 TIGR00054, TIGR00054, RIP metalloprotease RseP.  Members of this
           nearly universal bacterial protein family are regulated
           intramembrane proteolysis (RIP) proteases. Older and
           synonymous gene symbols include yaeL in E. coli, mmpA in
           Caulobacter crescentus, etc. This family includes a
           region that hits the PDZ domain, found in a number of
           proteins targeted to the membrane by binding to a
           peptide ligand. The N-terminal region of this family
           contains a perfectly conserved motif HEXGH as found in a
           number of metalloproteinases, where the Glu is the
           active site and the His residues coordinate the metal
           cation [Protein fate, Degradation of proteins, peptides,
           and glycopeptides].
          Length = 419

 Score = 32.5 bits (74), Expect = 0.49
 Identities = 17/38 (44%), Positives = 21/38 (55%)

Query: 225 GGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGI 262
            G     VG  I  +++ SIA  AG+ PGD IL VNG 
Sbjct: 120 IGVPGYEVGPVIELLDKNSIALEAGIEPGDEILSVNGN 157



 Score = 30.6 bits (69), Expect = 1.8
 Identities = 13/30 (43%), Positives = 17/30 (56%)

Query: 236 ISRVEEGSIAERAGLRPGDSILQVNGIPFT 265
           +S V   S AE+AGL+ GD I  +NG    
Sbjct: 206 LSDVTPNSPAEKAGLKEGDYIQSINGEKLR 235


>gnl|CDD|182820 PRK10898, PRK10898, serine endoprotease; Provisional.
          Length = 353

 Score = 31.9 bits (73), Expect = 0.62
 Identities = 10/31 (32%), Positives = 17/31 (54%)

Query: 233 GVYISRVEEGSIAERAGLRPGDSILQVNGIP 263
           G+ ++ V     A +AG++  D I+ VN  P
Sbjct: 280 GIVVNEVSPDGPAAKAGIQVNDLIISVNNKP 310


>gnl|CDD|223021 PHA03247, PHA03247, large tegument protein UL36; Provisional.
          Length = 3151

 Score = 32.6 bits (74), Expect = 0.62
 Identities = 22/73 (30%), Positives = 27/73 (36%), Gaps = 10/73 (13%)

Query: 301  SIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGR----PCSPPLDYARSVIPMPPPP----P 352
            S+PP  PR  P P  PA T R        Q      P     D      P P PP    P
Sbjct: 2567 SVPP--PRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAP 2624

Query: 353  PPPRWNYSARSSK 365
             PP  + S  +++
Sbjct: 2625 DPPPPSPSPAANE 2637



 Score = 31.4 bits (71), Expect = 1.4
 Identities = 16/52 (30%), Positives = 16/52 (30%), Gaps = 4/52 (7%)

Query: 304  PQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPP 355
            P  P   P P P A              R  SP L  A    P PP  P  P
Sbjct: 2701 PPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAA----PAPPAVPAGP 2748



 Score = 30.7 bits (69), Expect = 2.0
 Identities = 15/59 (25%), Positives = 19/59 (32%)

Query: 297  VRSPSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPP 355
            VR  + P  +        PP    R          +P   P    +   P PPPP P P
Sbjct: 2883 VRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQP 2941



 Score = 29.9 bits (67), Expect = 3.8
 Identities = 14/53 (26%), Positives = 20/53 (37%)

Query: 303  PPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPP 355
              ++  +  LPP       Q  +    Q +P  PP    +   P PP P PP 
Sbjct: 2891 VSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPL 2943



 Score = 29.5 bits (66), Expect = 4.6
 Identities = 17/80 (21%), Positives = 21/80 (26%), Gaps = 5/80 (6%)

Query: 85   SPPGTSYTIVERPPPPPPVPLPQPPKPRGTYLGTNGSSYRTQPSSTEYRSNSPSNNTSSS 144
             P G +      P    P P P  P P       +       P   E   + P+    S 
Sbjct: 2607 DPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPP---ERPRDDPAPGRVSR 2663

Query: 145  YRNTSSHSHGTKKGALSPEQ 164
             R            A SP Q
Sbjct: 2664 PRRARRLGRAA--QASSPPQ 2681



 Score = 29.5 bits (66), Expect = 5.8
 Identities = 12/24 (50%), Positives = 13/24 (54%)

Query: 86   PPGTSYTIVERPPPPPPVPLPQPP 109
            P   S T +  PPPPPP P P P 
Sbjct: 2690 PTVGSLTSLADPPPPPPTPEPAPH 2713



 Score = 29.1 bits (65), Expect = 6.2
 Identities = 18/62 (29%), Positives = 23/62 (37%), Gaps = 7/62 (11%)

Query: 293  LSMTVRSPSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPP 352
            +S +  S ++PP  P   P P  P     Q       Q +P  PP        P P PP 
Sbjct: 2891 VSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPP-------PRPQPPL 2943

Query: 353  PP 354
             P
Sbjct: 2944 AP 2945



 Score = 28.8 bits (64), Expect = 8.6
 Identities = 19/79 (24%), Positives = 22/79 (27%), Gaps = 13/79 (16%)

Query: 292  ELSMTVRSPSIPPQAPRNHPLPP---PPAWTMRQAYS---------WIDRQGRPCSPPL- 338
            E          PP+ PR+ P P     P    R   +            R  RP    L 
Sbjct: 2637 EPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLT 2696

Query: 339  DYARSVIPMPPPPPPPPRW 357
              A    P P P P P   
Sbjct: 2697 SLADPPPPPPTPEPAPHAL 2715


>gnl|CDD|218191 pfam04652, DUF605, Vta1 like.  Vta1 (VPS20-associated protein 1) is
           a positive regulator of Vps4. Vps4 is an ATPase that is
           required in the multivesicular body (MVB) sorting
           pathway to dissociate the endosomal sorting complex
           required for transport (ESCRT). Vta1 promotes correct
           assembly of Vps4 and stimulates its ATPase activity
           through its conserved Vta1/SBP1/LIP5 region.
          Length = 315

 Score = 32.0 bits (73), Expect = 0.63
 Identities = 15/96 (15%), Positives = 24/96 (25%), Gaps = 10/96 (10%)

Query: 72  PRASHRSKAGLYYSPPGTSYTIVERPPPPPPVPLPQPPKPRGTYLGTNGSSYRTQPSSTE 131
           P       +    S P    +     PPP P     P  P G             P    
Sbjct: 198 PSPPEDPSSPSDSSLPPAPSSFQSDTPPPSPESPTNPSPPPGPA----------APPPPP 247

Query: 132 YRSNSPSNNTSSSYRNTSSHSHGTKKGALSPEQVLK 167
            +   P +    +  + S+         L  + + K
Sbjct: 248 VQQVPPLSTAKPTPPSASATPAPIGGITLDDDAIAK 283



 Score = 28.9 bits (65), Expect = 6.0
 Identities = 13/74 (17%), Positives = 17/74 (22%), Gaps = 4/74 (5%)

Query: 299 SPSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPPRWN 358
            P        + P PP    +   +         P S   D        P  P PPP   
Sbjct: 186 DPPSSSPGVPSFPSPPEDPSSPSDS----SLPPAPSSFQSDTPPPSPESPTNPSPPPGPA 241

Query: 359 YSARSSKDTVRKVE 372
                    V  + 
Sbjct: 242 APPPPPVQQVPPLS 255


>gnl|CDD|234428 TIGR03979, His_Ser_Rich, His-Xaa-Ser repeat protein HxsA.  Members
           of this protein share two defining regions. One is a
           histidine/serine-rich cluster, typically
           H-R-S-H-S-S-H-R-S-H-S-S-H. Members are found always in
           the context of a pair of radical SAM proteins, HxsB and
           HxsC, and a fourth protein HxsD. The system is predicted
           to perform peptide modifications, likely in the
           His-Xaa-Ser region, to produce some uncharacterized
           natural product.
          Length = 186

 Score = 31.0 bits (70), Expect = 0.75
 Identities = 23/73 (31%), Positives = 30/73 (41%), Gaps = 12/73 (16%)

Query: 74  ASHRSKAGLYYSPPGTSYTIVERPPPPPPVPLPQPPKPRGTYLGTNGSSYRTQPSSTEYR 133
           +SH S AG  YS P    +       P P P   P           GSS ++ PS+T  R
Sbjct: 66  SSHYSGAGGSYSVPSGDTS---TYSYPVPSPSYSPS---------PGSSIQSLPSTTGVR 113

Query: 134 SNSPSNNTSSSYR 146
             S + N +S  R
Sbjct: 114 PQSSAENANSEKR 126


>gnl|CDD|237864 PRK14950, PRK14950, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 585

 Score = 31.7 bits (72), Expect = 0.98
 Identities = 15/59 (25%), Positives = 17/59 (28%), Gaps = 1/59 (1%)

Query: 298 RSPSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPPR 356
            SP  P  AP   P     A  +       +    P  PP   A  V   P   P   R
Sbjct: 376 PSPVRPTPAPSTRP-KAAAAANIPPKEPVRETATPPPVPPRPVAPPVPHTPESAPKLTR 433



 Score = 29.4 bits (66), Expect = 4.6
 Identities = 13/68 (19%), Positives = 19/68 (27%), Gaps = 9/68 (13%)

Query: 296 TVRSPSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMP------P 349
              + +  P          PP    R     +       +P L   R+ IP+       P
Sbjct: 390 KAAAAANIPPKEPVRETATPPPVPPRPVAPPVPHT-PESAPKL--TRAAIPVDEKPKYTP 446

Query: 350 PPPPPPRW 357
           P PP    
Sbjct: 447 PAPPKEEE 454


>gnl|CDD|234035 TIGR02860, spore_IV_B, stage IV sporulation protein B.  SpoIVB, the
           stage IV sporulation protein B of endospore-forming
           bacteria such as Bacillus subtilis, is a serine
           proteinase, expressed in the spore (rather than mother
           cell) compartment, that participates in a proteolytic
           activation cascade for Sigma-K. It appears to be
           universal among endospore-forming bacteria and occurs
           nowhere else [Cellular processes, Sporulation and
           germination].
          Length = 402

 Score = 31.5 bits (72), Expect = 1.1
 Identities = 14/38 (36%), Positives = 21/38 (55%), Gaps = 6/38 (15%)

Query: 240 EEGSI---AERAGLRPGDSILQVNGIPFTGISHEEALK 274
           E+G I    E AG++ GD IL++NG     I + + L 
Sbjct: 118 EKGKIHSPGEEAGIQIGDRILKINGEK---IKNMDDLA 152


>gnl|CDD|235309 PRK04596, minC, septum formation inhibitor; Reviewed.
          Length = 248

 Score = 30.7 bits (69), Expect = 1.3
 Identities = 13/40 (32%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 343 SVIPMPPPPPPPPRWNYSARSSKDTVRKVELN-IEPGQSL 381
           +V P PPPPPPP R   +   ++    +++   +  GQ L
Sbjct: 116 AVSPPPPPPPPPARAEPAPPVARPAPGRMQRTAVRSGQQL 155


>gnl|CDD|166942 PRK00404, tatB, sec-independent translocase; Provisional.
          Length = 141

 Score = 29.8 bits (67), Expect = 1.3
 Identities = 18/57 (31%), Positives = 21/57 (36%), Gaps = 2/57 (3%)

Query: 299 SPSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPP 355
           +P  PP  P   P+ PP A +   A         P  PP   A    P   PPP  P
Sbjct: 80  APLTPPAPPE--PVTPPTAQSPAPAVPTPPPTSTPAVPPAPAAAVPAPAAAPPPSDP 134


>gnl|CDD|222997 PHA03132, PHA03132, thymidine kinase; Provisional.
          Length = 580

 Score = 31.3 bits (71), Expect = 1.3
 Identities = 20/135 (14%), Positives = 27/135 (20%), Gaps = 4/135 (2%)

Query: 84  YSPPGTSYTIVERPPPPPPVPLPQPPKPRGTYLGTNGSSYRTQPSSTEYRSNSPSNNTSS 143
           Y P  T             VP P     +      +  + R  P              SS
Sbjct: 55  YPPRETGSGGGVATSTIYTVPRPPRGPEQTLDKPDSLPASRELPPGPTPVPPGGFRGASS 114

Query: 144 SYRNTSSHSHGTKKGALSPEQVLKMLTSGGGGKKSAEGSEEHHHHPRHRRLTPPDIDQLP 203
                 S S         P     +L   G    S+E   E   H R        +    
Sbjct: 115 PRLGADSTSPRFLYQVNFPV----ILAPIGESNSSSEELSEEEEHSRPPPSESLKVKNGG 170

Query: 204 VRTINMNRSQDANHG 218
                       +  
Sbjct: 171 KVYPKGFSKHKTHKR 185


>gnl|CDD|233696 TIGR02038, protease_degS, periplasmic serine pepetdase DegS.  This
           family consists of the periplasmic serine protease DegS
           (HhoB), a shorter paralog of protease DO (HtrA, DegP)
           and DegQ (HhoA). It is found in E. coli and several
           other Proteobacteria of the gamma subdivision. It
           contains a trypsin domain and a single copy of PDZ
           domain (in contrast to DegP with two copies). A critical
           role of this DegS is to sense stress in the periplasm
           and partially degrade an inhibitor of sigma(E) [Protein
           fate, Degradation of proteins, peptides, and
           glycopeptides, Regulatory functions, Protein
           interactions].
          Length = 351

 Score = 30.9 bits (70), Expect = 1.4
 Identities = 15/32 (46%), Positives = 19/32 (59%)

Query: 387 GGVEYNLGIFITGVDKDSVAERAGLLVSQLTL 418
            G+    GI ITGVD +  A RAG+LV  + L
Sbjct: 272 LGLPDLRGIVITGVDPNGPAARAGILVRDVIL 303



 Score = 30.2 bits (68), Expect = 2.3
 Identities = 15/48 (31%), Positives = 22/48 (45%), Gaps = 2/48 (4%)

Query: 224 KGGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGISHEE 271
           +G       G+ I+ V+    A RAG+   D IL+ +G     I  EE
Sbjct: 270 QGLGLPDLRGIVITGVDPNGPAARAGILVRDVILKYDGKD--VIGAEE 315


>gnl|CDD|217512 pfam03359, GKAP, Guanylate-kinase-associated protein (GKAP)
           protein. 
          Length = 342

 Score = 30.6 bits (69), Expect = 1.8
 Identities = 17/84 (20%), Positives = 24/84 (28%), Gaps = 12/84 (14%)

Query: 285 QMLKSNRELSMTVRS--PSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQG----------R 332
             L   R +S   ++       Q   +   PPP + T       +  QG          R
Sbjct: 73  PGLPVVRHVSTEDKALQFGPSFQRESSEDSPPPSSSTYSAGTRTVSTQGQSAYLSDPKRR 132

Query: 333 PCSPPLDYARSVIPMPPPPPPPPR 356
           P S   +           PPP P 
Sbjct: 133 PSSEASESETVAFDESDLPPPDPW 156


>gnl|CDD|219916 pfam08580, KAR9, Yeast cortical protein KAR9.  The KAR9 protein in
           Saccharomyces cerevisiae is a cytoskeletal protein
           required for karyogamy, correct positioning of the
           mitotic spindle and for orientation of cytoplasmic
           microtubules. KAR9 localises at the shmoo tip in mating
           cells and at the tip of the growing bud in anaphase.
          Length = 626

 Score = 30.6 bits (69), Expect = 2.0
 Identities = 25/121 (20%), Positives = 36/121 (29%), Gaps = 8/121 (6%)

Query: 92  TIVERPPPPP----PVPLPQPP--KPRGTYLGTNGSSYRTQPSSTEYRSNSPSNNTSSSY 145
           T++  PPP         LP  P        L +       Q   T    + P++  SS  
Sbjct: 472 TLLRDPPPKKCGEESGHLPNNPFFNKLKLTLSSIPPLSPRQSIITLPTPSRPASRISSLS 531

Query: 146 RNTSSHSHGTKKGALSPEQVLKMLTSGGGGKKSAEGSEEHHHHPRHRRLTPPDIDQLPVR 205
               S+S         P  V +   +G    +S    E          L P  I  LP +
Sbjct: 532 LRLGSYSGSIVSPPPYPTLVSRKGAAGLSFNRSVSDIEG--ERIGRYNLLPTRIPALPFK 589

Query: 206 T 206
            
Sbjct: 590 A 590


>gnl|CDD|238368 cd00717, URO-D, Uroporphyrinogen decarboxylase (URO-D) is a dimeric
           cytosolic enzyme that decarboxylates the four acetate
           side chains of uroporphyrinogen III (uro-III) to create
           coproporphyrinogen III, without requiring any prosthetic
           groups or cofactors. This reaction is located at the
           branching point of the tetrapyrrole biosynthetic
           pathway, leading to the biosynthesis of heme,
           chlorophyll or bacteriochlorophyll. URO-D deficiency is
           responsible for the human genetic diseases familial
           porphyria cutanea tarda (fPCT) and hepatoerythropoietic
           porphyria (HEP).
          Length = 335

 Score = 30.2 bits (69), Expect = 2.0
 Identities = 10/19 (52%), Positives = 12/19 (63%)

Query: 305 QAPRNHPLPPPPAWTMRQA 323
           +A R  P+  PP W MRQA
Sbjct: 2   RALRGEPVDRPPVWFMRQA 20


>gnl|CDD|184918 PRK14954, PRK14954, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 620

 Score = 30.7 bits (69), Expect = 2.1
 Identities = 24/99 (24%), Positives = 34/99 (34%), Gaps = 10/99 (10%)

Query: 85  SPPGTSYTIVERPPPPPPVPLPQPPK---PRGTYLGTNGSSYRT-QPSSTEYRSNS---- 136
             P ++ T  ++PP     PLP  P+   PR    G  G    + Q     +  N     
Sbjct: 422 PSPASAPTPEQQPPVARSAPLPPSPQASAPRNVASGKPGVDLGSWQGKFMNFTRNGSRKQ 481

Query: 137 PSNNTSSSYRNTSSHSHGTKKGALSPE--QVLKMLTSGG 173
           P   +SS    T       +   L  E  Q L+ L   G
Sbjct: 482 PVQASSSDAAQTGVFEGVAELEKLRMEWNQFLEHLLKKG 520



 Score = 29.1 bits (65), Expect = 5.8
 Identities = 13/76 (17%), Positives = 17/76 (22%), Gaps = 6/76 (7%)

Query: 303 PPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPPRWNYSAR 362
           P          P PA           R     SP         P      P P       
Sbjct: 395 PEPDLPQPDRHPGPAKPEAPGA----RPAELPSPASAPTPEQQPPVARSAPLPP--SPQA 448

Query: 363 SSKDTVRKVELNIEPG 378
           S+   V   +  ++ G
Sbjct: 449 SAPRNVASGKPGVDLG 464


>gnl|CDD|204614 pfam11221, Med21, Subunit 21 of Mediator complex.  Med21 has been
           known as Srb7 in yeasts, hSrb7 in humans and Trap 19 in
           Drosophila. The heterodimer of the two subunits Med7 and
           Med21 appears to act as a hinge between the middle and
           the tail regions of Mediator.
          Length = 132

 Score = 29.2 bits (66), Expect = 2.3
 Identities = 10/57 (17%), Positives = 16/57 (28%), Gaps = 1/57 (1%)

Query: 320 MRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPPRWNYSARSSKDTVRKVELNIE 376
                 ++ +   P     D      P    PPP          ++D + K    IE
Sbjct: 19  FCATIGYLQQNHDPSPLSPDEPAVSDPKANAPPPEEFEEGQRELARDIILK-AQQIE 74


>gnl|CDD|182262 PRK10139, PRK10139, serine endoprotease; Provisional.
          Length = 455

 Score = 30.3 bits (68), Expect = 2.5
 Identities = 17/57 (29%), Positives = 26/57 (45%), Gaps = 7/57 (12%)

Query: 209 MNRSQDANHGFGICVKGGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNGIPFT 265
              S D    F + V+ GA       ++S V   S + +AG++ GD I  +NG P  
Sbjct: 274 TEMSADIAKAFNLDVQRGA-------FVSEVLPNSGSAKAGVKAGDIITSLNGKPLN 323


>gnl|CDD|224567 COG1653, UgpB, ABC-type sugar transport system, periplasmic
           component [Carbohydrate transport and metabolism].
          Length = 433

 Score = 30.0 bits (67), Expect = 2.7
 Identities = 13/106 (12%), Positives = 24/106 (22%), Gaps = 5/106 (4%)

Query: 20  LRSGVGALHGNGGGAVWGRSPGSILPISQYRTMYHADQCRAAEAEDIMGHYNPRASHRSK 79
           L    G      GG  +   P ++  +   + +Y         +          A    K
Sbjct: 208 LGGAGGGFLDKDGGEAFLNDPEAVEALEFLKDLYKKGLLPKGASGYGWDDAGALAFGSGK 267

Query: 80  AGLYYSPPGTSYTIVERPPPP-----PPVPLPQPPKPRGTYLGTNG 120
             +            +   P       P+P           +G  G
Sbjct: 268 VAMTIDGTWAIGYFKKAAGPKFDIGVAPLPAGPGGGGAAGGVGGGG 313


>gnl|CDD|181765 PRK09294, PRK09294, acyltransferase PapA5; Provisional.
          Length = 416

 Score = 30.1 bits (68), Expect = 2.7
 Identities = 12/53 (22%), Positives = 15/53 (28%), Gaps = 2/53 (3%)

Query: 304 PQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPPR 356
           P   R  P P      +  A   I RQ    +     A     +PP P     
Sbjct: 147 PGPIRPQPAPQSLEAVL--AQRGIRRQALSGAERFMPAMYAYELPPTPTAAVL 197


>gnl|CDD|234644 PRK00115, hemE, uroporphyrinogen decarboxylase; Validated.
          Length = 346

 Score = 29.7 bits (68), Expect = 3.0
 Identities = 9/19 (47%), Positives = 11/19 (57%)

Query: 305 QAPRNHPLPPPPAWTMRQA 323
           +A R  P+   P W MRQA
Sbjct: 10  RALRGEPVDRTPVWMMRQA 28


>gnl|CDD|233732 TIGR02110, PQQ_syn_pqqF, coenzyme PQQ biosynthesis probable
           peptidase PqqF.  In a subset of species that make
           coenzyme PQQ (pyrrolo-quinoline-quinone), this probable
           peptidase is found in the PQQ biosynthesis region and is
           thought to act as a protease on PqqA (TIGR02107), a
           probable peptide precursor of the coenzyme. PQQ is
           required for some glucose dehydrogenases and alcohol
           dehydrogenases [Biosynthesis of cofactors, prosthetic
           groups, and carriers, Other].
          Length = 696

 Score = 30.2 bits (68), Expect = 3.1
 Identities = 13/68 (19%), Positives = 16/68 (23%), Gaps = 15/68 (22%)

Query: 310 HPLPPPPA----W------TMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPPRWNY 359
             L   PA    W         Q    +  Q  P +  L       P P P      W  
Sbjct: 554 RLLKSLPAQQDDWLAARWGAATQLAQRVALQLSPGTADLA-----RPTPLPARLGRGWVP 608

Query: 360 SARSSKDT 367
            A    + 
Sbjct: 609 LACDGGEQ 616


>gnl|CDD|171499 PRK12438, PRK12438, hypothetical protein; Provisional.
          Length = 991

 Score = 29.8 bits (67), Expect = 3.4
 Identities = 10/29 (34%), Positives = 11/29 (37%)

Query: 86  PPGTSYTIVERPPPPPPVPLPQPPKPRGT 114
           PPG       +  PPP    P    PRG 
Sbjct: 915 PPGAGPPAPPQAVPPPRTTQPPAAPPRGP 943


>gnl|CDD|193593 cd09979, LOTUS_3_Limkain_b1, The third LOTUS domain on Limkain
          b1(LKAP).  The third LOTUS domain on Limkain b1(LKAP):
          Limkain b1 is  a novel human autoantigen, localized to
          a subset of ABCD3 and PXF marked peroxisomes. Limkain
          b1 may be a relatively common target of human
          autoantibodies reactive to cytoplasmic vesicle-like
          structures. The protein contains multiple copies of
          LOTUS domains and a conserved RNA recognition motif.
          The exact molecular function of LOTUS domain remains to
          be identified. Its occurrence in proteins associated
          with RNA metabolism suggests that it might be involved
          in RNA binding function. The presence of several basic
          residues and RNA fold recognition motifs support this
          hypothesis. The RNA binding function might be the first
          step of regulating mRNA translation or localization.
          Length = 72

 Score = 27.4 bits (61), Expect = 3.8
 Identities = 9/27 (33%), Positives = 14/27 (51%), Gaps = 3/27 (11%)

Query: 39 SPGSILPISQYRTMYH---ADQCRAAE 62
           P  +LP S++   YH     QCR ++
Sbjct: 15 QPSCLLPFSRFIPAYHHHFGKQCRVSD 41


>gnl|CDD|217469 pfam03276, Gag_spuma, Spumavirus gag protein. 
          Length = 582

 Score = 29.5 bits (66), Expect = 4.1
 Identities = 27/104 (25%), Positives = 35/104 (33%), Gaps = 3/104 (2%)

Query: 11  EINLSHEYLLRSGVGALHGNGGGAVWGRSPGSILPISQYRTMYHADQCRAAEAEDIMGHY 70
           EI    E L+   +G   GN  GA+    P S+  +    +                  +
Sbjct: 162 EIRGLREMLVELQIGGRGGNIPGAIQPPPPSSLPGLPPGSSSLAPSASSTPGNRLPRVSF 221

Query: 71  NPRASHRSKA---GLYYSPPGTSYTIVERPPPPPPVPLPQPPKP 111
           NP     S A       S P      V +   PPPVP PQP  P
Sbjct: 222 NPFLPGPSPAQPSAPPASIPAPPIPPVIQYVAPPPVPPPQPIIP 265


>gnl|CDD|219085 pfam06551, DUF1120, Protein of unknown function (DUF1120).  This
           family consists of several hypothetical bacterial
           proteins of unknown function.
          Length = 116

 Score = 28.3 bits (63), Expect = 4.4
 Identities = 20/108 (18%), Positives = 31/108 (28%), Gaps = 22/108 (20%)

Query: 158 GALSPEQVLKMLTSGGG----GKKSAEGSEEHHHHPRHRRLTPPDIDQLPVRTINMNRSQ 213
           GA +P        SGGG    G  S               L+P D  QL  + I++  + 
Sbjct: 10  GACTPT------LSGGGVVDYGTISV------------SALSPTDYTQLGTKNISLTITC 51

Query: 214 DANHGFGICVKGGANNPGVGVYISRVEEGSIAERAGLRPGDSILQVNG 261
            A     +       +  V +  +     +I     L  G        
Sbjct: 52  TAPTKIAVTATDNRMDTIVTLNDTSYGASTITILNALFGGGISSGTTQ 99


>gnl|CDD|221793 pfam12824, MRP-L20, Mitochondrial ribosomal protein subunit L20.
           This family is the essential mitochondrial ribosomal
           protein subunit L20 of fungi.
          Length = 164

 Score = 28.8 bits (65), Expect = 4.4
 Identities = 22/108 (20%), Positives = 34/108 (31%), Gaps = 24/108 (22%)

Query: 70  YNPR--ASHRSKAGLYYSPPGTSYTIVERPPPP-------PPVPLPQPPKPRGTYLGTNG 120
           Y PR  +S+          P  S+ ++  PP         P   LP    PR   L    
Sbjct: 2   YEPRPKSSYNRTRSRLNIKPDPSFGLIHNPPSSAPSVYHTPYKFLP-ANDPRRELL--AA 58

Query: 121 SSYRTQPSSTEYRSNSPSNNTSSSYRNTSSHSHGTKKGALSPEQVLKM 168
                 P S++     P              +   KK  L+PE + ++
Sbjct: 59  KQSTISPKSSDL----PP--------ILRYKAEKEKKYHLTPEDIAEI 94


>gnl|CDD|215533 PLN02983, PLN02983, biotin carboxyl carrier protein of acetyl-CoA
           carboxylase.
          Length = 274

 Score = 29.0 bits (65), Expect = 5.2
 Identities = 19/56 (33%), Positives = 19/56 (33%), Gaps = 15/56 (26%)

Query: 300 PSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPP 355
           P  PP AP     PPPP                P SPP   A    P  P   PPP
Sbjct: 142 PQPPPPAPVVMMQPPPPHAMP------------PASPP---AAQPAPSAPASSPPP 182


>gnl|CDD|216707 pfam01796, DUF35, DUF35 OB-fold domain.  This domain has no known
           function and is found in conserved hypothetical archaeal
           and bacterial proteins. The domain is approximately 70
           amino acids long. The domain is duplicated in a member
           from Mycobacterium tuberculosis. The structure of a
           DUF35 representative reveals two long N-terminal helices
           followed by a rubredoxin-like zinc ribbon domain and a
           C-terminal OB fold domain represented in this entry.
           OB-folds are frequently found to bind nucleic acids
           suggesting this domain might bind to DNA or RNA.
          Length = 66

 Score = 26.8 bits (60), Expect = 5.4
 Identities = 13/26 (50%), Positives = 15/26 (57%), Gaps = 2/26 (7%)

Query: 85  SPPGT--SYTIVERPPPPPPVPLPQP 108
           S  GT  SYT+V RPP P P  +P  
Sbjct: 5   SGRGTVYSYTVVHRPPSPFPDEVPYV 30


>gnl|CDD|216368 pfam01213, CAP_N, Adenylate cyclase associated (CAP) N terminal. 
          Length = 313

 Score = 29.0 bits (65), Expect = 5.5
 Identities = 12/35 (34%), Positives = 15/35 (42%), Gaps = 7/35 (20%)

Query: 331 GRPCSPPLDYARSVIPMPPPPPPPPRWNYSARSSK 365
             P +PP        P PPPPP  P  + S  S+ 
Sbjct: 228 SAPSAPP-------PPPPPPPPSVPTISNSVESAS 255


>gnl|CDD|233423 TIGR01464, hemE, uroporphyrinogen decarboxylase.  This model
           represents uroporphyrinogen decarboxylase (HemE), which
           converts uroporphyrinogen III to coproporphyrinogen III.
           This step takes the pathway toward protoporphyrin IX, a
           common precursor of both heme and chlorophyll, rather
           than toward precorrin 2 and its products [Biosynthesis
           of cofactors, prosthetic groups, and carriers, Heme,
           porphyrin, and cobalamin].
          Length = 338

 Score = 28.8 bits (65), Expect = 5.7
 Identities = 8/18 (44%), Positives = 10/18 (55%)

Query: 306 APRNHPLPPPPAWTMRQA 323
           A +   +  PP W MRQA
Sbjct: 5   AAKGEVVDRPPVWFMRQA 22


>gnl|CDD|236090 PRK07764, PRK07764, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 824

 Score = 29.2 bits (66), Expect = 6.0
 Identities = 15/53 (28%), Positives = 20/53 (37%)

Query: 303 PPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPP 355
            P  P   P   PPA       +   +  +  S P   A   +P+PP P  PP
Sbjct: 697 APAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPP 749


>gnl|CDD|223061 PHA03369, PHA03369, capsid maturational protease; Provisional.
          Length = 663

 Score = 29.2 bits (65), Expect = 6.3
 Identities = 20/92 (21%), Positives = 30/92 (32%), Gaps = 5/92 (5%)

Query: 84  YSPPGTSYTIVERPPP-----PPPVPLPQPPKPRGTYLGTNGSSYRTQPSSTEYRSNSPS 138
           YS P  S      P P     P  V    P  P  +Y          QP++      S +
Sbjct: 392 YSVPARSPMTAYPPVPQFCGDPGLVSPYNPQSPGTSYGPEPVGPVPPQPTNPYVMPISMA 451

Query: 139 NNTSSSYRNTSSHSHGTKKGALSPEQVLKMLT 170
           N     +     H    K+G    E++++ L 
Sbjct: 452 NMVYPGHPQEHGHERKRKRGGELKEELIETLK 483


>gnl|CDD|219419 pfam07462, MSP1_C, Merozoite surface protein 1 (MSP1) C-terminus.
           This family represents the C-terminal region of
           merozoite surface protein 1 (MSP1) which are found in a
           number of Plasmodium species. MSP-1 is a 200-kDa protein
           expressed on the surface of the P. vivax merozoite.
           MSP-1 of Plasmodium species is synthesised as a
           high-molecular-weight precursor and then processed into
           several fragments. At the time of red cell invasion by
           the merozoite, only the 19-kDa C-terminal fragment
           (MSP-119), which contains two epidermal growth
           factor-like domains, remains on the surface. Antibodies
           against MSP-119 inhibit merozoite entry into red cells,
           and immunisation with MSP-119 protects monkeys from
           challenging infections. Hence, MSP-119 is considered a
           promising vaccine candidate.
          Length = 574

 Score = 29.1 bits (65), Expect = 6.4
 Identities = 13/50 (26%), Positives = 19/50 (38%), Gaps = 3/50 (6%)

Query: 92  TIVERPPPPPPVPLPQPPKPRGTYLGTNGSSYRTQPSSTEYRSNSPSNNT 141
           T V  PP     P P   +P G+    +GS   TQ  ++       +  T
Sbjct: 269 TTVVTPPQADAAPSPLSVRPAGSSGSASGS---TQIPTSGSVLGPGAAAT 315


>gnl|CDD|223066 PHA03379, PHA03379, EBNA-3A; Provisional.
          Length = 935

 Score = 29.3 bits (65), Expect = 6.5
 Identities = 14/62 (22%), Positives = 18/62 (29%), Gaps = 8/62 (12%)

Query: 304 PQAPRNHPLPPPPAWTMRQAYSWIDRQGR--------PCSPPLDYARSVIPMPPPPPPPP 355
           PQ P   PL P          S   R G         P + P+++          PP   
Sbjct: 712 PQQPMEGPLVPERWMFQGATLSQSVRPGVAQSQYFDLPLTQPINHGAPAAHFLHQPPMEG 771

Query: 356 RW 357
            W
Sbjct: 772 PW 773


>gnl|CDD|237871 PRK14965, PRK14965, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 576

 Score = 28.9 bits (65), Expect = 6.5
 Identities = 13/64 (20%), Positives = 15/64 (23%), Gaps = 11/64 (17%)

Query: 311 PLPPPPAWTMRQAYSW-------IDRQGRPCSPPLDYARSVIPMPPPPPPPPRWNYSARS 363
           P PP  AW      +                      AR      PP    P     ARS
Sbjct: 382 PAPPSAAWGAPTPAAPAAPPPAAAPPVPPAAPARPAAARPAPAPAPPAAAAPP----ARS 437

Query: 364 SKDT 367
           +   
Sbjct: 438 ADPA 441


>gnl|CDD|225575 COG3031, PulC, Type II secretory pathway, component PulC
           [Intracellular trafficking and secretion].
          Length = 275

 Score = 28.7 bits (64), Expect = 6.7
 Identities = 16/67 (23%), Positives = 29/67 (43%), Gaps = 10/67 (14%)

Query: 232 VGVYISRVEEGSIAERAGLRPGDSILQVNGIPFTGISHEEALKMCFFEGYKEGQMLKSNR 291
            G      ++GS+  ++GL+ GD  + +N +  T     E +           QML++  
Sbjct: 207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLT---DPEDMFRLL-------QMLRNMP 256

Query: 292 ELSMTVR 298
            L +TV 
Sbjct: 257 SLQLTVI 263


>gnl|CDD|235906 PRK07003, PRK07003, DNA polymerase III subunits gamma and tau;
           Validated.
          Length = 830

 Score = 29.0 bits (65), Expect = 7.1
 Identities = 18/80 (22%), Positives = 28/80 (35%), Gaps = 3/80 (3%)

Query: 60  AAEAEDIMGHYNPR-ASHRSKAGLYYSPPGTSYTIVERPPPP-PPVPLPQPPKPRGTYLG 117
           AA A D++ +   R +S R       + P  +     +P  P   V +P P     T   
Sbjct: 549 AAAALDVLRNAGMRVSSDRGARAAAAAKPAAAPAAAPKPAAPRVAVQVPTPRARAATGDA 608

Query: 118 TNGSSYRTQPSSTEYRSNSP 137
               + R    + E R   P
Sbjct: 609 PPNGAARA-EQAAESRGAPP 627


>gnl|CDD|234398 TIGR03921, T7SS_mycosin, type VII secretion-associated serine
           protease mycosin.  Members of this family are
           subtilisin-related serine proteases, found strictly in
           the Actinobacteria and associated with type VII
           secretion operons. The designation mycosin is used for
           members from Mycobacterium [Protein fate, Protein and
           peptide secretion and trafficking, Protein fate, Protein
           modification and repair].
          Length = 350

 Score = 28.4 bits (64), Expect = 8.3
 Identities = 10/30 (33%), Positives = 11/30 (36%)

Query: 86  PPGTSYTIVERPPPPPPVPLPQPPKPRGTY 115
           PP     +   P P  PV  P PP P    
Sbjct: 287 PPEDGRPLRPAPAPARPVAAPAPPPPPDDT 316


>gnl|CDD|220392 pfam09770, PAT1, Topoisomerase II-associated protein PAT1.  Members
           of this family are necessary for accurate chromosome
           transmission during cell division.
          Length = 804

 Score = 28.6 bits (64), Expect = 8.3
 Identities = 19/97 (19%), Positives = 27/97 (27%), Gaps = 20/97 (20%)

Query: 278 FEGYKEGQMLKSNRELSMTVRSPSIPPQ-------------APRNHPLPP------PPAW 318
                  +ML S  E+   ++     PQ              PR    P       PP +
Sbjct: 145 QPQTPAQKML-SLEEVEAQLQQRQQAPQLPQPPQQVLPQGMPPRQAAFPQQGPPEQPPGY 203

Query: 319 TMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPP 355
                      Q +   P    A +  P+PP  P  P
Sbjct: 204 PQPPQGHPEQVQPQQFLPAPSQAPAQPPLPPQLPQQP 240


>gnl|CDD|178744 PLN03205, PLN03205, ATR interacting protein; Provisional.
          Length = 652

 Score = 28.6 bits (63), Expect = 8.7
 Identities = 14/34 (41%), Positives = 16/34 (47%), Gaps = 2/34 (5%)

Query: 96  RPPPPPPVP--LPQPPKPRGTYLGTNGSSYRTQP 127
           R  PPP +P  LP PP    T   T  SS  + P
Sbjct: 30  RLLPPPSLPTFLPAPPVSEMTTPSTKISSSLSHP 63


>gnl|CDD|145736 pfam02741, FTR_C, FTR, proximal lobe.  The FTR
           (Formylmethanofuran--tetrahydromethanopterin
           formyltransferase) enzyme EC:2.3.1.101 is involved in
           archaebacteria in the formation of methane from carbon
           dioxide. C-terminal proximal lobe of alpha+beta
           ferredoxin-like fold. SCOP reports fold duplication with
           N-terminal distal lobe.
          Length = 150

 Score = 27.5 bits (62), Expect = 8.9
 Identities = 12/36 (33%), Positives = 19/36 (52%), Gaps = 2/36 (5%)

Query: 378 GQSLGLMIRGGVEYNLGIFITGVDKDSVAE--RAGL 411
           G+     +  GV     I I G+ +++VAE  RAG+
Sbjct: 84  GKVEDSELPDGVNAVYEIVIDGLSEEAVAEAMRAGI 119


>gnl|CDD|216412 pfam01285, TEA, TEA/ATTS domain family. 
          Length = 424

 Score = 28.6 bits (64), Expect = 9.1
 Identities = 16/75 (21%), Positives = 24/75 (32%), Gaps = 15/75 (20%)

Query: 301 SIPPQAPRNHP---LPPPPAWTMRQAYSWIDRQGRPCSPPLDYARSVIPMPPPPPPPPRW 357
            I P  P             +    +           S   D+ R + P     PPP   
Sbjct: 148 YIIPGGPSWRTSIKPFSSSHYGSHNS-----------SAYSDHLRPLQPYSGELPPPLGP 196

Query: 358 NYSARSSKDTVRKVE 372
           N+ A +SK  +R +E
Sbjct: 197 NWQASNSK-KIRGLE 210


>gnl|CDD|216421 pfam01299, Lamp, Lysosome-associated membrane glycoprotein (Lamp). 
          Length = 305

 Score = 28.2 bits (63), Expect = 9.1
 Identities = 21/83 (25%), Positives = 27/83 (32%), Gaps = 21/83 (25%)

Query: 86  PPGTSYTIVERPPPPPPVPLPQPPKPRGTYLGTNGS--------------SYRTQPSSTE 131
            P T  T    P P P  P        G Y  TNG+              +Y T+   T 
Sbjct: 87  SPTTVATPSPSPTPVPSSP------AVGNYSVTNGNGTCLLASMGLQLNITYETKDGKTA 140

Query: 132 YR-SNSPSNNTSSSYRNTSSHSH 153
            R  N   N T++S    S  + 
Sbjct: 141 TRLFNINPNKTTASGSCGSQTAT 163


>gnl|CDD|235904 PRK06995, flhF, flagellar biosynthesis regulator FlhF; Validated.
          Length = 484

 Score = 28.4 bits (64), Expect = 9.2
 Identities = 12/77 (15%), Positives = 21/77 (27%), Gaps = 9/77 (11%)

Query: 289 SNRELSMTVRSPSIPPQAPRNHPLPPP-----PAWTMRQAYSW----IDRQGRPCSPPLD 339
           ++ +L+      +  P A +  P   P     PA    +   W      R        + 
Sbjct: 44  ADSDLAALAPPAAAAPAAAQPPPAAAPAAVSRPAAPAAEPAPWLVEHAKRLTAQREQLVA 103

Query: 340 YARSVIPMPPPPPPPPR 356
            A +        P  P 
Sbjct: 104 RAAAPAAPEAQAPAAPA 120


>gnl|CDD|237862 PRK14948, PRK14948, DNA polymerase III subunits gamma and tau;
           Provisional.
          Length = 620

 Score = 28.4 bits (64), Expect = 9.6
 Identities = 16/74 (21%), Positives = 20/74 (27%), Gaps = 4/74 (5%)

Query: 287 LKSNRELSMTVRSPSIPPQAPRNHPLPPPPAWTMRQAYSWIDRQGRPCS----PPLDYAR 342
           L+S    +        PPQ     P P PP                P +         A+
Sbjct: 510 LESQSGSASNTAKTPPPPQKSPPPPAPTPPLPQPTATAPPPTPPPPPPTATQASSNAPAQ 569

Query: 343 SVIPMPPPPPPPPR 356
                 PPPP P  
Sbjct: 570 IPADSSPPPPIPEE 583


>gnl|CDD|227750 COG5463, COG5463, Predicted integral membrane protein [Function
           unknown].
          Length = 198

 Score = 27.9 bits (62), Expect = 9.9
 Identities = 19/74 (25%), Positives = 30/74 (40%), Gaps = 13/74 (17%)

Query: 119 NGSSYRTQPSSTEYRSNSPSNNTSSSYRNTSSHSHGTK---------KGALSPEQV-LKM 168
            G  Y  QP    Y S + S+   S +R+TS   +G            G  +P++     
Sbjct: 124 GGRGYDEQPL---YSSGNSSSGAYSVWRDTSGDYYGRSGTGKTMTVPSGGNAPKKASTTT 180

Query: 169 LTSGGGGKKSAEGS 182
           ++ GG G  S+  S
Sbjct: 181 VSRGGFGGSSSARS 194


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.314    0.133    0.404 

Gapped
Lambda     K      H
   0.267   0.0647    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 24,694,695
Number of extensions: 2433907
Number of successful extensions: 4546
Number of sequences better than 10.0: 1
Number of HSP's gapped: 4085
Number of HSP's successfully gapped: 242
Length of query: 469
Length of database: 10,937,602
Length adjustment: 100
Effective length of query: 369
Effective length of database: 6,502,202
Effective search space: 2399312538
Effective search space used: 2399312538
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 61 (27.4 bits)