RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy12185
         (317 letters)



>gnl|CDD|239068 cd02248, Peptidase_C1A, Peptidase C1A subfamily (MEROPS database
           nomenclature); composed of cysteine peptidases (CPs)
           similar to papain, including the mammalian CPs
           (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain
           is an endopeptidase with specific substrate preferences,
           primarily for bulky hydrophobic or aromatic residues at
           the S2 subsite, a hydrophobic pocket in papain that
           accommodates the P2 sidechain of the substrate (the
           second residue away from the scissile bond). Most
           members of the papain subfamily are endopeptidases. Some
           exceptions to this rule can be explained by specific
           details of the catalytic domains like the occluding loop
           in cathepsin B which confers an additional
           carboxydipeptidyl activity and the mini-chain of
           cathepsin H resulting in an N-terminal exopeptidase
           activity. Papain-like CPs have different functions in
           various organisms. Plant CPs are used to mobilize
           storage proteins in seeds. Parasitic CPs act
           extracellularly to help invade tissues and cells, to
           hatch or to evade the host immune system. Mammalian CPs
           are primarily lysosomal enzymes with the exception of
           cathepsin W, which is retained in the endoplasmic
           reticulum. They are responsible for protein degradation
           in the lysosome. Papain-like CPs are synthesized as
           inactive proenzymes with N-terminal propeptide regions,
           which are removed upon activation. In addition to its
           inhibitory role, the propeptide is required for proper
           folding of the newly synthesized enzyme and its
           stabilization in denaturing pH conditions. Residues
           within the propeptide region also play a role in the
           transport of the proenzyme to lysosomes or acidified
           vesicles. Also included in this subfamily are proteins
           classified as non-peptidase homologs, which lack
           peptidase activity or have missing active site residues.
          Length = 210

 Score =  179 bits (457), Expect = 1e-55
 Identities = 66/169 (39%), Positives = 99/169 (58%), Gaps = 6/169 (3%)

Query: 143 DWREAGIIGKVRNQQTCGACWAFSTVETAESMHALKNGTLSLLSVQEVIDCAGNGNMGCS 202
           DWRE G +  V++Q +CG+CWAFSTV   E  +A+K G L  LS Q+++DC+ +GN GC+
Sbjct: 5   DWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQQLVDCSTSGNNGCN 64

Query: 203 GGDFCALLDWMDVNKVVLEPESEYPLLLKDAACKRKATSPNGVKIKSYTCDTLIPSESSI 262
           GG+     +++    +    ES+YP   KD  CK   +S  G KI  Y  +     E ++
Sbjct: 65  GGNPDNAFEYVKNGGLAS--ESDYPYTGKDGTCKYN-SSKVGAKITGY-SNVPPGDEEAL 120

Query: 263 LTDIATHGPVIAAVNA-LTWQYYLGGVIQYNCDGSLANINHAVQIVGYD 310
              +A +GPV  A++A  ++Q+Y GG+    C  S  N+NHAV +VGY 
Sbjct: 121 KAALANYGPVSVAIDASSSFQFYKGGIYSGPC-CSNTNLNHAVLLVGYG 168


>gnl|CDD|215726 pfam00112, Peptidase_C1, Papain family cysteine protease. 
          Length = 213

 Score =  156 bits (397), Expect = 9e-47
 Identities = 66/176 (37%), Positives = 89/176 (50%), Gaps = 11/176 (6%)

Query: 138 IPVKKDWREAGIIGKVRNQQTCGACWAFSTVETAESMHALKNGTLSLLSVQEVIDCAGNG 197
           +P   DWRE G +  V++Q  CG+CWAFS V   E  + +K G L  LS Q+++DC   G
Sbjct: 1   LPESFDWREKGAVTPVKDQGQCGSCWAFSAVGALEGRYCIKTGKLVSLSEQQLVDCDT-G 59

Query: 198 NMGCSGG-DFCALLDWMDVNKVVLEPESEYPLLLKDAACKRKATSPNGVKIKSYTCDTLI 256
           N GC+GG    A         +V   ES+YP    D  CK K ++    KIK Y      
Sbjct: 60  NNGCNGGLPDNAFEYIKKNGGIVT--ESDYPYTAHDGTCKFKKSNSKYAKIKGYGDVPYN 117

Query: 257 PSESSILTDIATHGPVIAAVNALTW--QYYLGGVIQ-YNCDGSLANINHAVQIVGY 309
             E ++   +A +GPV  A++A     Q Y  GV +   C G L   +HAV IVGY
Sbjct: 118 -DEEALQAALAKNGPVSVAIDAYEDDFQLYKSGVYKHTECSGEL---DHAVLIVGY 169


>gnl|CDD|185513 PTZ00203, PTZ00203, cathepsin L protease; Provisional.
          Length = 348

 Score =  142 bits (359), Expect = 8e-40
 Identities = 96/307 (31%), Positives = 146/307 (47%), Gaps = 33/307 (10%)

Query: 11  VALIALCFLAIPVKVSKPNLEQKLELFSSFQQRYKKSYSK-SEHDIRFKNFEKSLDIIEE 69
           V L A C  A  + V  P       LF  F++ Y+++Y   +E   R  NFE++L+++ E
Sbjct: 16  VVLAAACAPARAIYVGTP----AAALFEEFKRTYQRAYGTLTEEQQRLANFERNLELMRE 71

Query: 70  LNKNRQSPESARYGITEFSDLSEEEFKTRHLRHSVNKHVLMSHHKHHDHHHNHVKKRSIT 129
                 +P  AR+GIT+F DLSE EF  R+L    N     +  K H   H    +  ++
Sbjct: 72  HQAR--NPH-ARFGITKFFDLSEAEFAARYL----NGAAYFAAAKQHAGQHYRKARADLS 124

Query: 130 TGITIPTGIPVKKDWREAGIIGKVRNQQTCGACWAFSTVETAESMHALKNGTLSLLSVQE 189
                   +P   DWRE G +  V+NQ  CG+CWAFS V   ES  A+    L  LS Q+
Sbjct: 125 A-------VPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQ 177

Query: 190 VIDCAGNGNMGCSGGDFCALLDWM--DVNKVVLEPESEYPLLLKD---AACKRKATSPNG 244
           ++ C    N GC GG      +W+  ++N  V   E  YP +  +     C   +    G
Sbjct: 178 LVSCDHVDN-GCGGGLMLQAFEWVLRNMNGTVFT-EKSYPYVSGNGDVPECSNSSELAPG 235

Query: 245 VKIKSYTCDTLIPSESSILTD-IATHGPVIAAVNALTWQYYLGGVIQYNCDGSLANINHA 303
            +I  Y     + S   ++   +A +GP+  AV+A ++  Y  GV+  +C G    +NH 
Sbjct: 236 ARIDGY---VSMESSERVMAAWLAKNGPISIAVDASSFMSYHSGVLT-SCIGE--QLNHG 289

Query: 304 VQIVGYD 310
           V +VGY+
Sbjct: 290 VLLVGYN 296


>gnl|CDD|214761 smart00645, Pept_C1, Papain family cysteine protease. 
          Length = 175

 Score =  123 bits (310), Expect = 2e-34
 Identities = 50/176 (28%), Positives = 74/176 (42%), Gaps = 44/176 (25%)

Query: 138 IPVKKDWREAGIIGKVRNQQTCGACWAFSTVETAESMHALKNGTLSLLSVQEVIDCAGNG 197
           +P   DWR+ G +  V++Q  CG+CWAFS     E  + +K G L  LS Q+++DC+G G
Sbjct: 1   LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQLVDCSGGG 60

Query: 198 NMGCSGGDFCALLDWMDVNKVVLEPESEYPLLLKDAACKRKATSPNGVKIKSYTCDTLIP 257
           N GC+GG      +++      LE ES YP                      YT      
Sbjct: 61  NCGCNGGLPDNAFEYI-KKNGGLETESCYP----------------------YT------ 91

Query: 258 SESSILTDIATHGPVIAAVNALTWQYYLGGVIQYNCDGSLANINHAVQIVGYDNYS 313
                            A++A  +Q+Y  G+   +       ++HAV IVGY    
Sbjct: 92  --------------GSVAIDASDFQFYKSGIY-DHPGCGSGTLDHAVLIVGYGTEV 132


>gnl|CDD|240310 PTZ00200, PTZ00200, cysteine proteinase; Provisional.
          Length = 448

 Score =  112 bits (281), Expect = 6e-28
 Identities = 81/291 (27%), Positives = 127/291 (43%), Gaps = 35/291 (12%)

Query: 37  FSSFQQRYKKSY-SKSEHDIRFKNFEKSLDIIEELNKNRQSPESARYGITEFSDLSEEEF 95
           F  F ++Y + + + +E   RF  F  +   +    K+ +  E     I +FSDL+EEEF
Sbjct: 126 FEEFNKKYNRKHATHAERLNRFLTFRNNYLEV----KSHKGDEPYSKEINKFSDLTEEEF 181

Query: 96  K----------TRHLRHSVNKHVLMSHHKHHDHHHNHVKKRSITTGITIPTGI-PVKKDW 144
           +            +     N      H  +  +  N  K ++    +  P+ I     DW
Sbjct: 182 RKLFPVIKVPPKSNSTSHNNDFKAR-HVSNPTYLKNLKKAKNTDEDVKDPSKITGEGLDW 240

Query: 145 REAGIIGKVRNQQT-CGACWAFSTVETAESMHAL-KNGTLSLLSVQEVIDCAGNGNMGCS 202
           R A  + KV++Q   CG+CWAFS+V + ES++ + ++ ++  LS QE+++C    + GCS
Sbjct: 241 RRADAVTKVKDQGLNCGSCWAFSSVGSVESLYKIYRDKSVD-LSEQELVNCD-TKSQGCS 298

Query: 203 GGDFCALLDWMDVNKVVLEPESEYPLLLKDAACKRKATSPNGVKIKSYTCDTLIPSESSI 262
           GG     L++  V    L   S+ P L KD  C    +S   V I SY     +     +
Sbjct: 299 GGYPDTALEY--VKNKGLSSSSDVPYLAKDGKC--VVSSTKKVYIDSYL----VAKGKDV 350

Query: 263 LTDIATHGPVIAAVNAL-TWQYYLGGVIQYNCDGSLANINHAVQIV--GYD 310
           L       P +  +        Y  GV    C  SL   NHAV +V  GYD
Sbjct: 351 LNKSLVISPTVVYIAVSRELLKYKSGVYNGECGKSL---NHAVLLVGEGYD 398


>gnl|CDD|240232 PTZ00021, PTZ00021, falcipain-2; Provisional.
          Length = 489

 Score =  102 bits (257), Expect = 2e-24
 Identities = 90/291 (30%), Positives = 129/291 (44%), Gaps = 33/291 (11%)

Query: 29  NLEQKLELFSSFQQRYKKSYSKS-EHDIRFKNFEKSLDIIEELNKNRQSPESARYGITEF 87
           NLE  +  F  F + + K Y    E   R+ +F ++L  I   N         + G+  F
Sbjct: 162 NLE-NVNSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENV--LYKKGMNRF 218

Query: 88  SDLSEEEFKTRHLRHSVNKHVLMSHHKHHDHHHNH---VKKRSITTGITIPTGIPVKKDW 144
            DLS EEFK ++L  ++      S+ K      N+   +KK          T    K DW
Sbjct: 219 GDLSFEEFKKKYL--TLKSFDFKSNGKKSPRVINYDDVIKKYKPKDA----TFDHAKYDW 272

Query: 145 REAGIIGKVRNQQTCGACWAFSTVETAESMHALKNGTLSLLSVQEVIDCAGNGNMGCSGG 204
           R    +  V++Q+ CG+CWAFSTV   ES +A++   L  LS QE++DC+   N GC GG
Sbjct: 273 RLHNGVTPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQELVDCSFK-NNGCYGG 331

Query: 205 DFC-ALLDWMDVNKVVLEPESEY----PLLLKDAACKRKATSPNGVKIKSYTCDTLIPSE 259
               A  D +++  +  E +  Y    P L     CK K       KIKSY     IP E
Sbjct: 332 LIPNAFEDMIELGGLCSEDDYPYVSDTPELCNIDRCKEK------YKIKSYVS---IP-E 381

Query: 260 SSILTDIATHGPVIAAVNAL-TWQYYLGGVIQYNCDGSLANINHAVQIVGY 309
                 I   GP+  ++     + +Y GG+    C       NHAV +VGY
Sbjct: 382 DKFKEAIRFLGPISVSIAVSDDFAFYKGGIFDGECGEEP---NHAVILVGY 429


>gnl|CDD|239111 cd02620, Peptidase_C1A_CathepsinB, Cathepsin B group; composed of
           cathepsin B and similar proteins, including
           tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin
           B is a lysosomal papain-like cysteine peptidase which is
           expressed in all tissues and functions primarily as an
           exopeptidase through its carboxydipeptidyl activity.
           Together with other cathepsins, it is involved in the
           degradation of proteins, proenzyme activation, Ag
           processing, metabolism and apoptosis. Cathepsin B has
           been implicated in a number of human diseases such as
           cancer, rheumatoid arthritis, osteoporosis and
           Alzheimer's disease. The unique carboxydipeptidyl
           activity of cathepsin B is attributed to the presence of
           an occluding loop in its active site which favors the
           binding of the C-termini of substrate proteins. Some
           members of this group do not possess the occluding loop.
           TIN-Ag is an extracellular matrix basement protein which
           was originally identified as a target Ag involved in
           anti-tubular basement membrane antibody-mediated
           interstitial nephritis. It plays a role in renal
           tubulogenesis and is defective in hereditary
           tubulointerstitial disorders. TIN-Ag is exclusively
           expressed in kidney tissues. .
          Length = 236

 Score = 85.4 bits (212), Expect = 1e-19
 Identities = 52/181 (28%), Positives = 77/181 (42%), Gaps = 24/181 (13%)

Query: 150 IGKVRNQQTCGACWAFSTVETAESMHAL-KNGTLS-LLSVQEVIDCAGNGNMGCSGGDFC 207
           IG++R+Q  CG+CWAFS VE       +  NG  + LLS Q+++ C      GC+GG   
Sbjct: 16  IGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVLLSAQDLLSCCSGCGDGCNGGYPD 75

Query: 208 ALLDWMDVNKVVLE-------PESEYPLLLKD---------AACKRKATSPNG-VKIKSY 250
           A   ++    VV         P   +                 C+          K K  
Sbjct: 76  AAWKYLTTTGVVTGGCQPYTIPPCGHHPEGPPPCCGTPYCTPKCQDGCEKTYEEDKHKGK 135

Query: 251 TCDTLIPSESSILTDIATHGPVIAA--VNALTWQYYLGGVIQYNCDGSLANINHAVQIVG 308
           +  ++   E+ I+ +I T+GPV AA  V      YY  GV Q+     L    HAV+I+G
Sbjct: 136 SAYSVPSDETDIMKEIMTNGPVQAAFTVYEDFL-YYKSGVYQHTSGKQLG--GHAVKIIG 192

Query: 309 Y 309
           +
Sbjct: 193 W 193


>gnl|CDD|239149 cd02698, Peptidase_C1A_CathepsinX, Cathepsin X; the only
           papain-like lysosomal cysteine peptidase exhibiting
           carboxymonopeptidase activity. It can also act as a
           carboxydipeptidase, like cathepsin B, but has been shown
           to preferentially cleave substrates through a
           monopeptidyl carboxypeptidase pathway. The propeptide
           region of cathepsin X, the shortest among papain-like
           peptidases, is covalently attached to the active site
           cysteine in the inactive form of the enzyme. Little is
           known about the biological function of cathepsin X. Some
           studies point to a role in early tumorigenesis. A more
           recent study indicates that cathepsin X expression is
           restricted to immune cells suggesting a role in
           phagocytosis and the regulation of the immune response.
          Length = 239

 Score = 70.1 bits (172), Expect = 4e-14
 Identities = 56/194 (28%), Positives = 85/194 (43%), Gaps = 29/194 (14%)

Query: 138 IPVKKDWRE-AGI--IGKVRNQ---QTCGACWAFSTVET-AESMHALKNGTLS--LLSVQ 188
           +P   DWR   G+  +   RNQ   Q CG+CWA  +    A+ ++  + G      LSVQ
Sbjct: 1   LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQ 60

Query: 189 EVIDCAGNGNMGCSGGDFCALLDWMDVNKVVLEPESEYPLLLKDAACKR----KATSPNG 244
            VIDCAG G+  C GGD   + ++   + +  E  + Y    KD  C         +P G
Sbjct: 61  VVIDCAGGGS--CHGGDPGGVYEYAHKHGIPDETCNPY--QAKDGECNPFNRCGTCNPFG 116

Query: 245 V--KIKSYTCDTL-----IPSESSILTDIATHGPVIAAVNALTWQY-YLGGVI-QYNCDG 295
               IK+YT   +     +     ++ +I   GP+   + A      Y GGV  +Y  D 
Sbjct: 117 ECFAIKNYTLYFVSDYGSVSGRDKMMAEIYARGPISCGIMATEALENYTGGVYKEYVQDP 176

Query: 296 SLANINHAVQIVGY 309
               INH + + G+
Sbjct: 177 L---INHIISVAGW 187


>gnl|CDD|219764 pfam08246, Inhibitor_I29, Cathepsin propeptide inhibitor domain
          (I29).  This domain is found at the N-terminus of some
          C1 peptidases such as Cathepsin L where it acts as a
          propeptide. There are also a number of proteins that
          are composed solely of multiple copies of this domain
          such as the peptidase inhibitor salarin. This family is
          classified as I29 by MEROPS.
          Length = 58

 Score = 65.0 bits (159), Expect = 6e-14
 Identities = 24/60 (40%), Positives = 38/60 (63%), Gaps = 3/60 (5%)

Query: 37 FSSFQQRYKKSY-SKSEHDIRFKNFEKSLDIIEELNKNRQSPESARYGITEFSDLSEEEF 95
          F  ++++Y KSY S+ E   RF+ F+++L  IEE NK      S   G+ +F+DL++EEF
Sbjct: 1  FEDWKKKYGKSYYSEEEELYRFQIFKENLRFIEEHNKKGNV--SYTLGLNQFADLTDEEF 58


>gnl|CDD|239110 cd02619, Peptidase_C1, C1 Peptidase family (MEROPS database
           nomenclature), also referred to as the papain family;
           composed of two subfamilies of cysteine peptidases
           (CPs), C1A (papain) and C1B (bleomycin hydrolase).
           Papain-like enzymes are mostly endopeptidases with some
           exceptions like cathepsins B, C, H and X, which are
           exopeptidases. Papain-like CPs have different functions
           in various organisms. Plant CPs are used to mobilize
           storage proteins in seeds while mammalian CPs are
           primarily lysosomal enzymes responsible for protein
           degradation in the lysosome. Papain-like CPs are
           synthesized as inactive proenzymes with N-terminal
           propeptide regions, which are removed upon activation.
           Bleomycin hydrolase (BH) is a CP that detoxifies
           bleomycin by hydrolysis of an amide group. It acts as a
           carboxypeptidase on its C-terminus to convert itself
           into an aminopeptidase and peptide ligase. BH is found
           in all tissues in mammals as well as in many other
           eukaryotes. It forms a hexameric ring barrel structure
           with the active sites imbedded in the central channel.
           Some members of the C1 family are proteins classified as
           non-peptidase homologs which lack peptidase activity or
           have missing active site residues.
          Length = 223

 Score = 69.1 bits (169), Expect = 7e-14
 Identities = 53/188 (28%), Positives = 77/188 (40%), Gaps = 25/188 (13%)

Query: 143 DWREAGIIGKVRNQQTCGACWAFSTVETAESMHALKNGTLSL--LSVQEVIDCAGNGNM- 199
           D R    +  V+NQ + G+CWAF++    ES + +K G      LS Q +  CA +  + 
Sbjct: 3   DLRPL-RLTPVKNQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYLYICANDECLG 61

Query: 200 ---GCSGGDFCALLDWMDVNKVVLE----PESEYPLLLKDAACKRK---ATSPNGVKIKS 249
               C GG          + K+V      PE +YP   +    + K   A +   VK+K 
Sbjct: 62  INGSCDGG-----GPLSALLKLVALKGIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKD 116

Query: 250 YTCDTLIPSESSILTDIATHGPVIAAVNALTWQYYLGGVIQYNCDGSLAN-----INHAV 304
           Y    L  +   I   +A  GPV+A  +  +    L   I Y     L         HAV
Sbjct: 117 YR-RVLKNNIEDIKEALAKGGPVVAGFDVYSGFDRLKEGIIYEEIVYLLYEDGDLGGHAV 175

Query: 305 QIVGYDNY 312
            IVGYD+ 
Sbjct: 176 VIVGYDDN 183


>gnl|CDD|214853 smart00848, Inhibitor_I29, Cathepsin propeptide inhibitor domain
          (I29).  This domain is found at the N-terminus of some
          C1 peptidases such as Cathepsin L where it acts as a
          propeptide. There are also a number of proteins that
          are composed solely of multiple copies of this domain
          such as the peptidase inhibitor salarin. This family is
          classified as I29 by MEROPS. Peptide proteinase
          inhibitors can be found as single domain proteins or as
          single or multiple domains within proteins; these are
          referred to as either simple or compound inhibitors,
          respectively. In many cases they are synthesised as
          part of a larger precursor protein, either as a
          prepropeptide or as an N-terminal domain associated
          with an inactive peptidase or zymogen. This domain
          prevents access of the substrate to the active site.
          Removal of the N-terminal inhibitor domain either by
          interaction with a second peptidase or by autocatalytic
          cleavage activates the zymogen. Other inhibitors
          interact direct with proteinases using a simple
          noncovalent lock and key mechanism; while yet others
          use a conformational change-based trapping mechanism
          that depends on their structural and thermodynamic
          properties.
          Length = 57

 Score = 62.6 bits (153), Expect = 4e-13
 Identities = 23/59 (38%), Positives = 37/59 (62%), Gaps = 3/59 (5%)

Query: 37 FSSFQQRYKKSY-SKSEHDIRFKNFEKSLDIIEELNKNRQSPESARYGITEFSDLSEEE 94
          F  +++++ KSY S+ E   RF  F+++L  IEE NK  +   S + G+ +FSDL+ EE
Sbjct: 1  FEQWKKKHGKSYSSEEEEARRFAIFKENLKKIEEHNKKYEH--SYKLGVNQFSDLTPEE 57


>gnl|CDD|239112 cd02621, Peptidase_C1A_CathepsinC, Cathepsin C; also known as
           Dipeptidyl Peptidase I (DPPI), an atypical papain-like
           cysteine peptidase with chloride dependency and
           dipeptidyl aminopeptidase activity, resulting from its
           tetrameric structure which limits substrate access. Each
           subunit of the tetramer is composed of three peptides:
           the heavy and light chains, which together adopts the
           papain fold and forms the catalytic domain; and the
           residual propeptide region, which forms a beta barrel
           and points towards the substrate's N-terminus. The
           subunit composition is the result of the unique
           characteristic of procathepsin C maturation involving
           the cleavage of the catalytic domain and the
           non-autocatalytic excision of an activation peptide
           within its propeptide region. By removing N-terminal
           dipeptide extensions, cathepsin C activates granule
           serine peptidases (granzymes) involved in cell-mediated
           apoptosis, inflammation and tissue remodelling.
           Loss-of-function mutations in cathepsin C are associated
           with Papillon-Lefevre and Haim-Munk syndromes, rare
           diseases characterized by hyperkeratosis and early-onset
           periodontitis. Cathepsin C is widely expressed in many
           tissues with high levels in lung, kidney and placenta.
           It is also highly expressed in cytotoxic lymphocytes and
           mature myeloid cells.
          Length = 243

 Score = 67.4 bits (165), Expect = 4e-13
 Identities = 49/195 (25%), Positives = 81/195 (41%), Gaps = 33/195 (16%)

Query: 143 DWREAGI----IGKVRNQQTCGACWAFSTVETAES--MHAL----KNGTLSLLSVQEVID 192
           DW +       +  VRNQ  CG+C+AF++V   E+  M A       G   +LS Q V+ 
Sbjct: 6   DWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSPQHVLS 65

Query: 193 CAGNGNMGCSGGDFCALLDWMDVNKVVLE------PESEYPLLLKDAACKRKATSPNGVK 246
           C+   + GC GG    +  + +   +V E       + + P     + C+R   S +   
Sbjct: 66  CS-QYSQGCDGGFPFLVGKFAEDFGIVTEDYFPYTADDDRPCKASPSECRRYYFS-DYNY 123

Query: 247 IKSYTCDTLIPSESSILTDIATHGPVIAAVNALT-WQYYLGGV-----IQYNCDGSLANI 300
           +      T   +E  +  +I  +GP++ A    + + +Y  GV          DG   N 
Sbjct: 124 VGGCYGCT---NEDEMKWEIYRNGPIVVAFEVYSDFDFYKEGVYHHTDNDEVSDGDNDNF 180

Query: 301 ------NHAVQIVGY 309
                 NHAV +VG+
Sbjct: 181 NPFELTNHAVLLVGW 195


>gnl|CDD|240381 PTZ00364, PTZ00364, dipeptidyl-peptidase I precursor; Provisional.
          Length = 548

 Score = 48.7 bits (116), Expect = 2e-06
 Identities = 41/191 (21%), Positives = 65/191 (34%), Gaps = 51/191 (26%)

Query: 159 CGACWAFSTVE--TAESMHALKN----GTLSLLSVQEVIDCAGNGNMGCSGGDFCALLDW 212
           C + +  + +    A  M A       G  + LS + V+DC+  G  GC+GG    +  +
Sbjct: 232 CNSSYVEAALAAMMARVMVASNRTDPLGQQTFLSARHVLDCSQYGQ-GCAGGFPEEVGKF 290

Query: 213 MDVNKVVLEPESEYPLLLKDAACKRKATSP--------NGVKIKSYTCDTLIPSESSILT 264
            +   ++       P    D   +   T          N   +  Y      P E  I+ 
Sbjct: 291 AETFGILTTDSYYIPYDSGDGVERACKTRRPSRRYYFTNYGPLGGYYGAVTDPDE--IIW 348

Query: 265 DIATHGPVIAAVNALTWQYYLGGVIQYNCDGSL--------------------------A 298
           +I  HGPV A+V A            YNCD +                           +
Sbjct: 349 EIYRHGPVPASVYA--------NSDWYNCDENSTEDVRYVSLDDYSTASADRPLRHYFAS 400

Query: 299 NINHAVQIVGY 309
           N+NH V I+G+
Sbjct: 401 NVNHTVLIIGW 411


>gnl|CDD|227207 COG4870, COG4870, Cysteine protease [Posttranslational
           modification, protein turnover, chaperones].
          Length = 372

 Score = 39.1 bits (91), Expect = 0.002
 Identities = 41/196 (20%), Positives = 61/196 (31%), Gaps = 33/196 (16%)

Query: 137 GIPVKKDWREAGIIGKVRNQQTCGACWAFSTVETAES-------MHALKNGTLSLLSVQE 189
            +P   D R+ G +  V++Q + G+CWAF+T  + ES           +N   +LL V  
Sbjct: 98  SLPSYFDRRDEGKVSPVKDQGSGGSCWAFATTRSLESYLNPESAWDFSENNMKNLLGVPY 157

Query: 190 VIDCAGNGNMG-----------CSGGDFCALLDWMDVNKVVLEPESEYPLLLKDAACKRK 238
                   N G              G      D    N       +  P+       +  
Sbjct: 158 EKGFDYTSNDGGNADMSAAYLTEWSGPVYETDDPYSENSYF--SPTNLPVTKHVQEAQII 215

Query: 239 ATSPNGVKIKSYTCDTLIPSESSILTDIATHGPVIAAVNALTWQYYLGGVIQYNCDGSLA 298
            +       K Y  +  I    ++          +           LG  I Y    S  
Sbjct: 216 PS------RKKYLDNGNIK---AMFGFYGAVSSSMYIDAT----NSLGICIPYPYVDSGE 262

Query: 299 NINHAVQIVGYDNYSR 314
           N  HAV IVGYD+   
Sbjct: 263 NWGHAVLIVGYDDSFD 278


>gnl|CDD|193500 cd03879, M28_AAP, M28 Zn-Peptidase Aeromonas (Vibrio)
          proteolytica aminopeptidase.  Peptidase family M28;
          Aeromonas (Vibrio) proteolytica aminopeptidase (AAP;
          leucine aminopeptidase from Vibrio proteolyticus;
          Bacterial leucyl aminopeptidase; E.C. 3.4.11.10)
          subfamily. AAP is a small (32kDa), heat stable leucine
          aminopeptidase and is active as a monomer. Similar
          forms of the enzyme have been isolated from Escherichia
          coli and Staphylococcus thermophilus. Leucine
          aminopeptidases, in general, play important roles in
          many biological processes such as protein catabolism,
          hormone degradation, regulation of migration and cell
          proliferation, as well as HIV infection and
          proliferation. AAP is a broad-specificity enzyme,
          utilizing two zinc(II) ions in its active site to
          remove N-terminal amino acids, with preference for
          large hydrophobic amino acids in the P1 position of the
          substrate, Leu being the most efficiently cleaved. It
          can accommodate all residues, except Pro, Asp and Glu
          in the P1' position.
          Length = 285

 Score = 32.9 bits (76), Expect = 0.18
 Identities = 10/24 (41%), Positives = 13/24 (54%)

Query: 24 KVSKPNLEQKLELFSSFQQRYKKS 47
          K+S  N+   LE  +SF  RY  S
Sbjct: 14 KLSADNIRSTLEKLTSFHNRYYTS 37


>gnl|CDD|233048 TIGR00605, rad4, DNA repair protein rad4.  All proteins in this
           family for which functions are known are involved in
           targeting nucleotide excision repair to specific regions
           of the genome.This family is based on the phylogenomic
           analysis of JA Eisen (1999, Ph.D. Thesis, Stanford
           University) [DNA metabolism, DNA replication,
           recombination, and repair].
          Length = 713

 Score = 32.9 bits (75), Expect = 0.23
 Identities = 34/185 (18%), Positives = 55/185 (29%), Gaps = 16/185 (8%)

Query: 25  VSKPNLEQKLELFSSFQQRYKKSYSKSEHDIRF-KNFEKSLDIIEELNKNRQSPESARYG 83
           VS P+     E   S ++ Y     +  +       FE     I+  +K     ++    
Sbjct: 75  VSVPDSLSVSEEIPSREEDYDSREFEDVYLSNLVAEFETISVEIKPSSKAESDDDAET-- 132

Query: 84  ITEFSDLSEEEFKTRHLRHSVNKHVLMSHHKHHDHHHNHVKKRSITTGITIPTGI----- 138
               +  S E  K R   H +    LM H    +        +S      IP  +     
Sbjct: 133 -LSRNVCSNEARKDRKYIHILYLLCLMVHLFTRNEWSLSAPLKSAKLSNLIPEKVRLLLH 191

Query: 139 -PVKKDW------REAGIIGKVRNQQTCGACWAFSTVETAESMHALKNGTLSLLSVQEVI 191
             V+K                V   + C   W     +T + +  L NG     S  E I
Sbjct: 192 PSVRKSEELPSRSLRGLRKPLVEKLKKCMETWQKGLRKTTKGLLKLLNGGRYSRSKWEEI 251

Query: 192 DCAGN 196
           + + N
Sbjct: 252 EKSSN 256


>gnl|CDD|113081 pfam04297, UPF0122, Putative helix-turn-helix protein, YlxM / p13
           like.  Members of this family are predicted to contain a
           helix-turn-helix motif, for example residues 37-55 in
           Mycoplasma mycoides p13. Genes encoding family members
           are often part of operons that encode components of the
           SRP pathway, and this protein may regulate the
           expression of an operon related to the SRP pathway.
          Length = 101

 Score = 29.7 bits (67), Expect = 0.68
 Identities = 11/40 (27%), Positives = 23/40 (57%), Gaps = 2/40 (5%)

Query: 29  NLEQKLELFSSFQQRYKKSYSK-SEHDIRFKNFEKSLDII 67
           + E+KL L+  ++ R  + Y K  +  ++ K+ E+ L +I
Sbjct: 63  SYEEKLHLYEKYKLR-NELYEKIKDKQLKDKDLEQLLKLI 101


>gnl|CDD|215711 pfam00092, VWA, von Willebrand factor type A domain. 
          Length = 178

 Score = 30.0 bits (68), Expect = 0.99
 Identities = 15/49 (30%), Positives = 23/49 (46%), Gaps = 4/49 (8%)

Query: 49 SKSEHDIRFKNFEKSLDIIEEL-NKNRQSPESARYGITEFSDLSEEEFK 96
          S S   I   NFEK  + I++L  +    P+  R G+ ++S     EF 
Sbjct: 9  SGS---IGEANFEKVKEFIKKLVERLDIGPDGTRVGLVQYSSDVTTEFS 54


>gnl|CDD|240244 PTZ00049, PTZ00049, cathepsin C-like protein; Provisional.
          Length = 693

 Score = 30.7 bits (69), Expect = 1.0
 Identities = 25/84 (29%), Positives = 34/84 (40%), Gaps = 27/84 (32%)

Query: 137 GIPVKKDWREAGIIGKVRNQQTCGACWAFSTVETAESMHALK----------------NG 180
           G P   + RE      V NQ  CG+C+       A  M+A K                N 
Sbjct: 388 GDPFNNNTREY----DVTNQLLCGSCYI------ASQMYAFKRRIEIALTKNLDKKYLNN 437

Query: 181 TLSLLSVQEVIDCAGNGNMGCSGG 204
              LLS+Q V+ C+   + GC+GG
Sbjct: 438 FDDLLSIQTVLSCSFY-DQGCNGG 460


>gnl|CDD|221138 pfam11573, Med23, Mediator complex subunit 23.  Med23 is one of the
           subunits of the Tail portion of the Mediator complex
           that regulates RNA polymerase II activity. Med23 is
           required for heat-shock-specific gene expression, and
           has been shown to mediate transcriptional activation of
           E1A in mice.
          Length = 1341

 Score = 30.6 bits (69), Expect = 1.2
 Identities = 26/140 (18%), Positives = 46/140 (32%), Gaps = 22/140 (15%)

Query: 31  EQKLELFSSFQQRYKKSYSKSEHDIRFKNFEKSLDIIEELNKNRQSPESARYGITEFSDL 90
            Q +E+     +   KS ++  H  +F N  +S ++ +E    R          T FS +
Sbjct: 5   TQIIEMDEERVKSQIKSLAEENHTRKFPNPLES-NLGDETAILRIKFN------TMFSKM 57

Query: 91  SEEEFKTRHLRHSVNKHVLMSHHKHHDHHHNHVKKRSITTGITIPT-----GIPVKKDWR 145
            +EE     L   + K V     K+       V  R +   I   T     G+  +K + 
Sbjct: 58  EQEE--KESLVRELLKMVHHVAEKNRYERVVDVLLRYVHQKIIPATMLCEEGLISEKLFY 115

Query: 146 EAGIIGKVRNQQTCGACWAF 165
           E          +     +  
Sbjct: 116 E--------CSRFWIEKFKL 127


>gnl|CDD|184161 PRK13580, PRK13580, serine hydroxymethyltransferase; Provisional.
          Length = 493

 Score = 30.4 bits (69), Expect = 1.4
 Identities = 10/36 (27%), Positives = 19/36 (52%), Gaps = 2/36 (5%)

Query: 75  QSPESARYGITEFSDLSEEEFKTRHLRHSVNKHVLM 110
           +SP   + G    +DL+EE+++   LR  +    L+
Sbjct: 137 ESPALEKLGAKTVNDLTEEDWEA--LRAELGNQRLL 170


>gnl|CDD|227889 COG5602, SIN3, Histone deacetylase complex, SIN3 component
           [Chromatin structure and dynamics].
          Length = 1163

 Score = 30.3 bits (68), Expect = 1.5
 Identities = 14/47 (29%), Positives = 26/47 (55%), Gaps = 6/47 (12%)

Query: 19  LAIPVKVSKPNLEQKLELFSSFQQRYKKSYSKSEHDIRFKNFEKSLD 65
           + IP+ + +  L+ K E +    +  K+ ++K   +I  KN+ KSLD
Sbjct: 635 VTIPIVLKR--LKMKDEEW----RSCKREWNKIWREIEEKNYHKSLD 675


>gnl|CDD|236641 PRK10019, PRK10019, nickel/cobalt efflux protein RcnA; Provisional.
          Length = 279

 Score = 29.8 bits (67), Expect = 1.6
 Identities = 9/25 (36%), Positives = 11/25 (44%)

Query: 98  RHLRHSVNKHVLMSHHKHHDHHHNH 122
            H  H  +      HH  H HHH+H
Sbjct: 121 HHHDHDHDHDHDHEHHHDHGHHHHH 145



 Score = 29.0 bits (65), Expect = 2.9
 Identities = 9/31 (29%), Positives = 12/31 (38%)

Query: 92  EEEFKTRHLRHSVNKHVLMSHHKHHDHHHNH 122
           E  +      H  +      H  HHDH H+H
Sbjct: 113 ERNWLENMHHHDHDHDHDHDHEHHHDHGHHH 143



 Score = 27.8 bits (62), Expect = 7.9
 Identities = 7/30 (23%), Positives = 8/30 (26%)

Query: 93  EEFKTRHLRHSVNKHVLMSHHKHHDHHHNH 122
            E       H  +      H   H H H H
Sbjct: 112 GERNWLENMHHHDHDHDHDHDHEHHHDHGH 141



 Score = 27.4 bits (61), Expect = 9.6
 Identities = 8/32 (25%), Positives = 13/32 (40%)

Query: 91  SEEEFKTRHLRHSVNKHVLMSHHKHHDHHHNH 122
             E     ++ H  + H     H+HH  H +H
Sbjct: 111 RGERNWLENMHHHDHDHDHDHDHEHHHDHGHH 142


>gnl|CDD|192184 pfam08932, DUF1914, Domain of unknown function (DUF1914).  This
           domain has no known function. It is found in a various
           putative receptor proteins from Lactococcus
           bacteriophages.
          Length = 114

 Score = 28.6 bits (64), Expect = 1.7
 Identities = 13/52 (25%), Positives = 18/52 (34%), Gaps = 3/52 (5%)

Query: 265 DIATHGPVIAAVNALT--WQYYLGGVIQYNCDGSLANINHAV-QIVGYDNYS 313
           DI T    +   N LT   Q     ++ Y   GS+ NIN           + 
Sbjct: 13  DIPTQTLTVQTGNGLTGQLQKKNMDLVIYRFSGSITNINSGAIFPWVTFPFR 64


>gnl|CDD|204706 pfam11666, DUF2933, Protein of unknown function (DUF2933).  This
           bacterial family of proteins has no known function.
          Length = 56

 Score = 26.9 bits (60), Expect = 2.2
 Identities = 5/21 (23%), Positives = 7/21 (33%), Gaps = 2/21 (9%)

Query: 107 HVLM--SHHKHHDHHHNHVKK 125
           H+ M   H  H  H  +    
Sbjct: 36  HLFMHGGHGGHGGHDSHDDDP 56


>gnl|CDD|219224 pfam06904, Extensin-like_C, Extensin-like protein C-terminus.  This
           family represents the C-terminus (approx. 120 residues)
           of a number of bacterial extensin-like proteins.
           Extensins are cell wall glycoproteins normally
           associated with plants, where they strengthen the cell
           wall in response to mechanical stress. Note that many
           family members of this family are hypothetical.
          Length = 178

 Score = 29.1 bits (66), Expect = 2.3
 Identities = 12/31 (38%), Positives = 12/31 (38%)

Query: 138 IPVKKDWREAGIIGKVRNQQTCGACWAFSTV 168
           I V KDW   G  G         AC  F TV
Sbjct: 120 ISVLKDWNGDGREGAFLRAVRDAACGRFGTV 150


>gnl|CDD|202517 pfam03051, Peptidase_C1_2, Peptidase C1-like family.  This family
           is closely related to the Peptidase_C1 family pfam00112,
           containing several prokaryotic and eukaryotic
           aminopeptidases and bleomycin hydrolases.
          Length = 438

 Score = 29.2 bits (66), Expect = 2.7
 Identities = 11/37 (29%), Positives = 14/37 (37%), Gaps = 4/37 (10%)

Query: 151 GKVRNQQTCGACWAFSTVETAE----SMHALKNGTLS 183
             V NQ+  G CW F+ + T          LK    S
Sbjct: 56  DPVTNQKQSGRCWLFAALNTMRHPFMKKLKLKEFEFS 92


>gnl|CDD|222278 pfam13638, PIN_4, PIN domain.  Members of this family of bacterial
           domains are predicted to be RNases (from similarities to
           5'-exonucleases).
          Length = 129

 Score = 28.3 bits (64), Expect = 2.8
 Identities = 14/80 (17%), Positives = 24/80 (30%), Gaps = 9/80 (11%)

Query: 53  HDIRF-KNFEKSLDII------EEL--NKNRQSPESARYGITEFSDLSEEEFKTRHLRHS 103
           HD     +F +  D++      EEL   K R           E     +E  +    R  
Sbjct: 10  HDPDALFSFAEENDVVIPITVLEELDKLKKRSDLRELGRNAREAIRFLDELLEDGSGRIR 69

Query: 104 VNKHVLMSHHKHHDHHHNHV 123
           V         +  D + + +
Sbjct: 70  VQTLDERLPPEIEDKNDDRI 89


>gnl|CDD|236558 PRK09545, znuA, high-affinity zinc transporter periplasmic
           component; Reviewed.
          Length = 311

 Score = 28.8 bits (65), Expect = 3.3
 Identities = 10/39 (25%), Positives = 15/39 (38%)

Query: 84  ITEFSDLSEEEFKTRHLRHSVNKHVLMSHHKHHDHHHNH 122
           I +  D+     K  H  H  + H    H K  + HH+ 
Sbjct: 103 IAQLPDVKPLLMKGAHDDHHDDDHDHAGHEKSDEDHHHG 141


>gnl|CDD|185641 PTZ00462, PTZ00462, Serine-repeat antigen protein; Provisional.
          Length = 1004

 Score = 29.3 bits (65), Expect = 3.7
 Identities = 22/54 (40%), Positives = 28/54 (51%), Gaps = 6/54 (11%)

Query: 262 ILTDIATHGPVIA---AVNALTWQYYLGGVIQYNCDGSLANINHAVQIVGYDNY 312
           I  +I   G VIA   A N L +++  G  +Q  C    A+  HAV IVGY NY
Sbjct: 683 IKDEIMNKGSVIAYIKAENVLGYEFN-GKKVQNLCGDDTAD--HAVNIVGYGNY 733


>gnl|CDD|234079 TIGR02984, Sig-70_plancto1, RNA polymerase sigma-70 factor,
           Planctomycetaceae-specific subfamily 1.  This group of
           sigma factors are members of the sigma-70 family
           (TIGR02937) and are apparently found only in the
           Planctomycetaceae family including the genuses Gemmata
           and Pirellula (in which seven sequences are found).
          Length = 189

 Score = 28.4 bits (64), Expect = 3.8
 Identities = 13/49 (26%), Positives = 19/49 (38%), Gaps = 6/49 (12%)

Query: 80  ARYGITEFSDLSEEEFKTRHLRHSVNKHVLMSHHKHHDHHHNHVKKRSI 128
           A     +F   +E EF    LR  +  +VL    + H       +KR I
Sbjct: 49  AHRRFDQFRGKTEGEFAG-WLRG-ILSNVLADALRRH----LGAQKRDI 91


>gnl|CDD|217010 pfam02387, IncFII_repA, IncFII RepA protein family.  This protein
           is plasmid encoded and found to be essential for plasmid
           replication.
          Length = 279

 Score = 28.9 bits (65), Expect = 3.8
 Identities = 18/99 (18%), Positives = 41/99 (41%), Gaps = 1/99 (1%)

Query: 10  IVALIALCFLAIPVKVSKPNLEQKLELFSSFQQRYKK-SYSKSEHDIRFKNFEKSLDIIE 68
           I+ L  L F+ + +   K N  +K +L    ++  KK     +  + R +  E  ++   
Sbjct: 150 IITLTPLFFMLLGISEEKLNSARKQQLEWENKKLKKKGLIPLTLDEARRRAKEFHIERAF 209

Query: 69  ELNKNRQSPESARYGITEFSDLSEEEFKTRHLRHSVNKH 107
                R++    R    + + L E++ + + L   V ++
Sbjct: 210 SYRTERKAFGKKRRRARKLAKLDEKDIRKKILNALVKEY 248


>gnl|CDD|184326 PRK13788, PRK13788, adenylosuccinate synthetase; Provisional.
          Length = 404

 Score = 28.6 bits (64), Expect = 4.2
 Identities = 14/46 (30%), Positives = 20/46 (43%), Gaps = 5/46 (10%)

Query: 86  EFSDLSEEEFKTRHLRHSVNKHVLMSHHKHHDHHHNHVKKRSITTG 131
           E  +L       + L  S   H+++ HHK+ D   N V     TTG
Sbjct: 82  ERENLRAGGLNPK-LLISERAHLVLPHHKYVDGRKNFVG----TTG 122


>gnl|CDD|176542 cd08600, GDPD_EcGlpQ_like, Glycerophosphodiester phosphodiesterase
           domain of Escherichia coli (GlpQ) and similar proteins. 
           This subfamily corresponds to the glycerophosphodiester
           phosphodiesterase domain (GDPD) present in Escherichia
           coli periplasmic glycerophosphodiester phosphodiesterase
           (GP-GDE, EC 3.1.4.46), GlpQ, and similar proteins.
           GP-GDE plays an essential role in the metabolic pathway
           of E. coli. It catalyzes the degradation of
           glycerophosphodiesters to produce
           sn-glycerol-3-phosphate (G3P) and the corresponding
           alcohols, which are major sources of carbon and
           phosphate. E. coli possesses two major G3P uptake
           systems: Glp and Ugp, which contain genes coding for two
           different GP-GDEs. GlpQ gene from the E. coli glp operon
           codes for a periplasmic phosphodiesterase GlpQ, which is
           the prototype of this family. GlpQ is a dimeric enzyme
           that hydrolyzes periplasmic glycerophosphodiesters, such
           as glycerophosphocholine (GPC),
           glycerophosphoethanolanmine (GPE),
           glycerophosphoglycerol (GPG), glycerophosphoinositol
           (GPI), and glycerophosphoserine (GPS), to the
           corresponding alcohols and G3P, which is subsequently
           transported into the cell through the GlpT transport
           system. Ca2+ is required for the enzymatic activity of
           GlpQ.  This family also includes a surface-exposed
           lipoprotein, protein D (HPD), from Haemophilus influenza
           Type b and nontypeable strains, which shows very high
           sequence similarity with E. coli GlpQ. HPD has been
           characterized as a human immunoglobulin D-binding
           protein with glycerophosphodiester phosphodiesterase
           activity. It can hydrolyze phosphatidylcholine from host
           membranes to produce free choline on the
           lipopolysaccharides on the surface of pathogenic
           bacteria.
          Length = 318

 Score = 28.5 bits (64), Expect = 4.3
 Identities = 8/33 (24%), Positives = 17/33 (51%)

Query: 42  QRYKKSYSKSEHDIRFKNFEKSLDIIEELNKNR 74
           Q Y   +   + D +    E+ +++I+ LNK+ 
Sbjct: 100 QVYPNRFPLWKSDFKIHTLEEEIELIQGLNKST 132


>gnl|CDD|218163 pfam04592, SelP_N, Selenoprotein P, N terminal region.  SelP is the
           only known eukaryotic selenoprotein that contains
           multiple selenocysteine (Sec) residues, and accounts for
           more than 50% of the selenium content of rat and human
           plasma. It is thought to be glycosylated. SelP may have
           antioxidant properties. It can attach to epithelial
           cells, and may protect vascular endothelial cells
           against peroxynitrite toxicity. The high selenium
           content of SelP suggests that it may be involved in
           selenium intercellular transport or storage. The
           promoter structure of bovine SelP suggest that it may be
           involved in countering heavy metal intoxication, and may
           also have a developmental function. The N-terminal
           region of SelP can exist independently of the C terminal
           region. Zebrafish selenoprotein Pb lacks the C terminal
           Sec-rich region, and a protein encoded by the rat SelP
           gene and lacking this region has also been reported.
           N-terminal region contains a conserved SecxxCys motif,
           which is similar to the CysxxCys found in thioredoxins.
           It is speculated that the N terminal region may adopt a
           thioredoxin fold and catalyze redox reactions. The
           N-terminal region also contains a His-rich region, which
           is thought to mediate heparin binding. Binding to
           heparan proteoglycans could account for the membrane
           binding properties of SelP. The function of the
           bacterial members of this family is uncharcterised.
          Length = 238

 Score = 28.3 bits (63), Expect = 4.7
 Identities = 9/30 (30%), Positives = 12/30 (40%)

Query: 93  EEFKTRHLRHSVNKHVLMSHHKHHDHHHNH 122
           E    +   H  + H    H  HH H H+H
Sbjct: 178 EAEPRQDHPHHHSHHEHQGHAHHHPHGHHH 207


>gnl|CDD|233304 TIGR01180, aman2_put, alpha-1,2-mannosidase, putative.  The
           identification of members of this family as putative
           alpha-1,2-mannosidases is based on an unpublished
           characterization of the aman2 gene in Bacillus sp. M-90
           by Maruyama,Y., Nakajima,M. and Nakajima,T. (Genbank
           accession BAA76709, pid g4587313). Most members of this
           family appear to have signal sequences. Members from the
           dental pathogen Porphyromonas gingivalis have been
           described as immunoreactive with periodontitis patient
           serum [Cell envelope, Biosynthesis and degradation of
           surface polysaccharides and lipopolysaccharides].
          Length = 750

 Score = 28.7 bits (64), Expect = 4.9
 Identities = 5/52 (9%), Positives = 14/52 (26%), Gaps = 13/52 (25%)

Query: 111 SHHKHHDHHHNHVKKRSITTGITIPTGIPVKKDWREAGIIGKVRNQQTCGAC 162
                  +H+ ++                 K+ WR   +I ++  +      
Sbjct: 609 QPINEPSYHYPYL-------------YHYWKQPWRTQKLIRRLYRETFDNYP 647


>gnl|CDD|225125 COG2215, COG2215, ABC-type uncharacterized transport system,
           permease component [General function prediction only].
          Length = 303

 Score = 28.1 bits (63), Expect = 5.3
 Identities = 12/27 (44%), Positives = 14/27 (51%)

Query: 96  KTRHLRHSVNKHVLMSHHKHHDHHHNH 122
             R LRH   KH   + H H DH H+H
Sbjct: 152 TLRRLRHRHPKHPHFAAHPHPDHDHDH 178


>gnl|CDD|219014 pfam06414, Zeta_toxin, Zeta toxin.  This family consists of several
           bacterial zeta toxin proteins. Zeta toxin is thought to
           be part of a postregulational killing system in
           bacteria. It relies on antitoxin/toxin systems that
           secure stable inheritance of low and medium copy number
           plasmids during cell division and kill cells that have
           lost the plasmid.
          Length = 191

 Score = 27.6 bits (62), Expect = 6.1
 Identities = 7/35 (20%), Positives = 14/35 (40%), Gaps = 6/35 (17%)

Query: 41  QQRYKKSY------SKSEHDIRFKNFEKSLDIIEE 69
             RY++         K  HD  +    +S++ +E 
Sbjct: 136 LDRYEEELAAGRRVPKEVHDAAYNGLPESVEALER 170


>gnl|CDD|238501 cd01019, ZnuA, Zinc binding protein ZnuA. These proteins have been
           shown to function as initial receptors in the ABC uptake
           of Zn2+.  They belong to the TroA superfamily of
           periplasmic metal binding proteins that share a distinct
           fold and ligand binding mechanism.  They are comprised
           of two globular subdomains connected by a single helix
           and bind their specific ligands in the cleft between
           these domains.  A typical TroA protein is comprised of
           two globular subdomains connected by a single helix and
           can bind the metal ion in the cleft between these
           domains. In addition, these proteins sometimes have a
           low complexity region containing a metal-binding
           histidine-rich motif (repetitive HDH sequence).
          Length = 286

 Score = 28.1 bits (63), Expect = 6.4
 Identities = 7/30 (23%), Positives = 10/30 (33%)

Query: 93  EEFKTRHLRHSVNKHVLMSHHKHHDHHHNH 122
            + KT     S   H     H H +H  + 
Sbjct: 86  IDLKTLEDGASHGDHEHDHEHAHGEHDGHE 115


>gnl|CDD|237322 PRK13261, ureE, urease accessory protein UreE; Provisional.
          Length = 159

 Score = 27.2 bits (61), Expect = 8.2
 Identities = 5/12 (41%), Positives = 7/12 (58%)

Query: 111 SHHKHHDHHHNH 122
              +HH H H+H
Sbjct: 146 GAFRHHGHSHDH 157


>gnl|CDD|226107 COG3579, PepC, Aminopeptidase C [Amino acid transport and
           metabolism].
          Length = 444

 Score = 27.8 bits (62), Expect = 8.6
 Identities = 9/21 (42%), Positives = 12/21 (57%)

Query: 151 GKVRNQQTCGACWAFSTVETA 171
            KV NQ+  G CW F+ + T 
Sbjct: 58  DKVTNQKQSGRCWMFAALNTF 78


>gnl|CDD|177648 PHA03420, PHA03420, E4 protein; Provisional.
          Length = 137

 Score = 26.9 bits (59), Expect = 9.4
 Identities = 11/67 (16%), Positives = 21/67 (31%), Gaps = 9/67 (13%)

Query: 60  FEKSLDIIEELNKNRQSPESARYGITEFSDLSEEEFKTRHLRHSVNKHVLMSHHKHHDHH 119
           + + LD   E    ++   + R            ++  R   H  ++H    HH   D  
Sbjct: 20  YRRLLDGRAENQHIQREGGNHR-------TWDPADYLDRP-HHHPHRHQQDDHH-LQDRQ 70

Query: 120 HNHVKKR 126
           H   +  
Sbjct: 71  HLPQQHL 77


>gnl|CDD|217536 pfam03403, PAF-AH_p_II, Platelet-activating factor acetylhydrolase,
           isoform II.  Platelet-activating factor acetylhydrolase
           (PAF-AH) is a subfamily of phospholipases A2,
           responsible for inactivation of platelet-activating
           factor through cleavage of an acetyl group. Three known
           PAF-AHs are the brain heterotrimeric PAF-AH Ib, whose
           catalytic beta and gamma subunits are aligned in
           pfam02266, the extracellular, plasma PAF-AH (pPAF-AH),
           and the intracellular PAF-AH isoform II (PAF-AH II).
           This family aligns pPAF-AH and PAF-AH II, whose
           similarity was previously noted.
          Length = 372

 Score = 27.4 bits (61), Expect = 9.9
 Identities = 10/30 (33%), Positives = 18/30 (60%), Gaps = 4/30 (13%)

Query: 260 SSILTDIATHGPVIAAV----NALTWQYYL 285
           S+I  ++A+HG V+AAV     + +  Y+ 
Sbjct: 117 SAICIELASHGFVVAAVEHRDRSASATYFF 146


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.133    0.406 

Gapped
Lambda     K      H
   0.267   0.0764    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,782,824
Number of extensions: 1465569
Number of successful extensions: 1764
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1657
Number of HSP's successfully gapped: 67
Length of query: 317
Length of database: 10,937,602
Length adjustment: 97
Effective length of query: 220
Effective length of database: 6,635,264
Effective search space: 1459758080
Effective search space used: 1459758080
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (26.4 bits)