RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy1705
         (309 letters)



>gnl|CDD|239068 cd02248, Peptidase_C1A, Peptidase C1A subfamily (MEROPS database
           nomenclature); composed of cysteine peptidases (CPs)
           similar to papain, including the mammalian CPs
           (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain
           is an endopeptidase with specific substrate preferences,
           primarily for bulky hydrophobic or aromatic residues at
           the S2 subsite, a hydrophobic pocket in papain that
           accommodates the P2 sidechain of the substrate (the
           second residue away from the scissile bond). Most
           members of the papain subfamily are endopeptidases. Some
           exceptions to this rule can be explained by specific
           details of the catalytic domains like the occluding loop
           in cathepsin B which confers an additional
           carboxydipeptidyl activity and the mini-chain of
           cathepsin H resulting in an N-terminal exopeptidase
           activity. Papain-like CPs have different functions in
           various organisms. Plant CPs are used to mobilize
           storage proteins in seeds. Parasitic CPs act
           extracellularly to help invade tissues and cells, to
           hatch or to evade the host immune system. Mammalian CPs
           are primarily lysosomal enzymes with the exception of
           cathepsin W, which is retained in the endoplasmic
           reticulum. They are responsible for protein degradation
           in the lysosome. Papain-like CPs are synthesized as
           inactive proenzymes with N-terminal propeptide regions,
           which are removed upon activation. In addition to its
           inhibitory role, the propeptide is required for proper
           folding of the newly synthesized enzyme and its
           stabilization in denaturing pH conditions. Residues
           within the propeptide region also play a role in the
           transport of the proenzyme to lysosomes or acidified
           vesicles. Also included in this subfamily are proteins
           classified as non-peptidase homologs, which lack
           peptidase activity or have missing active site residues.
          Length = 210

 Score =  257 bits (659), Expect = 2e-86
 Identities = 100/213 (46%), Positives = 134/213 (62%), Gaps = 7/213 (3%)

Query: 99  PDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIISG 158
           P+ +DWREKG +TP  +Q  CG+C+AFS   A++G     T ++  LS QQ+VDCS  SG
Sbjct: 1   PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQQLVDCS-TSG 59

Query: 159 NLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVDISSWSVLPPQDEH 218
           N GC GG+  N   YV+  GGL  E DYPY GK   CK+    +   I+ +S +PP DE 
Sbjct: 60  NNGCNGGNPDNAFEYVK-NGGLASESDYPYTGKDGTCKYNSSKVGAKITGYSNVPPGDEE 118

Query: 219 ALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGYTR----NSWI 274
           ALK  LA  GP++V+I+AS  +FQ Y  GIY    C++  +NHA+LLVGY      + WI
Sbjct: 119 ALKAALANYGPVSVAIDAS-SSFQFYKGGIYSGPCCSNTNLNHAVLLVGYGTENGVDYWI 177

Query: 275 LKNWWSHHWGDNGYMYLKRGNNRCGIANYAVYA 307
           +KN W   WG+ GY+ + RG+N CGIA+YA Y 
Sbjct: 178 VKNSWGTSWGEKGYIRIARGSNLCGIASYASYP 210


>gnl|CDD|215726 pfam00112, Peptidase_C1, Papain family cysteine protease. 
          Length = 213

 Score =  242 bits (619), Expect = 3e-80
 Identities = 94/216 (43%), Positives = 130/216 (60%), Gaps = 8/216 (3%)

Query: 98  IPDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIIS 157
           +P+  DWREKG +TP  +Q  CG+C+AFS   A++G+    T ++  LS QQ+VDC   +
Sbjct: 1   LPESFDWREKGAVTPVKDQGQCGSCWAFSAVGALEGRYCIKTGKLVSLSEQQLVDCD--T 58

Query: 158 GNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNI-VVDISSWSVLPPQD 216
           GN GC GG   N   Y++  GG++ E DYPY      CKFK+ N     I  +  +P  D
Sbjct: 59  GNNGCNGGLPDNAFEYIKKNGGIVTESDYPYTAHDGTCKFKKSNSKYAKIKGYGDVPYND 118

Query: 217 EHALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGY-TRNS--- 272
           E AL+  LA  GP++V+I+A    FQLY SG+Y    C+   ++HA+L+VGY T N    
Sbjct: 119 EEALQAALAKNGPVSVAIDAYEDDFQLYKSGVYKHTECSG-ELDHAVLIVGYGTENGVPY 177

Query: 273 WILKNWWSHHWGDNGYMYLKRGNNRCGIANYAVYAL 308
           WI+KN W   WG+NGY  + RG N CGIA+ A Y +
Sbjct: 178 WIVKNSWGTDWGENGYFRIARGVNECGIASEASYPI 213


>gnl|CDD|214761 smart00645, Pept_C1, Papain family cysteine protease. 
          Length = 175

 Score =  176 bits (449), Expect = 4e-55
 Identities = 80/217 (36%), Positives = 105/217 (48%), Gaps = 50/217 (23%)

Query: 98  IPDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIIS 157
           +P+  DWR+KG +TP  +Q  CG+C+AFS   A++G+    T ++  LS QQ+VDCS   
Sbjct: 1   LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQLVDCS-GG 59

Query: 158 GNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVDISSWSVLPPQDE 217
           GN GC GG   N   Y++  GGL  E  YPY G                           
Sbjct: 60  GNCGCNGGLPDNAFEYIKKNGGLETESCYPYTG--------------------------- 92

Query: 218 HALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGY---TRNS-- 272
                        +V+I+AS   FQ Y SGIYD   C S  ++HA+L+VGY     N   
Sbjct: 93  -------------SVAIDASD--FQFYKSGIYDHPGCGSGTLDHAVLIVGYGTEVENGKD 137

Query: 273 -WILKNWWSHHWGDNGYMYLKRG-NNRCGIANYAVYA 307
            WI+KN W   WG+NGY  + RG NN CGI       
Sbjct: 138 YWIVKNSWGTDWGENGYFRIARGKNNECGIEASVASY 174


>gnl|CDD|240310 PTZ00200, PTZ00200, cysteine proteinase; Provisional.
          Length = 448

 Score =  146 bits (371), Expect = 9e-41
 Identities = 92/323 (28%), Positives = 144/323 (44%), Gaps = 54/323 (16%)

Query: 15  KKYKKDYRKKATDSKKKLHWQSNHKKIHTHNQEAQQGLHGYTLRENHLSDL--------- 65
           KKY + +   A    + L +++N+ ++ +H     +G   Y+   N  SDL         
Sbjct: 131 KKYNRKHATHAERLNRFLTFRNNYLEVKSH-----KGDEPYSKEINKFSDLTEEEFRKLF 185

Query: 66  HPRHYIKEMTRL-THSRIRRTLVRSP----------ESNESVLIPDH-----LDWREKGF 109
                  +      ++  +   V +P           ++E V  P       LDWR    
Sbjct: 186 PVIKVPPKSNSTSHNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITGEGLDWRRADA 245

Query: 110 ITPDWNQ-EDCGACYAFSIASAIQG--QIFKSTSEIEELSIQQVVDCSIISGNLGCAGGS 166
           +T   +Q  +CG+C+AFS   +++   +I++  S   +LS Q++V+C   + + GC+GG 
Sbjct: 246 VTKVKDQGLNCGSCWAFSSVGSVESLYKIYRDKSV--DLSEQELVNCD--TKSQGCSGGY 301

Query: 167 LRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVDISSWSVLPPQDEHALKVTLAT 226
               L YV+  G L    D PY  K   C        V I S+ V   +D   L  +L  
Sbjct: 302 PDTALEYVKNKG-LSSSSDVPYLAKDGKCVVSSTK-KVYIDSYLVAKGKD--VLNKSL-V 356

Query: 227 VGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGYTRNS------WILKNWWS 280
           + P  V I  S      Y SG+Y+ E C    +NHA+LLVG   +       WI+KN W 
Sbjct: 357 ISPTVVYIAVSR-ELLKYKSGVYNGE-C-GKSLNHAVLLVGEGYDEKTKKRYWIIKNSWG 413

Query: 281 HHWGDNGYMYLKR---GNNRCGI 300
             WG+NGYM L+R   G ++CGI
Sbjct: 414 TDWGENGYMRLERTNEGTDKCGI 436


>gnl|CDD|240232 PTZ00021, PTZ00021, falcipain-2; Provisional.
          Length = 489

 Score =  139 bits (351), Expect = 1e-37
 Identities = 102/340 (30%), Positives = 149/340 (43%), Gaps = 42/340 (12%)

Query: 1   MTNKEWIIIFIFPQKKYKKDYRKKATDSKKKLHWQSNHKKIHTHNQEAQQGLHGYTLREN 60
           MTN E +  F    K++ K Y+      ++ L +  N  KI+ HN +       Y    N
Sbjct: 160 MTNLENVNSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENVL---YKKGMN 216

Query: 61  HLSDLHPRHYIKEMTRLTHSRIRRTLVRSP-ESNESVLIP---------DHL--DWREKG 108
              DL    + K+   L     +    +SP   N   +I          DH   DWR   
Sbjct: 217 RFGDLSFEEFKKKYLTLKSFDFKSNGKKSPRVINYDDVIKKYKPKDATFDHAKYDWRLHN 276

Query: 109 FITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIISGNLGCAGGSLR 168
            +TP  +Q++CG+C+AFS    ++ Q     +E+  LS Q++VDCS    N GC GG + 
Sbjct: 277 GVTPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQELVDCSF--KNNGCYGGLIP 334

Query: 169 NTLNYVQFAGGLMKEEDYPYKG-KQSICKFKRPNIVVDISSWSVLPPQDEHALKVTLATV 227
           N    +   GGL  E+DYPY      +C   R      I S+  +P   E   K  +  +
Sbjct: 335 NAFEDMIELGGLCSEDDYPYVSDTPELCNIDRCKEKYKIKSYVSIP---EDKFKEAIRFL 391

Query: 228 GPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGY--------------TRNSW 273
           GPI+VSI  S   F  Y  GI+D E C  +  NHA++LVGY               R  +
Sbjct: 392 GPISVSIAVS-DDFAFYKGGIFDGE-CG-EEPNHAVILVGYGMEEIYNSDTKKMEKRYYY 448

Query: 274 ILKNWWSHHWGDNGYMYLKRGNN----RCGIANYAVYALI 309
           I+KN W   WG+ G++ ++   N     C +   A   LI
Sbjct: 449 IIKNSWGESWGEKGFIRIETDENGLMKTCSLGTEAYVPLI 488


>gnl|CDD|185513 PTZ00203, PTZ00203, cathepsin L protease; Provisional.
          Length = 348

 Score =  126 bits (317), Expect = 9e-34
 Identities = 72/219 (32%), Positives = 105/219 (47%), Gaps = 15/219 (6%)

Query: 98  IPDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIIS 157
           +PD +DWREKG +TP  NQ  CG+C+AFS    I+ Q   +  ++  LS QQ+V C  + 
Sbjct: 126 VPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQLVSCDHVD 185

Query: 158 GNLGCAGGSLRNTLNYV--QFAGGLMKEEDYPY---KGKQSICKFKRPNIVVDISSWSVL 212
              GC GG +     +V     G +  E+ YPY    G    C               V 
Sbjct: 186 N--GCGGGLMLQAFEWVLRNMNGTVFTEKSYPYVSGNGDVPECSNSSELAPGARIDGYVS 243

Query: 213 PPQDEHALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGYTRNS 272
               E  +   LA  GPI+++++AS  +F  Y SG+    +C  + +NH +LLVGY    
Sbjct: 244 MESSERVMAAWLAKNGPISIAVDAS--SFMSYHSGVL--TSCIGEQLNHGVLLVGYNMTG 299

Query: 273 ----WILKNWWSHHWGDNGYMYLKRGNNRCGIANYAVYA 307
               W++KN W   WG+ GY+ +  G N C +  Y V  
Sbjct: 300 EVPYWVIKNSWGEDWGEKGYVRVTMGVNACLLTGYPVSV 338


>gnl|CDD|239112 cd02621, Peptidase_C1A_CathepsinC, Cathepsin C; also known as
           Dipeptidyl Peptidase I (DPPI), an atypical papain-like
           cysteine peptidase with chloride dependency and
           dipeptidyl aminopeptidase activity, resulting from its
           tetrameric structure which limits substrate access. Each
           subunit of the tetramer is composed of three peptides:
           the heavy and light chains, which together adopts the
           papain fold and forms the catalytic domain; and the
           residual propeptide region, which forms a beta barrel
           and points towards the substrate's N-terminus. The
           subunit composition is the result of the unique
           characteristic of procathepsin C maturation involving
           the cleavage of the catalytic domain and the
           non-autocatalytic excision of an activation peptide
           within its propeptide region. By removing N-terminal
           dipeptide extensions, cathepsin C activates granule
           serine peptidases (granzymes) involved in cell-mediated
           apoptosis, inflammation and tissue remodelling.
           Loss-of-function mutations in cathepsin C are associated
           with Papillon-Lefevre and Haim-Munk syndromes, rare
           diseases characterized by hyperkeratosis and early-onset
           periodontitis. Cathepsin C is widely expressed in many
           tissues with high levels in lung, kidney and placenta.
           It is also highly expressed in cytotoxic lymphocytes and
           mature myeloid cells.
          Length = 243

 Score =  111 bits (280), Expect = 2e-29
 Identities = 71/247 (28%), Positives = 102/247 (41%), Gaps = 46/247 (18%)

Query: 99  PDHLDWREKG----FITPDWNQEDCGACYAFSIASAIQGQI------FKSTSEIEELSIQ 148
           P   DW +      +++P  NQ  CG+CYAF+   A++ +I           +   LS Q
Sbjct: 2   PKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSPQ 61

Query: 149 QVVDCSIISGNLGCAGGSLRNTLNYVQ-FAGGLMKEEDYPYKG-KQSICKFKRPNIVVDI 206
            V+ CS  S   GC GG       + + F  G++ E+ +PY       CK          
Sbjct: 62  HVLSCSQYSQ--GCDGGFPFLVGKFAEDF--GIVTEDYFPYTADDDRPCKASPSECRRYY 117

Query: 207 SS--------WSVLPPQDEHALKVTLATVGPIAVSINASPHTFQLYASGIYD----DEAC 254
            S        +      +E  +K  +   GPI V+       F  Y  G+Y     DE  
Sbjct: 118 FSDYNYVGGCYGC---TNEDEMKWEIYRNGPIVVAFEVYS-DFDFYKEGVYHHTDNDEVS 173

Query: 255 TSD--------YVNHAMLLVGY------TRNSWILKNWWSHHWGDNGYMYLKRGNNRCGI 300
             D          NHA+LLVG+          WI+KN W   WG+ GY  ++RG N CGI
Sbjct: 174 DGDNDNFNPFELTNHAVLLVGWGEDEIKGEKYWIVKNSWGSSWGEKGYFKIRRGTNECGI 233

Query: 301 ANYAVYA 307
            + AV+A
Sbjct: 234 ESQAVFA 240


>gnl|CDD|239110 cd02619, Peptidase_C1, C1 Peptidase family (MEROPS database
           nomenclature), also referred to as the papain family;
           composed of two subfamilies of cysteine peptidases
           (CPs), C1A (papain) and C1B (bleomycin hydrolase).
           Papain-like enzymes are mostly endopeptidases with some
           exceptions like cathepsins B, C, H and X, which are
           exopeptidases. Papain-like CPs have different functions
           in various organisms. Plant CPs are used to mobilize
           storage proteins in seeds while mammalian CPs are
           primarily lysosomal enzymes responsible for protein
           degradation in the lysosome. Papain-like CPs are
           synthesized as inactive proenzymes with N-terminal
           propeptide regions, which are removed upon activation.
           Bleomycin hydrolase (BH) is a CP that detoxifies
           bleomycin by hydrolysis of an amide group. It acts as a
           carboxypeptidase on its C-terminus to convert itself
           into an aminopeptidase and peptide ligase. BH is found
           in all tissues in mammals as well as in many other
           eukaryotes. It forms a hexameric ring barrel structure
           with the active sites imbedded in the central channel.
           Some members of the C1 family are proteins classified as
           non-peptidase homologs which lack peptidase activity or
           have missing active site residues.
          Length = 223

 Score = 94.1 bits (234), Expect = 7e-23
 Identities = 54/209 (25%), Positives = 84/209 (40%), Gaps = 20/209 (9%)

Query: 102 LDWREKGFITPDWNQEDCGACYAFSIASAIQGQ--IFKSTSEIEELSIQQVVDCS---II 156
           +D R         NQ   G+C+AF+ A A++    I     E  +LS Q +  C+    +
Sbjct: 2   VDLRPLRLTPVK-NQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYLYICANDECL 60

Query: 157 SGNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNI----VVDISSWSVL 212
             N  C GG   + L  +    G+  EEDYPY  +    + K         V +  +  +
Sbjct: 61  GINGSCDGGGPLSALLKLVALKGIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYRRV 120

Query: 213 PPQDEHALKVTLATVGPIAVSINASPHTFQL----YASGIYDDEACTSDYVNHAMLLVGY 268
              +   +K  LA  GP+    +      +L        I        D   HA+++VGY
Sbjct: 121 LKNNIEDIKEALAKGGPVVAGFDVYSGFDRLKEGIIYEEIVYLLYEDGDLGGHAVVIVGY 180

Query: 269 TRN------SWILKNWWSHHWGDNGYMYL 291
             N      ++I+KN W   WGDNGY  +
Sbjct: 181 DDNYVEGKGAFIVKNSWGTDWGDNGYGRI 209


>gnl|CDD|239111 cd02620, Peptidase_C1A_CathepsinB, Cathepsin B group; composed of
           cathepsin B and similar proteins, including
           tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin
           B is a lysosomal papain-like cysteine peptidase which is
           expressed in all tissues and functions primarily as an
           exopeptidase through its carboxydipeptidyl activity.
           Together with other cathepsins, it is involved in the
           degradation of proteins, proenzyme activation, Ag
           processing, metabolism and apoptosis. Cathepsin B has
           been implicated in a number of human diseases such as
           cancer, rheumatoid arthritis, osteoporosis and
           Alzheimer's disease. The unique carboxydipeptidyl
           activity of cathepsin B is attributed to the presence of
           an occluding loop in its active site which favors the
           binding of the C-termini of substrate proteins. Some
           members of this group do not possess the occluding loop.
           TIN-Ag is an extracellular matrix basement protein which
           was originally identified as a target Ag involved in
           anti-tubular basement membrane antibody-mediated
           interstitial nephritis. It plays a role in renal
           tubulogenesis and is defective in hereditary
           tubulointerstitial disorders. TIN-Ag is exclusively
           expressed in kidney tissues. .
          Length = 236

 Score = 90.4 bits (225), Expect = 2e-21
 Identities = 63/251 (25%), Positives = 97/251 (38%), Gaps = 57/251 (22%)

Query: 99  PDHLDWREKGFITPDW-------NQEDCGACYAFSIASA------IQGQIFKSTSEIEEL 145
           P+  D REK    P+        +Q +CG+C+AFS   A      IQ    ++      L
Sbjct: 1   PESFDAREK---WPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVL----L 53

Query: 146 SIQQVVDCSIISGNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVD 205
           S Q ++ C    G+ GC GG       Y+   G ++     PY         + P     
Sbjct: 54  SAQDLLSCCSGCGD-GCNGGYPDAAWKYLTTTG-VVTGGCQPYTIPPCGHHPEGPPPCCG 111

Query: 206 -----------------------ISSWSVLPPQDEHALKVTLATVGPIAVSINASPHT-F 241
                                   S++SV  P DE  +   + T GP+  +     +  F
Sbjct: 112 TPYCTPKCQDGCEKTYEEDKHKGKSAYSV--PSDETDIMKEIMTNGPVQAAFTV--YEDF 167

Query: 242 QLYASGIYDDEACTSDYVN-HAMLLVGY-TRNS---WILKNWWSHHWGDNGYMYLKRGNN 296
             Y SG+Y     +   +  HA+ ++G+   N    W+  N W   WG+NGY  + RG+N
Sbjct: 168 LYYKSGVYQHT--SGKQLGGHAVKIIGWGVENGVPYWLAANSWGTDWGENGYFRILRGSN 225

Query: 297 RCGIANYAVYA 307
            CGI +  V  
Sbjct: 226 ECGIESEVVAG 236


>gnl|CDD|239149 cd02698, Peptidase_C1A_CathepsinX, Cathepsin X; the only
           papain-like lysosomal cysteine peptidase exhibiting
           carboxymonopeptidase activity. It can also act as a
           carboxydipeptidase, like cathepsin B, but has been shown
           to preferentially cleave substrates through a
           monopeptidyl carboxypeptidase pathway. The propeptide
           region of cathepsin X, the shortest among papain-like
           peptidases, is covalently attached to the active site
           cysteine in the inactive form of the enzyme. Little is
           known about the biological function of cathepsin X. Some
           studies point to a role in early tumorigenesis. A more
           recent study indicates that cathepsin X expression is
           restricted to immune cells suggesting a role in
           phagocytosis and the regulation of the immune response.
          Length = 239

 Score = 81.3 bits (201), Expect = 4e-18
 Identities = 58/243 (23%), Positives = 91/243 (37%), Gaps = 49/243 (20%)

Query: 98  IPDHLDWRE---KGFITPDWNQ---EDCGACYAFSIASAIQGQIF---KSTSEIEELSIQ 148
           +P   DWR      +++P  NQ   + CG+C+A    SA+  +I    K       LS+Q
Sbjct: 1   LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQ 60

Query: 149 QVVDCSIISGNLG-CAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVDIS 207
            V+DC       G C GG       Y     G+  E   PY+ K   C            
Sbjct: 61  VVIDC----AGGGSCHGGDPGGVYEYA-HKHGIPDETCNPYQAKDGECNPF-------NR 108

Query: 208 SWSVLPPQDEHALKV-TLATV-------------------GPIAVSINASPHTFQLYASG 247
             +  P  +  A+K  TL  V                   GPI+  I A+    + Y  G
Sbjct: 109 CGTCNPFGECFAIKNYTLYFVSDYGSVSGRDKMMAEIYARGPISCGIMATE-ALENYTGG 167

Query: 248 IYDDEACTSDYVNHAMLLVGY-----TRNSWILKNWWSHHWGDNGYMYLKRGNNRCGIAN 302
           +Y +       +NH + + G+         WI++N W   WG+ G+  +   + +    N
Sbjct: 168 VYKEYVQDP-LINHIISVAGWGVDENGVEYWIVRNSWGEPWGERGWFRIVTSSYKGARYN 226

Query: 303 YAV 305
            A+
Sbjct: 227 LAI 229


>gnl|CDD|240244 PTZ00049, PTZ00049, cathepsin C-like protein; Provisional.
          Length = 693

 Score = 60.0 bits (145), Expect = 5e-10
 Identities = 35/107 (32%), Positives = 47/107 (43%), Gaps = 29/107 (27%)

Query: 228 GPIAVSINASPHTFQLYASGIYDDEA------CTSD--------------YVNHAMLLVG 267
           GPI  S  ASP  +  YA G+Y  E       CT D               VNHA++LVG
Sbjct: 568 GPIVASFEASPDFYD-YADGVYYVEDFPHARRCTVDLPKHNGVYNITGWEKVNHAIVLVG 626

Query: 268 YTRNS--------WILKNWWSHHWGDNGYMYLKRGNNRCGIANYAVY 306
           +            WI +N W  +WG  GY  + RG N  GI + +++
Sbjct: 627 WGEEEINGKLYKYWIGRNSWGKNWGKEGYFKIIRGKNFSGIESQSLF 673



 Score = 34.2 bits (78), Expect = 0.094
 Identities = 20/61 (32%), Positives = 29/61 (47%), Gaps = 12/61 (19%)

Query: 115 NQEDCGACYAFSIASAIQGQIFKSTSEI----------EELSIQQVVDCSIISGNLGCAG 164
           NQ  CG+CY  S   A + +I  + ++           + LSIQ V+ CS    + GC G
Sbjct: 402 NQLLCGSCYIASQMYAFKRRIEIALTKNLDKKYLNNFDDLLSIQTVLSCSFY--DQGCNG 459

Query: 165 G 165
           G
Sbjct: 460 G 460


>gnl|CDD|219764 pfam08246, Inhibitor_I29, Cathepsin propeptide inhibitor domain
          (I29).  This domain is found at the N-terminus of some
          C1 peptidases such as Cathepsin L where it acts as a
          propeptide. There are also a number of proteins that
          are composed solely of multiple copies of this domain
          such as the peptidase inhibitor salarin. This family is
          classified as I29 by MEROPS.
          Length = 58

 Score = 49.2 bits (118), Expect = 3e-08
 Identities = 16/54 (29%), Positives = 26/54 (48%), Gaps = 3/54 (5%)

Query: 14 QKKYKKDYRKKATDSKKKLHWQSNHKKIHTHNQEAQQGLHGYTLRENHLSDLHP 67
          +KKY K Y  +  +  +   ++ N + I  HN   ++G   YTL  N  +DL  
Sbjct: 5  KKKYGKSYYSEEEELYRFQIFKENLRFIEEHN---KKGNVSYTLGLNQFADLTD 55


>gnl|CDD|240381 PTZ00364, PTZ00364, dipeptidyl-peptidase I precursor; Provisional.
          Length = 548

 Score = 52.2 bits (125), Expect = 2e-07
 Identities = 57/307 (18%), Positives = 103/307 (33%), Gaps = 49/307 (15%)

Query: 45  NQEAQQGLHGYTLRENHLSDLHPRHYIKEMTRLTHSRIRRTLVRSPESNESVLIPDHLDW 104
           N    Q       R   +       Y K  +            +S         P    W
Sbjct: 152 NIHYVQRPGPVNPRRLPVLVPTGDPYSKSRSARKAKTASFGFRQSFSHQLGDPPPAAWSW 211

Query: 105 REKG---FIT---PDWNQEDCGACYAFSIASAIQGQIFKSTSEIEE------LSIQQVVD 152
            + G   F+    P      C + Y  +  +A+  ++  +++  +       LS + V+D
Sbjct: 212 GDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDPLGQQTFLSARHVLD 271

Query: 153 CSIISGNLGCAGGSLRNTLNYVQFAGGLMKEEDY--PYK---GKQSICKFKRPNIVVDIS 207
           CS      GCAGG       + +   G++  + Y  PY    G +  CK +RP+     +
Sbjct: 272 CSQYGQ--GCAGGFPEEVGKFAE-TFGILTTDSYYIPYDSGDGVERACKTRRPSRRYYFT 328

Query: 208 SWSVL-----PPQDEHALKVTLATVGPIAVSI--------NASPHTFQLYASGIYDDEAC 254
           ++  L        D   +   +   GP+  S+             T  +    + D    
Sbjct: 329 NYGPLGGYYGAVTDPDEIIWEIYRHGPVPASVYANSDWYNCDENSTEDVRYVSLDDYSTA 388

Query: 255 TSD---------YVNHAMLLVGY-----TRNSWILKNWWS--HHWGDNGYMYLKRGNNRC 298
           ++D          VNH +L++G+       + W++ + W     W D G   + RG N  
Sbjct: 389 SADRPLRHYFASNVNHTVLIIGWGTDENGGDYWLVLDPWGSRRSWCDGGTRKIARGVNAY 448

Query: 299 GIANYAV 305
            I +  V
Sbjct: 449 NIESEVV 455


>gnl|CDD|214853 smart00848, Inhibitor_I29, Cathepsin propeptide inhibitor domain
          (I29).  This domain is found at the N-terminus of some
          C1 peptidases such as Cathepsin L where it acts as a
          propeptide. There are also a number of proteins that
          are composed solely of multiple copies of this domain
          such as the peptidase inhibitor salarin. This family is
          classified as I29 by MEROPS. Peptide proteinase
          inhibitors can be found as single domain proteins or as
          single or multiple domains within proteins; these are
          referred to as either simple or compound inhibitors,
          respectively. In many cases they are synthesised as
          part of a larger precursor protein, either as a
          prepropeptide or as an N-terminal domain associated
          with an inactive peptidase or zymogen. This domain
          prevents access of the substrate to the active site.
          Removal of the N-terminal inhibitor domain either by
          interaction with a second peptidase or by autocatalytic
          cleavage activates the zymogen. Other inhibitors
          interact direct with proteinases using a simple
          noncovalent lock and key mechanism; while yet others
          use a conformational change-based trapping mechanism
          that depends on their structural and thermodynamic
          properties.
          Length = 57

 Score = 42.2 bits (100), Expect = 9e-06
 Identities = 17/57 (29%), Positives = 30/57 (52%), Gaps = 7/57 (12%)

Query: 15 KKYKKDYRKKATDSKKKLH----WQSNHKKIHTHNQEAQQGLHGYTLRENHLSDLHP 67
          +++KK + K  +  +++      ++ N KKI  HN++ +   H Y L  N  SDL P
Sbjct: 2  EQWKKKHGKSYSSEEEEARRFAIFKENLKKIEEHNKKYE---HSYKLGVNQFSDLTP 55


>gnl|CDD|227207 COG4870, COG4870, Cysteine protease [Posttranslational
           modification, protein turnover, chaperones].
          Length = 372

 Score = 46.0 bits (109), Expect = 1e-05
 Identities = 42/225 (18%), Positives = 86/225 (38%), Gaps = 32/225 (14%)

Query: 92  SNESVLIPDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKST----SEIEELSI 147
            N S  +P + D R++G ++P  +Q   G+C+AF+   +++  +   +    SE    ++
Sbjct: 93  LNASASLPSYFDRRDEGKVSPVKDQGSGGSCWAFATTRSLESYLNPESAWDFSENNMKNL 152

Query: 148 QQVVDCSIISGNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSIC-------KFKRP 200
             V             G +  +     +++G + + +D PY              K  + 
Sbjct: 153 LGVPYEKGFDYTSNDGGNADMSAAYLTEWSGPVYETDD-PYSENSYFSPTNLPVTKHVQE 211

Query: 201 NIVVDISSWSVLPPQDEHALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVN 260
             ++  S    L   D   +K      G ++ S+                 +  + +   
Sbjct: 212 AQIIP-SRKKYL---DNGNIKAMFGFYGAVSSSMYIDATNSLGICIPYPYVD--SGENWG 265

Query: 261 HAMLLVGY--------------TRNSWILKNWWSHHWGDNGYMYL 291
           HA+L+VGY                 ++I+KN W  +WG+NGY ++
Sbjct: 266 HAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGENGYFWI 310


>gnl|CDD|185641 PTZ00462, PTZ00462, Serine-repeat antigen protein; Provisional.
          Length = 1004

 Score = 37.0 bits (85), Expect = 0.012
 Identities = 16/44 (36%), Positives = 24/44 (54%), Gaps = 9/44 (20%)

Query: 254 CTSDYVNHAMLLVGY---------TRNSWILKNWWSHHWGDNGY 288
           C  D  +HA+ +VGY          ++ WI++N W  +WGD GY
Sbjct: 716 CGDDTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDEGY 759


>gnl|CDD|201147 pfam00313, CSD, 'Cold-shock' DNA-binding domain. 
          Length = 66

 Score = 31.0 bits (71), Expect = 0.099
 Identities = 20/56 (35%), Positives = 25/56 (44%), Gaps = 11/56 (19%)

Query: 107 KGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIISGNLGC 162
            GFITP+   +D      F   SAIQG  F+S      L   Q V+  I+ G  G 
Sbjct: 14  FGFITPEDGDKD-----VFVHFSAIQGDGFRS------LQEGQRVEFDIVEGTKGP 58


>gnl|CDD|224309 COG1391, GlnE, Glutamine synthetase adenylyltransferase
           [Posttranslational modification, protein turnover,
           chaperones / Signal transduction mechanisms].
          Length = 963

 Score = 31.2 bits (71), Expect = 0.84
 Identities = 11/60 (18%), Positives = 18/60 (30%), Gaps = 4/60 (6%)

Query: 47  EAQQGLHGYTLRENHLSDLHPRHYIKEMTRLTHSRIRRTLVRSPESNE--SVLIPDHLDW 104
           E  +           L    P   +  ++   +   RRT+       E    L+P  LD 
Sbjct: 471 EGDEDEEDTLRTLAALGFEDPEQILTHISAFRNGSYRRTI--GERGRERLDELMPRLLDA 528


>gnl|CDD|143430 cd07112, ALDH_GABALDH-PuuC, Escherichia coli NADP+-dependent
           gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase
           PuuC-like.  NADP+-dependent,
           gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase
           (GABALDH) PuuC of  Escherichia coli which catalyzes the
           conversion of putrescine to 4-aminobutanoate and other
           similar sequences are present in this CD.
          Length = 462

 Score = 29.5 bits (67), Expect = 2.3
 Identities = 18/52 (34%), Positives = 25/52 (48%), Gaps = 7/52 (13%)

Query: 215 QDEHALKVTLATVGPI----AVSINASPHTFQLYASGI---YDDEACTSDYV 259
           +DE AL  TL    PI    AV + ++ +TF+ YA  I   Y + A T    
Sbjct: 65  RDELALLETLDMGKPISDALAVDVPSAANTFRWYAEAIDKVYGEVAPTGPDA 116


>gnl|CDD|147930 pfam06035, BTLCP, Bacterial transglutaminase-like cysteine
           proteinase BTLCP.  Members of this family are predicted
           to be bacterial transglutaminase-like cysteine
           proteinases. They contain a conserved Cys-His-Asp
           catalytic triad. Their structure is predicted to be
           similar to that of Salmonella typhimurium
           N-hydroxyarylamine O-acetyltransferase, in pfam00797,
           however they lack the sub-domain which is important for
           arylamine recognition.
          Length = 169

 Score = 28.4 bits (64), Expect = 3.5
 Identities = 15/35 (42%), Positives = 18/35 (51%), Gaps = 2/35 (5%)

Query: 261 HAMLLVGYTRNSWILKN--WWSHHWGDNGYMYLKR 293
           HA+L V   R  ++L N       W D GY YLKR
Sbjct: 113 HAVLTVRTDRGDFVLDNLTDEVLAWSDTGYRYLKR 147


>gnl|CDD|232865 TIGR00189, tesB, acyl-CoA thioesterase II.  Function: hydrolyzes a
           broad range of acyl-CoA thioesters. Physiological
           function is not known. Subunit: homotetramer [Fatty acid
           and phospholipid metabolism, Biosynthesis].
          Length = 271

 Score = 28.5 bits (64), Expect = 4.1
 Identities = 11/37 (29%), Positives = 16/37 (43%)

Query: 100 DHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIF 136
           DH  W  + F   DW    C +  A      ++G+IF
Sbjct: 218 DHSIWFHRPFRADDWLLYKCSSPSAGGSRGLVEGKIF 254


>gnl|CDD|236694 PRK10436, PRK10436, hypothetical protein; Provisional.
          Length = 462

 Score = 28.7 bits (65), Expect = 4.4
 Identities = 29/120 (24%), Positives = 43/120 (35%), Gaps = 36/120 (30%)

Query: 62  LSDLHPRHYIKEMTRLTHSRIRR-------------TLVRS--P----ESNESVLIPDHL 102
           LS LH     + + RL    I R              LVR   P    +++E + +P ++
Sbjct: 316 LSTLHTNSTSETLVRLQQMGIARWMLASALKLVIAQRLVRKLCPHCRQQASEPIHLPPNI 375

Query: 103 DWREKGFITPDWNQEDCGACYA-----------FSIASAIQGQIFKSTS--EIEELSIQQ 149
            W   G + P W    C  CY              I   +Q  I  + S  E+E  + QQ
Sbjct: 376 -WP--GPL-PHWQAVGCEHCYHGYYGRTALFEVLPITPVLQQAIASNASPEELETHARQQ 431


>gnl|CDD|227250 COG4913, COG4913, Uncharacterized protein conserved in bacteria
           [Function unknown].
          Length = 1104

 Score = 28.8 bits (64), Expect = 4.8
 Identities = 12/41 (29%), Positives = 21/41 (51%), Gaps = 2/41 (4%)

Query: 13  PQKKYKKDYRKKATD-SKKKLHWQSNHKKIHTHNQEAQQGL 52
           P  +++KD R+K  D S  +L   +N  K+ T  +  +  L
Sbjct: 590 PTTRWEKDDRRKLGDRSTYRLGS-TNDAKVETLRETVKAML 629


>gnl|CDD|217015 pfam02395, Peptidase_S6, Immunoglobulin A1 protease.  This family
           consists of immunoglobulin A1 protease proteins. The
           immunoglobulin A1 protease cleaves immunoglobulin IgA
           and is found in pathogenic bacteria such as Neisseria
           gonorrhoeae. Not all of the members of this family are
           IgA proteases, espP from E. coli O157:H7 cleaves human
           coagulation factor V and hbp is a hemoglobin protease
           from E. coli EB1.
          Length = 758

 Score = 28.2 bits (63), Expect = 6.7
 Identities = 19/51 (37%), Positives = 25/51 (49%), Gaps = 4/51 (7%)

Query: 155 IISGNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVD 205
           I +G  G     L+N++N  Q AGGL  E +Y  K   S   +K   I VD
Sbjct: 308 IFTGQNGTI--VLKNSIN--QGAGGLFFEGNYTVKVSASNQTWKGAGIDVD 354


>gnl|CDD|219055 pfam06484, Ten_N, Teneurin Intracellular Region.  This family is
           found in the intracellular N-terminal region of the
           Teneurin family of proteins. These proteins are
           'pair-rule' genes and are involved in tissue patterning,
           specifically probably neural patterning. The
           intracellular domain is cleaved in response to
           homophilic interaction of the extracellular domain, and
           translocates to the nucleus. Here it probably carries
           out to some transcriptional regulatory activity. The
           length of this region and the conservation suggests that
           there may be two structural domains here (personal obs:C
           Yeats).
          Length = 370

 Score = 28.1 bits (62), Expect = 6.8
 Identities = 23/95 (24%), Positives = 42/95 (44%), Gaps = 8/95 (8%)

Query: 13  PQKKYKKDYRKKATDSKKKLHWQSNHKKI--HTHNQEAQQGLHGYTLRENHLSDLHPRH- 69
           PQK Y      KA D   ++ + +  K +  H  ++ ++QG   +TLRE    +  P H 
Sbjct: 26  PQKSYSSSETLKAYDHDSRMAYGNRVKDLVHHESDEFSRQGTD-FTLRELGFGEPSPPHR 84

Query: 70  --YIKEMTRLTHSRIRRTLVRSPESN-ESVLIPDH 101
             Y  +M  L H     +     ++  + ++ P+H
Sbjct: 85  SGYRSDMG-LPHRGYSVSTGSDADTETDGIMSPEH 118


>gnl|CDD|182046 PRK09719, PRK09719, hypothetical protein; Provisional.
          Length = 89

 Score = 26.2 bits (57), Expect = 7.9
 Identities = 14/55 (25%), Positives = 25/55 (45%), Gaps = 4/55 (7%)

Query: 41 IHTHNQEAQQGLHGYTLRENHLSDLH----PRHYIKEMTRLTHSRIRRTLVRSPE 91
          IH    + +  +H  T  E+  + +H     RH ++   R  + ++ R   RSPE
Sbjct: 19 IHRRVVDKRTSMHSRTASESTGARIHRPWCARHQVRPAWRCQYDKLHRVPFRSPE 73


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.319    0.134    0.425 

Gapped
Lambda     K      H
   0.267   0.0764    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,784,908
Number of extensions: 1460056
Number of successful extensions: 1182
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1139
Number of HSP's successfully gapped: 37
Length of query: 309
Length of database: 10,937,602
Length adjustment: 97
Effective length of query: 212
Effective length of database: 6,635,264
Effective search space: 1406675968
Effective search space used: 1406675968
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (26.4 bits)