RPS-BLAST 2.2.26 [Sep-21-2011]

Database: CDD.v3.10 
           44,354 sequences; 10,937,602 total letters

Searching..................................................done

Query= psy11694
         (655 letters)



>gnl|CDD|215726 pfam00112, Peptidase_C1, Papain family cysteine protease. 
          Length = 213

 Score =  232 bits (593), Expect = 2e-72
 Identities = 86/193 (44%), Positives = 114/193 (59%), Gaps = 17/193 (8%)

Query: 296 LPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMSN 355
           LPE+FDWR +G ++ VK+QG+C  CWAFSAVG +E  + I+   L  LS QQLVDCD  N
Sbjct: 1   LPESFDWREKGAVTPVKDQGQCGSCWAFSAVGALEGRYCIKTGKLVSLSEQQLVDCDTGN 60

Query: 356 GGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGE 415
            GCNGG  D+A +YI  NGG+V++  YPY A +    C   +      K+K Y  +PY +
Sbjct: 61  NGCNGGLPDNAFEYIKKNGGIVTESDYPYTAHDGT--CKFKKSNSKYAKIKGYGDVPYND 118

Query: 416 EEEMKKWVATRGPLSVGMNANGLF--YYSGGVID---LNQRL--------YGTS--IPYW 460
           EE ++  +A  GP+SV ++A       Y  GV      +  L        YGT   +PYW
Sbjct: 119 EEALQAALAKNGPVSVAIDAYEDDFQLYKSGVYKHTECSGELDHAVLIVGYGTENGVPYW 178

Query: 461 IVKNSWGSDWGEK 473
           IVKNSWG+DWGE 
Sbjct: 179 IVKNSWGTDWGEN 191



 Score =  168 bits (429), Expect = 1e-48
 Identities = 66/158 (41%), Positives = 92/158 (58%), Gaps = 12/158 (7%)

Query: 497 KLSRLATEKLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEE 556
           KL  L+ ++LVDCD  N GCNGG  D+A +YI  NGG+V++  YPY A +    C   + 
Sbjct: 44  KLVSLSEQQLVDCDTGNNGCNGGLPDNAFEYIKKNGGIVTESDYPYTAHDGT--CKFKKS 101

Query: 557 EGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLF--YYSGGVIDLNQRLCNPK 614
                K+K Y  +PY +EE ++  +A  GP+SV ++A       Y  GV    +  C+ +
Sbjct: 102 NSKYAKIKGYGDVPYNDEEALQAALAKNGPVSVAIDAYEDDFQLYKSGVYKHTE--CSGE 159

Query: 615 AQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
             +HA++IVGYG E        PYWIVKNSWG+DWGE 
Sbjct: 160 L-DHAVLIVGYGTENGV-----PYWIVKNSWGTDWGEN 191



 Score = 85.7 bits (213), Expect = 6e-19
 Identities = 27/52 (51%), Positives = 36/52 (69%)

Query: 148 LPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQH 199
           LPE+FDWR +G ++ VK+QG+C  CWAFSAVG +E  + I+   L  LS Q 
Sbjct: 1   LPESFDWREKGAVTPVKDQGQCGSCWAFSAVGALEGRYCIKTGKLVSLSEQQ 52


>gnl|CDD|239068 cd02248, Peptidase_C1A, Peptidase C1A subfamily (MEROPS database
           nomenclature); composed of cysteine peptidases (CPs)
           similar to papain, including the mammalian CPs
           (cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain
           is an endopeptidase with specific substrate preferences,
           primarily for bulky hydrophobic or aromatic residues at
           the S2 subsite, a hydrophobic pocket in papain that
           accommodates the P2 sidechain of the substrate (the
           second residue away from the scissile bond). Most
           members of the papain subfamily are endopeptidases. Some
           exceptions to this rule can be explained by specific
           details of the catalytic domains like the occluding loop
           in cathepsin B which confers an additional
           carboxydipeptidyl activity and the mini-chain of
           cathepsin H resulting in an N-terminal exopeptidase
           activity. Papain-like CPs have different functions in
           various organisms. Plant CPs are used to mobilize
           storage proteins in seeds. Parasitic CPs act
           extracellularly to help invade tissues and cells, to
           hatch or to evade the host immune system. Mammalian CPs
           are primarily lysosomal enzymes with the exception of
           cathepsin W, which is retained in the endoplasmic
           reticulum. They are responsible for protein degradation
           in the lysosome. Papain-like CPs are synthesized as
           inactive proenzymes with N-terminal propeptide regions,
           which are removed upon activation. In addition to its
           inhibitory role, the propeptide is required for proper
           folding of the newly synthesized enzyme and its
           stabilization in denaturing pH conditions. Residues
           within the propeptide region also play a role in the
           transport of the proenzyme to lysosomes or acidified
           vesicles. Also included in this subfamily are proteins
           classified as non-peptidase homologs, which lack
           peptidase activity or have missing active site residues.
          Length = 210

 Score =  223 bits (570), Expect = 5e-69
 Identities = 83/193 (43%), Positives = 111/193 (57%), Gaps = 20/193 (10%)

Query: 297 PEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCD-MSN 355
           PE+ DWR +G ++ VK+QG C  CWAFS VG +E  +AI+   L  LS QQLVDC    N
Sbjct: 1   PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQQLVDCSTSGN 60

Query: 356 GGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGE 415
            GCNGG  D+A +Y + NGG+ S+  YPY   +    C     +    K+  YS +P G+
Sbjct: 61  NGCNGGNPDNAFEY-VKNGGLASESDYPYTGKDGT--CKYNSSKV-GAKITGYSNVPPGD 116

Query: 416 EEEMKKWVATRGPLSVGMNANGLF-YYSGGVID----LNQRL--------YGTS--IPYW 460
           EE +K  +A  GP+SV ++A+  F +Y GG+       N  L        YGT   + YW
Sbjct: 117 EEALKAALANYGPVSVAIDASSSFQFYKGGIYSGPCCSNTNLNHAVLLVGYGTENGVDYW 176

Query: 461 IVKNSWGSDWGEK 473
           IVKNSWG+ WGEK
Sbjct: 177 IVKNSWGTSWGEK 189



 Score =  165 bits (419), Expect = 3e-47
 Identities = 65/158 (41%), Positives = 91/158 (57%), Gaps = 13/158 (8%)

Query: 497 KLSRLATEKLVDCD-MSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGE 555
           KL  L+ ++LVDC    N GCNGG  D+A +Y+  NGG+ S+  YPY   +    C    
Sbjct: 43  KLVSLSEQQLVDCSTSGNNGCNGGNPDNAFEYV-KNGGLASESDYPYTGKDGT--CKYNS 99

Query: 556 EEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLF-YYSGGVIDLNQRLCNPK 614
            +    K+  YS +P G+EE +K  +A  GP+SV ++A+  F +Y GG+       C+  
Sbjct: 100 SKV-GAKITGYSNVPPGDEEALKAALANYGPVSVAIDASSSFQFYKGGIYSGPC--CSNT 156

Query: 615 AQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
             NHA+++VGYG E   D     YWIVKNSWG+ WGEK
Sbjct: 157 NLNHAVLLVGYGTENGVD-----YWIVKNSWGTSWGEK 189



 Score = 86.9 bits (216), Expect = 2e-19
 Identities = 25/50 (50%), Positives = 33/50 (66%)

Query: 149 PEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQ 198
           PE+ DWR +G ++ VK+QG C  CWAFS VG +E  +AI+   L  LS Q
Sbjct: 1   PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQ 50


>gnl|CDD|214761 smart00645, Pept_C1, Papain family cysteine protease. 
          Length = 175

 Score =  181 bits (462), Expect = 1e-53
 Identities = 70/195 (35%), Positives = 92/195 (47%), Gaps = 60/195 (30%)

Query: 296 LPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCD-MS 354
           LPE+FDWR +G ++ VK+QG+C  CWAFSA G +E  + I+   L  LS QQLVDC    
Sbjct: 1   LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQLVDCSGGG 60

Query: 355 NGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYG 414
           N GCNGG  D+A +YI  NGG+ ++  YPY  S                           
Sbjct: 61  NCGCNGGLPDNAFEYIKKNGGLETESCYPYTGS--------------------------- 93

Query: 415 EEEEMKKWVATRGPLSVGMNANGLFYYSGGVID----LNQRL--------YGTSI----P 458
                           V ++A+   +Y  G+ D     +  L        YGT +     
Sbjct: 94  ----------------VAIDASDFQFYKSGIYDHPGCGSGTLDHAVLIVGYGTEVENGKD 137

Query: 459 YWIVKNSWGSDWGEK 473
           YWIVKNSWG+DWGE 
Sbjct: 138 YWIVKNSWGTDWGEN 152



 Score =  118 bits (297), Expect = 1e-30
 Identities = 51/157 (32%), Positives = 69/157 (43%), Gaps = 49/157 (31%)

Query: 497 KLSRLATEKLVDCD-MSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGE 555
           KL  L+ ++LVDC    N GCNGG  D+A +YI  NGG+ ++  YPY  S          
Sbjct: 44  KLVSLSEQQLVDCSGGGNCGCNGGLPDNAFEYIKKNGGLETESCYPYTGS---------- 93

Query: 556 EEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFYYSGGVIDLNQRLCNPKA 615
                                            V ++A+   +Y  G+ D     C    
Sbjct: 94  ---------------------------------VAIDASDFQFYKSGIYD--HPGCGSGT 118

Query: 616 QNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
            +HA++IVGYG E +       YWIVKNSWG+DWGE 
Sbjct: 119 LDHAVLIVGYGTEVENGK---DYWIVKNSWGTDWGEN 152



 Score = 89.9 bits (224), Expect = 7e-21
 Identities = 26/52 (50%), Positives = 35/52 (67%)

Query: 148 LPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQH 199
           LPE+FDWR +G ++ VK+QG+C  CWAFSA G +E  + I+   L  LS Q 
Sbjct: 1   LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQ 52


>gnl|CDD|185513 PTZ00203, PTZ00203, cathepsin L protease; Provisional.
          Length = 348

 Score =  166 bits (421), Expect = 6e-46
 Identities = 97/299 (32%), Positives = 149/299 (49%), Gaps = 47/299 (15%)

Query: 198 QHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDL--QQLTG 255
           + + + Y ++ +  +R  NF  N+E   ++Q+ +   A FG+ KFFDLSE++   + L G
Sbjct: 43  RTYQRAYGTLTEEQQRLANFERNLELMREHQARNP-HARFGITKFFDLSEAEFAARYLNG 101

Query: 256 LNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRH---GDDL---PEAFDWRAEGVIS 309
                          A F++        A Q     +     DL   P+A DWR +G ++
Sbjct: 102 --------------AAYFAA--------AKQHAGQHYRKARADLSAVPDAVDWREKGAVT 139

Query: 310 KVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMSNGGCNGGRMDDALQY 369
            VK QG C  CWAFSAVG +E+  A+ G+ L  LS QQLV CD  + GC GG M  A ++
Sbjct: 140 PVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQLVSCDHVDNGCGGGLMLQAFEW 199

Query: 370 IIDN--GGVVSDQAYPYKASESERG-CLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATR 426
           ++ N  G V ++++YPY +   +   C    E     ++  Y  +    E  M  W+A  
Sbjct: 200 VLRNMNGTVFTEKSYPYVSGNGDVPECSNSSELAPGARIDGYVSME-SSERVMAAWLAKN 258

Query: 427 GPLSVGMNANGLFYYSGGVI---DLNQRLYGT---------SIPYWIVKNSWGSDWGEK 473
           GP+S+ ++A+    Y  GV+      Q  +G           +PYW++KNSWG DWGEK
Sbjct: 259 GPISIAVDASSFMSYHSGVLTSCIGEQLNHGVLLVGYNMTGEVPYWVIKNSWGEDWGEK 317



 Score =  105 bits (263), Expect = 1e-24
 Identities = 54/159 (33%), Positives = 85/159 (53%), Gaps = 13/159 (8%)

Query: 497 KLSRLATEKLVDCDMSNGGCNGGRMDDALQYIIDN--GGVVSDQAYPYKASESERG-CLV 553
           KL RL+ ++LV CD  + GC GG M  A ++++ N  G V ++++YPY +   +   C  
Sbjct: 169 KLVRLSEQQLVSCDHVDNGCGGGLMLQAFEWVLRNMNGTVFTEKSYPYVSGNGDVPECSN 228

Query: 554 GEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFYYSGGVIDLNQRLCNP 613
             E     ++  Y  +    E  M  W+A  GP+S+ ++A+    Y  GV+      C  
Sbjct: 229 SSELAPGARIDGYVSME-SSERVMAAWLAKNGPISIAVDASSFMSYHSGVLTS----CIG 283

Query: 614 KAQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
           +  NH +++VGY       G  +PYW++KNSWG DWGEK
Sbjct: 284 EQLNHGVLLVGY----NMTG-EVPYWVIKNSWGEDWGEK 317



 Score = 79.0 bits (194), Expect = 9e-16
 Identities = 54/172 (31%), Positives = 82/172 (47%), Gaps = 32/172 (18%)

Query: 37  YLNSPVTR-FLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSGTAVFEVNKFF 95
           Y+ +P    F  F R + + Y ++ +  +R  NF  N+E   ++Q  +   A F + KFF
Sbjct: 29  YVGTPAAALFEEFKRTYQRAYGTLTEEQQRLANFERNLELMREHQARNP-HARFGITKFF 87

Query: 96  DLSDSDL--QQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRH---GDDL-- 148
           DLS+++   + L G               A F++        A Q     +     DL  
Sbjct: 88  DLSEAEFAARYLNG--------------AAYFAA--------AKQHAGQHYRKARADLSA 125

Query: 149 -PEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQH 199
            P+A DWR +G ++ VK QG C  CWAFSAVG +E+  A+ G+ L  LS Q 
Sbjct: 126 VPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQ 177


>gnl|CDD|240232 PTZ00021, PTZ00021, falcipain-2; Provisional.
          Length = 489

 Score =  156 bits (396), Expect = 4e-41
 Identities = 96/308 (31%), Positives = 153/308 (49%), Gaps = 38/308 (12%)

Query: 191 NLTELSVQHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDL 250
           N   L ++ H K Y + +++ +R+ +FV N+ K   + ++++     G+N+F DLS  + 
Sbjct: 167 NSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENVLYKKGMNRFGDLSFEEF 226

Query: 251 QQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRHGDDL--PEAFDWRAEGVI 308
           ++   L L  + +      ++P   N  D           +  D       +DWR    +
Sbjct: 227 KK-KYLTL-KSFDFKSNGKKSPRVINYDDV------IKKYKPKDATFDHAKYDWRLHNGV 278

Query: 309 SKVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMSNGGCNGGRMDDALQ 368
           + VK+Q  C  CWAFS VGVVE+ +AI+ N L  LS Q+LVDC   N GC GG + +A +
Sbjct: 279 TPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQELVDCSFKNNGCYGGLIPNAFE 338

Query: 369 YIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGP 428
            +I+ GG+ S+  YPY  S++   C +   +  K K+K Y  IP   E++ K+ +   GP
Sbjct: 339 DMIELGGLCSEDDYPY-VSDTPELCNIDRCKE-KYKIKSYVSIP---EDKFKEAIRFLGP 393

Query: 429 LSVGMNANGLF-YYSGGVID------LNQRL----YGTSIP------------YWIVKNS 465
           +SV +  +  F +Y GG+ D       N  +    YG                Y+I+KNS
Sbjct: 394 ISVSIAVSDDFAFYKGGIFDGECGEEPNHAVILVGYGMEEIYNSDTKKMEKRYYYIIKNS 453

Query: 466 WGSDWGEK 473
           WG  WGEK
Sbjct: 454 WGESWGEK 461



 Score =  107 bits (269), Expect = 1e-24
 Identities = 60/162 (37%), Positives = 92/162 (56%), Gaps = 15/162 (9%)

Query: 497 KLSRLATEKLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEE 556
           +L  L+ ++LVDC   N GC GG + +A + +I+ GG+ S+  YPY  S++   C +   
Sbjct: 309 ELVSLSEQELVDCSFKNNGCYGGLIPNAFEDMIELGGLCSEDDYPY-VSDTPELCNIDRC 367

Query: 557 EGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLF-YYSGGVIDLNQRLCNPKA 615
           +  K K+K Y  IP   E++ K+ +   GP+SV +  +  F +Y GG+ D     C  + 
Sbjct: 368 KE-KYKIKSYVSIP---EDKFKEAIRFLGPISVSIAVSDDFAFYKGGIFDGE---CG-EE 419

Query: 616 QNHALIIVGYGEEEKKDGTSIP-----YWIVKNSWGSDWGEK 652
            NHA+I+VGYG EE  +  +       Y+I+KNSWG  WGEK
Sbjct: 420 PNHAVILVGYGMEEIYNSDTKKMEKRYYYIIKNSWGESWGEK 461



 Score = 80.6 bits (199), Expect = 6e-16
 Identities = 47/175 (26%), Positives = 83/175 (47%), Gaps = 12/175 (6%)

Query: 28  ESNIFQTRGYLNS--PVTRFLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSG 85
             N   ++  + +   V  F  F+++H K Y + +++ +R+ +FV N+ K   +  +++ 
Sbjct: 150 LINFADSKFLMTNLENVNSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENV 209

Query: 86  TAVFEVNKFFDLSDSDLQQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRHG 145
                +N+F DLS  + ++   L L  + +      ++P   N  D           +  
Sbjct: 210 LYKKGMNRFGDLSFEEFKK-KYLTL-KSFDFKSNGKKSPRVINYDDV------IKKYKPK 261

Query: 146 DDL--PEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQ 198
           D       +DWR    ++ VK+Q  C  CWAFS VGVVE+ +AI+ N L  LS Q
Sbjct: 262 DATFDHAKYDWRLHNGVTPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQ 316


>gnl|CDD|240310 PTZ00200, PTZ00200, cysteine proteinase; Provisional.
          Length = 448

 Score =  144 bits (365), Expect = 2e-37
 Identities = 87/304 (28%), Positives = 140/304 (46%), Gaps = 49/304 (16%)

Query: 200 HDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDLQQLTGLNLD 259
           +++ +++  + L R   F  N  + + ++ ++       +NKF DL+E + ++L      
Sbjct: 133 YNRKHATHAERLNRFLTFRNNYLEVKSHKGDE--PYSKEINKFSDLTEEEFRKL------ 184

Query: 260 STLEDIQPSLQAPFSSNQTDT----EMRAFQFNSLRHGDDL-----------PEAFDWRA 304
                I+   ++  +S+  D             +L+   +             E  DWR 
Sbjct: 185 --FPVIKVPPKSNSTSHNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITGEGLDWRR 242

Query: 305 EGVISKVKEQGK-CACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMSNGGCNGGRM 363
              ++KVK+QG  C  CWAFS+VG VE+++ I  +   +LS Q+LV+CD  + GC+GG  
Sbjct: 243 ADAVTKVKDQGLNCGSCWAFSSVGSVESLYKIYRDKSVDLSEQELVNCDTKSQGCSGGYP 302

Query: 364 DDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWV 423
           D AL+Y + N G+ S    PY A + +  C+V      KV +  Y  +  G++   K  V
Sbjct: 303 DTALEY-VKNKGLSSSSDVPYLAKDGK--CVVS--STKKVYIDSYL-VAKGKDVLNKSLV 356

Query: 424 ATRGPLSVGMNA-NGLFYYSGGVID------LNQR--LYG------TSIPYWIVKNSWGS 468
               P  V +     L  Y  GV +      LN    L G      T   YWI+KNSWG+
Sbjct: 357 I--SPTVVYIAVSRELLKYKSGVYNGECGKSLNHAVLLVGEGYDEKTKKRYWIIKNSWGT 414

Query: 469 DWGE 472
           DWGE
Sbjct: 415 DWGE 418



 Score = 99.4 bits (248), Expect = 4e-22
 Identities = 63/186 (33%), Positives = 89/186 (47%), Gaps = 25/186 (13%)

Query: 467 GSDWGEKVEDKVGSSGNRTRDLELTGVLPSKLSRLATEKLVDCDMSNGGCNGGRMDDALQ 526
           GS W       V S     RD  +          L+ ++LV+CD  + GC+GG  D AL+
Sbjct: 257 GSCWAFSSVGSVESLYKIYRDKSV---------DLSEQELVNCDTKSQGCSGGYPDTALE 307

Query: 527 YIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGP 586
           Y + N G+ S    PY A + +  C+V      KV +  Y  +  G++   K  V    P
Sbjct: 308 Y-VKNKGLSSSSDVPYLAKDGK--CVVS--STKKVYIDSYL-VAKGKDVLNKSLVI--SP 359

Query: 587 LSVGMNA-NGLFYYSGGVIDLNQRLCNPKAQNHALIIVGYGEEEKKDGTSIPYWIVKNSW 645
             V +     L  Y  GV +     C  K+ NHA+++VG G +EK   T   YWI+KNSW
Sbjct: 360 TVVYIAVSRELLKYKSGVYNGE---CG-KSLNHAVLLVGEGYDEK---TKKRYWIIKNSW 412

Query: 646 GSDWGE 651
           G+DWGE
Sbjct: 413 GTDWGE 418



 Score = 72.8 bits (179), Expect = 2e-13
 Identities = 40/171 (23%), Positives = 74/171 (43%), Gaps = 26/171 (15%)

Query: 45  FLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSGTAVFEVNKFFDLSDSDLQQ 104
           F  F + +++ +++  + L R   F  N  + +    +       E+NKF DL++ + ++
Sbjct: 126 FEEFNKKYNRKHATHAERLNRFLTFRNNYLEVK--SHKGDEPYSKEINKFSDLTEEEFRK 183

Query: 105 LTGLNLDSTLEDIQPSLQAPFSSNQTDT----EMRAFQFNSLRHGDDL-----------P 149
           L           I+   ++  +S+  D             +L+   +             
Sbjct: 184 L--------FPVIKVPPKSNSTSHNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITG 235

Query: 150 EAFDWRAEGVISKVKEQGK-CACCWAFSAVGVVEAMHAIQGNNLTELSVQH 199
           E  DWR    ++KVK+QG  C  CWAFS+VG VE+++ I  +   +LS Q 
Sbjct: 236 EGLDWRRADAVTKVKDQGLNCGSCWAFSSVGSVESLYKIYRDKSVDLSEQE 286


>gnl|CDD|239111 cd02620, Peptidase_C1A_CathepsinB, Cathepsin B group; composed of
           cathepsin B and similar proteins, including
           tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin
           B is a lysosomal papain-like cysteine peptidase which is
           expressed in all tissues and functions primarily as an
           exopeptidase through its carboxydipeptidyl activity.
           Together with other cathepsins, it is involved in the
           degradation of proteins, proenzyme activation, Ag
           processing, metabolism and apoptosis. Cathepsin B has
           been implicated in a number of human diseases such as
           cancer, rheumatoid arthritis, osteoporosis and
           Alzheimer's disease. The unique carboxydipeptidyl
           activity of cathepsin B is attributed to the presence of
           an occluding loop in its active site which favors the
           binding of the C-termini of substrate proteins. Some
           members of this group do not possess the occluding loop.
           TIN-Ag is an extracellular matrix basement protein which
           was originally identified as a target Ag involved in
           anti-tubular basement membrane antibody-mediated
           interstitial nephritis. It plays a role in renal
           tubulogenesis and is defective in hereditary
           tubulointerstitial disorders. TIN-Ag is exclusively
           expressed in kidney tissues. .
          Length = 236

 Score =  126 bits (319), Expect = 4e-33
 Identities = 68/218 (31%), Positives = 91/218 (41%), Gaps = 44/218 (20%)

Query: 297 PEAFDWRAE----GVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSL--TELSVQQLVD 350
           PE+FD R +      I ++++QG C  CWAFSAV        IQ N      LS Q L+ 
Sbjct: 1   PESFDAREKWPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVLLSAQDLLS 60

Query: 351 CDMSNG-GCNGGRMDDALQYIIDNGGVVSDQAYPYKASES-------------------- 389
           C    G GCNGG  D A +Y+   G VV+    PY                         
Sbjct: 61  CCSGCGDGCNGGYPDAAWKYLTTTG-VVTGGCQPYTIPPCGHHPEGPPPCCGTPYCTPKC 119

Query: 390 ERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNAN-GLFYYSGGV--- 445
           + GC    EE  K K K    +P  +E ++ K + T GP+           YY  GV   
Sbjct: 120 QDGCEKTYEE-DKHKGKSAYSVP-SDETDIMKEIMTNGPVQAAFTVYEDFLYYKSGVYQH 177

Query: 446 IDLNQ------RL--YGTS--IPYWIVKNSWGSDWGEK 473
               Q      ++  +G    +PYW+  NSWG+DWGE 
Sbjct: 178 TSGKQLGGHAVKIIGWGVENGVPYWLAANSWGTDWGEN 215



 Score =  101 bits (255), Expect = 2e-24
 Identities = 52/175 (29%), Positives = 72/175 (41%), Gaps = 33/175 (18%)

Query: 500 RLATEKLVDCDMSNG-GCNGGRMDDALQYIIDNGGVVSDQAYPYKASES----------- 547
            L+ + L+ C    G GCNGG  D A +Y+   G VV+    PY                
Sbjct: 52  LLSAQDLLSCCSGCGDGCNGGYPDAAWKYLTTTG-VVTGGCQPYTIPPCGHHPEGPPPCC 110

Query: 548 ---------ERGCLVGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNAN-GLF 597
                    + GC    EE  K K K    +P  +E ++ K + T GP+           
Sbjct: 111 GTPYCTPKCQDGCEKTYEE-DKHKGKSAYSVP-SDETDIMKEIMTNGPVQAAFTVYEDFL 168

Query: 598 YYSGGVIDLNQRLCNPKAQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
           YY  GV    Q     +   HA+ I+G+G E       +PYW+  NSWG+DWGE 
Sbjct: 169 YYKSGVY---QHTSGKQLGGHAVKIIGWGVENG-----VPYWLAANSWGTDWGEN 215



 Score = 48.0 bits (115), Expect = 4e-06
 Identities = 22/57 (38%), Positives = 28/57 (49%), Gaps = 6/57 (10%)

Query: 149 PEAFDWRAE----GVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNL--TELSVQH 199
           PE+FD R +      I ++++QG C  CWAFSAV        IQ N      LS Q 
Sbjct: 1   PESFDAREKWPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVLLSAQD 57


>gnl|CDD|239112 cd02621, Peptidase_C1A_CathepsinC, Cathepsin C; also known as
           Dipeptidyl Peptidase I (DPPI), an atypical papain-like
           cysteine peptidase with chloride dependency and
           dipeptidyl aminopeptidase activity, resulting from its
           tetrameric structure which limits substrate access. Each
           subunit of the tetramer is composed of three peptides:
           the heavy and light chains, which together adopts the
           papain fold and forms the catalytic domain; and the
           residual propeptide region, which forms a beta barrel
           and points towards the substrate's N-terminus. The
           subunit composition is the result of the unique
           characteristic of procathepsin C maturation involving
           the cleavage of the catalytic domain and the
           non-autocatalytic excision of an activation peptide
           within its propeptide region. By removing N-terminal
           dipeptide extensions, cathepsin C activates granule
           serine peptidases (granzymes) involved in cell-mediated
           apoptosis, inflammation and tissue remodelling.
           Loss-of-function mutations in cathepsin C are associated
           with Papillon-Lefevre and Haim-Munk syndromes, rare
           diseases characterized by hyperkeratosis and early-onset
           periodontitis. Cathepsin C is widely expressed in many
           tissues with high levels in lung, kidney and placenta.
           It is also highly expressed in cytotoxic lymphocytes and
           mature myeloid cells.
          Length = 243

 Score =  107 bits (269), Expect = 3e-26
 Identities = 64/221 (28%), Positives = 94/221 (42%), Gaps = 45/221 (20%)

Query: 296 LPEAFDWR----AEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNSL------TELSV 345
           LP++FDW         +S V+ QG C  C+AF++V  +EA   I  N          LS 
Sbjct: 1   LPKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSP 60

Query: 346 QQLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGFKVKV 405
           Q ++ C   + GC+GG      ++  D  G+V++  +PY A + +R C     E  +   
Sbjct: 61  QHVLSCSQYSQGCDGGFPFLVGKFAEDF-GIVTEDYFPYTA-DDDRPCKASPSECRRYYF 118

Query: 406 KEYSRIP--YG--EEEEMKKWVATRGPLSVGMNAN--GLFYYSG--GVIDLNQRL----- 452
            +Y+ +   YG   E+EMK  +   GP+ V         FY  G     D ++       
Sbjct: 119 SDYNYVGGCYGCTNEDEMKWEIYRNGPIVVAFEVYSDFDFYKEGVYHHTDNDEVSDGDND 178

Query: 453 ----------------YGT----SIPYWIVKNSWGSDWGEK 473
                           +G        YWIVKNSWGS WGEK
Sbjct: 179 NFNPFELTNHAVLLVGWGEDEIKGEKYWIVKNSWGSSWGEK 219



 Score =  104 bits (260), Expect = 6e-25
 Identities = 52/168 (30%), Positives = 80/168 (47%), Gaps = 20/168 (11%)

Query: 500 RLATEKLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASESERGCLVGEEEGF 559
            L+ + ++ C   + GC+GG      ++  D G +V++  +PY A + +R C     E  
Sbjct: 57  ILSPQHVLSCSQYSQGCDGGFPFLVGKFAEDFG-IVTEDYFPYTA-DDDRPCKASPSECR 114

Query: 560 KVKVKEYSRIP--YG--EEEEMKKWVATRGPLSVGMNAN--GLFYYSG--GVIDLNQRLC 611
           +    +Y+ +   YG   E+EMK  +   GP+ V         FY  G     D ++   
Sbjct: 115 RYYFSDYNYVGGCYGCTNEDEMKWEIYRNGPIVVAFEVYSDFDFYKEGVYHHTDNDEVSD 174

Query: 612 NPKAQ-------NHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
                       NHA+++VG+GE+E K      YWIVKNSWGS WGEK
Sbjct: 175 GDNDNFNPFELTNHAVLLVGWGEDEIKGE---KYWIVKNSWGSSWGEK 219



 Score = 42.0 bits (99), Expect = 4e-04
 Identities = 22/62 (35%), Positives = 30/62 (48%), Gaps = 10/62 (16%)

Query: 148 LPEAFDWR----AEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNL------TELSV 197
           LP++FDW         +S V+ QG C  C+AF++V  +EA   I  N          LS 
Sbjct: 1   LPKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSP 60

Query: 198 QH 199
           QH
Sbjct: 61  QH 62


>gnl|CDD|239110 cd02619, Peptidase_C1, C1 Peptidase family (MEROPS database
           nomenclature), also referred to as the papain family;
           composed of two subfamilies of cysteine peptidases
           (CPs), C1A (papain) and C1B (bleomycin hydrolase).
           Papain-like enzymes are mostly endopeptidases with some
           exceptions like cathepsins B, C, H and X, which are
           exopeptidases. Papain-like CPs have different functions
           in various organisms. Plant CPs are used to mobilize
           storage proteins in seeds while mammalian CPs are
           primarily lysosomal enzymes responsible for protein
           degradation in the lysosome. Papain-like CPs are
           synthesized as inactive proenzymes with N-terminal
           propeptide regions, which are removed upon activation.
           Bleomycin hydrolase (BH) is a CP that detoxifies
           bleomycin by hydrolysis of an amide group. It acts as a
           carboxypeptidase on its C-terminus to convert itself
           into an aminopeptidase and peptide ligase. BH is found
           in all tissues in mammals as well as in many other
           eukaryotes. It forms a hexameric ring barrel structure
           with the active sites imbedded in the central channel.
           Some members of the C1 family are proteins classified as
           non-peptidase homologs which lack peptidase activity or
           have missing active site residues.
          Length = 223

 Score =  103 bits (258), Expect = 5e-25
 Identities = 57/204 (27%), Positives = 87/204 (42%), Gaps = 31/204 (15%)

Query: 300 FDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGN--SLTELSVQQLVDCD----- 352
            D R    ++ VK QG    CWAF++   +E+ + I+G      +LS Q L  C      
Sbjct: 2   VDLR-PLRLTPVKNQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYLYICANDECL 60

Query: 353 MSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKA-SESERGCLVGEEEGFKVKVKEYSRI 411
             NG C+GG    AL  ++   G+  ++ YPY A S+ E           KVK+K+Y R+
Sbjct: 61  GINGSCDGGGPLSALLKLVALKGIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYRRV 120

Query: 412 PYGEEEEMKKWVATRGPLSVGMNANGLFYYSGGVIDLNQRLYGT---------------- 455
                E++K+ +A  GP+  G +    F      I   + +Y                  
Sbjct: 121 LKNNIEDIKEALAKGGPVVAGFDVYSGFDRLKEGIIYEEIVYLLYEDGDLGGHAVVIVGY 180

Query: 456 ------SIPYWIVKNSWGSDWGEK 473
                     +IVKNSWG+DWG+ 
Sbjct: 181 DDNYVEGKGAFIVKNSWGTDWGDN 204



 Score = 91.4 bits (227), Expect = 6e-21
 Identities = 46/149 (30%), Positives = 72/149 (48%), Gaps = 8/149 (5%)

Query: 509 CDMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKA-SESERGCLVGEEEGFKVKVKEYS 567
           C   NG C+GG    AL  ++   G+  ++ YPY A S+ E           KVK+K+Y 
Sbjct: 59  CLGINGSCDGGGPLSALLKLVALKGIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYR 118

Query: 568 RIPYGEEEEMKKWVATRGPLSVGMNA-NGLFYYSGGVIDLNQRLCNPKAQ---NHALIIV 623
           R+     E++K+ +A  GP+  G +  +G      G+I         +      HA++IV
Sbjct: 119 RVLKNNIEDIKEALAKGGPVVAGFDVYSGFDRLKEGIIYEEIVYLLYEDGDLGGHAVVIV 178

Query: 624 GYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
           GY +   +       +IVKNSWG+DWG+ 
Sbjct: 179 GYDDNYVEGK---GAFIVKNSWGTDWGDN 204



 Score = 41.7 bits (98), Expect = 5e-04
 Identities = 16/51 (31%), Positives = 26/51 (50%), Gaps = 3/51 (5%)

Query: 152 FDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNN--LTELSVQHH 200
            D R    ++ VK QG    CWAF++   +E+ + I+G      +LS Q+ 
Sbjct: 2   VDLR-PLRLTPVKNQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYL 51


>gnl|CDD|239149 cd02698, Peptidase_C1A_CathepsinX, Cathepsin X; the only
           papain-like lysosomal cysteine peptidase exhibiting
           carboxymonopeptidase activity. It can also act as a
           carboxydipeptidase, like cathepsin B, but has been shown
           to preferentially cleave substrates through a
           monopeptidyl carboxypeptidase pathway. The propeptide
           region of cathepsin X, the shortest among papain-like
           peptidases, is covalently attached to the active site
           cysteine in the inactive form of the enzyme. Little is
           known about the biological function of cathepsin X. Some
           studies point to a role in early tumorigenesis. A more
           recent study indicates that cathepsin X expression is
           restricted to immune cells suggesting a role in
           phagocytosis and the regulation of the immune response.
          Length = 239

 Score = 99.0 bits (247), Expect = 3e-23
 Identities = 58/215 (26%), Positives = 88/215 (40%), Gaps = 42/215 (19%)

Query: 296 LPEAFDWR-AEGV--ISKVKEQ---GKCACCWAFSAVGVVEAMHAIQGN---SLTELSVQ 346
           LP+++DWR   GV  +S  + Q     C  CWA  +   +     I          LSVQ
Sbjct: 1   LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQ 60

Query: 347 QLVDCDMSNGG-CNGGRMDDALQYIIDNGGVVSDQAYPYKASESE-------RGCLVGEE 398
            ++DC  + GG C+GG      +Y   +G +  +   PY+A + E         C    E
Sbjct: 61  VVIDC--AGGGSCHGGDPGGVYEYAHKHG-IPDETCNPYQAKDGECNPFNRCGTCNPFGE 117

Query: 399 -----EGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFY-YSGGVIDLNQRL 452
                      V +Y  +     ++M   +  RGP+S G+ A      Y+GGV     + 
Sbjct: 118 CFAIKNYTLYFVSDYGSV--SGRDKMMAEIYARGPISCGIMATEALENYTGGVYKEYVQD 175

Query: 453 -----------YGT---SIPYWIVKNSWGSDWGEK 473
                      +G     + YWIV+NSWG  WGE+
Sbjct: 176 PLINHIISVAGWGVDENGVEYWIVRNSWGEPWGER 210



 Score = 84.8 bits (210), Expect = 2e-18
 Identities = 45/166 (27%), Positives = 73/166 (43%), Gaps = 26/166 (15%)

Query: 501 LATEKLVDCDMSNGG-CNGGRMDDALQYIIDNGGVVSDQAYPYKASESE-------RGCL 552
           L+ + ++DC  + GG C+GG      +Y   +G +  +   PY+A + E         C 
Sbjct: 57  LSVQVVIDC--AGGGSCHGGDPGGVYEYAHKHG-IPDETCNPYQAKDGECNPFNRCGTCN 113

Query: 553 VGEE-----EGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFY-YSGGVIDL 606
              E           V +Y  +     ++M   +  RGP+S G+ A      Y+GGV   
Sbjct: 114 PFGECFAIKNYTLYFVSDYGSV--SGRDKMMAEIYARGPISCGIMATEALENYTGGVYKE 171

Query: 607 NQRLCNPKAQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
             +       NH + + G+G +E      + YWIV+NSWG  WGE+
Sbjct: 172 YVQDPLI---NHIISVAGWGVDENG----VEYWIVRNSWGEPWGER 210



 Score = 35.9 bits (83), Expect = 0.039
 Identities = 18/61 (29%), Positives = 25/61 (40%), Gaps = 9/61 (14%)

Query: 148 LPEAFDWR-AEGV--ISKVKEQ---GKCACCWAFSAVGVVEAMHAIQGNN---LTELSVQ 198
           LP+++DWR   GV  +S  + Q     C  CWA  +   +     I          LSVQ
Sbjct: 1   LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQ 60

Query: 199 H 199
            
Sbjct: 61  V 61


>gnl|CDD|214853 smart00848, Inhibitor_I29, Cathepsin propeptide inhibitor domain
          (I29).  This domain is found at the N-terminus of some
          C1 peptidases such as Cathepsin L where it acts as a
          propeptide. There are also a number of proteins that
          are composed solely of multiple copies of this domain
          such as the peptidase inhibitor salarin. This family is
          classified as I29 by MEROPS. Peptide proteinase
          inhibitors can be found as single domain proteins or as
          single or multiple domains within proteins; these are
          referred to as either simple or compound inhibitors,
          respectively. In many cases they are synthesised as
          part of a larger precursor protein, either as a
          prepropeptide or as an N-terminal domain associated
          with an inactive peptidase or zymogen. This domain
          prevents access of the substrate to the active site.
          Removal of the N-terminal inhibitor domain either by
          interaction with a second peptidase or by autocatalytic
          cleavage activates the zymogen. Other inhibitors
          interact direct with proteinases using a simple
          noncovalent lock and key mechanism; while yet others
          use a conformational change-based trapping mechanism
          that depends on their structural and thermodynamic
          properties.
          Length = 57

 Score = 59.9 bits (146), Expect = 1e-11
 Identities = 18/55 (32%), Positives = 30/55 (54%)

Query: 45 FLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSGTAVFEVNKFFDLSD 99
          F  + + H K YSS E+  RR   F  N++K E++ ++   +    VN+F DL+ 
Sbjct: 1  FEQWKKKHGKSYSSEEEEARRFAIFKENLKKIEEHNKKYEHSYKLGVNQFSDLTP 55



 Score = 54.2 bits (131), Expect = 1e-09
 Identities = 18/50 (36%), Positives = 28/50 (56%)

Query: 198 QHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSE 247
           + H K YSS E+  RR   F  N++K E++  +   +   GVN+F DL+ 
Sbjct: 6   KKHGKSYSSEEEEARRFAIFKENLKKIEEHNKKYEHSYKLGVNQFSDLTP 55


>gnl|CDD|219764 pfam08246, Inhibitor_I29, Cathepsin propeptide inhibitor domain
           (I29).  This domain is found at the N-terminus of some
           C1 peptidases such as Cathepsin L where it acts as a
           propeptide. There are also a number of proteins that are
           composed solely of multiple copies of this domain such
           as the peptidase inhibitor salarin. This family is
           classified as I29 by MEROPS.
          Length = 58

 Score = 56.1 bits (136), Expect = 2e-10
 Identities = 15/58 (25%), Positives = 32/58 (55%)

Query: 45  FLNFMRDHDKVYSSVEDLLRRHENFVTNVEKAEDYQREDSGTAVFEVNKFFDLSDSDL 102
           F ++ + + K Y S E+ L R + F  N+   E++ ++ + +    +N+F DL+D + 
Sbjct: 1   FEDWKKKYGKSYYSEEEELYRFQIFKENLRFIEEHNKKGNVSYTLGLNQFADLTDEEF 58



 Score = 49.2 bits (118), Expect = 7e-08
 Identities = 14/54 (25%), Positives = 29/54 (53%)

Query: 197 VQHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDL 250
            + + K Y S E+ L R + F  N+   E++  + + +   G+N+F DL++ + 
Sbjct: 5   KKKYGKSYYSEEEELYRFQIFKENLRFIEEHNKKGNVSYTLGLNQFADLTDEEF 58


>gnl|CDD|240244 PTZ00049, PTZ00049, cathepsin C-like protein; Provisional.
          Length = 693

 Score = 58.4 bits (141), Expect = 9e-09
 Identities = 33/96 (34%), Positives = 46/96 (47%), Gaps = 20/96 (20%)

Query: 574 EEEMKKWVATRGPLSVGMNANGLFY-YSGGVIDLNQ----RLCN---PKAQ--------- 616
           E+ M   +   GP+     A+  FY Y+ GV  +      R C    PK           
Sbjct: 557 EKIMMNEIYRNGPIVASFEASPDFYDYADGVYYVEDFPHARRCTVDLPKHNGVYNITGWE 616

Query: 617 --NHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWG 650
             NHA+++VG+GEEE  +G    YWI +NSWG +WG
Sbjct: 617 KVNHAIVLVGWGEEEI-NGKLYKYWIGRNSWGKNWG 651



 Score = 36.1 bits (83), Expect = 0.060
 Identities = 29/118 (24%), Positives = 45/118 (38%), Gaps = 25/118 (21%)

Query: 294 DDLPEAFDW----RAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQ----------GNS 339
           D+LP+ F W            V  Q  C  C+  S +   +    I            N 
Sbjct: 379 DELPKNFTWGDPFNNNTREYDVTNQLLCGSCYIASQMYAFKRRIEIALTKNLDKKYLNNF 438

Query: 340 LTELSVQQLVDCDMSNGGCNGGRMDDALQYIIDN----GGVVSDQAYPYKASESERGC 393
              LS+Q ++ C   + GCNGG       Y++       G+  D+ +PY A  +E+ C
Sbjct: 439 DDLLSIQTVLSCSFYDQGCNGG-----FPYLVSKMAKLQGIPLDKVFPYTA--TEQTC 489



 Score = 33.8 bits (77), Expect = 0.38
 Identities = 10/18 (55%), Positives = 12/18 (66%)

Query: 454 GTSIPYWIVKNSWGSDWG 471
           G    YWI +NSWG +WG
Sbjct: 634 GKLYKYWIGRNSWGKNWG 651


>gnl|CDD|227207 COG4870, COG4870, Cysteine protease [Posttranslational
           modification, protein turnover, chaperones].
          Length = 372

 Score = 56.0 bits (135), Expect = 3e-08
 Identities = 61/281 (21%), Positives = 87/281 (30%), Gaps = 41/281 (14%)

Query: 228 QSEDSGTAVFGVNKFFDLSESDLQQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQF 287
           Q       +       +L+   +        DS L     SL+    + Q    +     
Sbjct: 31  QLVLLRDKLSTSGIIIELAPKLIDFSEPEEKDSLLPVSLDSLEDCSPTGQVPDPVDLGSC 90

Query: 288 NSLRHGDDLPEAFDWRAEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQ----GNSLTEL 343
            +L     LP  FD R EG +S VK+QG    CWAF+    +E  +         S   +
Sbjct: 91  TALNASASLPSYFDRRDEGKVSPVKDQGSGGSCWAFATTRSLE-SYLNPESAWDFSENNM 149

Query: 344 SVQQLVDC--DMSNGGCNGGRMDDALQYIIDNGGVVSDQAYPYKASE--SERGCLVGE-- 397
                V           +GG  D +  Y+ +  G V +   PY  +   S     V +  
Sbjct: 150 KNLLGVPYEKGFDYTSNDGGNADMSAAYLTEWSGPVYETDDPYSENSYFSPTNLPVTKHV 209

Query: 398 -EEGFKVKVKEY---SRIP-----YGEEEEMKKWVATR------GPLSVGMNANGLFYYS 442
            E       K+Y     I      YG         AT           V    N      
Sbjct: 210 QEAQIIPSRKKYLDNGNIKAMFGFYGAVSSSMYIDATNSLGICIPYPYVDSGENW----- 264

Query: 443 G------GVIDLNQRLYGTSIPY----WIVKNSWGSDWGEK 473
           G      G  D          P     +I+KNSWG++WGE 
Sbjct: 265 GHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGEN 305



 Score = 45.6 bits (108), Expect = 6e-05
 Identities = 34/159 (21%), Positives = 55/159 (34%), Gaps = 1/159 (0%)

Query: 96  DLSDSDLQQLTGLNLDSTLEDIQPSLQAPFSSNQTDTEMRAFQFNSLRHGDDLPEAFDWR 155
           +L+   +        DS L     SL+    + Q    +      +L     LP  FD R
Sbjct: 47  ELAPKLIDFSEPEEKDSLLPVSLDSLEDCSPTGQVPDPVDLGSCTALNASASLPSYFDRR 106

Query: 156 AEGVISKVKEQGKCACCWAFSAVGVVEAMHAIQGNNLTELSVQHHDKVYSSVEDLLRRHE 215
            EG +S VK+QG    CWAF+    +E  +    +          + +    E       
Sbjct: 107 DEGKVSPVKDQGSGGSCWAFATTRSLE-SYLNPESAWDFSENNMKNLLGVPYEKGFDYTS 165

Query: 216 NFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESDLQQLT 254
           N   N + +  Y +E SG      + + + S      L 
Sbjct: 166 NDGGNADMSAAYLTEWSGPVYETDDPYSENSYFSPTNLP 204



 Score = 44.4 bits (105), Expect = 1e-04
 Identities = 16/41 (39%), Positives = 24/41 (58%), Gaps = 5/41 (12%)

Query: 617 NHALIIVGYGEEEKKDGT-SIPY----WIVKNSWGSDWGEK 652
            HA++IVGY +    +     P     +I+KNSWG++WGE 
Sbjct: 265 GHAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGEN 305


>gnl|CDD|240381 PTZ00364, PTZ00364, dipeptidyl-peptidase I precursor; Provisional.
          Length = 548

 Score = 54.5 bits (131), Expect = 1e-07
 Identities = 49/236 (20%), Positives = 79/236 (33%), Gaps = 54/236 (22%)

Query: 289 SLRHGDDLPEAFDWRAEGVISKVKE---QG---KCACCWAFSAVGVVEAMHAIQGN---- 338
           S + GD  P A+ W   G  S +           C   +  +A+  + A   +  N    
Sbjct: 198 SHQLGDPPPAAWSWGDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDP 257

Query: 339 --SLTELSVQQLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAY-PYKASESERGCLV 395
               T LS + ++DC     GC GG  ++  ++    G + +D  Y PY + +       
Sbjct: 258 LGQQTFLSARHVLDCSQYGQGCAGGFPEEVGKFAETFGILTTDSYYIPYDSGDGVERACK 317

Query: 396 GEEEGFKVKVKEYSRIP--YG---EEEEMKKWVATRGPLSVGMNANGLFY---------- 440
                 +     Y  +   YG   + +E+   +   GP+   + AN  +Y          
Sbjct: 318 TRRPSRRYYFTNYGPLGGYYGAVTDPDEIIWEIYRHGPVPASVYANSDWYNCDENSTEDV 377

Query: 441 -------YSGGVIDLNQRLY--------------GT---SIPYWIVKNSWGS--DW 470
                  YS    D   R Y              GT      YW+V + WGS   W
Sbjct: 378 RYVSLDDYSTASADRPLRHYFASNVNHTVLIIGWGTDENGGDYWLVLDPWGSRRSW 433



 Score = 49.9 bits (119), Expect = 4e-06
 Identities = 35/176 (19%), Positives = 65/176 (36%), Gaps = 29/176 (16%)

Query: 499 SRLATEKLVDCDMSNGGCNGGRMDDALQYIIDNGGVVSDQAY-PYKASESERGCLVGEEE 557
           + L+   ++DC     GC GG  ++  ++    G + +D  Y PY + +           
Sbjct: 262 TFLSARHVLDCSQYGQGCAGGFPEEVGKFAETFGILTTDSYYIPYDSGDGVERACKTRRP 321

Query: 558 GFKVKVKEYSRIP--YG---EEEEMKKWVATRGPLSVGMNANGLFYYSGGVIDLNQRLCN 612
             +     Y  +   YG   + +E+   +   GP+   + AN  +Y        + R  +
Sbjct: 322 SRRYYFTNYGPLGGYYGAVTDPDEIIWEIYRHGPVPASVYANSDWYNCDENSTEDVRYVS 381

Query: 613 PKAQ-----------------NHALIIVGYGEEEKKDGTSIPYWIVKNSWGS--DW 649
                                NH ++I+G+G +E        YW+V + WGS   W
Sbjct: 382 LDDYSTASADRPLRHYFASNVNHTVLIIGWGTDE----NGGDYWLVLDPWGSRRSW 433



 Score = 32.9 bits (75), Expect = 0.58
 Identities = 16/71 (22%), Positives = 25/71 (35%), Gaps = 12/71 (16%)

Query: 141 SLRHGDDLPEAFDWRAEGVISKVKE---QG---KCACCWAFSAVGVVEAMHAIQGNNL-- 192
           S + GD  P A+ W   G  S +           C   +  +A+  + A   +  N    
Sbjct: 198 SHQLGDPPPAAWSWGDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDP 257

Query: 193 ----TELSVQH 199
               T LS +H
Sbjct: 258 LGQQTFLSARH 268


>gnl|CDD|185641 PTZ00462, PTZ00462, Serine-repeat antigen protein; Provisional.
          Length = 1004

 Score = 52.4 bits (125), Expect = 7e-07
 Identities = 20/45 (44%), Positives = 26/45 (57%)

Query: 608 QRLCNPKAQNHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
           Q LC     +HA+ IVGYG     +     YWIV+NSWG  WG++
Sbjct: 713 QNLCGDDTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDE 757



 Score = 39.3 bits (91), Expect = 0.007
 Identities = 49/228 (21%), Positives = 82/228 (35%), Gaps = 62/228 (27%)

Query: 310 KVKEQGKCACCWAFSAVGVVEAMHAIQGNSLTELSVQQLVDCDMS--NGGCNGGRMD-DA 366
           ++++QG CA  W F++   +E +  ++G     +S   + +C        C+ G    + 
Sbjct: 546 QIEDQGNCAISWIFASKYHLETIKCMKGYEPHAISALYIANCSKGEHKDRCDEGSNPLEF 605

Query: 367 LQYIIDNGGVVSDQAYPY--------------------------------KASESERGCL 394
           LQ I DNG + +D  Y Y                                  S   +   
Sbjct: 606 LQIIEDNGFLPADSNYLYNYTKVGEDCPDEEDHWMNLLDHGKILNHNKKEPNSLDGKAYR 665

Query: 395 VGEEEGFKVKVKEYSRIPYGEEEEMKKWVATRGPLSVGMNANGLFYY--SGGVID----- 447
             E E F  K+  + +I       +K  +  +G +   + A  +  Y  +G  +      
Sbjct: 666 AYESEHFHDKMDAFIKI-------IKDEIMNKGSVIAYIKAENVLGYEFNGKKVQNLCGD 718

Query: 448 ------LNQRLYGTSI-------PYWIVKNSWGSDWGEKVEDKVGSSG 482
                 +N   YG  I        YWIV+NSWG  WG++   KV   G
Sbjct: 719 DTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDEGYFKVDMYG 766


>gnl|CDD|202517 pfam03051, Peptidase_C1_2, Peptidase C1-like family.  This family
           is closely related to the Peptidase_C1 family pfam00112,
           containing several prokaryotic and eukaryotic
           aminopeptidases and bleomycin hydrolases.
          Length = 438

 Score = 34.2 bits (79), Expect = 0.23
 Identities = 14/36 (38%), Positives = 21/36 (58%), Gaps = 3/36 (8%)

Query: 617 NHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
            HA+++ G  E++    T    W V+NSWG D G+K
Sbjct: 360 THAMVLTGVDEDDDGKPTK---WKVENSWGDDSGKK 392


>gnl|CDD|238328 cd00585, Peptidase_C1B, Peptidase C1B subfamily (MEROPS database
           nomenclature); composed of eukaryotic bleomycin
           hydrolases (BH) and bacterial aminopeptidases C (pepC).
           The proteins of this subfamily contain a large insert
           relative to the C1A peptidase (papain) subfamily. BH is
           a cysteine peptidase that detoxifies bleomycin by
           hydrolysis of an amide group. It acts as a
           carboxypeptidase on its C-terminus to convert itself
           into an aminopeptidase and peptide ligase. BH is found
           in all tissues in mammals as well as in many other
           eukaryotes. Bleomycin, a glycopeptide derived from the
           fungus Streptomyces verticullus, is an effective
           anticancer drug due to its ability to induce DNA strand
           breaks. Human BH is the major cause of tumor cell
           resistance to bleomycin chemotherapy, and is also
           genetically linked to Alzheimer's disease. In addition
           to its peptidase activity, the yeast BH (Gal6) binds DNA
           and acts as a repressor in the Gal4 regulatory system.
           BH forms a hexameric ring barrel structure with the
           active sites imbedded in the central channel. The
           bacterial homolog of BH, called pepC, is a cysteine
           aminopeptidase possessing broad specificity. Although
           its crystal structure has not been solved, biochemical
           analysis shows that pepC also forms a hexamer. .
          Length = 437

 Score = 32.6 bits (75), Expect = 0.61
 Identities = 14/41 (34%), Positives = 20/41 (48%), Gaps = 6/41 (14%)

Query: 617 NHALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK---VM 654
            HA+++ G   +E         W V+NSWG   G+K   VM
Sbjct: 359 THAMVLTGVDLDEDGKPVK---WKVENSWGEKVGKKGYFVM 396


>gnl|CDD|237194 PRK12765, PRK12765, flagellar capping protein; Provisional.
          Length = 595

 Score = 31.4 bits (71), Expect = 1.6
 Identities = 24/68 (35%), Positives = 33/68 (48%), Gaps = 1/68 (1%)

Query: 198 QHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTAVF-GVNKFFDLSESDLQQLTGL 256
           Q  + V  +VEDL+  + N VTN+  A  Y SE      F GV++   +  S L  L   
Sbjct: 292 QDTEGVTKAVEDLVDAYNNLVTNLNAATSYNSETGTKGTFQGVSEITSIRSSILADLFSQ 351

Query: 257 NLDSTLED 264
            +D T ED
Sbjct: 352 VVDGTDED 359


>gnl|CDD|227573 COG5248, TAF19, Transcription initiation factor TFIID, subunit
           TAF13 [Transcription].
          Length = 126

 Score = 29.5 bits (66), Expect = 2.6
 Identities = 18/93 (19%), Positives = 31/93 (33%), Gaps = 7/93 (7%)

Query: 15  GYLHTFMIKVALLESNIFQTRGYLNSPVTRFLNFMRDHDKVYSSVEDLLRRHENFVTNVE 74
            Y+  +M  +     N+ Q R         F   +R   K    VE+LL  +E       
Sbjct: 38  EYVLDYMSILCTNAHNMAQVRNKTK--TEDFKFALRRDPKKLGRVEELLITNEEI---KL 92

Query: 75  KAEDYQREDSGTAVF-EVNKFFDLSDSDLQQLT 106
             + ++ +DS       +N     S   L +  
Sbjct: 93  AKKAFEPKDSRYRKELRINTKMH-SFITLNKFI 124


>gnl|CDD|183636 PRK12631, flgC, flagellar basal body rod protein FlgC; Provisional.
          Length = 138

 Score = 29.5 bits (66), Expect = 2.8
 Identities = 24/75 (32%), Positives = 33/75 (44%), Gaps = 7/75 (9%)

Query: 191 NLTELSVQHHDKVYSSVEDLLR-RHENFVTNVEKAEDYQSEDSGTAVFGVNKFFDLSESD 249
           N T  ++ + D V SSV+   R RH  F   + KA+  Q    G AV G+       ESD
Sbjct: 22  NTTASNIANADSVSSSVDKTYRARHPIFEAEMAKAQSQQQASQGVAVKGI------VESD 75

Query: 250 LQQLTGLNLDSTLED 264
              L   + D  + D
Sbjct: 76  KPLLKEYSPDHPMAD 90


>gnl|CDD|226107 COG3579, PepC, Aminopeptidase C [Amino acid transport and
           metabolism].
          Length = 444

 Score = 30.5 bits (69), Expect = 3.2
 Identities = 13/35 (37%), Positives = 20/35 (57%), Gaps = 3/35 (8%)

Query: 618 HALIIVGYGEEEKKDGTSIPYWIVKNSWGSDWGEK 652
           HA+++ G   +E  +      W V+NSWG D G+K
Sbjct: 363 HAMVLTGVDLDETGNPLR---WKVENSWGKDVGKK 394


>gnl|CDD|184358 PRK13874, PRK13874, conjugal transfer protein TrbJ; Provisional.
          Length = 230

 Score = 29.5 bits (67), Expect = 4.7
 Identities = 13/27 (48%), Positives = 15/27 (55%), Gaps = 2/27 (7%)

Query: 324 SAVGVVEAMHAIQGNSLTELSVQQLVD 350
           SA G ++A  A  GN L  L  QQL D
Sbjct: 170 SATGALQAAQA--GNQLLALQAQQLAD 194


>gnl|CDD|191696 pfam07168, Ureide_permease, Ureide permease.  Heterocyclic nitrogen
           compounds may serve as nitrogen sources or nitrogen
           transport compounds in plants that are not able to fix
           nitrogen. This family represents ureide permease, a
           transporter of a wide spectrum of oxo derivatives of
           heterocyclic nitrogen compounds, including allantoin,
           uric acid and xanthine; it has 10 putative transmembrane
           domains with a large cytosolic central domain containing
           a 'Walker A' motif. Ureide permease is likely to
           transport other purine degradation products when
           nitrogen sources are low. Transport is dependent on
           glucose and a proton gradient. The family is found in
           bacteria, plants and yeast.
          Length = 336

 Score = 29.4 bits (66), Expect = 5.7
 Identities = 14/62 (22%), Positives = 26/62 (41%), Gaps = 2/62 (3%)

Query: 177 AVGVVEAMHAIQG-NNLTELSVQHHDKVYSSVEDLLRRHENFVTNVEKAEDYQSEDSGTA 235
           AV +  A+H+    +N  +L+   + +   S+  L         ++E  E      +GTA
Sbjct: 140 AVFLGSAVHSSNAADNKEKLNAFENYQSEFSISSLELMSRMNSEDLENGEA-DDAKAGTA 198

Query: 236 VF 237
            F
Sbjct: 199 EF 200


>gnl|CDD|219580 pfam07793, DUF1631, Protein of unknown function (DUF1631).  The
           members of this family are sequences derived from a
           group of hypothetical proteins expressed by certain
           bacterial species. The region concerned is approximately
           440 amino acid residues in length.
          Length = 729

 Score = 29.2 bits (66), Expect = 9.6
 Identities = 19/71 (26%), Positives = 32/71 (45%), Gaps = 19/71 (26%)

Query: 14  LGYLHTFMIKVALLESNIFQTRGYLNSPVTRFLNFM--------------RD--HDKVYS 57
           +G L   ++KVALL+ + F +RG    P  R LN +              RD  + K+  
Sbjct: 375 IGRLQIPVLKVALLDKSFF-SRG--EHPARRLLNEIAEAGIGWGGDDDGLRDSLYAKIEE 431

Query: 58  SVEDLLRRHEN 68
            V+ +L   ++
Sbjct: 432 IVQRILNEFDD 442


  Database: CDD.v3.10
    Posted date:  Mar 20, 2013  7:55 AM
  Number of letters in database: 10,937,602
  Number of sequences in database:  44,354
  
Lambda     K      H
   0.315    0.133    0.400 

Gapped
Lambda     K      H
   0.267   0.0764    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 33,042,956
Number of extensions: 3205812
Number of successful extensions: 2058
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1981
Number of HSP's successfully gapped: 63
Length of query: 655
Length of database: 10,937,602
Length adjustment: 103
Effective length of query: 552
Effective length of database: 6,369,140
Effective search space: 3515765280
Effective search space used: 3515765280
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (22.0 bits)
S2: 62 (27.6 bits)