RPS-BLAST 2.2.26 [Sep-21-2011]
Database: CDD.v3.10
44,354 sequences; 10,937,602 total letters
Searching..................................................done
Query= psy1705
(309 letters)
>gnl|CDD|239068 cd02248, Peptidase_C1A, Peptidase C1A subfamily (MEROPS database
nomenclature); composed of cysteine peptidases (CPs)
similar to papain, including the mammalian CPs
(cathepsins B, C, F, H, L, K, O, S, V, X and W). Papain
is an endopeptidase with specific substrate preferences,
primarily for bulky hydrophobic or aromatic residues at
the S2 subsite, a hydrophobic pocket in papain that
accommodates the P2 sidechain of the substrate (the
second residue away from the scissile bond). Most
members of the papain subfamily are endopeptidases. Some
exceptions to this rule can be explained by specific
details of the catalytic domains like the occluding loop
in cathepsin B which confers an additional
carboxydipeptidyl activity and the mini-chain of
cathepsin H resulting in an N-terminal exopeptidase
activity. Papain-like CPs have different functions in
various organisms. Plant CPs are used to mobilize
storage proteins in seeds. Parasitic CPs act
extracellularly to help invade tissues and cells, to
hatch or to evade the host immune system. Mammalian CPs
are primarily lysosomal enzymes with the exception of
cathepsin W, which is retained in the endoplasmic
reticulum. They are responsible for protein degradation
in the lysosome. Papain-like CPs are synthesized as
inactive proenzymes with N-terminal propeptide regions,
which are removed upon activation. In addition to its
inhibitory role, the propeptide is required for proper
folding of the newly synthesized enzyme and its
stabilization in denaturing pH conditions. Residues
within the propeptide region also play a role in the
transport of the proenzyme to lysosomes or acidified
vesicles. Also included in this subfamily are proteins
classified as non-peptidase homologs, which lack
peptidase activity or have missing active site residues.
Length = 210
Score = 257 bits (659), Expect = 2e-86
Identities = 100/213 (46%), Positives = 134/213 (62%), Gaps = 7/213 (3%)
Query: 99 PDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIISG 158
P+ +DWREKG +TP +Q CG+C+AFS A++G T ++ LS QQ+VDCS SG
Sbjct: 1 PESVDWREKGAVTPVKDQGSCGSCWAFSTVGALEGAYAIKTGKLVSLSEQQLVDCS-TSG 59
Query: 159 NLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVDISSWSVLPPQDEH 218
N GC GG+ N YV+ GGL E DYPY GK CK+ + I+ +S +PP DE
Sbjct: 60 NNGCNGGNPDNAFEYVK-NGGLASESDYPYTGKDGTCKYNSSKVGAKITGYSNVPPGDEE 118
Query: 219 ALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGYTR----NSWI 274
ALK LA GP++V+I+AS +FQ Y GIY C++ +NHA+LLVGY + WI
Sbjct: 119 ALKAALANYGPVSVAIDAS-SSFQFYKGGIYSGPCCSNTNLNHAVLLVGYGTENGVDYWI 177
Query: 275 LKNWWSHHWGDNGYMYLKRGNNRCGIANYAVYA 307
+KN W WG+ GY+ + RG+N CGIA+YA Y
Sbjct: 178 VKNSWGTSWGEKGYIRIARGSNLCGIASYASYP 210
>gnl|CDD|215726 pfam00112, Peptidase_C1, Papain family cysteine protease.
Length = 213
Score = 242 bits (619), Expect = 3e-80
Identities = 94/216 (43%), Positives = 130/216 (60%), Gaps = 8/216 (3%)
Query: 98 IPDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIIS 157
+P+ DWREKG +TP +Q CG+C+AFS A++G+ T ++ LS QQ+VDC +
Sbjct: 1 LPESFDWREKGAVTPVKDQGQCGSCWAFSAVGALEGRYCIKTGKLVSLSEQQLVDCD--T 58
Query: 158 GNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNI-VVDISSWSVLPPQD 216
GN GC GG N Y++ GG++ E DYPY CKFK+ N I + +P D
Sbjct: 59 GNNGCNGGLPDNAFEYIKKNGGIVTESDYPYTAHDGTCKFKKSNSKYAKIKGYGDVPYND 118
Query: 217 EHALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGY-TRNS--- 272
E AL+ LA GP++V+I+A FQLY SG+Y C+ ++HA+L+VGY T N
Sbjct: 119 EEALQAALAKNGPVSVAIDAYEDDFQLYKSGVYKHTECSG-ELDHAVLIVGYGTENGVPY 177
Query: 273 WILKNWWSHHWGDNGYMYLKRGNNRCGIANYAVYAL 308
WI+KN W WG+NGY + RG N CGIA+ A Y +
Sbjct: 178 WIVKNSWGTDWGENGYFRIARGVNECGIASEASYPI 213
>gnl|CDD|214761 smart00645, Pept_C1, Papain family cysteine protease.
Length = 175
Score = 176 bits (449), Expect = 4e-55
Identities = 80/217 (36%), Positives = 105/217 (48%), Gaps = 50/217 (23%)
Query: 98 IPDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIIS 157
+P+ DWR+KG +TP +Q CG+C+AFS A++G+ T ++ LS QQ+VDCS
Sbjct: 1 LPESFDWRKKGAVTPVKDQGQCGSCWAFSATGALEGRYCIKTGKLVSLSEQQLVDCS-GG 59
Query: 158 GNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVDISSWSVLPPQDE 217
GN GC GG N Y++ GGL E YPY G
Sbjct: 60 GNCGCNGGLPDNAFEYIKKNGGLETESCYPYTG--------------------------- 92
Query: 218 HALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGY---TRNS-- 272
+V+I+AS FQ Y SGIYD C S ++HA+L+VGY N
Sbjct: 93 -------------SVAIDASD--FQFYKSGIYDHPGCGSGTLDHAVLIVGYGTEVENGKD 137
Query: 273 -WILKNWWSHHWGDNGYMYLKRG-NNRCGIANYAVYA 307
WI+KN W WG+NGY + RG NN CGI
Sbjct: 138 YWIVKNSWGTDWGENGYFRIARGKNNECGIEASVASY 174
>gnl|CDD|240310 PTZ00200, PTZ00200, cysteine proteinase; Provisional.
Length = 448
Score = 146 bits (371), Expect = 9e-41
Identities = 92/323 (28%), Positives = 144/323 (44%), Gaps = 54/323 (16%)
Query: 15 KKYKKDYRKKATDSKKKLHWQSNHKKIHTHNQEAQQGLHGYTLRENHLSDL--------- 65
KKY + + A + L +++N+ ++ +H +G Y+ N SDL
Sbjct: 131 KKYNRKHATHAERLNRFLTFRNNYLEVKSH-----KGDEPYSKEINKFSDLTEEEFRKLF 185
Query: 66 HPRHYIKEMTRL-THSRIRRTLVRSP----------ESNESVLIPDH-----LDWREKGF 109
+ ++ + V +P ++E V P LDWR
Sbjct: 186 PVIKVPPKSNSTSHNNDFKARHVSNPTYLKNLKKAKNTDEDVKDPSKITGEGLDWRRADA 245
Query: 110 ITPDWNQ-EDCGACYAFSIASAIQG--QIFKSTSEIEELSIQQVVDCSIISGNLGCAGGS 166
+T +Q +CG+C+AFS +++ +I++ S +LS Q++V+C + + GC+GG
Sbjct: 246 VTKVKDQGLNCGSCWAFSSVGSVESLYKIYRDKSV--DLSEQELVNCD--TKSQGCSGGY 301
Query: 167 LRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVDISSWSVLPPQDEHALKVTLAT 226
L YV+ G L D PY K C V I S+ V +D L +L
Sbjct: 302 PDTALEYVKNKG-LSSSSDVPYLAKDGKCVVSSTK-KVYIDSYLVAKGKD--VLNKSL-V 356
Query: 227 VGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGYTRNS------WILKNWWS 280
+ P V I S Y SG+Y+ E C +NHA+LLVG + WI+KN W
Sbjct: 357 ISPTVVYIAVSR-ELLKYKSGVYNGE-C-GKSLNHAVLLVGEGYDEKTKKRYWIIKNSWG 413
Query: 281 HHWGDNGYMYLKR---GNNRCGI 300
WG+NGYM L+R G ++CGI
Sbjct: 414 TDWGENGYMRLERTNEGTDKCGI 436
>gnl|CDD|240232 PTZ00021, PTZ00021, falcipain-2; Provisional.
Length = 489
Score = 139 bits (351), Expect = 1e-37
Identities = 102/340 (30%), Positives = 149/340 (43%), Gaps = 42/340 (12%)
Query: 1 MTNKEWIIIFIFPQKKYKKDYRKKATDSKKKLHWQSNHKKIHTHNQEAQQGLHGYTLREN 60
MTN E + F K++ K Y+ ++ L + N KI+ HN + Y N
Sbjct: 160 MTNLENVNSFYLFIKEHGKKYQTPDEMQQRYLSFVENLAKINAHNNKENVL---YKKGMN 216
Query: 61 HLSDLHPRHYIKEMTRLTHSRIRRTLVRSP-ESNESVLIP---------DHL--DWREKG 108
DL + K+ L + +SP N +I DH DWR
Sbjct: 217 RFGDLSFEEFKKKYLTLKSFDFKSNGKKSPRVINYDDVIKKYKPKDATFDHAKYDWRLHN 276
Query: 109 FITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIISGNLGCAGGSLR 168
+TP +Q++CG+C+AFS ++ Q +E+ LS Q++VDCS N GC GG +
Sbjct: 277 GVTPVKDQKNCGSCWAFSTVGVVESQYAIRKNELVSLSEQELVDCSF--KNNGCYGGLIP 334
Query: 169 NTLNYVQFAGGLMKEEDYPYKG-KQSICKFKRPNIVVDISSWSVLPPQDEHALKVTLATV 227
N + GGL E+DYPY +C R I S+ +P E K + +
Sbjct: 335 NAFEDMIELGGLCSEDDYPYVSDTPELCNIDRCKEKYKIKSYVSIP---EDKFKEAIRFL 391
Query: 228 GPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGY--------------TRNSW 273
GPI+VSI S F Y GI+D E C + NHA++LVGY R +
Sbjct: 392 GPISVSIAVS-DDFAFYKGGIFDGE-CG-EEPNHAVILVGYGMEEIYNSDTKKMEKRYYY 448
Query: 274 ILKNWWSHHWGDNGYMYLKRGNN----RCGIANYAVYALI 309
I+KN W WG+ G++ ++ N C + A LI
Sbjct: 449 IIKNSWGESWGEKGFIRIETDENGLMKTCSLGTEAYVPLI 488
>gnl|CDD|185513 PTZ00203, PTZ00203, cathepsin L protease; Provisional.
Length = 348
Score = 126 bits (317), Expect = 9e-34
Identities = 72/219 (32%), Positives = 105/219 (47%), Gaps = 15/219 (6%)
Query: 98 IPDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIIS 157
+PD +DWREKG +TP NQ CG+C+AFS I+ Q + ++ LS QQ+V C +
Sbjct: 126 VPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWAVAGHKLVRLSEQQLVSCDHVD 185
Query: 158 GNLGCAGGSLRNTLNYV--QFAGGLMKEEDYPY---KGKQSICKFKRPNIVVDISSWSVL 212
GC GG + +V G + E+ YPY G C V
Sbjct: 186 N--GCGGGLMLQAFEWVLRNMNGTVFTEKSYPYVSGNGDVPECSNSSELAPGARIDGYVS 243
Query: 213 PPQDEHALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVNHAMLLVGYTRNS 272
E + LA GPI+++++AS +F Y SG+ +C + +NH +LLVGY
Sbjct: 244 MESSERVMAAWLAKNGPISIAVDAS--SFMSYHSGVL--TSCIGEQLNHGVLLVGYNMTG 299
Query: 273 ----WILKNWWSHHWGDNGYMYLKRGNNRCGIANYAVYA 307
W++KN W WG+ GY+ + G N C + Y V
Sbjct: 300 EVPYWVIKNSWGEDWGEKGYVRVTMGVNACLLTGYPVSV 338
>gnl|CDD|239112 cd02621, Peptidase_C1A_CathepsinC, Cathepsin C; also known as
Dipeptidyl Peptidase I (DPPI), an atypical papain-like
cysteine peptidase with chloride dependency and
dipeptidyl aminopeptidase activity, resulting from its
tetrameric structure which limits substrate access. Each
subunit of the tetramer is composed of three peptides:
the heavy and light chains, which together adopts the
papain fold and forms the catalytic domain; and the
residual propeptide region, which forms a beta barrel
and points towards the substrate's N-terminus. The
subunit composition is the result of the unique
characteristic of procathepsin C maturation involving
the cleavage of the catalytic domain and the
non-autocatalytic excision of an activation peptide
within its propeptide region. By removing N-terminal
dipeptide extensions, cathepsin C activates granule
serine peptidases (granzymes) involved in cell-mediated
apoptosis, inflammation and tissue remodelling.
Loss-of-function mutations in cathepsin C are associated
with Papillon-Lefevre and Haim-Munk syndromes, rare
diseases characterized by hyperkeratosis and early-onset
periodontitis. Cathepsin C is widely expressed in many
tissues with high levels in lung, kidney and placenta.
It is also highly expressed in cytotoxic lymphocytes and
mature myeloid cells.
Length = 243
Score = 111 bits (280), Expect = 2e-29
Identities = 71/247 (28%), Positives = 102/247 (41%), Gaps = 46/247 (18%)
Query: 99 PDHLDWREKG----FITPDWNQEDCGACYAFSIASAIQGQI------FKSTSEIEELSIQ 148
P DW + +++P NQ CG+CYAF+ A++ +I + LS Q
Sbjct: 2 PKSFDWGDVNNGFNYVSPVRNQGGCGSCYAFASVYALEARIMIASNKTDPLGQQPILSPQ 61
Query: 149 QVVDCSIISGNLGCAGGSLRNTLNYVQ-FAGGLMKEEDYPYKG-KQSICKFKRPNIVVDI 206
V+ CS S GC GG + + F G++ E+ +PY CK
Sbjct: 62 HVLSCSQYSQ--GCDGGFPFLVGKFAEDF--GIVTEDYFPYTADDDRPCKASPSECRRYY 117
Query: 207 SS--------WSVLPPQDEHALKVTLATVGPIAVSINASPHTFQLYASGIYD----DEAC 254
S + +E +K + GPI V+ F Y G+Y DE
Sbjct: 118 FSDYNYVGGCYGC---TNEDEMKWEIYRNGPIVVAFEVYS-DFDFYKEGVYHHTDNDEVS 173
Query: 255 TSD--------YVNHAMLLVGY------TRNSWILKNWWSHHWGDNGYMYLKRGNNRCGI 300
D NHA+LLVG+ WI+KN W WG+ GY ++RG N CGI
Sbjct: 174 DGDNDNFNPFELTNHAVLLVGWGEDEIKGEKYWIVKNSWGSSWGEKGYFKIRRGTNECGI 233
Query: 301 ANYAVYA 307
+ AV+A
Sbjct: 234 ESQAVFA 240
>gnl|CDD|239110 cd02619, Peptidase_C1, C1 Peptidase family (MEROPS database
nomenclature), also referred to as the papain family;
composed of two subfamilies of cysteine peptidases
(CPs), C1A (papain) and C1B (bleomycin hydrolase).
Papain-like enzymes are mostly endopeptidases with some
exceptions like cathepsins B, C, H and X, which are
exopeptidases. Papain-like CPs have different functions
in various organisms. Plant CPs are used to mobilize
storage proteins in seeds while mammalian CPs are
primarily lysosomal enzymes responsible for protein
degradation in the lysosome. Papain-like CPs are
synthesized as inactive proenzymes with N-terminal
propeptide regions, which are removed upon activation.
Bleomycin hydrolase (BH) is a CP that detoxifies
bleomycin by hydrolysis of an amide group. It acts as a
carboxypeptidase on its C-terminus to convert itself
into an aminopeptidase and peptide ligase. BH is found
in all tissues in mammals as well as in many other
eukaryotes. It forms a hexameric ring barrel structure
with the active sites imbedded in the central channel.
Some members of the C1 family are proteins classified as
non-peptidase homologs which lack peptidase activity or
have missing active site residues.
Length = 223
Score = 94.1 bits (234), Expect = 7e-23
Identities = 54/209 (25%), Positives = 84/209 (40%), Gaps = 20/209 (9%)
Query: 102 LDWREKGFITPDWNQEDCGACYAFSIASAIQGQ--IFKSTSEIEELSIQQVVDCS---II 156
+D R NQ G+C+AF+ A A++ I E +LS Q + C+ +
Sbjct: 2 VDLRPLRLTPVK-NQGSRGSCWAFASAYALESAYRIKGGEDEYVDLSPQYLYICANDECL 60
Query: 157 SGNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNI----VVDISSWSVL 212
N C GG + L + G+ EEDYPY + + K V + + +
Sbjct: 61 GINGSCDGGGPLSALLKLVALKGIPPEEDYPYGAESDGEEPKSEAALNAAKVKLKDYRRV 120
Query: 213 PPQDEHALKVTLATVGPIAVSINASPHTFQL----YASGIYDDEACTSDYVNHAMLLVGY 268
+ +K LA GP+ + +L I D HA+++VGY
Sbjct: 121 LKNNIEDIKEALAKGGPVVAGFDVYSGFDRLKEGIIYEEIVYLLYEDGDLGGHAVVIVGY 180
Query: 269 TRN------SWILKNWWSHHWGDNGYMYL 291
N ++I+KN W WGDNGY +
Sbjct: 181 DDNYVEGKGAFIVKNSWGTDWGDNGYGRI 209
>gnl|CDD|239111 cd02620, Peptidase_C1A_CathepsinB, Cathepsin B group; composed of
cathepsin B and similar proteins, including
tubulointerstitial nephritis antigen (TIN-Ag). Cathepsin
B is a lysosomal papain-like cysteine peptidase which is
expressed in all tissues and functions primarily as an
exopeptidase through its carboxydipeptidyl activity.
Together with other cathepsins, it is involved in the
degradation of proteins, proenzyme activation, Ag
processing, metabolism and apoptosis. Cathepsin B has
been implicated in a number of human diseases such as
cancer, rheumatoid arthritis, osteoporosis and
Alzheimer's disease. The unique carboxydipeptidyl
activity of cathepsin B is attributed to the presence of
an occluding loop in its active site which favors the
binding of the C-termini of substrate proteins. Some
members of this group do not possess the occluding loop.
TIN-Ag is an extracellular matrix basement protein which
was originally identified as a target Ag involved in
anti-tubular basement membrane antibody-mediated
interstitial nephritis. It plays a role in renal
tubulogenesis and is defective in hereditary
tubulointerstitial disorders. TIN-Ag is exclusively
expressed in kidney tissues. .
Length = 236
Score = 90.4 bits (225), Expect = 2e-21
Identities = 63/251 (25%), Positives = 97/251 (38%), Gaps = 57/251 (22%)
Query: 99 PDHLDWREKGFITPDW-------NQEDCGACYAFSIASA------IQGQIFKSTSEIEEL 145
P+ D REK P+ +Q +CG+C+AFS A IQ ++ L
Sbjct: 1 PESFDAREK---WPNCISIGEIRDQGNCGSCWAFSAVEAFSDRLCIQSNGKENVL----L 53
Query: 146 SIQQVVDCSIISGNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVD 205
S Q ++ C G+ GC GG Y+ G ++ PY + P
Sbjct: 54 SAQDLLSCCSGCGD-GCNGGYPDAAWKYLTTTG-VVTGGCQPYTIPPCGHHPEGPPPCCG 111
Query: 206 -----------------------ISSWSVLPPQDEHALKVTLATVGPIAVSINASPHT-F 241
S++SV P DE + + T GP+ + + F
Sbjct: 112 TPYCTPKCQDGCEKTYEEDKHKGKSAYSV--PSDETDIMKEIMTNGPVQAAFTV--YEDF 167
Query: 242 QLYASGIYDDEACTSDYVN-HAMLLVGY-TRNS---WILKNWWSHHWGDNGYMYLKRGNN 296
Y SG+Y + + HA+ ++G+ N W+ N W WG+NGY + RG+N
Sbjct: 168 LYYKSGVYQHT--SGKQLGGHAVKIIGWGVENGVPYWLAANSWGTDWGENGYFRILRGSN 225
Query: 297 RCGIANYAVYA 307
CGI + V
Sbjct: 226 ECGIESEVVAG 236
>gnl|CDD|239149 cd02698, Peptidase_C1A_CathepsinX, Cathepsin X; the only
papain-like lysosomal cysteine peptidase exhibiting
carboxymonopeptidase activity. It can also act as a
carboxydipeptidase, like cathepsin B, but has been shown
to preferentially cleave substrates through a
monopeptidyl carboxypeptidase pathway. The propeptide
region of cathepsin X, the shortest among papain-like
peptidases, is covalently attached to the active site
cysteine in the inactive form of the enzyme. Little is
known about the biological function of cathepsin X. Some
studies point to a role in early tumorigenesis. A more
recent study indicates that cathepsin X expression is
restricted to immune cells suggesting a role in
phagocytosis and the regulation of the immune response.
Length = 239
Score = 81.3 bits (201), Expect = 4e-18
Identities = 58/243 (23%), Positives = 91/243 (37%), Gaps = 49/243 (20%)
Query: 98 IPDHLDWRE---KGFITPDWNQ---EDCGACYAFSIASAIQGQIF---KSTSEIEELSIQ 148
+P DWR +++P NQ + CG+C+A SA+ +I K LS+Q
Sbjct: 1 LPKSWDWRNVNGVNYVSPTRNQHIPQYCGSCWAHGSTSALADRINIARKGAWPSVYLSVQ 60
Query: 149 QVVDCSIISGNLG-CAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVDIS 207
V+DC G C GG Y G+ E PY+ K C
Sbjct: 61 VVIDC----AGGGSCHGGDPGGVYEYA-HKHGIPDETCNPYQAKDGECNPF-------NR 108
Query: 208 SWSVLPPQDEHALKV-TLATV-------------------GPIAVSINASPHTFQLYASG 247
+ P + A+K TL V GPI+ I A+ + Y G
Sbjct: 109 CGTCNPFGECFAIKNYTLYFVSDYGSVSGRDKMMAEIYARGPISCGIMATE-ALENYTGG 167
Query: 248 IYDDEACTSDYVNHAMLLVGY-----TRNSWILKNWWSHHWGDNGYMYLKRGNNRCGIAN 302
+Y + +NH + + G+ WI++N W WG+ G+ + + + N
Sbjct: 168 VYKEYVQDP-LINHIISVAGWGVDENGVEYWIVRNSWGEPWGERGWFRIVTSSYKGARYN 226
Query: 303 YAV 305
A+
Sbjct: 227 LAI 229
>gnl|CDD|240244 PTZ00049, PTZ00049, cathepsin C-like protein; Provisional.
Length = 693
Score = 60.0 bits (145), Expect = 5e-10
Identities = 35/107 (32%), Positives = 47/107 (43%), Gaps = 29/107 (27%)
Query: 228 GPIAVSINASPHTFQLYASGIYDDEA------CTSD--------------YVNHAMLLVG 267
GPI S ASP + YA G+Y E CT D VNHA++LVG
Sbjct: 568 GPIVASFEASPDFYD-YADGVYYVEDFPHARRCTVDLPKHNGVYNITGWEKVNHAIVLVG 626
Query: 268 YTRNS--------WILKNWWSHHWGDNGYMYLKRGNNRCGIANYAVY 306
+ WI +N W +WG GY + RG N GI + +++
Sbjct: 627 WGEEEINGKLYKYWIGRNSWGKNWGKEGYFKIIRGKNFSGIESQSLF 673
Score = 34.2 bits (78), Expect = 0.094
Identities = 20/61 (32%), Positives = 29/61 (47%), Gaps = 12/61 (19%)
Query: 115 NQEDCGACYAFSIASAIQGQIFKSTSEI----------EELSIQQVVDCSIISGNLGCAG 164
NQ CG+CY S A + +I + ++ + LSIQ V+ CS + GC G
Sbjct: 402 NQLLCGSCYIASQMYAFKRRIEIALTKNLDKKYLNNFDDLLSIQTVLSCSFY--DQGCNG 459
Query: 165 G 165
G
Sbjct: 460 G 460
>gnl|CDD|219764 pfam08246, Inhibitor_I29, Cathepsin propeptide inhibitor domain
(I29). This domain is found at the N-terminus of some
C1 peptidases such as Cathepsin L where it acts as a
propeptide. There are also a number of proteins that
are composed solely of multiple copies of this domain
such as the peptidase inhibitor salarin. This family is
classified as I29 by MEROPS.
Length = 58
Score = 49.2 bits (118), Expect = 3e-08
Identities = 16/54 (29%), Positives = 26/54 (48%), Gaps = 3/54 (5%)
Query: 14 QKKYKKDYRKKATDSKKKLHWQSNHKKIHTHNQEAQQGLHGYTLRENHLSDLHP 67
+KKY K Y + + + ++ N + I HN ++G YTL N +DL
Sbjct: 5 KKKYGKSYYSEEEELYRFQIFKENLRFIEEHN---KKGNVSYTLGLNQFADLTD 55
>gnl|CDD|240381 PTZ00364, PTZ00364, dipeptidyl-peptidase I precursor; Provisional.
Length = 548
Score = 52.2 bits (125), Expect = 2e-07
Identities = 57/307 (18%), Positives = 103/307 (33%), Gaps = 49/307 (15%)
Query: 45 NQEAQQGLHGYTLRENHLSDLHPRHYIKEMTRLTHSRIRRTLVRSPESNESVLIPDHLDW 104
N Q R + Y K + +S P W
Sbjct: 152 NIHYVQRPGPVNPRRLPVLVPTGDPYSKSRSARKAKTASFGFRQSFSHQLGDPPPAAWSW 211
Query: 105 REKG---FIT---PDWNQEDCGACYAFSIASAIQGQIFKSTSEIEE------LSIQQVVD 152
+ G F+ P C + Y + +A+ ++ +++ + LS + V+D
Sbjct: 212 GDVGGASFLPAAPPASPGRGCNSSYVEAALAAMMARVMVASNRTDPLGQQTFLSARHVLD 271
Query: 153 CSIISGNLGCAGGSLRNTLNYVQFAGGLMKEEDY--PYK---GKQSICKFKRPNIVVDIS 207
CS GCAGG + + G++ + Y PY G + CK +RP+ +
Sbjct: 272 CSQYGQ--GCAGGFPEEVGKFAE-TFGILTTDSYYIPYDSGDGVERACKTRRPSRRYYFT 328
Query: 208 SWSVL-----PPQDEHALKVTLATVGPIAVSI--------NASPHTFQLYASGIYDDEAC 254
++ L D + + GP+ S+ T + + D
Sbjct: 329 NYGPLGGYYGAVTDPDEIIWEIYRHGPVPASVYANSDWYNCDENSTEDVRYVSLDDYSTA 388
Query: 255 TSD---------YVNHAMLLVGY-----TRNSWILKNWWS--HHWGDNGYMYLKRGNNRC 298
++D VNH +L++G+ + W++ + W W D G + RG N
Sbjct: 389 SADRPLRHYFASNVNHTVLIIGWGTDENGGDYWLVLDPWGSRRSWCDGGTRKIARGVNAY 448
Query: 299 GIANYAV 305
I + V
Sbjct: 449 NIESEVV 455
>gnl|CDD|214853 smart00848, Inhibitor_I29, Cathepsin propeptide inhibitor domain
(I29). This domain is found at the N-terminus of some
C1 peptidases such as Cathepsin L where it acts as a
propeptide. There are also a number of proteins that
are composed solely of multiple copies of this domain
such as the peptidase inhibitor salarin. This family is
classified as I29 by MEROPS. Peptide proteinase
inhibitors can be found as single domain proteins or as
single or multiple domains within proteins; these are
referred to as either simple or compound inhibitors,
respectively. In many cases they are synthesised as
part of a larger precursor protein, either as a
prepropeptide or as an N-terminal domain associated
with an inactive peptidase or zymogen. This domain
prevents access of the substrate to the active site.
Removal of the N-terminal inhibitor domain either by
interaction with a second peptidase or by autocatalytic
cleavage activates the zymogen. Other inhibitors
interact direct with proteinases using a simple
noncovalent lock and key mechanism; while yet others
use a conformational change-based trapping mechanism
that depends on their structural and thermodynamic
properties.
Length = 57
Score = 42.2 bits (100), Expect = 9e-06
Identities = 17/57 (29%), Positives = 30/57 (52%), Gaps = 7/57 (12%)
Query: 15 KKYKKDYRKKATDSKKKLH----WQSNHKKIHTHNQEAQQGLHGYTLRENHLSDLHP 67
+++KK + K + +++ ++ N KKI HN++ + H Y L N SDL P
Sbjct: 2 EQWKKKHGKSYSSEEEEARRFAIFKENLKKIEEHNKKYE---HSYKLGVNQFSDLTP 55
>gnl|CDD|227207 COG4870, COG4870, Cysteine protease [Posttranslational
modification, protein turnover, chaperones].
Length = 372
Score = 46.0 bits (109), Expect = 1e-05
Identities = 42/225 (18%), Positives = 86/225 (38%), Gaps = 32/225 (14%)
Query: 92 SNESVLIPDHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIFKST----SEIEELSI 147
N S +P + D R++G ++P +Q G+C+AF+ +++ + + SE ++
Sbjct: 93 LNASASLPSYFDRRDEGKVSPVKDQGSGGSCWAFATTRSLESYLNPESAWDFSENNMKNL 152
Query: 148 QQVVDCSIISGNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSIC-------KFKRP 200
V G + + +++G + + +D PY K +
Sbjct: 153 LGVPYEKGFDYTSNDGGNADMSAAYLTEWSGPVYETDD-PYSENSYFSPTNLPVTKHVQE 211
Query: 201 NIVVDISSWSVLPPQDEHALKVTLATVGPIAVSINASPHTFQLYASGIYDDEACTSDYVN 260
++ S L D +K G ++ S+ + + +
Sbjct: 212 AQIIP-SRKKYL---DNGNIKAMFGFYGAVSSSMYIDATNSLGICIPYPYVD--SGENWG 265
Query: 261 HAMLLVGY--------------TRNSWILKNWWSHHWGDNGYMYL 291
HA+L+VGY ++I+KN W +WG+NGY ++
Sbjct: 266 HAVLIVGYDDSFDINNFKYGPPGDGAFIIKNSWGTNWGENGYFWI 310
>gnl|CDD|185641 PTZ00462, PTZ00462, Serine-repeat antigen protein; Provisional.
Length = 1004
Score = 37.0 bits (85), Expect = 0.012
Identities = 16/44 (36%), Positives = 24/44 (54%), Gaps = 9/44 (20%)
Query: 254 CTSDYVNHAMLLVGY---------TRNSWILKNWWSHHWGDNGY 288
C D +HA+ +VGY ++ WI++N W +WGD GY
Sbjct: 716 CGDDTADHAVNIVGYGNYINDEDEKKSYWIVRNSWGKYWGDEGY 759
>gnl|CDD|201147 pfam00313, CSD, 'Cold-shock' DNA-binding domain.
Length = 66
Score = 31.0 bits (71), Expect = 0.099
Identities = 20/56 (35%), Positives = 25/56 (44%), Gaps = 11/56 (19%)
Query: 107 KGFITPDWNQEDCGACYAFSIASAIQGQIFKSTSEIEELSIQQVVDCSIISGNLGC 162
GFITP+ +D F SAIQG F+S L Q V+ I+ G G
Sbjct: 14 FGFITPEDGDKD-----VFVHFSAIQGDGFRS------LQEGQRVEFDIVEGTKGP 58
>gnl|CDD|224309 COG1391, GlnE, Glutamine synthetase adenylyltransferase
[Posttranslational modification, protein turnover,
chaperones / Signal transduction mechanisms].
Length = 963
Score = 31.2 bits (71), Expect = 0.84
Identities = 11/60 (18%), Positives = 18/60 (30%), Gaps = 4/60 (6%)
Query: 47 EAQQGLHGYTLRENHLSDLHPRHYIKEMTRLTHSRIRRTLVRSPESNE--SVLIPDHLDW 104
E + L P + ++ + RRT+ E L+P LD
Sbjct: 471 EGDEDEEDTLRTLAALGFEDPEQILTHISAFRNGSYRRTI--GERGRERLDELMPRLLDA 528
>gnl|CDD|143430 cd07112, ALDH_GABALDH-PuuC, Escherichia coli NADP+-dependent
gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase
PuuC-like. NADP+-dependent,
gamma-glutamyl-gamma-aminobutyraldehyde dehydrogenase
(GABALDH) PuuC of Escherichia coli which catalyzes the
conversion of putrescine to 4-aminobutanoate and other
similar sequences are present in this CD.
Length = 462
Score = 29.5 bits (67), Expect = 2.3
Identities = 18/52 (34%), Positives = 25/52 (48%), Gaps = 7/52 (13%)
Query: 215 QDEHALKVTLATVGPI----AVSINASPHTFQLYASGI---YDDEACTSDYV 259
+DE AL TL PI AV + ++ +TF+ YA I Y + A T
Sbjct: 65 RDELALLETLDMGKPISDALAVDVPSAANTFRWYAEAIDKVYGEVAPTGPDA 116
>gnl|CDD|147930 pfam06035, BTLCP, Bacterial transglutaminase-like cysteine
proteinase BTLCP. Members of this family are predicted
to be bacterial transglutaminase-like cysteine
proteinases. They contain a conserved Cys-His-Asp
catalytic triad. Their structure is predicted to be
similar to that of Salmonella typhimurium
N-hydroxyarylamine O-acetyltransferase, in pfam00797,
however they lack the sub-domain which is important for
arylamine recognition.
Length = 169
Score = 28.4 bits (64), Expect = 3.5
Identities = 15/35 (42%), Positives = 18/35 (51%), Gaps = 2/35 (5%)
Query: 261 HAMLLVGYTRNSWILKN--WWSHHWGDNGYMYLKR 293
HA+L V R ++L N W D GY YLKR
Sbjct: 113 HAVLTVRTDRGDFVLDNLTDEVLAWSDTGYRYLKR 147
>gnl|CDD|232865 TIGR00189, tesB, acyl-CoA thioesterase II. Function: hydrolyzes a
broad range of acyl-CoA thioesters. Physiological
function is not known. Subunit: homotetramer [Fatty acid
and phospholipid metabolism, Biosynthesis].
Length = 271
Score = 28.5 bits (64), Expect = 4.1
Identities = 11/37 (29%), Positives = 16/37 (43%)
Query: 100 DHLDWREKGFITPDWNQEDCGACYAFSIASAIQGQIF 136
DH W + F DW C + A ++G+IF
Sbjct: 218 DHSIWFHRPFRADDWLLYKCSSPSAGGSRGLVEGKIF 254
>gnl|CDD|236694 PRK10436, PRK10436, hypothetical protein; Provisional.
Length = 462
Score = 28.7 bits (65), Expect = 4.4
Identities = 29/120 (24%), Positives = 43/120 (35%), Gaps = 36/120 (30%)
Query: 62 LSDLHPRHYIKEMTRLTHSRIRR-------------TLVRS--P----ESNESVLIPDHL 102
LS LH + + RL I R LVR P +++E + +P ++
Sbjct: 316 LSTLHTNSTSETLVRLQQMGIARWMLASALKLVIAQRLVRKLCPHCRQQASEPIHLPPNI 375
Query: 103 DWREKGFITPDWNQEDCGACYA-----------FSIASAIQGQIFKSTS--EIEELSIQQ 149
W G + P W C CY I +Q I + S E+E + QQ
Sbjct: 376 -WP--GPL-PHWQAVGCEHCYHGYYGRTALFEVLPITPVLQQAIASNASPEELETHARQQ 431
>gnl|CDD|227250 COG4913, COG4913, Uncharacterized protein conserved in bacteria
[Function unknown].
Length = 1104
Score = 28.8 bits (64), Expect = 4.8
Identities = 12/41 (29%), Positives = 21/41 (51%), Gaps = 2/41 (4%)
Query: 13 PQKKYKKDYRKKATD-SKKKLHWQSNHKKIHTHNQEAQQGL 52
P +++KD R+K D S +L +N K+ T + + L
Sbjct: 590 PTTRWEKDDRRKLGDRSTYRLGS-TNDAKVETLRETVKAML 629
>gnl|CDD|217015 pfam02395, Peptidase_S6, Immunoglobulin A1 protease. This family
consists of immunoglobulin A1 protease proteins. The
immunoglobulin A1 protease cleaves immunoglobulin IgA
and is found in pathogenic bacteria such as Neisseria
gonorrhoeae. Not all of the members of this family are
IgA proteases, espP from E. coli O157:H7 cleaves human
coagulation factor V and hbp is a hemoglobin protease
from E. coli EB1.
Length = 758
Score = 28.2 bits (63), Expect = 6.7
Identities = 19/51 (37%), Positives = 25/51 (49%), Gaps = 4/51 (7%)
Query: 155 IISGNLGCAGGSLRNTLNYVQFAGGLMKEEDYPYKGKQSICKFKRPNIVVD 205
I +G G L+N++N Q AGGL E +Y K S +K I VD
Sbjct: 308 IFTGQNGTI--VLKNSIN--QGAGGLFFEGNYTVKVSASNQTWKGAGIDVD 354
>gnl|CDD|219055 pfam06484, Ten_N, Teneurin Intracellular Region. This family is
found in the intracellular N-terminal region of the
Teneurin family of proteins. These proteins are
'pair-rule' genes and are involved in tissue patterning,
specifically probably neural patterning. The
intracellular domain is cleaved in response to
homophilic interaction of the extracellular domain, and
translocates to the nucleus. Here it probably carries
out to some transcriptional regulatory activity. The
length of this region and the conservation suggests that
there may be two structural domains here (personal obs:C
Yeats).
Length = 370
Score = 28.1 bits (62), Expect = 6.8
Identities = 23/95 (24%), Positives = 42/95 (44%), Gaps = 8/95 (8%)
Query: 13 PQKKYKKDYRKKATDSKKKLHWQSNHKKI--HTHNQEAQQGLHGYTLRENHLSDLHPRH- 69
PQK Y KA D ++ + + K + H ++ ++QG +TLRE + P H
Sbjct: 26 PQKSYSSSETLKAYDHDSRMAYGNRVKDLVHHESDEFSRQGTD-FTLRELGFGEPSPPHR 84
Query: 70 --YIKEMTRLTHSRIRRTLVRSPESN-ESVLIPDH 101
Y +M L H + ++ + ++ P+H
Sbjct: 85 SGYRSDMG-LPHRGYSVSTGSDADTETDGIMSPEH 118
>gnl|CDD|182046 PRK09719, PRK09719, hypothetical protein; Provisional.
Length = 89
Score = 26.2 bits (57), Expect = 7.9
Identities = 14/55 (25%), Positives = 25/55 (45%), Gaps = 4/55 (7%)
Query: 41 IHTHNQEAQQGLHGYTLRENHLSDLH----PRHYIKEMTRLTHSRIRRTLVRSPE 91
IH + + +H T E+ + +H RH ++ R + ++ R RSPE
Sbjct: 19 IHRRVVDKRTSMHSRTASESTGARIHRPWCARHQVRPAWRCQYDKLHRVPFRSPE 73
Database: CDD.v3.10
Posted date: Mar 20, 2013 7:55 AM
Number of letters in database: 10,937,602
Number of sequences in database: 44,354
Lambda K H
0.319 0.134 0.425
Gapped
Lambda K H
0.267 0.0764 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 44354
Number of Hits to DB: 15,784,908
Number of extensions: 1460056
Number of successful extensions: 1182
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1139
Number of HSP's successfully gapped: 37
Length of query: 309
Length of database: 10,937,602
Length adjustment: 97
Effective length of query: 212
Effective length of database: 6,635,264
Effective search space: 1406675968
Effective search space used: 1406675968
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (26.4 bits)