Query 021321
Match_columns 314
No_of_seqs 295 out of 2128
Neff 8.6
Searched_HMMs 46136
Date Fri Mar 29 09:19:22 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/021321.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/021321hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 2E-38 4.3E-43 302.7 29.1 223 76-314 40-269 (455)
2 PRK10898 serine endoprotease; 100.0 1.5E-37 3.2E-42 288.5 28.0 213 76-314 45-258 (353)
3 TIGR02038 protease_degS peripl 100.0 1.9E-37 4.1E-42 287.9 28.7 216 73-314 42-257 (351)
4 PRK10942 serine endoprotease; 100.0 6.4E-36 1.4E-40 286.8 28.1 223 76-314 38-290 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 7.4E-35 1.6E-39 278.1 26.4 221 78-314 3-236 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 9.7E-27 2.1E-31 216.7 22.6 219 76-314 33-251 (347)
7 KOG1320 Serine protease [Postt 99.8 7.3E-19 1.6E-23 164.8 14.6 224 74-309 126-356 (473)
8 PF13365 Trypsin_2: Trypsin-li 99.7 7.3E-17 1.6E-21 126.6 9.1 117 122-271 1-120 (120)
9 KOG1421 Predicted signaling-as 99.6 3.8E-15 8.2E-20 142.0 13.0 209 76-313 52-268 (955)
10 PF00089 Trypsin: Trypsin; In 99.6 7.7E-14 1.7E-18 120.5 19.0 170 119-300 24-220 (220)
11 cd00190 Tryp_SPc Trypsin-like 99.5 1.1E-12 2.5E-17 114.0 19.4 176 118-302 23-231 (232)
12 smart00020 Tryp_SPc Trypsin-li 99.4 1.6E-11 3.5E-16 106.9 16.5 172 118-298 24-227 (229)
13 COG3591 V8-like Glu-specific e 99.2 1.9E-09 4.2E-14 94.2 15.9 170 117-304 61-250 (251)
14 KOG3627 Trypsin [Amino acid tr 98.8 5.6E-07 1.2E-11 80.0 18.3 175 120-304 38-254 (256)
15 PF00863 Peptidase_C4: Peptida 98.8 1.8E-07 3.8E-12 81.3 13.9 167 83-294 14-185 (235)
16 COG5640 Secreted trypsin-like 98.6 4.7E-06 1E-10 75.5 17.1 54 250-305 223-279 (413)
17 KOG1320 Serine protease [Postt 98.6 1.1E-07 2.5E-12 89.9 6.7 195 81-301 55-251 (473)
18 KOG1421 Predicted signaling-as 98.4 3.5E-06 7.6E-11 81.7 12.7 203 82-309 524-732 (955)
19 PF05579 Peptidase_S32: Equine 98.1 1.7E-05 3.6E-10 69.2 9.2 117 120-277 114-230 (297)
20 PF03761 DUF316: Domain of unk 97.7 0.0029 6.2E-08 57.2 17.4 111 176-299 159-274 (282)
21 PF10459 Peptidase_S46: Peptid 97.6 0.00033 7.2E-09 70.5 10.3 23 120-142 47-69 (698)
22 PF00548 Peptidase_C3: 3C cyst 97.5 0.002 4.2E-08 54.0 12.1 140 117-275 22-170 (172)
23 PF10459 Peptidase_S46: Peptid 97.4 0.00033 7.1E-09 70.6 6.4 61 245-305 623-688 (698)
24 PF05580 Peptidase_S55: SpoIVB 96.7 0.036 7.7E-07 47.5 11.9 42 249-296 174-215 (218)
25 PF08192 Peptidase_S64: Peptid 96.6 0.014 3.1E-07 57.4 10.0 118 176-303 541-688 (695)
26 PF00949 Peptidase_S7: Peptida 96.1 0.0054 1.2E-07 48.7 3.1 33 246-278 88-120 (132)
27 PF02122 Peptidase_S39: Peptid 95.6 0.07 1.5E-06 45.7 8.1 154 116-296 26-184 (203)
28 TIGR02860 spore_IV_B stage IV 95.2 0.24 5.1E-06 46.9 11.1 42 249-296 354-395 (402)
29 PF00944 Peptidase_S3: Alphavi 95.0 0.026 5.6E-07 44.4 3.2 29 250-278 101-129 (158)
30 PF03510 Peptidase_C24: 2C end 94.0 0.19 4.1E-06 38.2 5.9 103 122-263 1-105 (105)
31 PF09342 DUF1986: Domain of un 94.0 2.1 4.5E-05 37.6 13.0 94 117-216 25-131 (267)
32 PF05416 Peptidase_C37: Southa 92.9 0.12 2.6E-06 48.4 3.8 137 119-277 378-528 (535)
33 PF00947 Pico_P2A: Picornaviru 90.9 0.36 7.8E-06 37.8 4.0 32 244-276 79-110 (127)
34 PF02907 Peptidase_S29: Hepati 90.9 0.35 7.7E-06 38.2 3.9 42 252-296 105-146 (148)
35 PF02395 Peptidase_S6: Immunog 87.0 3.6 7.7E-05 42.5 9.1 49 252-303 213-266 (769)
36 PF01732 DUF31: Putative pepti 81.5 1.1 2.3E-05 42.3 2.5 23 251-273 351-373 (374)
37 COG5510 Predicted small secret 78.8 1.9 4.2E-05 26.9 2.1 24 29-52 1-24 (44)
38 PF12381 Peptidase_C3G: Tungro 75.0 3.2 6.9E-05 35.7 3.1 56 243-304 168-229 (231)
39 PRK10081 entericidin B membran 68.6 4.5 9.8E-05 26.0 2.0 24 29-52 1-24 (48)
40 COG3056 Uncharacterized lipopr 60.8 12 0.00027 31.3 3.7 16 41-56 22-37 (204)
41 PF00571 CBS: CBS domain CBS d 58.7 9 0.0002 24.8 2.3 22 254-275 28-49 (57)
42 PRK14864 putative biofilm stre 58.0 66 0.0014 24.4 7.0 10 118-127 93-102 (104)
43 COG0298 HypC Hydrogenase matur 51.9 38 0.00083 24.3 4.5 47 167-215 5-52 (82)
44 PRK15396 murein lipoprotein; P 46.9 20 0.00044 25.7 2.5 21 32-52 3-23 (78)
45 PF02743 Cache_1: Cache domain 43.6 27 0.00059 24.5 2.9 30 259-303 19-48 (81)
46 PF05578 Peptidase_S31: Pestiv 42.8 92 0.002 25.4 6.0 128 119-278 50-185 (211)
47 PF14827 Cache_3: Sensory doma 39.4 21 0.00045 27.4 1.8 19 258-276 93-111 (116)
48 PF01732 DUF31: Putative pepti 34.9 28 0.0006 32.8 2.3 24 119-142 35-68 (374)
49 COG3065 Slp Starvation-inducib 33.8 2.1E+02 0.0046 24.0 6.9 11 44-54 17-27 (191)
50 cd04627 CBS_pair_14 The CBS do 33.3 28 0.00062 26.2 1.7 22 254-275 97-118 (123)
51 cd04618 CBS_pair_5 The CBS dom 30.4 82 0.0018 22.8 3.8 50 254-307 22-72 (98)
52 PRK10672 rare lipoprotein A; P 29.6 2.6E+02 0.0057 26.2 7.6 29 111-141 85-113 (361)
53 COG3290 CitA Signal transducti 29.5 62 0.0014 31.9 3.6 18 259-276 143-160 (537)
54 cd04603 CBS_pair_KefB_assoc Th 29.2 40 0.00087 24.9 1.9 22 254-275 85-106 (111)
55 PF10049 DUF2283: Protein of u 28.8 38 0.00082 21.8 1.5 12 263-274 36-47 (50)
56 cd04620 CBS_pair_7 The CBS dom 28.5 39 0.00083 24.9 1.7 21 255-275 90-110 (115)
57 PF07172 GRP: Glycine rich pro 27.0 47 0.001 24.8 1.9 9 38-46 9-17 (95)
58 cd04597 CBS_pair_DRTGG_assoc2 25.7 58 0.0013 24.4 2.3 22 254-275 87-108 (113)
59 cd04643 CBS_pair_30 The CBS do 25.5 49 0.0011 24.3 1.8 17 259-275 95-111 (116)
60 PF08669 GCV_T_C: Glycine clea 25.5 78 0.0017 23.0 2.9 23 256-278 34-56 (95)
61 cd01739 LSm11_C The eukaryotic 25.3 1.2E+02 0.0026 20.9 3.4 39 149-187 7-45 (66)
62 cd04592 CBS_pair_EriC_assoc_eu 23.8 65 0.0014 25.2 2.3 22 254-275 22-43 (133)
63 cd04641 CBS_pair_28 The CBS do 23.5 66 0.0014 24.0 2.3 22 253-274 21-42 (120)
64 cd04619 CBS_pair_6 The CBS dom 23.2 57 0.0012 24.1 1.8 22 254-275 88-109 (114)
65 PRK14864 putative biofilm stre 22.0 65 0.0014 24.5 1.8 9 149-157 77-85 (104)
66 cd04602 CBS_pair_IMPDH_2 This 21.8 68 0.0015 23.6 2.0 22 254-275 88-109 (114)
67 cd04614 CBS_pair_1 The CBS dom 21.7 74 0.0016 22.9 2.1 50 254-307 22-71 (96)
68 cd04607 CBS_pair_NTP_transfera 21.6 64 0.0014 23.6 1.8 22 254-275 87-108 (113)
69 COG3448 CBS-domain-containing 21.3 61 0.0013 29.6 1.8 22 254-275 344-365 (382)
70 cd04582 CBS_pair_ABC_OpuCA_ass 21.0 67 0.0014 23.1 1.8 22 254-275 80-101 (106)
71 COG5428 Uncharacterized conser 20.9 72 0.0016 22.2 1.7 16 263-278 37-52 (69)
72 cd04583 CBS_pair_ABC_OpuCA_ass 20.7 74 0.0016 22.9 2.0 21 255-275 84-104 (109)
73 cd04617 CBS_pair_4 The CBS dom 20.6 69 0.0015 23.8 1.8 22 254-275 89-113 (118)
74 PF01455 HupF_HypC: HupF/HypC 20.2 3.1E+02 0.0066 19.0 5.4 43 167-212 5-47 (68)
75 PRK10781 rcsF outer membrane l 20.2 1.1E+02 0.0024 24.4 2.8 15 44-58 10-24 (133)
76 PRK13835 conjugal transfer pro 20.1 2.2E+02 0.0048 23.0 4.6 23 74-96 43-65 (145)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=2e-38 Score=302.74 Aligned_cols=223 Identities=32% Similarity=0.512 Sum_probs=180.7
Q ss_pred hHHHHHHHHhCCceEEEEeeeeecCCCCCccchhhcc------ccCCcccceEEEEEEcC-CCEEEeccccccCCCcCCC
Q 021321 76 DRVVQLFQETSPSVVSIQDLELSKNPKSTSSELMLVD------GEYAKVEGTGSGFVWDK-FGHIVTNYHVVAKLATDTS 148 (314)
Q Consensus 76 ~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~------~~~~~~~~~GsGfiI~~-~g~VLT~aHvv~~~~~~~~ 148 (314)
.++.++++++.||||.|.+......+......|...+ .......+.||||+|++ +||||||+|||+
T Consensus 40 ~~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~------- 112 (455)
T PRK10139 40 PSLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVIN------- 112 (455)
T ss_pred ccHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhC-------
Confidence 3699999999999999998765432211111111111 11123457999999985 699999999999
Q ss_pred CcceEEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEe
Q 021321 149 GLHRCKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVS 228 (314)
Q Consensus 149 ~~~~~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs 228 (314)
+++.+.|++.+++ .++|++++.|+.+||||||++.+ ..+++++|+++..+++|++|+++|||++...+++.|+|+
T Consensus 113 ~a~~i~V~~~dg~----~~~a~vvg~D~~~DlAvlkv~~~-~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS 187 (455)
T PRK10139 113 QAQKISIQLNDGR----EFDAKLIGSDDQSDIALLQIQNP-SKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIIS 187 (455)
T ss_pred CCCEEEEEECCCC----EEEEEEEEEcCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEc
Confidence 5678899987644 78999999999999999999843 368899999999999999999999999999999999999
Q ss_pred cccccccCCCCccccceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHHcCc
Q 021321 229 GLGREIPSPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIVYGT 308 (314)
Q Consensus 229 ~~~~~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~~~~ 308 (314)
...+...... .+..++++|+.+++|+|||||||.+|+||||+++..... +...|++||||++.+++++++|+++|+
T Consensus 188 ~~~r~~~~~~--~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~--~~~~gigfaIP~~~~~~v~~~l~~~g~ 263 (455)
T PRK10139 188 ALGRSGLNLE--GLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPG--GGSVGIGFAIPSNMARTLAQQLIDFGE 263 (455)
T ss_pred cccccccCCC--CcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCC--CCccceEEEEEhHHHHHHHHHHhhcCc
Confidence 8876422211 235689999999999999999999999999999876542 235789999999999999999999999
Q ss_pred cCCCCC
Q 021321 309 PYSNRF 314 (314)
Q Consensus 309 ~~~~~~ 314 (314)
+.|+|+
T Consensus 264 v~r~~L 269 (455)
T PRK10139 264 IKRGLL 269 (455)
T ss_pred ccccce
Confidence 999986
No 2
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=1.5e-37 Score=288.48 Aligned_cols=213 Identities=34% Similarity=0.551 Sum_probs=175.9
Q ss_pred hHHHHHHHHhCCceEEEEeeeeecCCCCCccchhhccccCCcccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEE
Q 021321 76 DRVVQLFQETSPSVVSIQDLELSKNPKSTSSELMLVDGEYAKVEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKV 155 (314)
Q Consensus 76 ~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v 155 (314)
.++.++++++.||||.|.+..... .........+.||||+|+++||||||+||++ +++.+.|
T Consensus 45 ~~~~~~~~~~~psvV~v~~~~~~~-----------~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~-------~a~~i~V 106 (353)
T PRK10898 45 ASYNQAVRRAAPAVVNVYNRSLNS-----------TSHNQLEIRTLGSGVIMDQRGYILTNKHVIN-------DADQIIV 106 (353)
T ss_pred chHHHHHHHhCCcEEEEEeEeccc-----------cCcccccccceeeEEEEeCCeEEEecccEeC-------CCCEEEE
Confidence 478899999999999999855321 0011223457999999998899999999999 5677888
Q ss_pred EEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEeccccccc
Q 021321 156 SLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIP 235 (314)
Q Consensus 156 ~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~ 235 (314)
.+.++ + .+++++++.|+.+||||||++.. .+++++|+++..+++|++|+++|||.+...+++.|+|++..+...
T Consensus 107 ~~~dg--~--~~~a~vv~~d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~ 180 (353)
T PRK10898 107 ALQDG--R--VFEALLVGSDSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGL 180 (353)
T ss_pred EeCCC--C--EEEEEEEEEcCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEecccccc
Confidence 88764 3 78899999999999999999854 578899988888999999999999999888999999998776432
Q ss_pred CCCCccccceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCC-CccceEEEEehHHHHHHHHHHHHcCccCCCCC
Q 021321 236 SPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTG-LSSGVNFAIPIDTVVRTVPYLIVYGTPYSNRF 314 (314)
Q Consensus 236 ~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~-~~~~~~~aipi~~i~~~l~~l~~~~~~~~~~~ 314 (314)
...+ ...++++|+.+.+|+|||||+|.+|+||||+++.....+.+ ...+++||||++.+++++++|+++|++.|+|+
T Consensus 181 ~~~~--~~~~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~l 258 (353)
T PRK10898 181 SPTG--RQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYI 258 (353)
T ss_pred CCcc--ccceEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCccccccc
Confidence 2222 24689999999999999999999999999999876543221 23689999999999999999999999999986
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=1.9e-37 Score=287.90 Aligned_cols=216 Identities=37% Similarity=0.565 Sum_probs=178.1
Q ss_pred ccchHHHHHHHHhCCceEEEEeeeeecCCCCCccchhhccccCCcccceEEEEEEcCCCEEEeccccccCCCcCCCCcce
Q 021321 73 LEEDRVVQLFQETSPSVVSIQDLELSKNPKSTSSELMLVDGEYAKVEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHR 152 (314)
Q Consensus 73 ~~~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~ 152 (314)
....++.++++++.||||+|++.....+. .......+.||||+|+++||||||+||++ +++.
T Consensus 42 ~~~~~~~~~~~~~~psVV~I~~~~~~~~~-----------~~~~~~~~~GSG~vi~~~G~IlTn~HVV~-------~~~~ 103 (351)
T TIGR02038 42 TVEISFNKAVRRAAPAVVNIYNRSISQNS-----------LNQLSIQGLGSGVIMSKEGYILTNYHVIK-------KADQ 103 (351)
T ss_pred ccchhHHHHHHhcCCcEEEEEeEeccccc-----------cccccccceEEEEEEeCCeEEEecccEeC-------CCCE
Confidence 44457999999999999999975532210 01123457899999998899999999998 5677
Q ss_pred EEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEecccc
Q 021321 153 CKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGR 232 (314)
Q Consensus 153 ~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~ 232 (314)
+.|.+.++ + .+++++++.|+.+||||||++.. .+++++++++..+++|++|+++|||.+...+.+.|+|+...+
T Consensus 104 i~V~~~dg--~--~~~a~vv~~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r 177 (351)
T TIGR02038 104 IVVALQDG--R--KFEAELVGSDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGR 177 (351)
T ss_pred EEEEECCC--C--EEEEEEEEecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccC
Confidence 88888764 3 78999999999999999999854 478889988888999999999999999889999999998876
Q ss_pred cccCCCCccccceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHHcCccCCC
Q 021321 233 EIPSPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIVYGTPYSN 312 (314)
Q Consensus 233 ~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~~~~~~~~ 312 (314)
...... ....++++|+.+.+|+|||||||.+|+||||+++.....+.....+++|+||++.+++++++|+++|++.|+
T Consensus 178 ~~~~~~--~~~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~ 255 (351)
T TIGR02038 178 NGLSSV--GRQNFIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRG 255 (351)
T ss_pred cccCCC--CcceEEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccce
Confidence 433222 224689999999999999999999999999999765433223346899999999999999999999999998
Q ss_pred CC
Q 021321 313 RF 314 (314)
Q Consensus 313 ~~ 314 (314)
|+
T Consensus 256 ~l 257 (351)
T TIGR02038 256 YI 257 (351)
T ss_pred Ee
Confidence 85
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=6.4e-36 Score=286.81 Aligned_cols=223 Identities=36% Similarity=0.525 Sum_probs=179.2
Q ss_pred hHHHHHHHHhCCceEEEEeeeeecCC---CC-Cccchhhc---c---c-------------------cCCcccceEEEEE
Q 021321 76 DRVVQLFQETSPSVVSIQDLELSKNP---KS-TSSELMLV---D---G-------------------EYAKVEGTGSGFV 126 (314)
Q Consensus 76 ~~~~~~~~~~~~svV~I~~~~~~~~~---~~-~~~~~~~~---~---~-------------------~~~~~~~~GsGfi 126 (314)
.++.++++++.||||.|++......+ .+ .+..||.. + . ......+.||||+
T Consensus 38 ~~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~i 117 (473)
T PRK10942 38 PSLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVI 117 (473)
T ss_pred ccHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEE
Confidence 36999999999999999987654321 00 01122210 0 0 0112356899999
Q ss_pred EcC-CCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCC
Q 021321 127 WDK-FGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRV 205 (314)
Q Consensus 127 I~~-~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~ 205 (314)
|++ +||||||+||+. +++.++|++.+++ .+++++++.|+.+||||||++.+ ..+++++|+++..+++
T Consensus 118 i~~~~G~IlTn~HVv~-------~a~~i~V~~~dg~----~~~a~vv~~D~~~DlAvlki~~~-~~l~~~~lg~s~~l~~ 185 (473)
T PRK10942 118 IDADKGYVVTNNHVVD-------NATKIKVQLSDGR----KFDAKVVGKDPRSDIALIQLQNP-KNLTAIKMADSDALRV 185 (473)
T ss_pred EECCCCEEEeChhhcC-------CCCEEEEEECCCC----EEEEEEEEecCCCCEEEEEecCC-CCCceeEecCccccCC
Confidence 986 599999999999 5678899887643 78999999999999999999743 3688999999999999
Q ss_pred CCEEEEEEcCCCCCCCeEeeEEecccccccCCCCccccceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCcc
Q 021321 206 GQSCFAIGNPYGFEDTLTTGVVSGLGREIPSPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSS 285 (314)
Q Consensus 206 G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~ 285 (314)
|++|+++|+|++...+++.|+|+...+.... ...+.+++++|+.+++|+|||||+|.+|+||||+++..... +...
T Consensus 186 G~~V~aiG~P~g~~~tvt~GiVs~~~r~~~~--~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~--g~~~ 261 (473)
T PRK10942 186 GDYTVAIGNPYGLGETVTSGIVSALGRSGLN--VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPD--GGNI 261 (473)
T ss_pred CCEEEEEcCCCCCCcceeEEEEEEeecccCC--cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCC--CCcc
Confidence 9999999999999899999999988764211 12345789999999999999999999999999999876543 2346
Q ss_pred ceEEEEehHHHHHHHHHHHHcCccCCCCC
Q 021321 286 GVNFAIPIDTVVRTVPYLIVYGTPYSNRF 314 (314)
Q Consensus 286 ~~~~aipi~~i~~~l~~l~~~~~~~~~~~ 314 (314)
+++|+||++.+++++++|+++|++.|||+
T Consensus 262 g~gfaIP~~~~~~v~~~l~~~g~v~rg~l 290 (473)
T PRK10942 262 GIGFAIPSNMVKNLTSQMVEYGQVKRGEL 290 (473)
T ss_pred cEEEEEEHHHHHHHHHHHHhcccccccee
Confidence 89999999999999999999999999985
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=7.4e-35 Score=278.12 Aligned_cols=221 Identities=40% Similarity=0.554 Sum_probs=178.8
Q ss_pred HHHHHHHhCCceEEEEeeeeecCCCC---C---ccchhhc-c------ccCCcccceEEEEEEcCCCEEEeccccccCCC
Q 021321 78 VVQLFQETSPSVVSIQDLELSKNPKS---T---SSELMLV-D------GEYAKVEGTGSGFVWDKFGHIVTNYHVVAKLA 144 (314)
Q Consensus 78 ~~~~~~~~~~svV~I~~~~~~~~~~~---~---~~~~~~~-~------~~~~~~~~~GsGfiI~~~g~VLT~aHvv~~~~ 144 (314)
+.++++++.||||.|.+......... . ...++.. . .......+.||||+|+++||||||+||++
T Consensus 3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~--- 79 (428)
T TIGR02037 3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVD--- 79 (428)
T ss_pred HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcC---
Confidence 67899999999999998764432111 0 1112211 0 01223567999999998899999999999
Q ss_pred cCCCCcceEEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEe
Q 021321 145 TDTSGLHRCKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTT 224 (314)
Q Consensus 145 ~~~~~~~~~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~ 224 (314)
+++.+.|.+.++. .+++++++.|+.+||||||++.+ ..++++.|+++..+++|++|+++|||++...+++.
T Consensus 80 ----~~~~i~V~~~~~~----~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~ 150 (428)
T TIGR02037 80 ----GADEITVTLSDGR----EFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTS 150 (428)
T ss_pred ----CCCeEEEEeCCCC----EEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEE
Confidence 5677888887643 78899999999999999999854 46899999888899999999999999999999999
Q ss_pred eEEecccccccCCCCccccceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHH
Q 021321 225 GVVSGLGREIPSPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLI 304 (314)
Q Consensus 225 G~vs~~~~~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~ 304 (314)
|+|+...+... ....+..++++|+.+.+|+|||||||.+|+||||+++..... +...+++||||++.+++++++|+
T Consensus 151 G~vs~~~~~~~--~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~--g~~~g~~faiP~~~~~~~~~~l~ 226 (428)
T TIGR02037 151 GIVSALGRSGL--GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPS--GGNVGIGFAIPSNMAKNVVDQLI 226 (428)
T ss_pred EEEEecccCcc--CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCC--CCccceEEEEEhHHHHHHHHHHH
Confidence 99998876521 122345689999999999999999999999999999876542 23568999999999999999999
Q ss_pred HcCccCCCCC
Q 021321 305 VYGTPYSNRF 314 (314)
Q Consensus 305 ~~~~~~~~~~ 314 (314)
++|++.|+|+
T Consensus 227 ~~g~~~~~~l 236 (428)
T TIGR02037 227 EGGKVQRGWL 236 (428)
T ss_pred hcCcCcCCcC
Confidence 9999999986
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.95 E-value=9.7e-27 Score=216.65 Aligned_cols=219 Identities=43% Similarity=0.613 Sum_probs=178.1
Q ss_pred hHHHHHHHHhCCceEEEEeeeeecCCCCCccchhhccccCCcccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEE
Q 021321 76 DRVVQLFQETSPSVVSIQDLELSKNPKSTSSELMLVDGEYAKVEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKV 155 (314)
Q Consensus 76 ~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v 155 (314)
..+.++++++.|+||+|........ ..++..........+.||||+++++|||+|+.|++. +++.+.+
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~-----~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~-------~a~~i~v 100 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKL-----RSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIA-------GAEEITV 100 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecc-----hhcccCCcccccccccccEEEEcCCeEEEecceecC-------CcceEEE
Confidence 5788999999999999998665432 011100000001158999999998899999999999 5577777
Q ss_pred EEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEeccccccc
Q 021321 156 SLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIP 235 (314)
Q Consensus 156 ~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~ 235 (314)
.+.+ |+ .+++++++.|+..|+|++|++.... ++.+.++++..++.|+++.++|+|++...+++.|+++...+. .
T Consensus 101 ~l~d--g~--~~~a~~vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~ 174 (347)
T COG0265 101 TLAD--GR--EVPAKLVGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-G 174 (347)
T ss_pred EeCC--CC--EEEEEEEecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-c
Confidence 7744 44 7899999999999999999986543 788899999999999999999999999999999999998885 2
Q ss_pred CCCCccccceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHHcCccCCCCC
Q 021321 236 SPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIVYGTPYSNRF 314 (314)
Q Consensus 236 ~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~~~~~~~~~~ 314 (314)
......+.+++|+|+.+++|+||||++|.+|++|||++......+. ..+++|+||++.+++++++++.+|++.|+++
T Consensus 175 v~~~~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~--~~gigfaiP~~~~~~v~~~l~~~G~v~~~~l 251 (347)
T COG0265 175 VGSAGGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGG--SSGIGFAIPVNLVAPVLDELISKGKVVRGYL 251 (347)
T ss_pred ccCcccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCC--cceeEEEecHHHHHHHHHHHHHcCCcccccc
Confidence 2111224678999999999999999999999999999998776432 4568999999999999999999899999874
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.80 E-value=7.3e-19 Score=164.82 Aligned_cols=224 Identities=39% Similarity=0.424 Sum_probs=170.8
Q ss_pred cchHHHHHHHHhCCceEEEEeeeeecCCCCCccchhhccccCCcccceEEEEEEcCCCEEEeccccccCCCcCC--CCcc
Q 021321 74 EEDRVVQLFQETSPSVVSIQDLELSKNPKSTSSELMLVDGEYAKVEGTGSGFVWDKFGHIVTNYHVVAKLATDT--SGLH 151 (314)
Q Consensus 74 ~~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~--~~~~ 151 (314)
.....+.+.++...|+|.|....--.... ...........||||+++.||+++||+||+....... .+..
T Consensus 126 ~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~--------~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~ 197 (473)
T KOG1320|consen 126 YKAFVAAVFEECDLAVVYIESEEFWKGMN--------PFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTV 197 (473)
T ss_pred hhhhHHHhhhcccceEEEEeeccccCCCc--------ccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcc
Confidence 35678889999999999999743211110 1223345667999999999999999999997532211 1112
Q ss_pred --eEEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEec
Q 021321 152 --RCKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSG 229 (314)
Q Consensus 152 --~~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~ 229 (314)
.+.|...++.|+ ..++.+.+.|+..|+|+++++.+....++++++-+..+..|+++..+|.|++...+.+.|.++.
T Consensus 198 l~~vqi~aa~~~~~--s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~ 275 (473)
T KOG1320|consen 198 LLRVQIDAAIGPGN--SGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSG 275 (473)
T ss_pred eeeEEEEEeecCCc--cCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeeccccc
Confidence 244444444334 6788999999999999999976543377888888899999999999999999999999999988
Q ss_pred ccccccCCCC---ccccceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHHc
Q 021321 230 LGREIPSPNG---RAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIVY 306 (314)
Q Consensus 230 ~~~~~~~~~~---~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~~ 306 (314)
..+....... ....+++++|+.+..|.||+|++|.+|++||+++..... .+...+++|++|++.++.++.+..++
T Consensus 276 ~~R~~~~lg~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~r--i~~~~~iSf~~p~d~vl~~v~r~~e~ 353 (473)
T KOG1320|consen 276 QLRKSFKLGLETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTR--IGFSHGISFKIPIDTVLVIVLRLGEF 353 (473)
T ss_pred ccccccccCcccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEE--eeccccceeccCchHhhhhhhhhhhh
Confidence 8775443222 345678999999999999999999999999999887654 23357899999999999999888654
Q ss_pred Ccc
Q 021321 307 GTP 309 (314)
Q Consensus 307 ~~~ 309 (314)
...
T Consensus 354 ~~~ 356 (473)
T KOG1320|consen 354 QIS 356 (473)
T ss_pred cee
Confidence 443
No 8
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.70 E-value=7.3e-17 Score=126.58 Aligned_cols=117 Identities=33% Similarity=0.492 Sum_probs=69.1
Q ss_pred EEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEEEE--EEEEEeCCC-CcEEEEEEeeCCCccceeecC
Q 021321 122 GSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFYRE--GKMVGCDPA-YDLAVLKVDVEGFELKPVVLG 198 (314)
Q Consensus 122 GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~~~--a~v~~~d~~-~DlAlL~v~~~~~~~~~~~l~ 198 (314)
||||+|+++|+||||+||+.+...... .....+.+...++. ... ++++..++. +|+|||+++
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~-~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~D~All~v~------------ 65 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQ-PDNSSVEVVFPDGR--RVPPVAEVVYFDPDDYDLALLKVD------------ 65 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G--TCSEEEEEETTSC--EEETEEEEEEEETT-TTEEEEEES------------
T ss_pred CEEEEEcCCceEEEchhheeccccccc-CCCCEEEEEecCCC--EEeeeEEEEEECCccccEEEEEEe------------
Confidence 899999997799999999996432110 12334444433344 345 899999998 999999997
Q ss_pred CCCCCCCCCEEEEEEcCCCCCCCeEeeEEecccccccCCCCccccceEEEeeccCCCCcccceecCCCeEEEE
Q 021321 199 TSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIPSPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGV 271 (314)
Q Consensus 199 ~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI 271 (314)
.....+... ...+........... ... ...+ +++.+.+|+|||||||.+|+||||
T Consensus 66 ---------~~~~~~~~~-----~~~~~~~~~~~~~~~--~~~-~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 66 ---------PWTGVGGGV-----RVPGSTSGVSPTSTN--DNR-MLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ---------CEEEEEEEE-----EEEEEEEEEEEEEEE--ETE-EEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred ---------cccceeeee-----EeeeeccccccccCc--ccc-eeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 000000000 000000000000000 000 0113 799999999999999999999997
No 9
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.62 E-value=3.8e-15 Score=142.04 Aligned_cols=209 Identities=25% Similarity=0.301 Sum_probs=161.4
Q ss_pred hHHHHHHHHhCCceEEEEeeeeecCCCCCccchhhccccCCcccceEEEEEEcC-CCEEEeccccccCCCcCCCCcceEE
Q 021321 76 DRVVQLFQETSPSVVSIQDLELSKNPKSTSSELMLVDGEYAKVEGTGSGFVWDK-FGHIVTNYHVVAKLATDTSGLHRCK 154 (314)
Q Consensus 76 ~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GsGfiI~~-~g~VLT~aHvv~~~~~~~~~~~~~~ 154 (314)
.++...+.++-+|||.|+...... ++......+.+|||++++ .||+|||+|++.. +.-...
T Consensus 52 e~w~~~ia~VvksvVsI~~S~v~~------------fdtesag~~~atgfvvd~~~gyiLtnrhvv~p------gP~va~ 113 (955)
T KOG1421|consen 52 EDWRNTIANVVKSVVSIRFSAVRA------------FDTESAGESEATGFVVDKKLGYILTNRHVVAP------GPFVAS 113 (955)
T ss_pred hhhhhhhhhhcccEEEEEehheee------------cccccccccceeEEEEecccceEEEeccccCC------CCceeE
Confidence 378889999999999999765321 222345567899999996 4899999999985 445556
Q ss_pred EEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCC---ccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEeccc
Q 021321 155 VSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGF---ELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLG 231 (314)
Q Consensus 155 v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~---~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~ 231 (314)
+.+.+.. ..+...++.|+.+|+.+++.++... .+..+.+ ..+..++|.+++++|+..+...++..|.++.+.
T Consensus 114 avf~n~e----e~ei~pvyrDpVhdfGf~r~dps~ir~s~vt~i~l-ap~~akvgseirvvgNDagEklsIlagflSrld 188 (955)
T KOG1421|consen 114 AVFDNHE----EIEIYPVYRDPVHDFGFFRYDPSTIRFSIVTEICL-APELAKVGSEIRVVGNDAGEKLSILAGFLSRLD 188 (955)
T ss_pred EEecccc----cCCcccccCCchhhcceeecChhhcceeeeecccc-CccccccCCceEEecCCccceEEeehhhhhhcc
Confidence 6665543 4556677889999999999986532 2445556 335668999999999988888888889999888
Q ss_pred ccccCCCCcccc----ceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHHcC
Q 021321 232 REIPSPNGRAIR----GAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIVYG 307 (314)
Q Consensus 232 ~~~~~~~~~~~~----~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~~~ 307 (314)
+....+.+..+. .++|..+....|.||+|++|.+|..|.++..+... .+..|++|++.+++.|.=+..+.
T Consensus 189 r~apdyg~~~yndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg~~s------sas~ffLpLdrV~RaL~clq~n~ 262 (955)
T KOG1421|consen 189 RNAPDYGEDTYNDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSIS------SASDFFLPLDRVVRALRCLQNNT 262 (955)
T ss_pred CCCccccccccccccceeeeehhcCCCCCCCCceecccceEEeeecCCccc------ccccceeeccchhhhhhhhhcCC
Confidence 866554433222 35677777889999999999999999999887654 34569999999999999998888
Q ss_pred ccCCCC
Q 021321 308 TPYSNR 313 (314)
Q Consensus 308 ~~~~~~ 313 (314)
.++||-
T Consensus 263 PItRGt 268 (955)
T KOG1421|consen 263 PITRGT 268 (955)
T ss_pred Ccccce
Confidence 888874
No 10
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.61 E-value=7.7e-14 Score=120.49 Aligned_cols=170 Identities=21% Similarity=0.254 Sum_probs=110.1
Q ss_pred cceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEec-----CCCCeEEEEEEEEEeC-------CCCcEEEEEEe
Q 021321 119 EGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFD-----AKGNGFYREGKMVGCD-------PAYDLAVLKVD 186 (314)
Q Consensus 119 ~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~-----~~g~~~~~~a~v~~~d-------~~~DlAlL~v~ 186 (314)
...|+|++|++ .+|||++||+.. ...+.+.+.. ..+....+...-+..+ ..+|+|||+++
T Consensus 24 ~~~C~G~li~~-~~vLTaahC~~~-------~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~ 95 (220)
T PF00089_consen 24 RFFCTGTLISP-RWVLTAAHCVDG-------ASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLD 95 (220)
T ss_dssp EEEEEEEEEET-TEEEEEGGGHTS-------GGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEES
T ss_pred CeeEeEEeccc-cccccccccccc-------ccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 67899999997 799999999994 3456665543 1221123333333222 25799999998
Q ss_pred eC---CCccceeecCCC-CCCCCCCEEEEEEcCCCCCC----CeEeeEEecccc---cccCCCCccccceEEEee----c
Q 021321 187 VE---GFELKPVVLGTS-HDLRVGQSCFAIGNPYGFED----TLTTGVVSGLGR---EIPSPNGRAIRGAIQTDA----A 251 (314)
Q Consensus 187 ~~---~~~~~~~~l~~~-~~~~~G~~v~~iG~p~~~~~----~~~~G~vs~~~~---~~~~~~~~~~~~~i~~~~----~ 251 (314)
.+ ...+.++.+... ..+..|+.+.++||+..... ......+..+.. ... .........++... .
T Consensus 96 ~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~-~~~~~~~~~~c~~~~~~~~ 174 (220)
T PF00089_consen 96 RPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSS-YNDNLTPNMICAGSSGSGD 174 (220)
T ss_dssp SSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHH-TTTTSTTTEEEEETTSSSB
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc-ccccccccccccccccccc
Confidence 76 334677887552 34588999999999975332 233333332211 111 11112245677765 7
Q ss_pred cCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHH
Q 021321 252 INSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTV 300 (314)
Q Consensus 252 ~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l 300 (314)
.+.|+|||||++.++.|+||++.+.. +.. .....+.+++..+.+||
T Consensus 175 ~~~g~sG~pl~~~~~~lvGI~s~~~~-c~~--~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 175 ACQGDSGGPLICNNNYLVGIVSFGEN-CGS--PNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp GGTTTTTSEEEETTEEEEEEEEEESS-SSB--TTSEEEEEEGGGGHHHH
T ss_pred ccccccccccccceeeecceeeecCC-CCC--CCcCEEEEEHHHhhccC
Confidence 89999999999876679999998832 221 22357889999998886
No 11
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.53 E-value=1.1e-12 Score=113.96 Aligned_cols=176 Identities=18% Similarity=0.150 Sum_probs=104.7
Q ss_pred ccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCC-----eEEEEEEEEEeC-------CCCcEEEEEE
Q 021321 118 VEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGN-----GFYREGKMVGCD-------PAYDLAVLKV 185 (314)
Q Consensus 118 ~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~-----~~~~~a~v~~~d-------~~~DlAlL~v 185 (314)
....|+|++|++ .+|||+|||+... ....+.|.+...... ...+...-+..+ ..+|||||++
T Consensus 23 ~~~~C~GtlIs~-~~VLTaAhC~~~~-----~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L 96 (232)
T cd00190 23 GRHFCGGSLISP-RWVLTAAHCVYSS-----APSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKL 96 (232)
T ss_pred CcEEEEEEEeeC-CEEEECHHhcCCC-----CCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEE
Confidence 356899999997 8999999999853 134566665432211 112223333333 3589999999
Q ss_pred eeCC---CccceeecCCCC-CCCCCCEEEEEEcCCCCCC-----CeEeeEEeccc---ccccCCC-CccccceEEE----
Q 021321 186 DVEG---FELKPVVLGTSH-DLRVGQSCFAIGNPYGFED-----TLTTGVVSGLG---REIPSPN-GRAIRGAIQT---- 248 (314)
Q Consensus 186 ~~~~---~~~~~~~l~~~~-~~~~G~~v~~iG~p~~~~~-----~~~~G~vs~~~---~~~~~~~-~~~~~~~i~~---- 248 (314)
+.+- ..+.|+.|.... .+..|+.++++||+..... ......+..+. +...... .......++.
T Consensus 97 ~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~ 176 (232)
T cd00190 97 KRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLE 176 (232)
T ss_pred CCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCC
Confidence 8652 236788885543 6778899999999765322 12222222111 1100000 0011223433
Q ss_pred -eeccCCCCcccceecCC---CeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHH
Q 021321 249 -DAAINSGNSGGPLMNSF---GHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPY 302 (314)
Q Consensus 249 -~~~~~~G~SGGPl~n~~---G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~ 302 (314)
+...|.|+|||||+... +.++||.+++.. ++. .........+....+||++
T Consensus 177 ~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~--~~~~~~~t~v~~~~~WI~~ 231 (232)
T cd00190 177 GGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG-CAR--PNYPGVYTRVSSYLDWIQK 231 (232)
T ss_pred CCCccccCCCCCcEEEEeCCEEEEEEEEehhhc-cCC--CCCCCEEEEcHHhhHHhhc
Confidence 23478999999999764 789999999864 321 1223355667888888764
No 12
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.40 E-value=1.6e-11 Score=106.90 Aligned_cols=172 Identities=18% Similarity=0.167 Sum_probs=99.0
Q ss_pred ccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCe----EEEEEEEEEeC-------CCCcEEEEEEe
Q 021321 118 VEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNG----FYREGKMVGCD-------PAYDLAVLKVD 186 (314)
Q Consensus 118 ~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~----~~~~a~v~~~d-------~~~DlAlL~v~ 186 (314)
....|+|++|++ .+|||+|||+.... ...+.|.+....... ......-+..+ ..+|||||+++
T Consensus 24 ~~~~C~GtlIs~-~~VLTaahC~~~~~-----~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~ 97 (229)
T smart00020 24 GRHFCGGSLISP-RWVLTAAHCVYGSD-----PSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLK 97 (229)
T ss_pred CCcEEEEEEecC-CEEEECHHHcCCCC-----CcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEEC
Confidence 356899999997 89999999998531 245677765433211 12233333322 46899999998
Q ss_pred eC---CCccceeecCCC-CCCCCCCEEEEEEcCCCCC------CCeEeeEEecccc---cccCCCC-ccccceEEE----
Q 021321 187 VE---GFELKPVVLGTS-HDLRVGQSCFAIGNPYGFE------DTLTTGVVSGLGR---EIPSPNG-RAIRGAIQT---- 248 (314)
Q Consensus 187 ~~---~~~~~~~~l~~~-~~~~~G~~v~~iG~p~~~~------~~~~~G~vs~~~~---~~~~~~~-~~~~~~i~~---- 248 (314)
.+ ...+.++.|... ..+..++.+.+.||+.... .......+..+.. ....... ......++.
T Consensus 98 ~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~ 177 (229)
T smart00020 98 SPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLE 177 (229)
T ss_pred cccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCC
Confidence 65 223677777543 3567789999999986542 1111222221111 1000000 001123333
Q ss_pred -eeccCCCCcccceecCCC--eEEEEEcccccCCCCCCccceEEEEehHHHHH
Q 021321 249 -DAAINSGNSGGPLMNSFG--HVIGVNTATFTRKGTGLSSGVNFAIPIDTVVR 298 (314)
Q Consensus 249 -~~~~~~G~SGGPl~n~~G--~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~ 298 (314)
....|.|+|||||+...+ .++||++++. .++. .........+....+
T Consensus 178 ~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~-~C~~--~~~~~~~~~i~~~~~ 227 (229)
T smart00020 178 GGKDACQGDSGGPLVCNDGRWVLVGIVSWGS-GCAR--PGKPGVYTRVSSYLD 227 (229)
T ss_pred CCCcccCCCCCCeeEEECCCEEEEEEEEECC-CCCC--CCCCCEEEEeccccc
Confidence 345789999999997543 8999999986 3321 122334455554443
No 13
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=99.15 E-value=1.9e-09 Score=94.18 Aligned_cols=170 Identities=18% Similarity=0.127 Sum_probs=97.1
Q ss_pred cccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEE--ecCCCC-eEEEEEEEEE-eCC---CCcEEEEEEeeCC
Q 021321 117 KVEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSL--FDAKGN-GFYREGKMVG-CDP---AYDLAVLKVDVEG 189 (314)
Q Consensus 117 ~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~--~~~~g~-~~~~~a~v~~-~d~---~~DlAlL~v~~~~ 189 (314)
.++..+++|+|++ ..+||++||+..... +...+.+.. ...++. .+.+...... ... ..|.+...+....
T Consensus 61 tG~~~~~~~lI~p-ntvLTa~Hc~~s~~~---G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~ 136 (251)
T COG3591 61 TGRLCTAATLIGP-NTVLTAGHCIYSPDY---GEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAA 136 (251)
T ss_pred CCcceeeEEEEcC-ceEEEeeeEEecCCC---ChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHH
Confidence 3444667799998 899999999985432 112222211 111111 1111111111 112 3455555553211
Q ss_pred C--------ccceeecCCCCCCCCCCEEEEEEcCCCCCCCeE----eeEEecccccccCCCCccccceEEEeeccCCCCc
Q 021321 190 F--------ELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLT----TGVVSGLGREIPSPNGRAIRGAIQTDAAINSGNS 257 (314)
Q Consensus 190 ~--------~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~----~G~vs~~~~~~~~~~~~~~~~~i~~~~~~~~G~S 257 (314)
. ......+......+.++.+.++|||.......+ .+.+..+. ...+.+++.+++|+|
T Consensus 137 ~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~-----------~~~l~y~~dT~pG~S 205 (251)
T COG3591 137 LESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIK-----------GNKLFYDADTLPGSS 205 (251)
T ss_pred hccCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEe-----------cceEEEEecccCCCC
Confidence 1 122223333456678899999999987653322 22222211 135888999999999
Q ss_pred ccceecCCCeEEEEEcccccCCCCCCccceE-EEEehHHHHHHHHHHH
Q 021321 258 GGPLMNSFGHVIGVNTATFTRKGTGLSSGVN-FAIPIDTVVRTVPYLI 304 (314)
Q Consensus 258 GGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~-~aipi~~i~~~l~~l~ 304 (314)
|+|+++.+.++||+++.+....+. ...+ .+.-...++++|+++.
T Consensus 206 GSpv~~~~~~vigv~~~g~~~~~~---~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 206 GSPVLISKDEVIGVHYNGPGANGG---SLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred CCceEecCceEEEEEecCCCcccc---cccCcceEecHHHHHHHHHhh
Confidence 999999988999999988664432 2223 3344567788887764
No 14
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.80 E-value=5.6e-07 Score=80.00 Aligned_cols=175 Identities=19% Similarity=0.167 Sum_probs=98.3
Q ss_pred ceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecC-------CC---CeEEEEEEEEEeC-------CC-CcEE
Q 021321 120 GTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDA-------KG---NGFYREGKMVGCD-------PA-YDLA 181 (314)
Q Consensus 120 ~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~-------~g---~~~~~~a~v~~~d-------~~-~DlA 181 (314)
..+.|.+|++ .+|||++||+.+.. .. .+.|.+... .+ ....+. +++ .+ .. +|||
T Consensus 38 ~~Cggsli~~-~~vltaaHC~~~~~----~~-~~~V~~G~~~~~~~~~~~~~~~~~~v~-~~i-~H~~y~~~~~~~nDia 109 (256)
T KOG3627|consen 38 HLCGGSLISP-RWVLTAAHCVKGAS----AS-LYTVRLGEHDINLSVSEGEEQLVGDVE-KII-VHPNYNPRTLENNDIA 109 (256)
T ss_pred eeeeeEEeeC-CEEEEChhhCCCCC----Cc-ceEEEECccccccccccCchhhhceee-EEE-ECCCCCCCCCCCCCEE
Confidence 3788888865 79999999998531 00 455555311 01 111122 232 22 13 8999
Q ss_pred EEEEeeC---CCccceeecCCCCC---CCCCCEEEEEEcCCCCC------CCeEeeEEeccc---ccccCCCC-ccccce
Q 021321 182 VLKVDVE---GFELKPVVLGTSHD---LRVGQSCFAIGNPYGFE------DTLTTGVVSGLG---REIPSPNG-RAIRGA 245 (314)
Q Consensus 182 lL~v~~~---~~~~~~~~l~~~~~---~~~G~~v~~iG~p~~~~------~~~~~G~vs~~~---~~~~~~~~-~~~~~~ 245 (314)
||+++.+ ...+.++.|..... ...+..+++.||+.... .......+.-+. +....... ......
T Consensus 110 ll~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~ 189 (256)
T KOG3627|consen 110 LLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTM 189 (256)
T ss_pred EEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCE
Confidence 9999865 23466777743332 34458899999975321 112222222221 11111100 001123
Q ss_pred EEEe-----eccCCCCcccceecCC---CeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHH
Q 021321 246 IQTD-----AAINSGNSGGPLMNSF---GHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLI 304 (314)
Q Consensus 246 i~~~-----~~~~~G~SGGPl~n~~---G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~ 304 (314)
++.. ...|.|+|||||+..+ ..++||++++...++.....+. ...+....+|+++.+
T Consensus 190 ~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~v--yt~V~~y~~WI~~~~ 254 (256)
T KOG3627|consen 190 LCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGV--YTRVSSYLDWIKENI 254 (256)
T ss_pred EeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeE--EeEhHHhHHHHHHHh
Confidence 5554 2368999999999664 5999999999765433222333 566777888887754
No 15
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.79 E-value=1.8e-07 Score=81.27 Aligned_cols=167 Identities=15% Similarity=0.204 Sum_probs=85.1
Q ss_pred HHhCCceEEEEeeeeecCCCCCccchhhccccCCcccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCC
Q 021321 83 QETSPSVVSIQDLELSKNPKSTSSELMLVDGEYAKVEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKG 162 (314)
Q Consensus 83 ~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g 162 (314)
.-+...|++|.... ......=-|+..+ .+|+|++|.++.. ...++|.. ..|
T Consensus 14 n~Ia~~ic~l~n~s-------------------~~~~~~l~gigyG--~~iItn~HLf~~n------ng~L~i~s--~hG 64 (235)
T PF00863_consen 14 NPIASNICRLTNES-------------------DGGTRSLYGIGYG--SYIITNAHLFKRN------NGELTIKS--QHG 64 (235)
T ss_dssp HHHHTTEEEEEEEE-------------------TTEEEEEEEEEET--TEEEEEGGGGSST------TCEEEEEE--TTE
T ss_pred chhhheEEEEEEEe-------------------CCCeEEEEEEeEC--CEEEEChhhhccC------CCeEEEEe--Cce
Confidence 34566788887432 2333455667776 5999999999753 23455554 333
Q ss_pred CeEEEE---EEEEEeCCCCcEEEEEEeeCCCccceeec-CCCCCCCCCCEEEEEEcCCCCCCCeEeeEEecccccccCCC
Q 021321 163 NGFYRE---GKMVGCDPAYDLAVLKVDVEGFELKPVVL-GTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIPSPN 238 (314)
Q Consensus 163 ~~~~~~---a~v~~~d~~~DlAlL~v~~~~~~~~~~~l-~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~~~ 238 (314)
.- .+. .--+..-+..||.++|+.. +++|.+- .....+..+++|+++|.-+..... .-.|+.........+
T Consensus 65 ~f-~v~nt~~lkv~~i~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~--~s~vSesS~i~p~~~ 138 (235)
T PF00863_consen 65 EF-TVPNTTQLKVHPIEGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKSI--SSTVSESSWIYPEEN 138 (235)
T ss_dssp EE-EECEGGGSEEEE-TCSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCCC--EEEEEEEEEEEEETT
T ss_pred EE-EcCCccccceEEeCCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCCe--eEEECCceEEeecCC
Confidence 21 111 1122344789999999963 4555431 133678899999999975443222 123332222111112
Q ss_pred CccccceEEEeeccCCCCcccceecC-CCeEEEEEcccccCCCCCCccceEEEEehH
Q 021321 239 GRAIRGAIQTDAAINSGNSGGPLMNS-FGHVIGVNTATFTRKGTGLSSGVNFAIPID 294 (314)
Q Consensus 239 ~~~~~~~i~~~~~~~~G~SGGPl~n~-~G~vvGI~s~~~~~~~~~~~~~~~~aipi~ 294 (314)
..+-.+-..+..|+-|+||++. ||.+|||++..... ...+|+.|+.
T Consensus 139 ----~~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~------~~~N~F~~f~ 185 (235)
T PF00863_consen 139 ----SHFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNT------SSRNYFTPFP 185 (235)
T ss_dssp ----TTEEEE-C---TT-TT-EEEETTT--EEEEEEEEETT------TSSEEEEE--
T ss_pred ----CCeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCC------CCeEEEEcCC
Confidence 2345566667899999999986 99999999987653 3456777654
No 16
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.58 E-value=4.7e-06 Score=75.50 Aligned_cols=54 Identities=22% Similarity=0.266 Sum_probs=39.6
Q ss_pred eccCCCCcccceecC--CCeE-EEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHH
Q 021321 250 AAINSGNSGGPLMNS--FGHV-IGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIV 305 (314)
Q Consensus 250 ~~~~~G~SGGPl~n~--~G~v-vGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~ 305 (314)
...|.|+||||+|-. +|++ +||++|+.+.++.....+ ...-++....||++.++
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~g--VyT~vsny~~WI~a~~~ 279 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPG--VYTNVSNYQDWIAAMTN 279 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcce--eEEehhHHHHHHHHHhc
Confidence 357899999999954 5665 999999988765433334 44558889999888544
No 17
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=98.56 E-value=1.1e-07 Score=89.93 Aligned_cols=195 Identities=25% Similarity=0.300 Sum_probs=128.6
Q ss_pred HHHHhCCceEEEEeeeeecCCCCCccchhhccccCCcccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecC
Q 021321 81 LFQETSPSVVSIQDLELSKNPKSTSSELMLVDGEYAKVEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDA 160 (314)
Q Consensus 81 ~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~ 160 (314)
..+....|++.+.+..... .....|+... +....|+||.+.. ..++|++|++.... +...+.+. .
T Consensus 55 ~~~~~~~s~~~v~~~~~~~------~~~~pw~~~~-q~~~~~s~f~i~~-~~lltn~~~v~~~~------~~~~v~v~-~ 119 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEP------SSVLPWQRTR-QFSSGGSGFAIYG-KKLLTNAHVVAPNN------DHKFVTVK-K 119 (473)
T ss_pred CccccccceeEEEeecccc------cccCcceeee-hhcccccchhhcc-cceeecCccccccc------cccccccc-c
Confidence 3445566777777654322 1111233222 6677899999986 78999999998432 22233332 3
Q ss_pred CCCeEEEEEEEEEeCCCCcEEEEEEeeCCCc--cceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEecccccccCCC
Q 021321 161 KGNGFYREGKMVGCDPAYDLAVLKVDVEGFE--LKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIPSPN 238 (314)
Q Consensus 161 ~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~--~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~~~ 238 (314)
.|....+.+++...-.+.|+|++.++..... ..++.++ +-+...+.++++| +....++.|.|.......+...
T Consensus 120 ~gs~~k~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~--~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~ 194 (473)
T KOG1320|consen 120 HGSPRKYKAFVAAVFEECDLAVVYIESEEFWKGMNPFELG--DIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHS 194 (473)
T ss_pred CCCchhhhhhHHHhhhcccceEEEEeeccccCCCcccccC--CCcccCccEEEEc---CCcEEEEeeEEEEEEeccccCC
Confidence 3444466788888888999999999865332 2334443 3345557899998 6667899999998766543322
Q ss_pred CccccceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHH
Q 021321 239 GRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVP 301 (314)
Q Consensus 239 ~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~ 301 (314)
+. ....+++++.+.+|+||+|.+...+++.|+........ ..+.+.+|.-.+..++.
T Consensus 195 ~~-~l~~vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~-----~~i~~~i~~~~~~~~~~ 251 (473)
T KOG1320|consen 195 ST-VLLRVQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTP-----ENILYVIPLGVSSHFRT 251 (473)
T ss_pred Cc-ceeeEEEEEeecCCccCCCeEEccccccceEEEEEecC-----Ccccceeecceeeeecc
Confidence 22 23468999999999999999987689999999886432 13457777665555443
No 18
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=98.42 E-value=3.5e-06 Score=81.72 Aligned_cols=203 Identities=15% Similarity=0.128 Sum_probs=133.6
Q ss_pred HHHhCCceEEEEeeeeecCCCCCccchhhccccCCcccceEEEEEEcC-CCEEEeccccccCCCcCCCCcceEEEEEecC
Q 021321 82 FQETSPSVVSIQDLELSKNPKSTSSELMLVDGEYAKVEGTGSGFVWDK-FGHIVTNYHVVAKLATDTSGLHRCKVSLFDA 160 (314)
Q Consensus 82 ~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GsGfiI~~-~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~ 160 (314)
.+++..+.|.+....... -++.......|||.|++. .|++++.+.++.- ...+.+|++.+.
T Consensus 524 ~~~i~~~~~~v~~~~~~~------------l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~------d~~d~~vt~~dS 585 (955)
T KOG1421|consen 524 SADISNCLVDVEPMMPVN------------LDGVSSDIYKGTALIMDTSKGLGVVSRSVVPS------DAKDQRVTEADS 585 (955)
T ss_pred hhHHhhhhhhheeceeec------------cccchhhhhcCceEEEEccCCceeEecccCCc------hhhceEEeeccc
Confidence 456666777776544322 111222456899999984 4999999999974 456778888776
Q ss_pred CCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEecccc-cccCCCC
Q 021321 161 KGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGR-EIPSPNG 239 (314)
Q Consensus 161 ~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~-~~~~~~~ 239 (314)
. ...|.+...++...+|.+|.++. ....++| ....+..||++...|+-.........-.+..+.. .+.....
T Consensus 586 ~----~i~a~~~fL~~t~n~a~~kydp~--~~~~~kl-~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~ 658 (955)
T KOG1421|consen 586 D----GIPANVSFLHPTENVASFKYDPA--LEVQLKL-TDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVM 658 (955)
T ss_pred c----cccceeeEecCccceeEeccChh--Hhhhhcc-ceeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCC
Confidence 5 56888999999999999999854 2345666 4467889999999999755432211111211110 0000000
Q ss_pred ccc----cceEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHHcCcc
Q 021321 240 RAI----RGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIVYGTP 309 (314)
Q Consensus 240 ~~~----~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~~~~~ 309 (314)
.++ .+.|.+++.+.-++--|-+.|.+|+|+|++-....+.-.+...-+-|.+.+.++++.|++|+..+.+
T Consensus 659 pr~r~~n~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~ 732 (955)
T KOG1421|consen 659 PRFRATNLEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSA 732 (955)
T ss_pred cceeecceEEEEEeccccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCC
Confidence 001 2345565554444445577888999999998887765545555677889999999999999766544
No 19
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=98.12 E-value=1.7e-05 Score=69.22 Aligned_cols=117 Identities=22% Similarity=0.301 Sum_probs=62.6
Q ss_pred ceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCC
Q 021321 120 GTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGT 199 (314)
Q Consensus 120 ~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~ 199 (314)
++|+.|-++.+-.|+|+.||+.+ +..++.... . . +...++.+-|+|.-.++.-...+|.++++.
T Consensus 114 Gsggvft~~~~~vvvTAtHVlg~--------~~a~v~~~g---~--~---~~~tF~~~GDfA~~~~~~~~G~~P~~k~a~ 177 (297)
T PF05579_consen 114 GSGGVFTIGGNTVVVTATHVLGG--------NTARVSGVG---T--R---RMLTFKKNGDFAEADITNWPGAAPKYKFAQ 177 (297)
T ss_dssp EEEEEEECTTEEEEEEEHHHCBT--------TEEEEEETT---E--E---EEEEEEEETTEEEEEETTS-S---B--B-T
T ss_pred cccceEEECCeEEEEEEEEEcCC--------CeEEEEecc---e--E---EEEEEeccCcEEEEECCCCCCCCCceeecC
Confidence 34444445544579999999973 333444321 1 1 334556778999999943333577777742
Q ss_pred CCCCCCCCEEEEEEcCCCCCCCeEeeEEecccccccCCCCccccceEEEeeccCCCCcccceecCCCeEEEEEccccc
Q 021321 200 SHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIPSPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIGVNTATFT 277 (314)
Q Consensus 200 ~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~ 277 (314)
-..|.--|.- ...+..|.|..- ..+++ ..+||||+|++..+|.+||+|+..-.
T Consensus 178 ---~~~GrAyW~t------~tGvE~G~ig~~-------------~~~~f---T~~GDSGSPVVt~dg~liGVHTGSn~ 230 (297)
T PF05579_consen 178 ---NYTGRAYWLT------STGVEPGFIGGG-------------GAVCF---TGPGDSGSPVVTEDGDLIGVHTGSNK 230 (297)
T ss_dssp ---T-SEEEEEEE------TTEEEEEEEETT-------------EEEES---S-GGCTT-EEEETTC-EEEEEEEEET
T ss_pred ---CcccceEEEc------ccCcccceecCc-------------eEEEE---cCCCCCCCccCcCCCCEEEEEecCCC
Confidence 1233333322 122444554421 12333 35799999999999999999997643
No 20
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.74 E-value=0.0029 Score=57.21 Aligned_cols=111 Identities=17% Similarity=0.210 Sum_probs=67.2
Q ss_pred CCCcEEEEEEeeC-CCccceeecCCCC-CCCCCCEEEEEEcCCCCCCCeEeeEEecccccccCCCCccccceEEEeeccC
Q 021321 176 PAYDLAVLKVDVE-GFELKPVVLGTSH-DLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIPSPNGRAIRGAIQTDAAIN 253 (314)
Q Consensus 176 ~~~DlAlL~v~~~-~~~~~~~~l~~~~-~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~~~ 253 (314)
..++++||+++.+ .....++.|+++. ....|+.+.+.|+... .......+.-..... ....+......+
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~~~~~~~~i~~~~~-------~~~~~~~~~~~~ 229 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GKLKHRKLKITNCTK-------CAYSICTKQYSC 229 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--CeEEEEEEEEEEeec-------cceeEecccccC
Confidence 4579999999865 2467788886643 4667899999888211 112222222111100 122355556678
Q ss_pred CCCcccceecC-CC--eEEEEEcccccCCCCCCccceEEEEehHHHHHH
Q 021321 254 SGNSGGPLMNS-FG--HVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRT 299 (314)
Q Consensus 254 ~G~SGGPl~n~-~G--~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~ 299 (314)
.|++|||++.. +| .||||.+....... ....+++.+..+++-
T Consensus 230 ~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~----~~~~~f~~v~~~~~~ 274 (282)
T PF03761_consen 230 KGDRGGPLVKNINGRWTLIGVGASGNYECN----KNNSYFFNVSWYQDE 274 (282)
T ss_pred CCCccCeEEEEECCCEEEEEEEccCCCccc----ccccEEEEHHHhhhh
Confidence 99999999832 44 58999987653321 124577777776654
No 21
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=97.63 E-value=0.00033 Score=70.52 Aligned_cols=23 Identities=30% Similarity=0.226 Sum_probs=20.8
Q ss_pred ceEEEEEEcCCCEEEeccccccC
Q 021321 120 GTGSGFVWDKFGHIVTNYHVVAK 142 (314)
Q Consensus 120 ~~GsGfiI~~~g~VLT~aHvv~~ 142 (314)
+-|||.||+++|+||||.||.-+
T Consensus 47 gGCSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 47 GGCSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred CceeEEEEcCCceEEecchhhhh
Confidence 46999999999999999999863
No 22
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=97.53 E-value=0.002 Score=54.00 Aligned_cols=140 Identities=18% Similarity=0.231 Sum_probs=78.5
Q ss_pred cccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEEEEEEEEEeCC---CCcEEEEEEeeCCCccc
Q 021321 117 KVEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFYREGKMVGCDP---AYDLAVLKVDVEGFELK 193 (314)
Q Consensus 117 ~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~~~a~v~~~d~---~~DlAlL~v~~~~~~~~ 193 (314)
.....++++.|.. .++|...|.-. .. ++.+. |..+.....+...+. ..|+++++++.. .+++
T Consensus 22 ~g~~t~l~~gi~~-~~~lvp~H~~~--------~~--~i~i~---g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~-~kfr 86 (172)
T PF00548_consen 22 KGEFTMLALGIYD-RYFLVPTHEEP--------ED--TIYID---GVEYKVDDSVVLVDRDGVDTDLTLVKLPRN-PKFR 86 (172)
T ss_dssp TEEEEEEEEEEEB-TEEEEEGGGGG--------CS--EEEET---TEEEEEEEEEEEEETTSSEEEEEEEEEESS-S-B-
T ss_pred CceEEEecceEee-eEEEEECcCCC--------cE--EEEEC---CEEEEeeeeEEEecCCCcceeEEEEEccCC-cccC
Confidence 4567889989986 89999999221 22 33332 333333333333443 469999999753 2332
Q ss_pred eee--cCCCCCCCCCCEEEEEEcCCCCCCC-eEeeEEecccccccCCCCccccceEEEeeccCCCCcccceecC---CCe
Q 021321 194 PVV--LGTSHDLRVGQSCFAIGNPYGFEDT-LTTGVVSGLGREIPSPNGRAIRGAIQTDAAINSGNSGGPLMNS---FGH 267 (314)
Q Consensus 194 ~~~--l~~~~~~~~G~~v~~iG~p~~~~~~-~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n~---~G~ 267 (314)
-+. |.+ ......+...++ +....... ...+.+...+.. ..++......+.++++...|+.||||+.. .++
T Consensus 87 DIrk~~~~-~~~~~~~~~l~v-~~~~~~~~~~~v~~v~~~~~i--~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~ 162 (172)
T PF00548_consen 87 DIRKFFPE-SIPEYPECVLLV-NSTKFPRMIVEVGFVTNFGFI--NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGK 162 (172)
T ss_dssp -GGGGSBS-SGGTEEEEEEEE-ESSSSTCEEEEEEEEEEEEEE--EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTE
T ss_pred chhhhhcc-ccccCCCcEEEE-ECCCCccEEEEEEEEeecCcc--ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCcc
Confidence 221 111 111233334444 33333322 233444433221 12334456678888888999999999953 678
Q ss_pred EEEEEccc
Q 021321 268 VIGVNTAT 275 (314)
Q Consensus 268 vvGI~s~~ 275 (314)
++|||.++
T Consensus 163 i~GiHvaG 170 (172)
T PF00548_consen 163 IIGIHVAG 170 (172)
T ss_dssp EEEEEEEE
T ss_pred EEEEEecc
Confidence 99999985
No 23
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=97.37 E-value=0.00033 Score=70.56 Aligned_cols=61 Identities=20% Similarity=0.288 Sum_probs=47.7
Q ss_pred eEEEeeccCCCCcccceecCCCeEEEEEcccccCCCCC-----CccceEEEEehHHHHHHHHHHHH
Q 021321 245 AIQTDAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTG-----LSSGVNFAIPIDTVVRTVPYLIV 305 (314)
Q Consensus 245 ~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~-----~~~~~~~aipi~~i~~~l~~l~~ 305 (314)
.+.++..+..|+||+|++|.+|||||+++-+..+.-.+ .....+..+-+.++..+|+++-.
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~g 688 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVYG 688 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHhC
Confidence 46677889999999999999999999999775543221 12345788899999999988743
No 24
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.73 E-value=0.036 Score=47.50 Aligned_cols=42 Identities=29% Similarity=0.483 Sum_probs=32.6
Q ss_pred eeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHH
Q 021321 249 DAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTV 296 (314)
Q Consensus 249 ~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i 296 (314)
+..+..||||+|++ .+|++||=++..+.. ....+|.++++..
T Consensus 174 TGGIvqGMSGSPI~-qdGKLiGAVthvf~~-----dp~~Gygi~ie~M 215 (218)
T PF05580_consen 174 TGGIVQGMSGSPII-QDGKLIGAVTHVFVN-----DPTKGYGIFIEWM 215 (218)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEec-----CCCceeeecHHHH
Confidence 34577899999999 599999999988654 2456788986653
No 25
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.63 E-value=0.014 Score=57.44 Aligned_cols=118 Identities=19% Similarity=0.345 Sum_probs=70.7
Q ss_pred CCCcEEEEEEeeCC-------Ccc------ceeecCCC------CCCCCCCEEEEEEcCCCCCCCeEeeEEecccccccC
Q 021321 176 PAYDLAVLKVDVEG-------FEL------KPVVLGTS------HDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIPS 236 (314)
Q Consensus 176 ~~~DlAlL~v~~~~-------~~~------~~~~l~~~------~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~ 236 (314)
.-.|+||++++..- .++ |.+.+.+. ..+.+|..|+-+|.-.+ .+.|++.++.- ...
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~kl-vyw 615 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKL-VYW 615 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEE-EEe
Confidence 34699999997431 011 22223211 34577899999987654 45666665532 112
Q ss_pred CCCccc-cceEEEe----eccCCCCcccceecCCC------eEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHH
Q 021321 237 PNGRAI-RGAIQTD----AAINSGNSGGPLMNSFG------HVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYL 303 (314)
Q Consensus 237 ~~~~~~-~~~i~~~----~~~~~G~SGGPl~n~~G------~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l 303 (314)
.++... .+++... .-...||||+-|++.-+ .|+||..+..++ ...++...|+..|.+-|++.
T Consensus 616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge-----~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGE-----QKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCc-----cceeeccCcHHHHHHHHHHh
Confidence 222221 2333333 12457999999998633 399999886443 35688889988877666654
No 26
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=96.08 E-value=0.0054 Score=48.74 Aligned_cols=33 Identities=24% Similarity=0.440 Sum_probs=23.5
Q ss_pred EEEeeccCCCCcccceecCCCeEEEEEcccccC
Q 021321 246 IQTDAAINSGNSGGPLMNSFGHVIGVNTATFTR 278 (314)
Q Consensus 246 i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~ 278 (314)
...+....+|.||+|+||.+|++|||...+...
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 444556789999999999999999999887654
No 27
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=95.59 E-value=0.07 Score=45.72 Aligned_cols=154 Identities=19% Similarity=0.163 Sum_probs=47.3
Q ss_pred CcccceEEEEEE-cCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEEE-EEEEEEeCCCCcEEEEEEeeC---CC
Q 021321 116 AKVEGTGSGFVW-DKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFYR-EGKMVGCDPAYDLAVLKVDVE---GF 190 (314)
Q Consensus 116 ~~~~~~GsGfiI-~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~~-~a~v~~~d~~~DlAlL~v~~~---~~ 190 (314)
+...+.++.+-. +-+..++|++||.... ... ....+ |+.... +-+.+..+...|++||+.... ..
T Consensus 26 ~~hvGya~cv~l~~g~~~L~ta~Hv~~~~-------~~~-~~~k~--g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~L 95 (203)
T PF02122_consen 26 GSHVGYATCVRLFDGEDALLTARHVWSRP-------SKV-TSLKT--GEKIPLAEFTDLLESRIADFVILRGPPNWESKL 95 (203)
T ss_dssp --------EEEE----EEEEE-HHHHTSS-------S----EEET--TEEEE--S-EEEEE-TTT-EEEEE--HHHHHHH
T ss_pred ccccccceEEECcCCccceecccccCCCc-------cce-eEcCC--CCcccchhChhhhCCCccCEEEEecCcCHHHHh
Confidence 444455555442 2234799999999852 111 12222 221111 123444678899999999722 00
Q ss_pred ccceeecCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEecccccccCCCCccccceEEEeeccCCCCcccceecCCCeEEE
Q 021321 191 ELKPVVLGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIPSPNGRAIRGAIQTDAAINSGNSGGPLMNSFGHVIG 270 (314)
Q Consensus 191 ~~~~~~l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n~~G~vvG 270 (314)
..+.+.+.....+ .-| .-..+....+........+.... ..+..+-+...+|.||.|+|+.+ +++|
T Consensus 96 g~k~~~~~~~~~~-------~~g--~~~~y~~~~~~~~~~sa~i~g~~----~~~~~vls~T~~G~SGtp~y~g~-~vvG 161 (203)
T PF02122_consen 96 GVKAAQLSQNSQL-------AKG--PVSFYGFSSGEWPCSSAKIPGTE----GKFASVLSNTSPGWSGTPYYSGK-NVVG 161 (203)
T ss_dssp T-----B----SE-------EEE--ESSTTSEEEEEEEEEE-S----S----TTEEEE-----TT-TT-EEE-SS--EEE
T ss_pred Ccccccccchhhh-------CCC--CeeeeeecCCCceeccCcccccc----CcCCceEcCCCCCCCCCCeEECC-CceE
Confidence 1233333111111 001 01112222222211111111111 23556667788999999999877 8999
Q ss_pred EEcccccCCCCCCccceEEEEehHHH
Q 021321 271 VNTATFTRKGTGLSSGVNFAIPIDTV 296 (314)
Q Consensus 271 I~s~~~~~~~~~~~~~~~~aipi~~i 296 (314)
++....... ..+++++-.|+--+
T Consensus 162 vH~G~~~~~---~~~n~n~~spip~~ 184 (203)
T PF02122_consen 162 VHTGSPSGS---NRENNNRMSPIPPI 184 (203)
T ss_dssp EEEEE---------------------
T ss_pred eecCccccc---cccccccccccccc
Confidence 999852211 13455555555444
No 28
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=95.23 E-value=0.24 Score=46.88 Aligned_cols=42 Identities=29% Similarity=0.511 Sum_probs=31.6
Q ss_pred eeccCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHH
Q 021321 249 DAAINSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTV 296 (314)
Q Consensus 249 ~~~~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i 296 (314)
+..+..||||+|++ .+|++||=++-.+.+. +..+|+|-++..
T Consensus 354 tgGivqGMSGSPi~-q~gkliGAvtHVfvnd-----pt~GYGi~ie~M 395 (402)
T TIGR02860 354 TGGIVQGMSGSPII-QNGKVIGAVTHVFVND-----PTSGYGVYIEWM 395 (402)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEecC-----CCcceeehHHHH
Confidence 34567899999999 6999999888876652 445688855443
No 29
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=94.96 E-value=0.026 Score=44.39 Aligned_cols=29 Identities=21% Similarity=0.438 Sum_probs=24.8
Q ss_pred eccCCCCcccceecCCCeEEEEEcccccC
Q 021321 250 AAINSGNSGGPLMNSFGHVIGVNTATFTR 278 (314)
Q Consensus 250 ~~~~~G~SGGPl~n~~G~vvGI~s~~~~~ 278 (314)
..-.+|+||-|++|..|+||||+-.+..+
T Consensus 101 g~g~~GDSGRpi~DNsGrVVaIVLGG~ne 129 (158)
T PF00944_consen 101 GVGKPGDSGRPIFDNSGRVVAIVLGGANE 129 (158)
T ss_dssp TS-STTSTTEEEESTTSBEEEEEEEEEEE
T ss_pred CCCCCCCCCCccCcCCCCEEEEEecCCCC
Confidence 34579999999999999999999988764
No 30
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=94.04 E-value=0.19 Score=38.17 Aligned_cols=103 Identities=20% Similarity=0.365 Sum_probs=54.0
Q ss_pred EEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCC
Q 021321 122 GSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSH 201 (314)
Q Consensus 122 GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~ 201 (314)
|-++-|.+ |.++|+.||.+.. +.+. |.. + ++ ...+.|+|+++.+.. ..+..++++
T Consensus 1 G~avHIGn-G~~vt~tHva~~~-------~~v~-------g~~--f--~~--~~~~ge~~~v~~~~~--~~p~~~ig~-- 55 (105)
T PF03510_consen 1 GWAVHIGN-GRYVTVTHVAKSS-------DSVD-------GQP--F--KI--VKTDGELCWVQSPLV--HLPAAQIGT-- 55 (105)
T ss_pred CceEEeCC-CEEEEEEEEeccC-------ceEc-------CcC--c--EE--EEeccCEEEEECCCC--CCCeeEecc--
Confidence 34677886 9999999999843 2221 221 1 22 224569999998753 355666643
Q ss_pred CCCCCCEEEEEEcCCCCCCCeE--eeEEecccccccCCCCccccceEEEeeccCCCCcccceec
Q 021321 202 DLRVGQSCFAIGNPYGFEDTLT--TGVVSGLGREIPSPNGRAIRGAIQTDAAINSGNSGGPLMN 263 (314)
Q Consensus 202 ~~~~G~~v~~iG~p~~~~~~~~--~G~vs~~~~~~~~~~~~~~~~~i~~~~~~~~G~SGGPl~n 263 (314)
|.+++ |+.+...... .+... ...+....-...+...+.+||-|-|.||
T Consensus 56 ----g~Pv~---~~~~~~~~t~~~~~~~~-------t~~~~v~G~~~~~~~~T~~GDCGlPY~d 105 (105)
T PF03510_consen 56 ----GKPVY---DTWGLHPVTTWSEGTYN-------TPTGTVNGWHVKITNPTKKGDCGLPYFD 105 (105)
T ss_pred ----CCCEE---ecCCCccEEEeccceEE-------cCCcEEEEEEEeCCCCccCCccCCcccC
Confidence 44455 3333222111 11111 0111100112233336789999999986
No 31
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=94.04 E-value=2.1 Score=37.59 Aligned_cols=94 Identities=16% Similarity=0.260 Sum_probs=57.1
Q ss_pred cccceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEE-E---EEEEEEeC-----CCCcEEEEEEee
Q 021321 117 KVEGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFY-R---EGKMVGCD-----PAYDLAVLKVDV 187 (314)
Q Consensus 117 ~~~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~-~---~a~v~~~d-----~~~DlAlL~v~~ 187 (314)
.+...++|++|++ .|+|++-.|+.+..- ....+.+.+ +.|+.+. + .-++...| +..++.||.++.
T Consensus 25 dG~~~CsgvLlD~-~WlLvsssCl~~I~L---~~~Yvsall--G~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~ 98 (267)
T PF09342_consen 25 DGRYWCSGVLLDP-HWLLVSSSCLRGISL---SHHYVSALL--GGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQ 98 (267)
T ss_pred cCeEEEEEEEecc-ceEEEeccccCCccc---ccceEEEEe--cCcceecccCCChheEEEeeeeeeccccceeeeeecC
Confidence 4567999999998 899999999986421 113344444 3233211 0 01233333 678999999987
Q ss_pred CCC---ccceeecCC-CCCCCCCCEEEEEEcCC
Q 021321 188 EGF---ELKPVVLGT-SHDLRVGQSCFAIGNPY 216 (314)
Q Consensus 188 ~~~---~~~~~~l~~-~~~~~~G~~v~~iG~p~ 216 (314)
+.. .+.|.-+.+ .......+.++++|.-.
T Consensus 99 ~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 99 PANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred cccceeeecccccccccCCCCCCCceEEEEccc
Confidence 632 244555533 23444456899999765
No 32
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=92.85 E-value=0.12 Score=48.42 Aligned_cols=137 Identities=21% Similarity=0.316 Sum_probs=67.9
Q ss_pred cceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEeeC-CCccceeec
Q 021321 119 EGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDVE-GFELKPVVL 197 (314)
Q Consensus 119 ~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~-~~~~~~~~l 197 (314)
-++|-||-|++ ...+|+-||+.. +...+ | |. +..-+..+..-+++-+++..+ ..+++-+-|
T Consensus 378 fGsGWGfWVS~-~lfITttHViP~------g~~E~---F----Gv----~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiL 439 (535)
T PF05416_consen 378 FGSGWGFWVSP-TLFITTTHVIPP------GAKEA---F----GV----PISQIQVHKSGEFCRFRFPKPIRPDVTGMIL 439 (535)
T ss_dssp ETTEEEEESSS-SEEEEEGGGS-S------TTSEE---T----TE----ECGGEEEEEETTEEEEEESS-SSTTS---EE
T ss_pred cCCceeeeecc-eEEEEeeeecCC------cchhh---h----CC----ChhHeEEeeccceEEEecCCCCCCCccceee
Confidence 36899999998 899999999974 22211 1 11 111123445577888888654 224555556
Q ss_pred CCCCCCCCCCEEEE-EEcCCCC--CCCeEeeEEecccccccCCCCccccceEEE-------eeccCCCCcccceecCCC-
Q 021321 198 GTSHDLRVGQSCFA-IGNPYGF--EDTLTTGVVSGLGREIPSPNGRAIRGAIQT-------DAAINSGNSGGPLMNSFG- 266 (314)
Q Consensus 198 ~~~~~~~~G~~v~~-iG~p~~~--~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~-------~~~~~~G~SGGPl~n~~G- 266 (314)
. +-...|.-+.+ |=.+.|. ...+..|...+..-.-....++ ..++.+ |..+.+|+.|+|-+-..|
T Consensus 440 -E-eGapEGtV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v~GQ--~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgN 515 (535)
T PF05416_consen 440 -E-EGAPEGTVCSILIKRPSGELLPLAVRMGTHASMKIQGRTVHGQ--MGMLLTGANAKGMDLGTIPGDCGCPYVYKRGN 515 (535)
T ss_dssp ---SS--TT-EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEEEEE--EEEETTSTT-SSTTTS--TTGTT-EEEEEETT
T ss_pred -c-cCCCCceEEEEEEEcCCccchhhhhhhccceeEEEcceeecce--eeeeeecCCccccccCCCCCCCCCceeeecCC
Confidence 2 33455665544 4455443 2234444443321100000011 112222 334679999999997655
Q ss_pred --eEEEEEccccc
Q 021321 267 --HVIGVNTATFT 277 (314)
Q Consensus 267 --~vvGI~s~~~~ 277 (314)
-|+|+|.+...
T Consensus 516 d~VV~GVH~AAtr 528 (535)
T PF05416_consen 516 DWVVIGVHAAATR 528 (535)
T ss_dssp EEEEEEEEEEE-S
T ss_pred cEEEEEEEehhcc
Confidence 48999998754
No 33
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=90.94 E-value=0.36 Score=37.82 Aligned_cols=32 Identities=28% Similarity=0.353 Sum_probs=23.8
Q ss_pred ceEEEeeccCCCCcccceecCCCeEEEEEcccc
Q 021321 244 GAIQTDAAINSGNSGGPLMNSFGHVIGVNTATF 276 (314)
Q Consensus 244 ~~i~~~~~~~~G~SGGPl~n~~G~vvGI~s~~~ 276 (314)
+++.......||+.||+|+.. --||||++++.
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~C~-HGViGi~Tagg 110 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILRCK-HGVIGIVTAGG 110 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEEET-TCEEEEEEEEE
T ss_pred CceeecccCCCCCCCceeEeC-CCeEEEEEeCC
Confidence 455556678999999999964 45999999874
No 34
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=90.89 E-value=0.35 Score=38.20 Aligned_cols=42 Identities=31% Similarity=0.717 Sum_probs=28.2
Q ss_pred cCCCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHH
Q 021321 252 INSGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTV 296 (314)
Q Consensus 252 ~~~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i 296 (314)
...|.||||++..+|.+|||..+..-.. +....+-|. |++.+
T Consensus 105 ~lkGSSGgPiLC~~GH~vG~f~aa~~tr--gvak~i~f~-P~e~l 146 (148)
T PF02907_consen 105 DLKGSSGGPILCPSGHAVGMFRAAVCTR--GVAKAIDFI-PVETL 146 (148)
T ss_dssp HHTT-TT-EEEETTSEEEEEEEEEEEET--TEEEEEEEE-EHHHH
T ss_pred EEecCCCCcccCCCCCEEEEEEEEEEcC--CceeeEEEE-eeeec
Confidence 4579999999999999999988775432 223344554 77654
No 35
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=87.01 E-value=3.6 Score=42.51 Aligned_cols=49 Identities=22% Similarity=0.226 Sum_probs=30.4
Q ss_pred cCCCCccccee--cC-C--CeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHH
Q 021321 252 INSGNSGGPLM--NS-F--GHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYL 303 (314)
Q Consensus 252 ~~~G~SGGPl~--n~-~--G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l 303 (314)
..+||||+||| |. + ..++|+.+......+ .......+|.+++.++.++.
T Consensus 213 ~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g---~~~~~~~~~~~f~~~~~~~d 266 (769)
T PF02395_consen 213 GSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNG---KGNWWNVIPPDFINQIKQND 266 (769)
T ss_dssp --TT-TT-EEEEEETTTTEEEEEEEEEEECCCCH---SEEEEEEECHHHHHHHHHHC
T ss_pred cccCcCCCceEEEEccCCeEEEEEEEccccccCC---ccceeEEecHHHHHHHHhhh
Confidence 46899999998 43 3 347899887654322 12445678888887777664
No 36
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=81.46 E-value=1.1 Score=42.35 Aligned_cols=23 Identities=22% Similarity=0.505 Sum_probs=20.8
Q ss_pred ccCCCCcccceecCCCeEEEEEc
Q 021321 251 AINSGNSGGPLMNSFGHVIGVNT 273 (314)
Q Consensus 251 ~~~~G~SGGPl~n~~G~vvGI~s 273 (314)
.+..|.||+.|+|.+|++|||..
T Consensus 351 ~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 351 SLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred CCCCCCCcCeEECCCCCEEEEeC
Confidence 56689999999999999999975
No 37
>COG5510 Predicted small secreted protein [Function unknown]
Probab=78.84 E-value=1.9 Score=26.93 Aligned_cols=24 Identities=25% Similarity=0.350 Sum_probs=17.2
Q ss_pred ccchhhHHHHHHHHHHHHhhhcCC
Q 021321 29 TRRSSIGFGSSVILSSFLVNFCSP 52 (314)
Q Consensus 29 ~~~~~~~~~~~~~~~~~~~~~~~~ 52 (314)
||++.+.+.++++++++++++|++
T Consensus 1 mmk~t~l~i~~vll~s~llaaCNT 24 (44)
T COG5510 1 MMKKTILLIALVLLASTLLAACNT 24 (44)
T ss_pred CchHHHHHHHHHHHHHHHHHHhhh
Confidence 355666667777778888899974
No 38
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=75.01 E-value=3.2 Score=35.71 Aligned_cols=56 Identities=18% Similarity=0.431 Sum_probs=38.8
Q ss_pred cceEEEeeccCCCCcccceecC----CCeEEEEEcccccCCCCCCccceEEEEehH--HHHHHHHHHH
Q 021321 243 RGAIQTDAAINSGNSGGPLMNS----FGHVIGVNTATFTRKGTGLSSGVNFAIPID--TVVRTVPYLI 304 (314)
Q Consensus 243 ~~~i~~~~~~~~G~SGGPl~n~----~G~vvGI~s~~~~~~~~~~~~~~~~aipi~--~i~~~l~~l~ 304 (314)
...+++......|+-|||++-. .-+++||+.++... .+.+||-++. .+.+.+.+|.
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~------~~~gYAe~itQEDL~~A~~~l~ 229 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSAN------HAMGYAESITQEDLMRAINKLE 229 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeeccccc------ccceehhhhhHHHHHHHHHhhc
Confidence 3456777888999999999832 34799999998643 4566776653 4555555543
No 39
>PRK10081 entericidin B membrane lipoprotein; Provisional
Probab=68.56 E-value=4.5 Score=26.01 Aligned_cols=24 Identities=21% Similarity=0.265 Sum_probs=14.8
Q ss_pred ccchhhHHHHHHHHHHHHhhhcCC
Q 021321 29 TRRSSIGFGSSVILSSFLVNFCSP 52 (314)
Q Consensus 29 ~~~~~~~~~~~~~~~~~~~~~~~~ 52 (314)
||++++.++++++++++.+.+|.+
T Consensus 1 MmKk~i~~i~~~l~~~~~l~~CnT 24 (48)
T PRK10081 1 MVKKTIAAIFSVLVLSTVLTACNT 24 (48)
T ss_pred ChHHHHHHHHHHHHHHHHHhhhhh
Confidence 355655555555556666688873
No 40
>COG3056 Uncharacterized lipoprotein [Cell envelope biogenesis, outer membrane]
Probab=60.85 E-value=12 Score=31.32 Aligned_cols=16 Identities=25% Similarity=0.482 Sum_probs=10.5
Q ss_pred HHHHHHhhhcCCCCCC
Q 021321 41 ILSSFLVNFCSPSSTL 56 (314)
Q Consensus 41 ~~~~~~~~~~~~~~~~ 56 (314)
+++.+++++|...+..
T Consensus 22 laa~~lLagC~a~~~t 37 (204)
T COG3056 22 LAAIFLLAGCAAPPTT 37 (204)
T ss_pred HHHHHHHHhcCCCCce
Confidence 3445666899876664
No 41
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=58.68 E-value=9 Score=24.75 Aligned_cols=22 Identities=23% Similarity=0.483 Sum_probs=18.7
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+.+.-|++|.+|+++|+++..
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~~ 49 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISRS 49 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEHH
T ss_pred cCCcEEEEEecCCEEEEEEEHH
Confidence 4678899999999999999853
No 42
>PRK14864 putative biofilm stress and motility protein A; Provisional
Probab=58.02 E-value=66 Score=24.45 Aligned_cols=10 Identities=0% Similarity=0.019 Sum_probs=5.8
Q ss_pred ccceEEEEEE
Q 021321 118 VEGTGSGFVW 127 (314)
Q Consensus 118 ~~~~GsGfiI 127 (314)
....||..|.
T Consensus 93 ~~~~atA~iY 102 (104)
T PRK14864 93 GQWYSQAILY 102 (104)
T ss_pred CeEEEEEEEe
Confidence 3456666665
No 43
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=51.85 E-value=38 Score=24.33 Aligned_cols=47 Identities=23% Similarity=0.330 Sum_probs=30.9
Q ss_pred EEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEE-EEcC
Q 021321 167 REGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFA-IGNP 215 (314)
Q Consensus 167 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~-iG~p 215 (314)
.+++++..+.++++|++.+-.-...+ .+.|-. ..++.|++|.+ +||-
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvkreV-~l~Lv~-~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVKREV-NLDLVG-EEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEeEEE-Eeeeec-CccccCCEEEEEeeEE
Confidence 46788888888889999885422122 222312 37899999876 6774
No 44
>PRK15396 murein lipoprotein; Provisional
Probab=46.88 E-value=20 Score=25.70 Aligned_cols=21 Identities=33% Similarity=0.290 Sum_probs=13.0
Q ss_pred hhhHHHHHHHHHHHHhhhcCC
Q 021321 32 SSIGFGSSVILSSFLVNFCSP 52 (314)
Q Consensus 32 ~~~~~~~~~~~~~~~~~~~~~ 52 (314)
+..+++.+++++++++++|+.
T Consensus 3 ~~kl~l~av~ls~~LLaGCAs 23 (78)
T PRK15396 3 RTKLVLGAVILGSTLLAGCSS 23 (78)
T ss_pred hhHHHHHHHHHHHHHHHHcCC
Confidence 334555555665667799973
No 45
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=43.56 E-value=27 Score=24.53 Aligned_cols=30 Identities=23% Similarity=0.472 Sum_probs=21.9
Q ss_pred cceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHH
Q 021321 259 GPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYL 303 (314)
Q Consensus 259 GPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l 303 (314)
-|+++.+|+++|++.. .+.++.+.++++++
T Consensus 19 ~pi~~~~g~~~Gvv~~---------------di~l~~l~~~i~~~ 48 (81)
T PF02743_consen 19 VPIYDDDGKIIGVVGI---------------DISLDQLSEIISNI 48 (81)
T ss_dssp EEEEETTTEEEEEEEE---------------EEEHHHHHHHHTTS
T ss_pred EEEECCCCCEEEEEEE---------------EeccceeeeEEEee
Confidence 5788889999998864 35667777766664
No 46
>PF05578 Peptidase_S31: Pestivirus NS3 polyprotein peptidase S31; InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=42.84 E-value=92 Score=25.38 Aligned_cols=128 Identities=23% Similarity=0.285 Sum_probs=65.8
Q ss_pred cceEEEEEEcCCCEEEeccccccCCCcCCCCcceEEEEEecCCCCeEEEEEEEEEeCCC--CcEEEEEEeeCCCccceee
Q 021321 119 EGTGSGFVWDKFGHIVTNYHVVAKLATDTSGLHRCKVSLFDAKGNGFYREGKMVGCDPA--YDLAVLKVDVEGFELKPVV 196 (314)
Q Consensus 119 ~~~GsGfiI~~~g~VLT~aHvv~~~~~~~~~~~~~~v~~~~~~g~~~~~~a~v~~~d~~--~DlAlL~v~~~~~~~~~~~ 196 (314)
++.-+|+-+...|-|-.--||..+. .+.+-|.-|+. +++..+.+ .|=.- + -++
T Consensus 50 rgletgwaythqggissvdhvt~gk----------d~lvcdsmgrt-----rvvcqsnnk~tde~e---------y-gvk 104 (211)
T PF05578_consen 50 RGLETGWAYTHQGGISSVDHVTAGK----------DLLVCDSMGRT-----RVVCQSNNKMTDETE---------Y-GVK 104 (211)
T ss_pred hcccccceeeccCCcccceeeecCC----------ceEEecCCCce-----EEEEccCCcccchhh---------c-ccc
Confidence 3466788887778787778887642 12233333332 23222211 11100 0 111
Q ss_pred cCCCCCCCCCCEEEEEEcCCCCCCCeEeeEEecccccccC-----CCCccccceEEEeeccCCCCcccceecC-CCeEEE
Q 021321 197 LGTSHDLRVGQSCFAIGNPYGFEDTLTTGVVSGLGREIPS-----PNGRAIRGAIQTDAAINSGNSGGPLMNS-FGHVIG 270 (314)
Q Consensus 197 l~~~~~~~~G~~v~~iG~p~~~~~~~~~G~vs~~~~~~~~-----~~~~~~~~~i~~~~~~~~G~SGGPl~n~-~G~vvG 270 (314)
.......|..+|++ +|.....+.+.|.+-.+.+.-.. ..+. --.+|..-..|.||=|+|.. .|++||
T Consensus 105 --tdsgcp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gt----paf~~~knlkg~s~~pifeassgr~vg 177 (211)
T PF05578_consen 105 --TDSGCPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGT----PAFFDLKNLKGWSGLPIFEASSGRVVG 177 (211)
T ss_pred --cCCCCCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCC----cceeeccccCCCCCCceeeccCCcEEE
Confidence 12335667888888 66554444444444333221000 0000 01223334579999999965 899999
Q ss_pred EEcccccC
Q 021321 271 VNTATFTR 278 (314)
Q Consensus 271 I~s~~~~~ 278 (314)
=+-.+.++
T Consensus 178 r~k~gkn~ 185 (211)
T PF05578_consen 178 RVKVGKNE 185 (211)
T ss_pred EEEecCCC
Confidence 88766554
No 47
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=39.37 E-value=21 Score=27.36 Aligned_cols=19 Identities=37% Similarity=0.482 Sum_probs=13.8
Q ss_pred ccceecCCCeEEEEEcccc
Q 021321 258 GGPLMNSFGHVIGVNTATF 276 (314)
Q Consensus 258 GGPl~n~~G~vvGI~s~~~ 276 (314)
-.|++|.+|++||+++.+.
T Consensus 93 ~~PV~d~~g~viG~V~VG~ 111 (116)
T PF14827_consen 93 FAPVYDSDGKVIGVVSVGV 111 (116)
T ss_dssp EEEEE-TTS-EEEEEEEEE
T ss_pred EEeeECCCCcEEEEEEEEE
Confidence 3688888999999998654
No 48
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=34.89 E-value=28 Score=32.80 Aligned_cols=24 Identities=29% Similarity=0.476 Sum_probs=19.2
Q ss_pred cceEEEEEEcC----CC------EEEeccccccC
Q 021321 119 EGTGSGFVWDK----FG------HIVTNYHVVAK 142 (314)
Q Consensus 119 ~~~GsGfiI~~----~g------~VLT~aHvv~~ 142 (314)
...|||+|++- ++ ++.||.||+..
T Consensus 35 ~~~GT~WIlDy~~~~~~~~p~k~y~ATNlHVa~~ 68 (374)
T PF01732_consen 35 SVSGTGWILDYKKPEDNKYPTKWYFATNLHVASN 68 (374)
T ss_pred cCcceEEEEEEeccCCCCCCeEEEEEechhhhcc
Confidence 46899999971 22 89999999983
No 49
>COG3065 Slp Starvation-inducible outer membrane lipoprotein [Cell envelope biogenesis, outer membrane]
Probab=33.83 E-value=2.1e+02 Score=24.03 Aligned_cols=11 Identities=27% Similarity=0.570 Sum_probs=6.7
Q ss_pred HHHhhhcCCCC
Q 021321 44 SFLVNFCSPSS 54 (314)
Q Consensus 44 ~~~~~~~~~~~ 54 (314)
+|++++|...+
T Consensus 17 aflLsgC~tiP 27 (191)
T COG3065 17 AFLLSGCVTIP 27 (191)
T ss_pred HHHHhhcccCC
Confidence 45568887433
No 50
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=33.28 E-value=28 Score=26.20 Aligned_cols=22 Identities=27% Similarity=0.407 Sum_probs=18.0
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+.+.=|++|.+|+++|+++..
T Consensus 97 ~~~~~lpVvd~~~~~vGiit~~ 118 (123)
T cd04627 97 EGISSVAVVDNQGNLIGNISVT 118 (123)
T ss_pred cCCceEEEECCCCcEEEEEeHH
Confidence 4556679999889999999875
No 51
>cd04618 CBS_pair_5 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=30.42 E-value=82 Score=22.84 Aligned_cols=50 Identities=18% Similarity=0.094 Sum_probs=31.4
Q ss_pred CCCcccceecCC-CeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHHcC
Q 021321 254 SGNSGGPLMNSF-GHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIVYG 307 (314)
Q Consensus 254 ~G~SGGPl~n~~-G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~~~ 307 (314)
.+.++-|++|.+ |+++|+++...-... ......-|-..+.+.++.+.+++
T Consensus 22 ~~~~~~~Vvd~~~~~~~Givt~~Dl~~~----~~~~~v~~~~~l~~a~~~m~~~~ 72 (98)
T cd04618 22 NGIRSAPLWDSRKQQFVGMLTITDFILI----LRLVSIHPERSLFDAALLLLKNK 72 (98)
T ss_pred cCCceEEEEeCCCCEEEEEEEHHHHhhh----eeeEEeCCCCcHHHHHHHHHHCC
Confidence 456788999874 899999996422110 00233445556777777776654
No 52
>PRK10672 rare lipoprotein A; Provisional
Probab=29.63 E-value=2.6e+02 Score=26.25 Aligned_cols=29 Identities=24% Similarity=0.155 Sum_probs=18.0
Q ss_pred ccccCCcccceEEEEEEcCCCEEEecccccc
Q 021321 111 VDGEYAKVEGTGSGFVWDKFGHIVTNYHVVA 141 (314)
Q Consensus 111 ~~~~~~~~~~~GsGfiI~~~g~VLT~aHvv~ 141 (314)
|.+.....+...+|-.++. +-+|+||-.-
T Consensus 85 wYg~~f~G~~TA~Ge~~~~--~~~tAAH~tL 113 (361)
T PRK10672 85 IYDAEAGSNLTASGERFDP--NALTAAHPTL 113 (361)
T ss_pred EeCCccCCCcCcCceeecC--CcCeeeccCC
Confidence 3333334455667777764 5799999654
No 53
>COG3290 CitA Signal transduction histidine kinase regulating citrate/malate metabolism [Signal transduction mechanisms]
Probab=29.49 E-value=62 Score=31.92 Aligned_cols=18 Identities=28% Similarity=0.551 Sum_probs=15.7
Q ss_pred cceecCCCeEEEEEcccc
Q 021321 259 GPLMNSFGHVIGVNTATF 276 (314)
Q Consensus 259 GPl~n~~G~vvGI~s~~~ 276 (314)
.|+||.+|++||+++-++
T Consensus 143 ~PI~d~~g~~IGvVsVG~ 160 (537)
T COG3290 143 VPIFDEDGKQIGVVSVGY 160 (537)
T ss_pred cceECCCCCEEEEEEEee
Confidence 599999999999998764
No 54
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=29.16 E-value=40 Score=24.88 Aligned_cols=22 Identities=9% Similarity=0.146 Sum_probs=17.2
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+.+--|++|.+|+++|+++..
T Consensus 85 ~~~~~lpVvd~~~~~~Giit~~ 106 (111)
T cd04603 85 TEPPVVAVVDKEGKLVGTIYER 106 (111)
T ss_pred cCCCeEEEEcCCCeEEEEEEhH
Confidence 3455569999889999999864
No 55
>PF10049 DUF2283: Protein of unknown function (DUF2283); InterPro: IPR019270 Members of this family of hypothetical proteins have no known function.
Probab=28.76 E-value=38 Score=21.84 Aligned_cols=12 Identities=17% Similarity=0.559 Sum_probs=7.7
Q ss_pred cCCCeEEEEEcc
Q 021321 263 NSFGHVIGVNTA 274 (314)
Q Consensus 263 n~~G~vvGI~s~ 274 (314)
|.+|++|||-..
T Consensus 36 d~~G~ivGIEIl 47 (50)
T PF10049_consen 36 DEDGRIVGIEIL 47 (50)
T ss_pred CCCCCEEEEEEE
Confidence 456777777543
No 56
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=28.51 E-value=39 Score=24.93 Aligned_cols=21 Identities=29% Similarity=0.431 Sum_probs=17.1
Q ss_pred CCcccceecCCCeEEEEEccc
Q 021321 255 GNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 255 G~SGGPl~n~~G~vvGI~s~~ 275 (314)
+...-|++|.+|+++|+++..
T Consensus 90 ~~~~~pVvd~~~~~~Gvit~~ 110 (115)
T cd04620 90 QIRHLPVLDDQGQLIGLVTAE 110 (115)
T ss_pred CCceEEEEcCCCCEEEEEEhH
Confidence 445679999889999999864
No 57
>PF07172 GRP: Glycine rich protein family; InterPro: IPR010800 This family consists of glycine rich proteins. Some of them may be involved in resistance to environmental stress [].
Probab=26.98 E-value=47 Score=24.80 Aligned_cols=9 Identities=22% Similarity=0.560 Sum_probs=3.3
Q ss_pred HHHHHHHHH
Q 021321 38 SSVILSSFL 46 (314)
Q Consensus 38 ~~~~~~~~~ 46 (314)
++++|+++|
T Consensus 9 L~l~LA~lL 17 (95)
T PF07172_consen 9 LGLLLAALL 17 (95)
T ss_pred HHHHHHHHH
Confidence 333333333
No 58
>cd04597 CBS_pair_DRTGG_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=25.69 E-value=58 Score=24.40 Aligned_cols=22 Identities=18% Similarity=0.176 Sum_probs=18.3
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+...-|++|.+|+++|+++..
T Consensus 87 ~~~~~lpVvd~~~~l~Givt~~ 108 (113)
T cd04597 87 HNIRTLPVVDDDGTPAGIITLL 108 (113)
T ss_pred cCCCEEEEECCCCeEEEEEEHH
Confidence 4667789999899999999864
No 59
>cd04643 CBS_pair_30 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=25.52 E-value=49 Score=24.29 Aligned_cols=17 Identities=29% Similarity=0.430 Sum_probs=14.9
Q ss_pred cceecCCCeEEEEEccc
Q 021321 259 GPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 259 GPl~n~~G~vvGI~s~~ 275 (314)
-|++|.+|+++|+++..
T Consensus 95 ~~Vv~~~~~~~Gvit~~ 111 (116)
T cd04643 95 LPVVDDDGIFIGIITRR 111 (116)
T ss_pred eeEEeCCCeEEEEEEHH
Confidence 68999889999999874
No 60
>PF08669 GCV_T_C: Glycine cleavage T-protein C-terminal barrel domain; InterPro: IPR013977 This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=25.50 E-value=78 Score=23.01 Aligned_cols=23 Identities=22% Similarity=0.329 Sum_probs=18.9
Q ss_pred CcccceecCCCeEEEEEcccccC
Q 021321 256 NSGGPLMNSFGHVIGVNTATFTR 278 (314)
Q Consensus 256 ~SGGPl~n~~G~vvGI~s~~~~~ 278 (314)
..|.|+++.+|+.||.+++....
T Consensus 34 ~~g~~v~~~~g~~vG~vTS~~~s 56 (95)
T PF08669_consen 34 RGGEPVYDEDGKPVGRVTSGAYS 56 (95)
T ss_dssp STTCEEEETTTEEEEEEEEEEEE
T ss_pred CCCCEEEECCCcEEeEEEEEeEC
Confidence 45789998799999999988553
No 61
>cd01739 LSm11_C The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm11 is an SmD2 - like subunit which binds U7 snRNA along with LSm10 and five other Sm subunits to form a 7-member ring structure. LSm11 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=25.31 E-value=1.2e+02 Score=20.93 Aligned_cols=39 Identities=26% Similarity=0.284 Sum_probs=29.9
Q ss_pred CcceEEEEEecCCCCeEEEEEEEEEeCCCCcEEEEEEee
Q 021321 149 GLHRCKVSLFDAKGNGFYREGKMVGCDPAYDLAVLKVDV 187 (314)
Q Consensus 149 ~~~~~~v~~~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 187 (314)
....++|.+...+|-.-...+.++++|...+++|.-++.
T Consensus 7 er~RVrV~iR~~~gvrG~~~G~lvAFDK~wNm~L~DV~E 45 (66)
T cd01739 7 ERIRVRVHIRTFKGLRGVCSGFLVAFDKFWNMALVDVDE 45 (66)
T ss_pred CCcEEEEEEecccCcccEEEEEEEeeeeehhheehhhhh
Confidence 345677777665555557889999999999999988864
No 62
>cd04592 CBS_pair_EriC_assoc_euk This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in eukaryotes. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually
Probab=23.82 E-value=65 Score=25.15 Aligned_cols=22 Identities=18% Similarity=0.058 Sum_probs=18.2
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+.++-|++|.+|+++|+++..
T Consensus 22 ~~~~~~~VvD~~g~l~Givt~~ 43 (133)
T cd04592 22 EKQSCVLVVDSDDFLEGILTLG 43 (133)
T ss_pred cCCCEEEEECCCCeEEEEEEHH
Confidence 3557889999999999999954
No 63
>cd04641 CBS_pair_28 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=23.55 E-value=66 Score=23.96 Aligned_cols=22 Identities=23% Similarity=0.362 Sum_probs=18.2
Q ss_pred CCCCcccceecCCCeEEEEEcc
Q 021321 253 NSGNSGGPLMNSFGHVIGVNTA 274 (314)
Q Consensus 253 ~~G~SGGPl~n~~G~vvGI~s~ 274 (314)
..+.+.-|++|.+|+++|+++.
T Consensus 21 ~~~~~~~pVv~~~~~~~Giv~~ 42 (120)
T cd04641 21 ERRVSALPIVDENGKVVDVYSR 42 (120)
T ss_pred HcCCCeeeEECCCCeEEEEEeH
Confidence 3466788999989999999874
No 64
>cd04619 CBS_pair_6 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=23.23 E-value=57 Score=24.09 Aligned_cols=22 Identities=23% Similarity=0.327 Sum_probs=17.4
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+...=|++|.+|+++|+++..
T Consensus 88 ~~~~~lpVvd~~~~~~Gvi~~~ 109 (114)
T cd04619 88 RGLKNIPVVDENARPLGVLNAR 109 (114)
T ss_pred cCCCeEEEECCCCcEEEEEEhH
Confidence 3555678998889999999864
No 65
>PRK14864 putative biofilm stress and motility protein A; Provisional
Probab=22.04 E-value=65 Score=24.51 Aligned_cols=9 Identities=11% Similarity=0.250 Sum_probs=3.9
Q ss_pred CcceEEEEE
Q 021321 149 GLHRCKVSL 157 (314)
Q Consensus 149 ~~~~~~v~~ 157 (314)
|+..++|.-
T Consensus 77 GA~yYrIi~ 85 (104)
T PRK14864 77 GADYYVIVM 85 (104)
T ss_pred CCCEEEEEE
Confidence 444444443
No 66
>cd04602 CBS_pair_IMPDH_2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the inosine 5' monophosphate dehydrogenase (IMPDH) protein. IMPDH is an essential enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in the regulation of cell proliferation and differentiation. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain in IMPDH have been associated with retinitis pigmentos
Probab=21.76 E-value=68 Score=23.63 Aligned_cols=22 Identities=23% Similarity=0.365 Sum_probs=17.4
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+...-|++|.+|+++|+++..
T Consensus 88 ~~~~~~pVv~~~~~~~Gvit~~ 109 (114)
T cd04602 88 SKKGKLPIVNDDGELVALVTRS 109 (114)
T ss_pred cCCCceeEECCCCeEEEEEEHH
Confidence 3445679998889999999864
No 67
>cd04614 CBS_pair_1 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=21.74 E-value=74 Score=22.87 Aligned_cols=50 Identities=18% Similarity=0.074 Sum_probs=31.2
Q ss_pred CCCcccceecCCCeEEEEEcccccCCCCCCccceEEEEehHHHHHHHHHHHHcC
Q 021321 254 SGNSGGPLMNSFGHVIGVNTATFTRKGTGLSSGVNFAIPIDTVVRTVPYLIVYG 307 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~~~~~~~~~~~~~~~aipi~~i~~~l~~l~~~~ 307 (314)
.+.++-|++|.+|+++|+++...-... ....+.-+-+.+.+.++.+.+++
T Consensus 22 ~~~~~~~V~d~~~~~~Giv~~~dl~~~----~~~~~v~~~~~l~~a~~~m~~~~ 71 (96)
T cd04614 22 ANVKALPVLDDDGKLSGIITERDLIAK----SEVVTATKRTTVSECAQKMKRNR 71 (96)
T ss_pred cCCCeEEEECCCCCEEEEEEHHHHhcC----CCcEEecCCCCHHHHHHHHHHhC
Confidence 466788999989999999986532110 11333344455666776666554
No 68
>cd04607 CBS_pair_NTP_transferase_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domain associated with the NTP (Nucleotidyl transferase) domain downstream. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=21.60 E-value=64 Score=23.62 Aligned_cols=22 Identities=23% Similarity=0.398 Sum_probs=17.5
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+...-|++|.+|+++|+++..
T Consensus 87 ~~~~~~~Vv~~~~~~~Gvit~~ 108 (113)
T cd04607 87 RSIRHLPILDEEGRVVGLATLD 108 (113)
T ss_pred CCCCEEEEECCCCCEEEEEEhH
Confidence 3455678998889999999864
No 69
>COG3448 CBS-domain-containing membrane protein [Signal transduction mechanisms]
Probab=21.29 E-value=61 Score=29.60 Aligned_cols=22 Identities=23% Similarity=0.551 Sum_probs=17.6
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.|.--=|++|.+|+++||++..
T Consensus 344 ~g~H~lpvld~~g~lvGIvsQt 365 (382)
T COG3448 344 EGLHALPVLDAAGKLVGIVSQT 365 (382)
T ss_pred CCcceeeEEcCCCcEEEEeeHH
Confidence 3444569999999999999864
No 70
>cd04582 CBS_pair_ABC_OpuCA_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown. In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzi
Probab=20.97 E-value=67 Score=23.10 Aligned_cols=22 Identities=23% Similarity=0.167 Sum_probs=17.2
Q ss_pred CCCcccceecCCCeEEEEEccc
Q 021321 254 SGNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~G~vvGI~s~~ 275 (314)
.+.+--|++|.+|+++|+++..
T Consensus 80 ~~~~~~~Vv~~~~~~~Gvi~~~ 101 (106)
T cd04582 80 HDMSWLPCVDEDGRYVGEVTQR 101 (106)
T ss_pred CCCCeeeEECCCCcEEEEEEHH
Confidence 3445578999889999999864
No 71
>COG5428 Uncharacterized conserved small protein [Function unknown]
Probab=20.85 E-value=72 Score=22.22 Aligned_cols=16 Identities=25% Similarity=0.405 Sum_probs=13.2
Q ss_pred cCCCeEEEEEcccccC
Q 021321 263 NSFGHVIGVNTATFTR 278 (314)
Q Consensus 263 n~~G~vvGI~s~~~~~ 278 (314)
|.+|+|+||-.|....
T Consensus 37 de~GkV~GiEi~~As~ 52 (69)
T COG5428 37 DENGKVIGIEIWNASA 52 (69)
T ss_pred cCCCcEEEEEEEchhh
Confidence 5789999999997653
No 72
>cd04583 CBS_pair_ABC_OpuCA_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in association with the ABC transporter OpuCA. OpuCA is the ATP binding component of a bacterial solute transporter that serves a protective role to cells growing in a hyperosmolar environment but the function of the CBS domains in OpuCA remains unknown. In the related ABC transporter, OpuA, the tandem CBS domains have been shown to function as sensors for ionic strength, whereby they control the transport activity through an electronic switching mechanism. ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. They are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyz
Probab=20.67 E-value=74 Score=22.91 Aligned_cols=21 Identities=24% Similarity=0.453 Sum_probs=16.9
Q ss_pred CCcccceecCCCeEEEEEccc
Q 021321 255 GNSGGPLMNSFGHVIGVNTAT 275 (314)
Q Consensus 255 G~SGGPl~n~~G~vvGI~s~~ 275 (314)
+...-|++|.+|+++|+++..
T Consensus 84 ~~~~~~vv~~~g~~~Gvit~~ 104 (109)
T cd04583 84 GPKYVPVVDEDGKLVGLITRS 104 (109)
T ss_pred CCceeeEECCCCeEEEEEehH
Confidence 445568999889999999864
No 73
>cd04617 CBS_pair_4 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=20.62 E-value=69 Score=23.79 Aligned_cols=22 Identities=27% Similarity=0.229 Sum_probs=16.5
Q ss_pred CCCcccceecCC---CeEEEEEccc
Q 021321 254 SGNSGGPLMNSF---GHVIGVNTAT 275 (314)
Q Consensus 254 ~G~SGGPl~n~~---G~vvGI~s~~ 275 (314)
.+..-=|++|.+ |+++|+++..
T Consensus 89 ~~~~~lpVvd~~~~~~~l~Gvit~~ 113 (118)
T cd04617 89 HQVDSLPVVEKVDEGLEVIGRITKT 113 (118)
T ss_pred cCCCEeeEEeCCCccceEEEEEEhh
Confidence 344557888876 7999999875
No 74
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=20.25 E-value=3.1e+02 Score=18.96 Aligned_cols=43 Identities=23% Similarity=0.297 Sum_probs=28.0
Q ss_pred EEEEEEEeCCCCcEEEEEEeeCCCccceeecCCCCCCCCCCEEEEE
Q 021321 167 REGKMVGCDPAYDLAVLKVDVEGFELKPVVLGTSHDLRVGQSCFAI 212 (314)
Q Consensus 167 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~i 212 (314)
++++++..+.....|++.... ....+.+.--.++++||+|.+-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G---~~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGG---VRREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETT---EEEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCC---cEEEEEEEEeCCCCCCCEEEEe
Confidence 567888887788999988753 2344444333458999998863
No 75
>PRK10781 rcsF outer membrane lipoprotein; Reviewed
Probab=20.17 E-value=1.1e+02 Score=24.39 Aligned_cols=15 Identities=27% Similarity=0.416 Sum_probs=8.2
Q ss_pred HHHhhhcCCCCCCCC
Q 021321 44 SFLVNFCSPSSTLPS 58 (314)
Q Consensus 44 ~~~~~~~~~~~~~~~ 58 (314)
.+++.+|......+.
T Consensus 10 ~L~LsGCS~l~~tp~ 24 (133)
T PRK10781 10 ALMLTGCSMLSRSPV 24 (133)
T ss_pred HHHHhhccccCcCCC
Confidence 344577875555433
No 76
>PRK13835 conjugal transfer protein TrbH; Provisional
Probab=20.11 E-value=2.2e+02 Score=22.99 Aligned_cols=23 Identities=22% Similarity=0.307 Sum_probs=16.7
Q ss_pred cchHHHHHHHHhCCceEEEEeee
Q 021321 74 EEDRVVQLFQETSPSVVSIQDLE 96 (314)
Q Consensus 74 ~~~~~~~~~~~~~~svV~I~~~~ 96 (314)
..+-+.++.+.+.|+--+|...+
T Consensus 43 A~D~vsqLae~~pPa~tt~~l~q 65 (145)
T PRK13835 43 AGDMVSRLAEQIGPGTTTIKLKK 65 (145)
T ss_pred HHHHHHHHHHhcCCCceEEEEee
Confidence 44566778999999987776543
Done!