Query 016641
Match_columns 385
No_of_seqs 367 out of 2614
Neff 8.1
Searched_HMMs 46136
Date Fri Mar 29 08:41:06 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/016641.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/016641hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 5.2E-48 1.1E-52 388.2 32.2 271 104-383 41-333 (455)
2 TIGR02038 protease_degS peripl 100.0 2.3E-46 4.9E-51 365.9 31.8 268 102-383 44-321 (351)
3 PRK10898 serine endoprotease; 100.0 5.6E-46 1.2E-50 362.9 32.0 267 103-383 45-322 (353)
4 PRK10942 serine endoprotease; 100.0 5E-45 1.1E-49 368.3 30.0 270 104-382 39-353 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 3E-44 6.6E-49 360.7 32.1 271 104-383 2-300 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 2.9E-35 6.3E-40 288.2 27.0 272 103-384 33-314 (347)
7 KOG1320 Serine protease [Postt 99.9 3.8E-24 8.2E-29 210.7 18.1 278 102-383 127-441 (473)
8 KOG1320 Serine protease [Postt 99.9 1.8E-22 3.9E-27 198.9 10.7 261 108-373 55-319 (473)
9 KOG1421 Predicted signaling-as 99.8 4.3E-20 9.3E-25 184.1 16.9 269 104-382 53-344 (955)
10 PF13365 Trypsin_2: Trypsin-li 99.6 2.5E-14 5.3E-19 117.7 12.0 107 142-279 1-120 (120)
11 PF00089 Trypsin: Trypsin; In 99.5 3.6E-13 7.7E-18 122.0 16.0 165 139-303 24-220 (220)
12 cd00190 Tryp_SPc Trypsin-like 99.5 1.3E-12 2.8E-17 119.2 16.8 166 139-304 24-230 (232)
13 KOG1421 Predicted signaling-as 99.4 2.5E-12 5.5E-17 129.2 16.8 259 109-381 524-809 (955)
14 smart00020 Tryp_SPc Trypsin-li 99.4 6.3E-12 1.4E-16 114.9 15.2 146 139-284 25-208 (229)
15 PF13180 PDZ_2: PDZ domain; PD 99.0 1E-09 2.2E-14 84.7 5.4 56 316-383 1-57 (82)
16 cd00987 PDZ_serine_protease PD 98.9 3.1E-09 6.8E-14 83.0 7.0 65 316-382 1-66 (90)
17 KOG3627 Trypsin [Amino acid tr 98.8 3.6E-07 7.9E-12 85.3 17.4 144 141-285 39-229 (256)
18 COG3591 V8-like Glu-specific e 98.7 4E-07 8.7E-12 84.0 14.0 157 141-307 65-250 (251)
19 cd00991 PDZ_archaeal_metallopr 98.5 1.8E-07 3.9E-12 71.6 5.8 44 340-383 9-53 (79)
20 cd00990 PDZ_glycyl_aminopeptid 98.5 1.4E-07 3.1E-12 72.0 5.0 50 316-379 1-51 (80)
21 cd00136 PDZ PDZ domain, also c 98.5 2.1E-07 4.6E-12 69.0 5.8 42 341-382 13-57 (70)
22 TIGR02037 degP_htrA_DO peripla 98.5 2.1E-07 4.5E-12 94.0 7.3 67 315-382 337-404 (428)
23 TIGR01713 typeII_sec_gspC gene 98.4 2.8E-07 6.2E-12 86.4 6.0 74 297-382 159-233 (259)
24 cd00989 PDZ_metalloprotease PD 98.4 6.5E-07 1.4E-11 68.0 5.5 42 341-382 12-54 (79)
25 cd00986 PDZ_LON_protease PDZ d 98.4 6.9E-07 1.5E-11 68.2 5.3 43 341-383 8-50 (79)
26 PF00863 Peptidase_C4: Peptida 98.3 3.3E-05 7.1E-10 70.8 15.9 143 149-305 40-195 (235)
27 cd00992 PDZ_signaling PDZ doma 98.2 2.6E-06 5.7E-11 65.0 5.6 56 316-382 12-70 (82)
28 cd00988 PDZ_CTP_protease PDZ d 98.2 2.8E-06 6E-11 65.5 4.9 42 341-382 13-57 (85)
29 PF00595 PDZ: PDZ domain (Also 97.9 1.2E-05 2.6E-10 61.6 4.4 53 316-378 10-63 (81)
30 smart00228 PDZ Domain present 97.9 1.8E-05 4E-10 60.5 5.1 41 341-381 26-67 (85)
31 COG5640 Secreted trypsin-like 97.9 0.00022 4.8E-09 68.2 13.0 50 259-308 223-279 (413)
32 PF12812 PDZ_1: PDZ-like domai 97.7 7.9E-05 1.7E-09 56.7 5.8 64 316-383 9-73 (78)
33 TIGR00054 RIP metalloprotease 97.6 6.6E-05 1.4E-09 75.6 5.4 42 341-382 203-245 (420)
34 PF03761 DUF316: Domain of unk 97.6 0.004 8.6E-08 59.2 17.3 106 186-301 160-273 (282)
35 PRK10779 zinc metallopeptidase 97.6 4.9E-05 1.1E-09 77.3 4.3 40 343-382 128-168 (449)
36 PF05579 Peptidase_S32: Equine 97.6 0.00062 1.3E-08 62.7 10.9 113 141-284 113-229 (297)
37 PRK10139 serine endoprotease; 97.6 9.6E-05 2.1E-09 75.1 5.7 42 341-382 390-432 (455)
38 PF00548 Peptidase_C3: 3C cyst 97.5 0.0018 3.9E-08 57.1 12.6 138 138-283 23-170 (172)
39 TIGR00054 RIP metalloprotease 97.5 9.6E-05 2.1E-09 74.5 4.9 43 341-383 128-171 (420)
40 PRK10942 serine endoprotease; 97.5 0.00011 2.4E-09 75.0 5.4 42 341-382 408-450 (473)
41 PRK10779 zinc metallopeptidase 97.5 0.00015 3.2E-09 73.8 5.3 42 341-382 221-263 (449)
42 KOG3553 Tax interaction protei 97.4 0.00017 3.7E-09 56.3 3.7 37 334-372 54-91 (124)
43 TIGR00225 prc C-terminal pepti 97.4 0.00024 5.2E-09 69.4 5.2 41 341-381 62-105 (334)
44 PLN00049 carboxyl-terminal pro 97.2 0.00047 1E-08 68.8 5.5 35 341-375 102-137 (389)
45 PF14685 Tricorn_PDZ: Tricorn 97.0 0.00094 2E-08 52.0 4.6 43 341-383 12-65 (88)
46 TIGR02860 spore_IV_B stage IV 97.0 0.001 2.2E-08 65.8 5.3 42 341-382 105-155 (402)
47 PF04495 GRASP55_65: GRASP55/6 96.5 0.0032 6.9E-08 53.4 4.3 60 315-382 25-86 (138)
48 COG0793 Prc Periplasmic protea 96.3 0.0051 1.1E-07 61.6 5.0 49 315-376 99-148 (406)
49 PF09342 DUF1986: Domain of un 96.2 0.029 6.2E-07 51.4 8.9 98 127-225 13-131 (267)
50 PF08192 Peptidase_S64: Peptid 96.0 0.054 1.2E-06 56.2 10.7 117 185-306 541-688 (695)
51 KOG3532 Predicted protein kina 95.9 0.0079 1.7E-07 62.0 4.2 42 341-382 398-440 (1051)
52 COG3480 SdrC Predicted secrete 95.8 0.0099 2.1E-07 56.3 4.2 43 341-383 130-172 (342)
53 PRK11186 carboxy-terminal prot 95.7 0.015 3.2E-07 61.6 5.3 34 341-374 255-292 (667)
54 PF10459 Peptidase_S46: Peptid 94.9 0.039 8.4E-07 58.8 5.4 20 141-160 48-68 (698)
55 PF05580 Peptidase_S55: SpoIVB 94.9 0.022 4.8E-07 51.3 3.0 41 259-299 175-215 (218)
56 KOG3129 26S proteasome regulat 94.9 0.031 6.8E-07 49.8 3.9 39 342-380 140-179 (231)
57 COG3975 Predicted protease wit 94.8 0.029 6.3E-07 56.7 3.9 32 339-370 460-492 (558)
58 PF02122 Peptidase_S39: Peptid 94.2 0.045 9.8E-07 49.4 3.4 136 149-298 41-183 (203)
59 PF00947 Pico_P2A: Picornaviru 93.6 0.31 6.7E-06 40.2 6.9 42 242-283 68-109 (127)
60 KOG3580 Tight junction protein 93.2 0.065 1.4E-06 54.7 2.9 44 340-383 428-472 (1027)
61 KOG3209 WW domain-containing p 91.9 0.11 2.4E-06 54.2 2.6 32 345-376 782-815 (984)
62 KOG3552 FERM domain protein FR 91.8 0.12 2.6E-06 55.4 2.8 34 342-375 76-109 (1298)
63 PF00949 Peptidase_S7: Peptida 91.6 0.31 6.7E-06 40.9 4.6 31 255-285 88-119 (132)
64 PRK09681 putative type II secr 91.6 0.22 4.8E-06 47.0 4.1 42 341-382 204-249 (276)
65 KOG3550 Receptor targeting pro 91.1 0.28 6.1E-06 41.5 3.8 35 341-375 115-151 (207)
66 PF00944 Peptidase_S3: Alphavi 90.4 0.31 6.7E-06 40.5 3.4 27 259-285 101-128 (158)
67 KOG2921 Intramembrane metallop 89.9 0.29 6.4E-06 47.8 3.3 42 340-381 219-262 (484)
68 TIGR02860 spore_IV_B stage IV 89.1 0.28 6E-06 48.9 2.6 41 259-299 355-395 (402)
69 KOG3542 cAMP-regulated guanine 88.7 0.29 6.2E-06 50.9 2.4 41 337-377 558-599 (1283)
70 PF10459 Peptidase_S46: Peptid 88.2 0.26 5.6E-06 52.7 1.8 54 254-307 623-687 (698)
71 COG3031 PulC Type II secretory 87.6 0.58 1.3E-05 42.8 3.4 44 341-384 207-251 (275)
72 KOG3606 Cell polarity protein 87.2 0.5 1.1E-05 43.9 2.8 72 306-382 164-239 (358)
73 KOG3651 Protein kinase C, alph 85.1 1.1 2.4E-05 42.4 4.0 37 342-378 31-69 (429)
74 KOG3571 Dishevelled 3 and rela 84.2 0.77 1.7E-05 46.3 2.7 36 340-375 276-313 (626)
75 PF02395 Peptidase_S6: Immunog 84.2 5.1 0.00011 43.5 9.0 65 142-209 67-131 (769)
76 COG0750 Predicted membrane-ass 83.3 1.7 3.6E-05 42.9 4.7 39 345-383 133-172 (375)
77 PF05416 Peptidase_C37: Southa 83.0 11 0.00025 37.5 10.0 137 140-285 379-528 (535)
78 KOG1892 Actin filament-binding 82.8 1.2 2.6E-05 48.3 3.5 42 337-378 956-999 (1629)
79 PF02907 Peptidase_S29: Hepati 82.3 1.1 2.5E-05 37.3 2.6 23 262-284 106-129 (148)
80 KOG3209 WW domain-containing p 82.2 1.1 2.3E-05 47.1 2.9 37 341-377 923-961 (984)
81 PF03510 Peptidase_C24: 2C end 82.0 8.6 0.00019 30.9 7.3 53 144-210 3-55 (105)
82 KOG3580 Tight junction protein 81.0 1.6 3.5E-05 45.0 3.6 37 341-377 40-76 (1027)
83 KOG3605 Beta amyloid precursor 77.9 2.9 6.2E-05 43.6 4.3 75 293-373 708-789 (829)
84 KOG3551 Syntrophins (type beta 73.2 1.5 3.2E-05 43.0 0.8 35 341-375 110-146 (506)
85 KOG0609 Calcium/calmodulin-dep 71.9 3 6.5E-05 42.6 2.7 41 342-382 147-191 (542)
86 KOG0606 Microtubule-associated 67.9 3.4 7.3E-05 45.8 2.2 32 344-375 661-693 (1205)
87 cd01720 Sm_D2 The eukaryotic S 58.1 21 0.00045 27.6 4.5 37 158-195 10-46 (87)
88 COG0260 PepB Leucyl aminopepti 56.1 11 0.00024 38.7 3.4 40 332-373 291-330 (485)
89 PF01732 DUF31: Putative pepti 51.1 11 0.00023 37.4 2.4 22 260-281 351-373 (374)
90 cd00600 Sm_like The eukaryotic 48.9 52 0.0011 23.1 5.1 33 163-196 7-39 (63)
91 KOG3549 Syntrophins (type gamm 48.7 12 0.00026 36.3 2.1 33 343-375 82-116 (505)
92 cd01727 LSm8 The eukaryotic Sm 48.1 93 0.002 23.0 6.5 32 163-195 10-41 (74)
93 PRK05015 aminopeptidase B; Pro 45.7 20 0.00044 36.0 3.3 39 333-373 230-268 (424)
94 cd01728 LSm1 The eukaryotic Sm 45.7 93 0.002 23.2 6.1 32 163-195 13-44 (74)
95 PRK00737 small nuclear ribonuc 44.7 59 0.0013 24.0 4.9 33 163-196 15-47 (72)
96 cd01731 archaeal_Sm1 The archa 44.4 61 0.0013 23.5 4.9 33 163-196 11-43 (68)
97 cd01726 LSm6 The eukaryotic Sm 44.2 56 0.0012 23.6 4.7 32 163-195 11-42 (67)
98 cd01722 Sm_F The eukaryotic Sm 43.9 54 0.0012 23.9 4.6 32 163-195 12-43 (68)
99 cd01730 LSm3 The eukaryotic Sm 43.2 49 0.0011 25.1 4.4 31 163-194 12-42 (82)
100 cd01729 LSm7 The eukaryotic Sm 41.7 66 0.0014 24.4 4.9 31 163-194 13-43 (81)
101 cd01732 LSm5 The eukaryotic Sm 41.6 59 0.0013 24.4 4.6 31 163-194 14-44 (76)
102 cd06168 LSm9 The eukaryotic Sm 41.1 72 0.0016 23.9 4.9 32 163-195 11-42 (75)
103 cd01717 Sm_B The eukaryotic Sm 40.8 65 0.0014 24.2 4.7 32 163-195 11-42 (79)
104 KOG3605 Beta amyloid precursor 40.1 11 0.00025 39.4 0.6 29 346-374 678-708 (829)
105 cd01719 Sm_G The eukaryotic Sm 39.0 81 0.0018 23.3 4.9 31 163-194 11-41 (72)
106 cd01735 LSm12_N LSm12 belongs 38.8 1.2E+02 0.0027 21.8 5.5 33 163-196 7-39 (61)
107 PF11874 DUF3394: Domain of un 36.4 46 0.001 29.6 3.7 28 341-368 122-150 (183)
108 PF09465 LBR_tudor: Lamin-B re 35.1 1.6E+02 0.0035 20.7 5.8 37 161-197 8-44 (55)
109 PRK00913 multifunctional amino 34.9 28 0.00061 35.8 2.4 31 343-373 301-331 (483)
110 smart00651 Sm snRNP Sm protein 34.3 1.1E+02 0.0023 21.8 4.9 33 163-196 9-41 (67)
111 PTZ00412 leucyl aminopeptidase 33.9 34 0.00073 35.7 2.8 40 332-373 337-376 (569)
112 PF12381 Peptidase_C3G: Tungro 33.8 32 0.0007 31.3 2.3 53 254-307 170-229 (231)
113 cd01721 Sm_D3 The eukaryotic S 33.6 1.1E+02 0.0024 22.4 4.9 32 163-195 11-42 (70)
114 PF00883 Peptidase_M17: Cytoso 33.0 22 0.00048 34.3 1.3 30 344-373 133-162 (311)
115 COG1958 LSM1 Small nuclear rib 32.3 95 0.0021 23.2 4.5 33 163-196 18-50 (79)
116 cd00433 Peptidase_M17 Cytosol 31.9 33 0.0007 35.2 2.3 31 343-373 287-317 (468)
117 PF01423 LSM: LSM domain ; In 31.3 1.3E+02 0.0028 21.3 4.9 35 162-197 8-42 (67)
118 COG0298 HypC Hydrogenase matur 29.6 1.2E+02 0.0027 23.0 4.4 47 176-224 5-52 (82)
119 KOG3938 RGS-GAIP interacting p 27.9 21 0.00046 33.4 0.2 38 343-380 151-190 (334)
120 KOG2597 Predicted aminopeptida 26.9 67 0.0015 33.1 3.5 43 329-373 310-352 (513)
121 PF11730 DUF3297: Protein of u 25.2 44 0.00096 24.4 1.4 32 347-378 5-37 (71)
122 cd01723 LSm4 The eukaryotic Sm 24.1 2.1E+02 0.0046 21.2 5.0 33 162-195 11-43 (76)
123 KOG1738 Membrane-associated gu 23.1 53 0.0012 34.5 2.0 31 343-373 227-259 (638)
124 KOG3834 Golgi reassembly stack 22.9 69 0.0015 32.2 2.7 39 344-382 112-152 (462)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=5.2e-48 Score=388.20 Aligned_cols=271 Identities=23% Similarity=0.367 Sum_probs=234.0
Q ss_pred cHHHHHHHhCCCeEEEEeeecCCC----------CC---CCCCCCCCCcceEEEEEec--CCEEEecccccCCCceEEEE
Q 016641 104 NAYAAIELALDSVVKIFTVSSSPN----------YG---LPWQNKSQRETTGSGFVIP--GKKILTNAHVVADSTFVLVR 168 (385)
Q Consensus 104 ~~~~~~~~~~~SVV~I~~~~~~~~----------~~---~p~~~~~~~~~~GSGfiI~--~g~ILT~aHvv~~~~~i~V~ 168 (385)
++.++++++.||||.|.+...... ++ .||+......+.||||+|+ +||||||+|||+++..+.|+
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~ 120 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ 120 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence 589999999999999998653210 11 1333333446789999997 58999999999999999999
Q ss_pred EcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeeccccccc
Q 016641 169 KHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYV 246 (385)
Q Consensus 169 ~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~ 246 (385)
+. |+++++|++++.|+.+||||||++... .+++++|+++. ++||+|+++|+|++.. .+++.|+|++..+....
T Consensus 121 ~~-dg~~~~a~vvg~D~~~DlAvlkv~~~~---~l~~~~lg~s~~~~~G~~V~aiG~P~g~~-~tvt~GivS~~~r~~~~ 195 (455)
T PRK10139 121 LN-DGREFDAKLIGSDDQSDIALLQIQNPS---KLTQIAIADSDKLRVGDFAVAVGNPFGLG-QTATSGIISALGRSGLN 195 (455)
T ss_pred EC-CCCEEEEEEEEEcCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEecCCCCC-CceEEEEEccccccccC
Confidence 97 899999999999999999999998643 58899999876 5699999999999987 59999999998775322
Q ss_pred CCCceeeEEEecccCCCCCCCccee-eCCEEEEEEeeecC---CCCceEEEEecchHHHHHHHHHHcCeeeeeeccCccc
Q 016641 247 HGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSC 322 (385)
Q Consensus 247 ~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~ 322 (385)
. .....+||+|+++++|||||||+ .+|+||||+++... +..+++||||++.+++++++|+++|++. ++|||+.+
T Consensus 196 ~-~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~-r~~LGv~~ 273 (455)
T PRK10139 196 L-EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIK-RGLLGIKG 273 (455)
T ss_pred C-CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCccc-ccceeEEE
Confidence 2 22346899999999999999999 99999999998764 3467999999999999999999999998 89999999
Q ss_pred cccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 323 QTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 323 ~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
+++ +++.++.+|++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..|+.+.|.
T Consensus 274 ~~l-~~~~~~~lgl~~-~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~ 333 (455)
T PRK10139 274 TEM-SADIAKAFNLDV-QRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELRSRIA 333 (455)
T ss_pred EEC-CHHHHHhcCCCC-CCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHHHHHH
Confidence 999 788999999873 579999999999999999 99999999999999999999988764
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=2.3e-46 Score=365.89 Aligned_cols=268 Identities=24% Similarity=0.397 Sum_probs=230.4
Q ss_pred CCcHHHHHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec-CCEEEecccccCCCceEEEEEcCCCcEEEEEE
Q 016641 102 TTNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQV 180 (385)
Q Consensus 102 ~~~~~~~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~-~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v 180 (385)
...+.++++++.||||+|.+.....+. + ......+.||||+|+ +||||||+|||.+++.+.|++. ||+.++|++
T Consensus 44 ~~~~~~~~~~~~psVV~I~~~~~~~~~---~-~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~-dg~~~~a~v 118 (351)
T TIGR02038 44 EISFNKAVRRAAPAVVNIYNRSISQNS---L-NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQ-DGRKFEAEL 118 (351)
T ss_pred chhHHHHHHhcCCcEEEEEeEeccccc---c-ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEEC-CCCEEEEEE
Confidence 346889999999999999986543321 1 122345689999998 7899999999999999999997 889999999
Q ss_pred EEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEec
Q 016641 181 EAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQID 258 (385)
Q Consensus 181 ~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d 258 (385)
++.|+.+||||||++... +++++++++. ++||+|+++|||++.. .+++.|+|+...+..... .....++++|
T Consensus 119 v~~d~~~DlAvlkv~~~~----~~~~~l~~s~~~~~G~~V~aiG~P~~~~-~s~t~GiIs~~~r~~~~~-~~~~~~iqtd 192 (351)
T TIGR02038 119 VGSDPLTDLAVLKIEGDN----LPTIPVNLDRPPHVGDVVLAIGNPYNLG-QTITQGIISATGRNGLSS-VGRQNFIQTD 192 (351)
T ss_pred EEecCCCCEEEEEecCCC----CceEeccCcCccCCCCEEEEEeCCCCCC-CcEEEEEEEeccCcccCC-CCcceEEEEC
Confidence 999999999999999764 6788887654 6799999999999877 589999999987754322 2234689999
Q ss_pred ccCCCCCCCccee-eCCEEEEEEeeecC-----CCCceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHh
Q 016641 259 AAINPGNSGGPAI-MGNKVAGVAFQNLS-----GAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN 332 (385)
Q Consensus 259 ~~i~~G~SGGPL~-~~G~vVGI~s~~~~-----~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~ 332 (385)
+.+++|||||||+ .+|+||||+++.+. ...+++|+||++.+++++++++++|++. ++|||+.++++ ++..++
T Consensus 193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~-r~~lGv~~~~~-~~~~~~ 270 (351)
T TIGR02038 193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVI-RGYIGVSGEDI-NSVVAQ 270 (351)
T ss_pred CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCccc-ceEeeeEEEEC-CHHHHH
Confidence 9999999999999 99999999987543 1257899999999999999999999987 89999999998 678888
Q ss_pred hcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 333 NFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 333 ~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
.+|++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..||.+.|.
T Consensus 271 ~lgl~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~ 321 (351)
T TIGR02038 271 GLGLPD-LRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIA 321 (351)
T ss_pred hcCCCc-cccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence 899974 479999999999999999 99999999999999999999987763
No 3
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=5.6e-46 Score=362.95 Aligned_cols=267 Identities=21% Similarity=0.357 Sum_probs=226.9
Q ss_pred CcHHHHHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec-CCEEEecccccCCCceEEEEEcCCCcEEEEEEE
Q 016641 103 TNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQVE 181 (385)
Q Consensus 103 ~~~~~~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~-~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~ 181 (385)
..+.++++++.+|||.|.+....... .......+.||||+|+ +||||||+|||+++..+.|++. |++.++|+++
T Consensus 45 ~~~~~~~~~~~psvV~v~~~~~~~~~----~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~-dg~~~~a~vv 119 (353)
T PRK10898 45 ASYNQAVRRAAPAVVNVYNRSLNSTS----HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQ-DGRVFEALLV 119 (353)
T ss_pred chHHHHHHHhCCcEEEEEeEeccccC----cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeC-CCCEEEEEEE
Confidence 46889999999999999986543211 1123345789999998 7899999999999999999997 8899999999
Q ss_pred EecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecc
Q 016641 182 AVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDA 259 (385)
Q Consensus 182 ~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~ 259 (385)
+.|+.+||||||++... +++++++++. ++||+|+++|||++.. .+++.|+|+...+...... ....++|+|+
T Consensus 120 ~~d~~~DlAvl~v~~~~----l~~~~l~~~~~~~~G~~V~aiG~P~g~~-~~~t~Giis~~~r~~~~~~-~~~~~iqtda 193 (353)
T PRK10898 120 GSDSLTDLAVLKINATN----LPVIPINPKRVPHIGDVVLAIGNPYNLG-QTITQGIISATGRIGLSPT-GRQNFLQTDA 193 (353)
T ss_pred EEcCCCCEEEEEEcCCC----CCeeeccCcCcCCCCCEEEEEeCCCCcC-CCcceeEEEeccccccCCc-cccceEEecc
Confidence 99999999999998753 7788888654 6799999999999876 5899999998876533222 2236799999
Q ss_pred cCCCCCCCccee-eCCEEEEEEeeecCC------CCceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHh
Q 016641 260 AINPGNSGGPAI-MGNKVAGVAFQNLSG------AENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN 332 (385)
Q Consensus 260 ~i~~G~SGGPL~-~~G~vVGI~s~~~~~------~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~ 332 (385)
.+++|||||||+ .+|+||||+++.+.. ..+++||||++.+++++++++++|++. ++|||+.++++ ++..+.
T Consensus 194 ~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~-~~~lGi~~~~~-~~~~~~ 271 (353)
T PRK10898 194 SINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVI-RGYIGIGGREI-APLHAQ 271 (353)
T ss_pred ccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCccc-ccccceEEEEC-CHHHHH
Confidence 999999999999 999999999976542 257899999999999999999999998 89999999988 555666
Q ss_pred hcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 333 NFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 333 ~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
.++++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..++.+.|.
T Consensus 272 ~~~~~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~ 322 (353)
T PRK10898 272 GGGIDQ-LQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVA 322 (353)
T ss_pred hcCCCC-CCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence 677753 489999999999999999 99999999999999999999887663
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=5e-45 Score=368.25 Aligned_cols=270 Identities=26% Similarity=0.391 Sum_probs=229.9
Q ss_pred cHHHHHHHhCCCeEEEEeeecCCC--------C--CC----CCC----------------------CCCCCcceEEEEEe
Q 016641 104 NAYAAIELALDSVVKIFTVSSSPN--------Y--GL----PWQ----------------------NKSQRETTGSGFVI 147 (385)
Q Consensus 104 ~~~~~~~~~~~SVV~I~~~~~~~~--------~--~~----p~~----------------------~~~~~~~~GSGfiI 147 (385)
++.++++++.|+||.|.+...... + +| |+. ......+.||||||
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 589999999999999998653211 0 11 110 00112467999999
Q ss_pred c--CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEec
Q 016641 148 P--GKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGY 223 (385)
Q Consensus 148 ~--~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~ 223 (385)
+ +||||||+|||.+++.++|++. |+++++|++++.|+.+||||||++... .+++++|+++. ++|++|+++|+
T Consensus 119 ~~~~G~IlTn~HVv~~a~~i~V~~~-dg~~~~a~vv~~D~~~DlAvlki~~~~---~l~~~~lg~s~~l~~G~~V~aiG~ 194 (473)
T PRK10942 119 DADKGYVVTNNHVVDNATKIKVQLS-DGRKFDAKVVGKDPRSDIALIQLQNPK---NLTAIKMADSDALRVGDYTVAIGN 194 (473)
T ss_pred ECCCCEEEeChhhcCCCCEEEEEEC-CCCEEEEEEEEecCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEcC
Confidence 8 4899999999999999999997 899999999999999999999997543 58899999765 56999999999
Q ss_pred CCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee-eCCEEEEEEeeecC---CCCceEEEEecchH
Q 016641 224 PQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVI 299 (385)
Q Consensus 224 p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~~~~~aip~~~i 299 (385)
|++.. .+++.|+|+...+..... ..+..+|++|+.+++|||||||+ .+|+||||+++... +..+++|+||++.+
T Consensus 195 P~g~~-~tvt~GiVs~~~r~~~~~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~ 272 (473)
T PRK10942 195 PYGLG-ETVTSGIVSALGRSGLNV-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMV 272 (473)
T ss_pred CCCCC-cceeEEEEEEeecccCCc-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHH
Confidence 99887 589999999987642211 12346899999999999999999 99999999998654 33568999999999
Q ss_pred HHHHHHHHHcCeeeeeeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhH
Q 016641 300 KHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTG 378 (385)
Q Consensus 300 ~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l 378 (385)
++++++|.++|++. ++|||+.++++ ++++++.++++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..+|
T Consensus 273 ~~v~~~l~~~g~v~-rg~lGv~~~~l-~~~~a~~~~l~~-~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl 349 (473)
T PRK10942 273 KNLTSQMVEYGQVK-RGELGIMGTEL-NSELAKAMKVDA-QRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAAL 349 (473)
T ss_pred HHHHHHHHhccccc-cceeeeEeeec-CHHHHHhcCCCC-CCceEEEEECCCChHHHcCCCCCCEEEEECCEECCCHHHH
Confidence 99999999999998 89999999999 788999999974 589999999999999999 999999999999999999999
Q ss_pred Hhhh
Q 016641 379 SHSM 382 (385)
Q Consensus 379 ~~~l 382 (385)
.+.|
T Consensus 350 ~~~l 353 (473)
T PRK10942 350 RAQV 353 (473)
T ss_pred HHHH
Confidence 8876
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=3e-44 Score=360.71 Aligned_cols=271 Identities=29% Similarity=0.409 Sum_probs=230.9
Q ss_pred cHHHHHHHhCCCeEEEEeeecCCC-------------CCC---CCC----CCCCCcceEEEEEec-CCEEEecccccCCC
Q 016641 104 NAYAAIELALDSVVKIFTVSSSPN-------------YGL---PWQ----NKSQRETTGSGFVIP-GKKILTNAHVVADS 162 (385)
Q Consensus 104 ~~~~~~~~~~~SVV~I~~~~~~~~-------------~~~---p~~----~~~~~~~~GSGfiI~-~g~ILT~aHvv~~~ 162 (385)
++.++++++.||||.|.+...... ++. |.. ......+.||||+|+ +||||||+||+.++
T Consensus 2 ~~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~ 81 (428)
T TIGR02037 2 SFAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA 81 (428)
T ss_pred cHHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC
Confidence 367899999999999998652211 110 000 012235689999999 78999999999999
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeec
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRV 240 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~ 240 (385)
..+.|++. +++.++|++++.|+.+||||||++... .+++++|+++. +.|++|+++|||++.. .+++.|+|+..
T Consensus 82 ~~i~V~~~-~~~~~~a~vv~~d~~~DlAllkv~~~~---~~~~~~l~~~~~~~~G~~v~aiG~p~g~~-~~~t~G~vs~~ 156 (428)
T TIGR02037 82 DEITVTLS-DGREFKAKLVGKDPRTDIAVLKIDAKK---NLPVIKLGDSDKLRVGDWVLAIGNPFGLG-QTVTSGIVSAL 156 (428)
T ss_pred CeEEEEeC-CCCEEEEEEEEecCCCCEEEEEecCCC---CceEEEccCCCCCCCCCEEEEEECCCcCC-CcEEEEEEEec
Confidence 99999997 899999999999999999999998752 58999998754 6699999999999987 58999999988
Q ss_pred ccccccCCCceeeEEEecccCCCCCCCccee-eCCEEEEEEeeecC---CCCceEEEEecchHHHHHHHHHHcCeeeeee
Q 016641 241 EPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHGKYVGFC 316 (385)
Q Consensus 241 ~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~ 316 (385)
.+... .......++++|+.+++|+|||||+ .+|+||||+++... +..+++||||++.+++++++|+++|++. ++
T Consensus 157 ~~~~~-~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~-~~ 234 (428)
T TIGR02037 157 GRSGL-GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQ-RG 234 (428)
T ss_pred ccCcc-CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCc-CC
Confidence 76532 2223346899999999999999999 99999999987654 3467899999999999999999999987 89
Q ss_pred ccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 317 SLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 317 ~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
|||+.++++ +++.++.+|++. ..|++|.+|.++|||+++ ||+||+|++|||++|.+..++.+.|.
T Consensus 235 ~lGi~~~~~-~~~~~~~lgl~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~~~~~l~ 300 (428)
T TIGR02037 235 WLGVTIQEV-TSDLAKSLGLEK-QRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFADLRRAIG 300 (428)
T ss_pred cCceEeecC-CHHHHHHcCCCC-CCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence 999999999 788999999975 579999999999999999 99999999999999999999988763
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=2.9e-35 Score=288.16 Aligned_cols=272 Identities=29% Similarity=0.399 Sum_probs=228.6
Q ss_pred CcHHHHHHHhCCCeEEEEeeecCCC-CCCCCCCCCC-CcceEEEEEec-CCEEEecccccCCCceEEEEEcCCCcEEEEE
Q 016641 103 TNAYAAIELALDSVVKIFTVSSSPN-YGLPWQNKSQ-RETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQ 179 (385)
Q Consensus 103 ~~~~~~~~~~~~SVV~I~~~~~~~~-~~~p~~~~~~-~~~~GSGfiI~-~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~ 179 (385)
..+..+++++.++||.|........ .+++-..... ..+.||||+++ ++||+||.|++.++..+.+.+. |+++++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~-dg~~~~a~ 111 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLA-DGREVPAK 111 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeC-CCCEEEEE
Confidence 5788999999999999998664332 1111111101 14789999999 9999999999999999999985 99999999
Q ss_pred EEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEe
Q 016641 180 VEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQI 257 (385)
Q Consensus 180 v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~ 257 (385)
+++.|+..|+|+||++... .++.+.++++. ++|+.++++|+|++.. .+++.|+|+...+...........+||+
T Consensus 112 ~vg~d~~~dlavlki~~~~---~~~~~~~~~s~~l~vg~~v~aiGnp~g~~-~tvt~Givs~~~r~~v~~~~~~~~~Iqt 187 (347)
T COG0265 112 LVGKDPISDLAVLKIDGAG---GLPVIALGDSDKLRVGDVVVAIGNPFGLG-QTVTSGIVSALGRTGVGSAGGYVNFIQT 187 (347)
T ss_pred EEecCCccCEEEEEeccCC---CCceeeccCCCCcccCCEEEEecCCCCcc-cceeccEEeccccccccCcccccchhhc
Confidence 9999999999999999864 26777888776 5599999999999976 5999999999988622221225578999
Q ss_pred cccCCCCCCCccee-eCCEEEEEEeeecCCC---CceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHhh
Q 016641 258 DAAINPGNSGGPAI-MGNKVAGVAFQNLSGA---ENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNN 333 (385)
Q Consensus 258 d~~i~~G~SGGPL~-~~G~vVGI~s~~~~~~---~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~ 333 (385)
|+++++|+||||++ .+|++|||++...... .+++|++|++.++.+++++.++|++. ++|+|+...++ +...+
T Consensus 188 dAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~-~~~lgv~~~~~-~~~~~-- 263 (347)
T COG0265 188 DAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVV-RGYLGVIGEPL-TADIA-- 263 (347)
T ss_pred ccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCcc-ccccceEEEEc-ccccc--
Confidence 99999999999999 9999999999987743 35899999999999999999988877 89999999887 44444
Q ss_pred cCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhcc
Q 016641 334 FGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSMLF 384 (385)
Q Consensus 334 ~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~~ 384 (385)
+|++ ...|++|.+|.+++||+++ ++.||||+++||+++.+..++.+.+..
T Consensus 264 ~g~~-~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~ 314 (347)
T COG0265 264 LGLP-VAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVAS 314 (347)
T ss_pred cCCC-CCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhc
Confidence 7766 4689999999999999999 999999999999999999999988753
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.92 E-value=3.8e-24 Score=210.73 Aligned_cols=278 Identities=26% Similarity=0.265 Sum_probs=207.2
Q ss_pred CCcHHHHHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec-CCEEEecccccCCCc-----------eEEEEE
Q 016641 102 TTNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADST-----------FVLVRK 169 (385)
Q Consensus 102 ~~~~~~~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~-~g~ILT~aHvv~~~~-----------~i~V~~ 169 (385)
.....++.++-..++|.|....-. ....|+....-....||||+++ +++++||+||+.... .+.+..
T Consensus 127 ~~~v~~~~~~cd~Avv~Ie~~~f~-~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~a 205 (473)
T KOG1320|consen 127 KAFVAAVFEECDLAVVYIESEEFW-KGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDA 205 (473)
T ss_pred hhhHHHhhhcccceEEEEeecccc-CCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEE
Confidence 345677889999999999974321 1122444555567889999999 999999999997432 255655
Q ss_pred c-CCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcc--cCCCeEEEEecCCCCCCceEEEeeEeeccccccc
Q 016641 170 H-GSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYV 246 (385)
Q Consensus 170 ~-~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~ 246 (385)
. +.+..+++.+.+.|+..|+|+++++.+. ...++++++-.. ..|+++.++|.|++..+ +.+.|+++...|..+.
T Consensus 206 a~~~~~s~ep~i~g~d~~~gvA~l~ik~~~--~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~n-t~t~g~vs~~~R~~~~ 282 (473)
T KOG1320|consen 206 AIGPGNSGEPVIVGVDKVAGVAFLKIKTPE--NILYVIPLGVSSHFRTGVEVSAIGNGFGLLN-TLTQGMVSGQLRKSFK 282 (473)
T ss_pred eecCCccCCCeEEccccccceEEEEEecCC--cccceeecceeeeecccceeeccccCceeee-eeeecccccccccccc
Confidence 4 2348899999999999999999997653 136677777555 45999999999999985 8999999988776544
Q ss_pred CC----CceeeEEEecccCCCCCCCccee-eCCEEEEEEeeecC---CCCceEEEEecchHHHHHHHHHHcC---eee--
Q 016641 247 HG----ATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHG---KYV-- 313 (385)
Q Consensus 247 ~~----~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~~~~~aip~~~i~~~l~~l~~~g---~~~-- 313 (385)
-+ ....+++|+|+.++.|+|||||+ .+|++||+++.... -..+++|++|.+.++.++.+..+.. +..
T Consensus 283 lg~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~ 362 (473)
T KOG1320|consen 283 LGLETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKP 362 (473)
T ss_pred cCcccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccC
Confidence 21 23446899999999999999999 99999999988654 2357899999999998888763222 211
Q ss_pred ---eeeccCccccccc----cHHHHhhcCCC-CccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 314 ---GFCSLGLSCQTTE----NVQLRNNFGMR-SEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 314 ---~~~~lGi~~~~~~----~~~~~~~~g~~-~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
.+.|+|+..--+. ...+.+.+-.+ ...+|++|.+|.+++++... +++||+|++|||++|.+..++.+.|-
T Consensus 363 ~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~~~i~ 441 (473)
T KOG1320|consen 363 LVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLYELIE 441 (473)
T ss_pred cccccccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHHHHHHHH
Confidence 1235665442221 01111223223 23479999999999999999 99999999999999999999998763
No 8
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.87 E-value=1.8e-22 Score=198.95 Aligned_cols=261 Identities=52% Similarity=0.764 Sum_probs=235.3
Q ss_pred HHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEecCCEEEecccccC---CCceEEEEEcCCCcEEEEEEEEec
Q 016641 108 AIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIPGKKILTNAHVVA---DSTFVLVRKHGSPTKYRAQVEAVG 184 (385)
Q Consensus 108 ~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~~g~ILT~aHvv~---~~~~i~V~~~~~g~~~~a~v~~~d 184 (385)
..+...+|++.+......+.+..||+...+....|+||.+....++|++|+++ +...+.+..++.-+.|.+++...-
T Consensus 55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~ 134 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGKKLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF 134 (473)
T ss_pred CccccccceeEEEeecccccccCcceeeehhcccccchhhcccceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence 45677889999999999999999999998888999999999999999999999 666777776667778899999999
Q ss_pred CCCCeEEEEecCCcccccceeeecCCcccCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCC
Q 016641 185 HECDLAILIVESDEFWEGMHFLELGDIPFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPG 264 (385)
Q Consensus 185 ~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G 264 (385)
.++|+|++.++..+||+.+.|+++++.+...+-++++| + +...++.|.|.......+.........+++|+++++|
T Consensus 135 ~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---g-d~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~~~ 210 (473)
T KOG1320|consen 135 EECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---G-DGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIGPG 210 (473)
T ss_pred hcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---C-CcEEEEeeEEEEEEeccccCCCcceeeEEEEEeecCC
Confidence 99999999999999999999999999999989999999 3 3479999999998887777776777789999999999
Q ss_pred CCCccee-eCCEEEEEEeeecCCCCceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHhhcCCCCccCce
Q 016641 265 NSGGPAI-MGNKVAGVAFQNLSGAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGV 343 (385)
Q Consensus 265 ~SGGPL~-~~G~vVGI~s~~~~~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv 343 (385)
+||+|.+ ..+++.|+.+......+++++.+|.-.+.++.......+.+.++++++...+.+++.+.++.+.|..+ .|+
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~-~g~ 289 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE-TGV 289 (473)
T ss_pred ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc-cce
Confidence 9999999 55999999999875444789999999999999998888888889999999999988999999999887 899
Q ss_pred EEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641 344 LVNKINPLSDAHEILKKDDIILAFDGVPIA 373 (385)
Q Consensus 344 ~V~~V~~~spA~~aL~~GDiI~~vng~~i~ 373 (385)
.+.++.+-+.|.+.++.||+|+.+||+.|.
T Consensus 290 ~i~~~~qtd~ai~~~nsg~~ll~~DG~~Ig 319 (473)
T KOG1320|consen 290 LISKINQTDAAINPGNSGGPLLNLDGEVIG 319 (473)
T ss_pred eeeeecccchhhhcccCCCcEEEecCcEee
Confidence 999999999999999999999999999995
No 9
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.84 E-value=4.3e-20 Score=184.14 Aligned_cols=269 Identities=19% Similarity=0.226 Sum_probs=207.1
Q ss_pred cHHHHHHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec--CCEEEecccccCCC-ceEEEEEcCCCcEEEEEE
Q 016641 104 NAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP--GKKILTNAHVVADS-TFVLVRKHGSPTKYRAQV 180 (385)
Q Consensus 104 ~~~~~~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~--~g~ILT~aHvv~~~-~~i~V~~~~~g~~~~a~v 180 (385)
++...+.++-+|||.|...... +++........++||+++ .||||||+|++... -.-.+.+. +..+.+.-.
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~-----~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~-n~ee~ei~p 126 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVR-----AFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFD-NHEEIEIYP 126 (955)
T ss_pred hhhhhhhhhcccEEEEEehhee-----ecccccccccceeEEEEecccceEEEeccccCCCCceeEEEec-ccccCCccc
Confidence 7888999999999999987642 344445566789999999 78999999999854 34455554 556677778
Q ss_pred EEecCCCCeEEEEecCCcc-cccceeeecCCc-ccCCCeEEEEecCCCCCCceEEEeeEeecccccccCCC-----ceee
Q 016641 181 EAVGHECDLAILIVESDEF-WEGMHFLELGDI-PFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGA-----TQLM 253 (385)
Q Consensus 181 ~~~d~~~DlAlLkv~~~~~-~~~~~~l~l~~~-~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~-----~~~~ 253 (385)
++.|+-+|+.++|.++... ...++-+++... .++|.++.++|+..+.. .++..|.++.+++.....+. ....
T Consensus 127 vyrDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDagEk-lsIlagflSrldr~apdyg~~~yndfnTf 205 (955)
T KOG1421|consen 127 VYRDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAGEK-LSILAGFLSRLDRNAPDYGEDTYNDFNTF 205 (955)
T ss_pred ccCCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCccce-EEeehhhhhhccCCCccccccccccccce
Confidence 8999999999999987642 124555666543 47899999999987765 68889999988875443221 2234
Q ss_pred EEEecccCCCCCCCccee-eCCEEEEEEeeecCCCCceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHh
Q 016641 254 AIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLSGAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN 332 (385)
Q Consensus 254 ~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~ 332 (385)
++|.......|.||.|++ -+|..|.++..+.. ....+|++|++.+++.|.-++.+.-+. |+.|-+++-.- ..+.++
T Consensus 206 y~QaasstsggssgspVv~i~gyAVAl~agg~~-ssas~ffLpLdrV~RaL~clq~n~PIt-RGtLqvefl~k-~~de~r 282 (955)
T KOG1421|consen 206 YIQAASSTSGGSSGSPVVDIPGYAVALNAGGSI-SSASDFFLPLDRVVRALRCLQNNTPIT-RGTLQVEFLHK-LFDECR 282 (955)
T ss_pred eeeehhcCCCCCCCCceecccceEEeeecCCcc-cccccceeeccchhhhhhhhhcCCCcc-cceEEEEEehh-hhHHHH
Confidence 688888889999999999 99999999876543 345689999999999998888555444 67777666544 445666
Q ss_pred hcCCCC-----------ccCce-EEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhhHHhhh
Q 016641 333 NFGMRS-----------EVTGV-LVNKINPLSDAHEILKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 333 ~~g~~~-----------~~~Gv-~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
.+|++. +..|+ +|..|.++|||++.|++||++++||+.-+.++.++.+.|
T Consensus 283 rlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~l~~iL 344 (955)
T KOG1421|consen 283 RLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEALEQIL 344 (955)
T ss_pred hcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHHHHHHH
Confidence 777754 24555 556799999999999999999999999999999988765
No 10
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.57 E-value=2.5e-14 Score=117.65 Aligned_cols=107 Identities=36% Similarity=0.485 Sum_probs=70.8
Q ss_pred EEEEEecCC-EEEecccccC--------CCceEEEEEcCCCcEEE--EEEEEecCC-CCeEEEEecCCcccccceeeecC
Q 016641 142 GSGFVIPGK-KILTNAHVVA--------DSTFVLVRKHGSPTKYR--AQVEAVGHE-CDLAILIVESDEFWEGMHFLELG 209 (385)
Q Consensus 142 GSGfiI~~g-~ILT~aHvv~--------~~~~i~V~~~~~g~~~~--a~v~~~d~~-~DlAlLkv~~~~~~~~~~~l~l~ 209 (385)
||||+|++. +||||+||+. ....+.+... ++.... ++++..|+. .|+|||+++... .
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~D~All~v~~~~------~---- 69 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFP-DGRRVPPVAEVVYFDPDDYDLALLKVDPWT------G---- 69 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEET-TSCEEETEEEEEEEETT-TTEEEEEESCEE------E----
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEec-CCCEEeeeEEEEEECCccccEEEEEEeccc------c----
Confidence 899999954 9999999998 4567888876 666677 999999999 999999999100 0
Q ss_pred CcccCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee-eCCEEEEE
Q 016641 210 DIPFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGV 279 (385)
Q Consensus 210 ~~~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI 279 (385)
.+......+. ..+..... ........+ +++.+.+|+|||||+ .+|+||||
T Consensus 70 ----~~~~~~~~~~---------~~~~~~~~------~~~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 70 ----VGGGVRVPGS---------TSGVSPTS------TNDNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ----EEEEEEEEEE---------EEEEEEEE------EEETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred ----eeeeeEeeee---------cccccccc------CcccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 0000000000 00000000 000111124 799999999999999 99999997
No 11
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.52 E-value=3.6e-13 Score=122.05 Aligned_cols=165 Identities=21% Similarity=0.239 Sum_probs=107.9
Q ss_pred cceEEEEEecCCEEEecccccCCCceEEEEEc------CCC--cEEEEEEEEec----C---CCCeEEEEecCC-ccccc
Q 016641 139 ETTGSGFVIPGKKILTNAHVVADSTFVLVRKH------GSP--TKYRAQVEAVG----H---ECDLAILIVESD-EFWEG 202 (385)
Q Consensus 139 ~~~GSGfiI~~g~ILT~aHvv~~~~~i~V~~~------~~g--~~~~a~v~~~d----~---~~DlAlLkv~~~-~~~~~ 202 (385)
...|+|++|++.||||++||+.+...+.+.+. .++ ..+..+-+..+ . .+|+|||+++.+ .+.+.
T Consensus 24 ~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~ 103 (220)
T PF00089_consen 24 RFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDN 103 (220)
T ss_dssp EEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSS
T ss_pred CeeEeEEecccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 46799999999999999999999555655432 122 23444433332 2 579999999987 45567
Q ss_pred ceeeecCCcc---cCCCeEEEEecCCCCCCc---eEEEeeEeecc---cccccCCCceeeEEEecc----cCCCCCCCcc
Q 016641 203 MHFLELGDIP---FLQQAVAVVGYPQGGDNI---SVTKGVVSRVE---PTQYVHGATQLMAIQIDA----AINPGNSGGP 269 (385)
Q Consensus 203 ~~~l~l~~~~---~~G~~V~~iG~p~~~~~~---~~~~G~Vs~~~---~~~~~~~~~~~~~i~~d~----~i~~G~SGGP 269 (385)
+.++.+.... ..|+.+.++||+...... .+....+..+. +............+.... ..|.|+||||
T Consensus 104 ~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~p 183 (220)
T PF00089_consen 104 IQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGGP 183 (220)
T ss_dssp BEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTSE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 8899998733 568999999998753322 33333333222 221111111112344444 7899999999
Q ss_pred eeeCC-EEEEEEeeecC-C-CCceEEEEecchHHHHH
Q 016641 270 AIMGN-KVAGVAFQNLS-G-AENIGYIIPVPVIKHFI 303 (385)
Q Consensus 270 L~~~G-~vVGI~s~~~~-~-~~~~~~aip~~~i~~~l 303 (385)
|+.++ +|+||++.... + .....+..+++...+|+
T Consensus 184 l~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 184 LICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 99444 59999998733 1 12358889998887775
No 12
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.49 E-value=1.3e-12 Score=119.19 Aligned_cols=166 Identities=19% Similarity=0.192 Sum_probs=98.2
Q ss_pred cceEEEEEecCCEEEecccccCCC--ceEEEEEcC--------CCcEEEEEEEEec-------CCCCeEEEEecCC-ccc
Q 016641 139 ETTGSGFVIPGKKILTNAHVVADS--TFVLVRKHG--------SPTKYRAQVEAVG-------HECDLAILIVESD-EFW 200 (385)
Q Consensus 139 ~~~GSGfiI~~g~ILT~aHvv~~~--~~i~V~~~~--------~g~~~~a~v~~~d-------~~~DlAlLkv~~~-~~~ 200 (385)
...|+|++|++.+|||+|||+.+. ..+.|.+.. ....+..+-+..+ ..+||||||++.+ .+.
T Consensus 24 ~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~ 103 (232)
T cd00190 24 RHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLS 103 (232)
T ss_pred cEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCC
Confidence 368999999999999999999874 455555421 1222334333333 3589999999875 344
Q ss_pred ccceeeecCCc--c-cCCCeEEEEecCCCCCC----ceEEE---eeEeecccccccC--CCceeeEEEe-----cccCCC
Q 016641 201 EGMHFLELGDI--P-FLQQAVAVVGYPQGGDN----ISVTK---GVVSRVEPTQYVH--GATQLMAIQI-----DAAINP 263 (385)
Q Consensus 201 ~~~~~l~l~~~--~-~~G~~V~~iG~p~~~~~----~~~~~---G~Vs~~~~~~~~~--~~~~~~~i~~-----d~~i~~ 263 (385)
..+.|++|... . ..|+.+.+.||...... ..... .++....+..... .......+.. +...|+
T Consensus 104 ~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~ 183 (232)
T cd00190 104 DNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQ 183 (232)
T ss_pred CcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCcccc
Confidence 56889999866 2 45899999999765321 11222 2222222211111 0000111211 345789
Q ss_pred CCCCccee-eC---CEEEEEEeeecCCC--CceEEEEecchHHHHHH
Q 016641 264 GNSGGPAI-MG---NKVAGVAFQNLSGA--ENIGYIIPVPVIKHFIT 304 (385)
Q Consensus 264 G~SGGPL~-~~---G~vVGI~s~~~~~~--~~~~~aip~~~i~~~l~ 304 (385)
|+|||||+ .. +.++||.+....-. ...+....+....+|++
T Consensus 184 gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~ 230 (232)
T cd00190 184 GDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQ 230 (232)
T ss_pred CCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhh
Confidence 99999999 43 78999999864311 12334444444555543
No 13
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.45 E-value=2.5e-12 Score=129.18 Aligned_cols=259 Identities=15% Similarity=0.130 Sum_probs=181.1
Q ss_pred HHHhCCCeEEEEeeecCCCCCCCCCCCCCCcceEEEEEec--CCEEEecccccC-CCceEEEEEcCCCcEEEEEEEEecC
Q 016641 109 IELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP--GKKILTNAHVVA-DSTFVLVRKHGSPTKYRAQVEAVGH 185 (385)
Q Consensus 109 ~~~~~~SVV~I~~~~~~~~~~~p~~~~~~~~~~GSGfiI~--~g~ILT~aHvv~-~~~~i~V~~~~~g~~~~a~v~~~d~ 185 (385)
.+++..+.|.+......+- +.-......|||.|++ +|+++++..++. +..+.+|+.. |...++|.+...++
T Consensus 524 ~~~i~~~~~~v~~~~~~~l-----~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~-dS~~i~a~~~fL~~ 597 (955)
T KOG1421|consen 524 SADISNCLVDVEPMMPVNL-----DGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEA-DSDGIPANVSFLHP 597 (955)
T ss_pred hhHHhhhhhhheeceeecc-----ccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeec-ccccccceeeEecC
Confidence 4667777787776443221 1112234569999998 899999999997 6778888886 77889999999999
Q ss_pred CCCeEEEEecCCcccccceeeecCCcc-cCCCeEEEEecCCCCCCc----eEEEeeEeecccccc-cCCCceeeEEEecc
Q 016641 186 ECDLAILIVESDEFWEGMHFLELGDIP-FLQQAVAVVGYPQGGDNI----SVTKGVVSRVEPTQY-VHGATQLMAIQIDA 259 (385)
Q Consensus 186 ~~DlAlLkv~~~~~~~~~~~l~l~~~~-~~G~~V~~iG~p~~~~~~----~~~~G~Vs~~~~~~~-~~~~~~~~~i~~d~ 259 (385)
..++|.+|.++.. ...++|.+.. ..||+|...|+....+.. +++.-.+....+... .......+.|..++
T Consensus 598 t~n~a~~kydp~~----~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~ 673 (955)
T KOG1421|consen 598 TENVASFKYDPAL----EVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMD 673 (955)
T ss_pred ccceeEeccChhH----hhhhccceeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEec
Confidence 9999999998764 3455665544 459999999998765421 222111111111111 11123445677766
Q ss_pred cCCCCCCCccee-eCCEEEEEEeeecC---CCC--ceEEEEecchHHHHHHHHHHcCeeeeeeccCccccccccHHHHhh
Q 016641 260 AINPGNSGGPAI-MGNKVAGVAFQNLS---GAE--NIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNN 333 (385)
Q Consensus 260 ~i~~G~SGGPL~-~~G~vVGI~s~~~~---~~~--~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~ 333 (385)
.+.-++--|-+. .+|+|+|++-.... +.. ..-|.+.+..++.+|++|+..+..+ ...+|+.+..+ +...++.
T Consensus 674 nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r-p~i~~vef~~i-~laqar~ 751 (955)
T KOG1421|consen 674 NLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR-PTIAGVEFSHI-TLAQART 751 (955)
T ss_pred cccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC-ceeeccceeeE-Eeehhhc
Confidence 655444446677 99999999866554 111 1345567789999999999877765 45688888888 6777788
Q ss_pred cCCCCc------------cCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhhHHhh
Q 016641 334 FGMRSE------------VTGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTGSHS 381 (385)
Q Consensus 334 ~g~~~~------------~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~l~~~ 381 (385)
+|+|.+ .+=.+|++|.+.-+-- |..||||+++|||.|+...||.+.
T Consensus 752 lglp~e~imk~e~es~~~~ql~~ishv~~~~~ki--l~~gdiilsvngk~itr~~dl~d~ 809 (955)
T KOG1421|consen 752 LGLPSEFIMKSEEESTIPRQLYVISHVRPLLHKI--LGVGDIILSVNGKMITRLSDLHDF 809 (955)
T ss_pred cCCCHHHHhhhhhcCCCcceEEEEEeeccCcccc--cccccEEEEecCeEEeeehhhhhh
Confidence 888752 3456788898865433 999999999999999999999873
No 14
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.40 E-value=6.3e-12 Score=114.92 Aligned_cols=146 Identities=18% Similarity=0.189 Sum_probs=91.0
Q ss_pred cceEEEEEecCCEEEecccccCCCc--eEEEEEcCCC-------cEEEEEEEEec-------CCCCeEEEEecCC-cccc
Q 016641 139 ETTGSGFVIPGKKILTNAHVVADST--FVLVRKHGSP-------TKYRAQVEAVG-------HECDLAILIVESD-EFWE 201 (385)
Q Consensus 139 ~~~GSGfiI~~g~ILT~aHvv~~~~--~i~V~~~~~g-------~~~~a~v~~~d-------~~~DlAlLkv~~~-~~~~ 201 (385)
...|+|.+|++.+|||++||+.+.. .+.|.+.... ..+.+.-+..+ ..+|||||+++.+ .+.+
T Consensus 25 ~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~ 104 (229)
T smart00020 25 RHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSD 104 (229)
T ss_pred CcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCC
Confidence 4579999999999999999998743 6666654211 33344433322 3589999999876 3445
Q ss_pred cceeeecCCc---ccCCCeEEEEecCCCCC-----CceEEEeeEeecc---cccccC-----CCceeeEEEe--cccCCC
Q 016641 202 GMHFLELGDI---PFLQQAVAVVGYPQGGD-----NISVTKGVVSRVE---PTQYVH-----GATQLMAIQI--DAAINP 263 (385)
Q Consensus 202 ~~~~l~l~~~---~~~G~~V~~iG~p~~~~-----~~~~~~G~Vs~~~---~~~~~~-----~~~~~~~i~~--d~~i~~ 263 (385)
.+.|+.|... ...++.+.+.||+.... ........+..+. +..... .......... ....|+
T Consensus 105 ~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~ 184 (229)
T smart00020 105 NVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQ 184 (229)
T ss_pred ceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccC
Confidence 6889999864 34589999999986542 0112222222211 111000 0111111111 355789
Q ss_pred CCCCcceeeCC---EEEEEEeeec
Q 016641 264 GNSGGPAIMGN---KVAGVAFQNL 284 (385)
Q Consensus 264 G~SGGPL~~~G---~vVGI~s~~~ 284 (385)
|+|||||+.++ .++||++...
T Consensus 185 gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 185 GDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCCCCeeEEECCCEEEEEEEEECC
Confidence 99999999444 9999999864
No 15
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=98.95 E-value=1e-09 Score=84.67 Aligned_cols=56 Identities=30% Similarity=0.439 Sum_probs=47.7
Q ss_pred eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
||||+.+..... ..|++|.+|.++|||+++ ||+||+|++|||++|.+..++.+.|.
T Consensus 1 ~~lGv~~~~~~~------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~ 57 (82)
T PF13180_consen 1 GGLGVTVQNLSD------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILS 57 (82)
T ss_dssp -E-SEEEEECSC------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHH
T ss_pred CEECeEEEEccC------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHH
Confidence 579999877631 369999999999999999 99999999999999999999998874
No 16
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.92 E-value=3.1e-09 Score=82.99 Aligned_cols=65 Identities=31% Similarity=0.517 Sum_probs=56.3
Q ss_pred eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
+|+|+.++++ ++..++.++++ ...|++|.+|.++|||+++ |++||+|++|||++|.+..++.+.+
T Consensus 1 ~~~G~~~~~~-~~~~~~~~~~~-~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l 66 (90)
T cd00987 1 PWLGVTVQDL-TPDLAEELGLK-DTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRAL 66 (90)
T ss_pred CccceEEeEC-CHHHHHHcCCC-CCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHH
Confidence 5799999999 56666656654 3579999999999999999 9999999999999999999988765
No 17
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.79 E-value=3.6e-07 Score=85.34 Aligned_cols=144 Identities=19% Similarity=0.178 Sum_probs=85.9
Q ss_pred eEEEEEecCCEEEecccccCCCc--eEEEEEcC--------CC---cEE-EEEEEEecC-------C-CCeEEEEecCC-
Q 016641 141 TGSGFVIPGKKILTNAHVVADST--FVLVRKHG--------SP---TKY-RAQVEAVGH-------E-CDLAILIVESD- 197 (385)
Q Consensus 141 ~GSGfiI~~g~ILT~aHvv~~~~--~i~V~~~~--------~g---~~~-~a~v~~~d~-------~-~DlAlLkv~~~- 197 (385)
.|.|.+|++.||||++||+.+.. ...|.+.. .+ ... ..+++ .++ . +|||||+++.+
T Consensus 39 ~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~v 117 (256)
T KOG3627|consen 39 LCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEPV 117 (256)
T ss_pred eeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCCc
Confidence 78888889889999999999865 55555421 11 111 11233 221 3 89999999975
Q ss_pred cccccceeeecCCcc----cC-CCeEEEEecCCCCC-----CceE---EEeeEeecccccccCCC--ceeeEEEe-----
Q 016641 198 EFWEGMHFLELGDIP----FL-QQAVAVVGYPQGGD-----NISV---TKGVVSRVEPTQYVHGA--TQLMAIQI----- 257 (385)
Q Consensus 198 ~~~~~~~~l~l~~~~----~~-G~~V~~iG~p~~~~-----~~~~---~~G~Vs~~~~~~~~~~~--~~~~~i~~----- 257 (385)
.|.+.+.|+.|.... .. +..+++.||+.... .... ...+++...+....... .....+..
T Consensus 118 ~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~ 197 (256)
T KOG3627|consen 118 TFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEG 197 (256)
T ss_pred ccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCC
Confidence 566788999886432 22 48899999865321 1112 22233322221111110 00011332
Q ss_pred cccCCCCCCCccee-eC---CEEEEEEeeecC
Q 016641 258 DAAINPGNSGGPAI-MG---NKVAGVAFQNLS 285 (385)
Q Consensus 258 d~~i~~G~SGGPL~-~~---G~vVGI~s~~~~ 285 (385)
....|.|||||||+ .+ ..++||++++..
T Consensus 198 ~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 198 GKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred CCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 23469999999999 44 699999999754
No 18
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.69 E-value=4e-07 Score=83.97 Aligned_cols=157 Identities=19% Similarity=0.218 Sum_probs=88.8
Q ss_pred eEEEEEecCCEEEecccccCCCc----eEEEEEc---CC-CcEEEEE--EEEec-C---CCCeEEEEecCCcc------c
Q 016641 141 TGSGFVIPGKKILTNAHVVADST----FVLVRKH---GS-PTKYRAQ--VEAVG-H---ECDLAILIVESDEF------W 200 (385)
Q Consensus 141 ~GSGfiI~~g~ILT~aHvv~~~~----~i~V~~~---~~-g~~~~a~--v~~~d-~---~~DlAlLkv~~~~~------~ 200 (385)
.|++|+|.++.+||++||+.... .+.+... ++ +..+..+ ..... . +.|.+...+.+..+ .
T Consensus 65 ~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~ 144 (251)
T COG3591 65 CTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINIG 144 (251)
T ss_pred eeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCcc
Confidence 45669999999999999996432 2222211 11 1111111 11111 2 34555555543221 1
Q ss_pred ccce--eeecCCcccCCCeEEEEecCCCCCC---ceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee-eCC
Q 016641 201 EGMH--FLELGDIPFLQQAVAVVGYPQGGDN---ISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGN 274 (385)
Q Consensus 201 ~~~~--~l~l~~~~~~G~~V~~iG~p~~~~~---~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G 274 (385)
.... ........++++.+.++|||.+..+ .....+.|..... ..++.++.+++|+||.|++ .+.
T Consensus 145 ~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~----------~~l~y~~dT~pG~SGSpv~~~~~ 214 (251)
T COG3591 145 DVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKG----------NKLFYDADTLPGSSGSPVLISKD 214 (251)
T ss_pred ccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEec----------ceEEEEecccCCCCCCceEecCc
Confidence 1222 2222233366889999999987652 2233333332211 2578899999999999999 888
Q ss_pred EEEEEEeeecCCC--CceEEE-EecchHHHHHHHHH
Q 016641 275 KVAGVAFQNLSGA--ENIGYI-IPVPVIKHFITGVV 307 (385)
Q Consensus 275 ~vVGI~s~~~~~~--~~~~~a-ip~~~i~~~l~~l~ 307 (385)
+++|+++...... ...+++ .-...++++++++.
T Consensus 215 ~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 215 EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred eEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence 9999998865522 122333 33345666666553
No 19
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.53 E-value=1.8e-07 Score=71.59 Aligned_cols=44 Identities=25% Similarity=0.343 Sum_probs=40.8
Q ss_pred cCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 340 VTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 340 ~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
..|++|.+|.++|||+++ ||+||+|++|||++|.+..++.+.|.
T Consensus 9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~ 53 (79)
T cd00991 9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALK 53 (79)
T ss_pred CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHh
Confidence 379999999999999999 99999999999999999999987763
No 20
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.52 E-value=1.4e-07 Score=72.01 Aligned_cols=50 Identities=20% Similarity=0.086 Sum_probs=41.8
Q ss_pred eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHH
Q 016641 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGS 379 (385)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~ 379 (385)
||+|+.+..- ..|++|.+|.++|||+++ |++||+|++|||++|.+..++.
T Consensus 1 ~~~G~~~~~~--------------~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l 51 (80)
T cd00990 1 PYLGLTLDKE--------------EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRL 51 (80)
T ss_pred CcccEEEEcc--------------CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHH
Confidence 4688777432 257999999999999999 9999999999999999855543
No 21
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.51 E-value=2.1e-07 Score=69.03 Aligned_cols=42 Identities=31% Similarity=0.434 Sum_probs=38.5
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh--hhHHhhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND--GTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~--~~l~~~l 382 (385)
.|++|.+|.++|||+++ |++||+|++|||+++.+. .++.+.|
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l 57 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELL 57 (70)
T ss_pred CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHH
Confidence 48999999999999998 999999999999999998 7777655
No 22
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.50 E-value=2.1e-07 Score=94.00 Aligned_cols=67 Identities=24% Similarity=0.386 Sum_probs=60.9
Q ss_pred eeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 315 ~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
..|+|+.++++ ++..++.++++....|++|.+|.++|||+++ |++||+|++|||++|.+..++.++|
T Consensus 337 ~~~lGi~~~~l-~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l 404 (428)
T TIGR02037 337 NPFLGLTVANL-SPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVL 404 (428)
T ss_pred ccccceEEecC-CHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence 46899999998 7888888898865689999999999999999 9999999999999999999998876
No 23
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.44 E-value=2.8e-07 Score=86.42 Aligned_cols=74 Identities=18% Similarity=0.223 Sum_probs=64.0
Q ss_pred chHHHHHHHHHHcCeeeeeeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh
Q 016641 297 PVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND 375 (385)
Q Consensus 297 ~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~ 375 (385)
..++++++++.++|++. +.|+|+...... | ...|++|..+.+++||+++ ||+||+|++|||++|.+.
T Consensus 159 ~~~~~v~~~l~~~g~~~-~~~lgi~p~~~~--------g---~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~ 226 (259)
T TIGR01713 159 VVSRRIIEELTKDPQKM-FDYIRLSPVMKN--------D---KLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDP 226 (259)
T ss_pred hhHHHHHHHHHHCHHhh-hheEeEEEEEeC--------C---ceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCH
Confidence 46788999999999887 889999875431 1 2479999999999999999 999999999999999999
Q ss_pred hhHHhhh
Q 016641 376 GTGSHSM 382 (385)
Q Consensus 376 ~~l~~~l 382 (385)
.++.+.+
T Consensus 227 ~~~~~~l 233 (259)
T TIGR01713 227 EQAFQAL 233 (259)
T ss_pred HHHHHHH
Confidence 9987765
No 24
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.37 E-value=6.5e-07 Score=68.03 Aligned_cols=42 Identities=26% Similarity=0.275 Sum_probs=38.8
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
..++|.+|.++|||+++ |++||+|++|||+++.+..++...|
T Consensus 12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l 54 (79)
T cd00989 12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAV 54 (79)
T ss_pred cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHH
Confidence 45899999999999998 9999999999999999999998765
No 25
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.36 E-value=6.9e-07 Score=68.18 Aligned_cols=43 Identities=28% Similarity=0.278 Sum_probs=39.8
Q ss_pred CceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 341 TGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
.|++|.+|.++|||+..|++||+|++|||++|.+..++.+.|.
T Consensus 8 ~Gv~V~~V~~~s~A~~gL~~GD~I~~Ing~~v~~~~~~~~~l~ 50 (79)
T cd00986 8 HGVYVTSVVEGMPAAGKLKAGDHIIAVDGKPFKEAEELIDYIQ 50 (79)
T ss_pred cCEEEEEECCCCchhhCCCCCCEEEEECCEECCCHHHHHHHHH
Confidence 6999999999999987799999999999999999999987764
No 26
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.31 E-value=3.3e-05 Score=70.81 Aligned_cols=143 Identities=17% Similarity=0.200 Sum_probs=72.0
Q ss_pred CCEEEecccccC-CCceEEEEEcCCCcEEEEE-----EEEecCCCCeEEEEecCCcccccceeeecC---CcccCCCeEE
Q 016641 149 GKKILTNAHVVA-DSTFVLVRKHGSPTKYRAQ-----VEAVGHECDLAILIVESDEFWEGMHFLELG---DIPFLQQAVA 219 (385)
Q Consensus 149 ~g~ILT~aHvv~-~~~~i~V~~~~~g~~~~a~-----v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~---~~~~~G~~V~ 219 (385)
..||+|++|... +...++|... -|. |... -+..-+..||.++|+..+ +||.+-. ..+..+|+|.
T Consensus 40 G~~iItn~HLf~~nng~L~i~s~-hG~-f~v~nt~~lkv~~i~~~DiviirmPkD-----fpPf~~kl~FR~P~~~e~v~ 112 (235)
T PF00863_consen 40 GSYIITNAHLFKRNNGELTIKSQ-HGE-FTVPNTTQLKVHPIEGRDIVIIRMPKD-----FPPFPQKLKFRAPKEGERVC 112 (235)
T ss_dssp TTEEEEEGGGGSSTTCEEEEEET-TEE-EEECEGGGSEEEE-TCSSEEEEE--TT-----S----S---B----TT-EEE
T ss_pred CCEEEEChhhhccCCCeEEEEeC-ceE-EEcCCccccceEEeCCccEEEEeCCcc-----cCCcchhhhccCCCCCCEEE
Confidence 779999999996 4456777764 332 2221 133345899999999874 3443322 3456799999
Q ss_pred EEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee--eCCEEEEEEeeecCCCCceEEEEec-
Q 016641 220 VVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI--MGNKVAGVAFQNLSGAENIGYIIPV- 296 (385)
Q Consensus 220 ~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~--~~G~vVGI~s~~~~~~~~~~~aip~- 296 (385)
++|.-+....... .||.........+. .+...-.....|+-|.||+ .||.+|||++..... ...+|+.|+
T Consensus 113 mVg~~fq~k~~~s---~vSesS~i~p~~~~---~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f~ 185 (235)
T PF00863_consen 113 MVGSNFQEKSISS---TVSESSWIYPEENS---HFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPFP 185 (235)
T ss_dssp EEEEECSSCCCEE---EEEEEEEEEEETTT---TEEEE-C---TT-TT-EEEETTT--EEEEEEEEETT-TSSEEEEE--
T ss_pred EEEEEEEcCCeeE---EECCceEEeecCCC---CeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcCC
Confidence 9998654432222 22222111111111 2344555567899999999 899999999986542 344566554
Q ss_pred -chHHHHHHH
Q 016641 297 -PVIKHFITG 305 (385)
Q Consensus 297 -~~i~~~l~~ 305 (385)
+.+..+++.
T Consensus 186 ~~f~~~~l~~ 195 (235)
T PF00863_consen 186 DDFEEFYLEN 195 (235)
T ss_dssp TTHHHHHCC-
T ss_pred HHHHHHHhcc
Confidence 334444433
No 27
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.21 E-value=2.6e-06 Score=65.04 Aligned_cols=56 Identities=23% Similarity=0.396 Sum_probs=45.9
Q ss_pred eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcC--ChhhHHhhh
Q 016641 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIA--NDGTGSHSM 382 (385)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~--~~~~l~~~l 382 (385)
..+|+.++...+ ...|++|.+|.++|||+++ |++||+|++|||+++. +..++.+.+
T Consensus 12 ~~~G~~~~~~~~-----------~~~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l 70 (82)
T cd00992 12 GGLGFSLRGGKD-----------SGGGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELL 70 (82)
T ss_pred CCcCEEEeCccc-----------CCCCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHH
Confidence 457877765421 0268999999999999999 9999999999999999 888877665
No 28
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.16 E-value=2.8e-06 Score=65.54 Aligned_cols=42 Identities=29% Similarity=0.442 Sum_probs=38.8
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh--hhHHhhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND--GTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~--~~l~~~l 382 (385)
.+++|..|.++|||+++ |++||+|++|||+++.+. .++.+.+
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l 57 (85)
T cd00988 13 GGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLL 57 (85)
T ss_pred CeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHh
Confidence 68999999999999999 999999999999999998 8887655
No 29
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.93 E-value=1.2e-05 Score=61.55 Aligned_cols=53 Identities=26% Similarity=0.382 Sum_probs=42.5
Q ss_pred eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhH
Q 016641 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTG 378 (385)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l 378 (385)
..+||.+....+. ...|++|.+|.++|||+++ |++||.|++|||+.+.+....
T Consensus 10 ~~lG~~l~~~~~~----------~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~ 63 (81)
T PF00595_consen 10 GPLGFTLRGGSDN----------DEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHD 63 (81)
T ss_dssp SBSSEEEEEESTS----------SSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHH
T ss_pred CCcCEEEEecCCC----------CcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHH
Confidence 4578887654210 0259999999999999999 999999999999999987443
No 30
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.91 E-value=1.8e-05 Score=60.48 Aligned_cols=41 Identities=29% Similarity=0.371 Sum_probs=36.7
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHS 381 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~ 381 (385)
.|++|..|.++|||+++ |++||+|++|||+++.+..+....
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~ 67 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAV 67 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHH
Confidence 68999999999999999 999999999999999987655443
No 31
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.89 E-value=0.00022 Score=68.21 Aligned_cols=50 Identities=24% Similarity=0.355 Sum_probs=34.0
Q ss_pred ccCCCCCCCccee-eC--C-EEEEEEeeecCCCCc---eEEEEecchHHHHHHHHHH
Q 016641 259 AAINPGNSGGPAI-MG--N-KVAGVAFQNLSGAEN---IGYIIPVPVIKHFITGVVE 308 (385)
Q Consensus 259 ~~i~~G~SGGPL~-~~--G-~vVGI~s~~~~~~~~---~~~aip~~~i~~~l~~l~~ 308 (385)
...|+||||||++ .. | .-+||++++.....+ .+...-++....|++...+
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~ 279 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN 279 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence 4569999999999 43 4 489999997663222 2333446677778777553
No 32
>PF12812 PDZ_1: PDZ-like domain
Probab=97.73 E-value=7.9e-05 Score=56.73 Aligned_cols=64 Identities=17% Similarity=0.114 Sum_probs=56.7
Q ss_pred eccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
-|.|..++++ +-+.+++++++- |+++.....++++... +..|-||.+|||+++.+.++|.+++-
T Consensus 9 ~~~Ga~f~~L-s~q~aR~~~~~~---~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk 73 (78)
T PF12812_consen 9 EVCGAVFHDL-SYQQARQYGIPV---GGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVK 73 (78)
T ss_pred EEcCeecccC-CHHHHHHhCCCC---CEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHH
Confidence 5799999999 788999999984 4666677899999998 99999999999999999999998864
No 33
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.63 E-value=6.6e-05 Score=75.64 Aligned_cols=42 Identities=19% Similarity=0.249 Sum_probs=39.8
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
.|++|.+|.++|||+++ ||+||+|++|||++|.+.+|+.+.+
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l 245 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAV 245 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHH
Confidence 58999999999999999 9999999999999999999998776
No 34
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.62 E-value=0.004 Score=59.20 Aligned_cols=106 Identities=21% Similarity=0.261 Sum_probs=64.3
Q ss_pred CCCeEEEEecCCcccccceeeecCCcc---cCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCC
Q 016641 186 ECDLAILIVESDEFWEGMHFLELGDIP---FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAIN 262 (385)
Q Consensus 186 ~~DlAlLkv~~~~~~~~~~~l~l~~~~---~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~ 262 (385)
..+++||.++.+ +.....|+.|+++. ..++.+.+.|+.... .+....+.-..... ....+......+
T Consensus 160 ~~~~mIlEl~~~-~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~---~~~~~~~~i~~~~~------~~~~~~~~~~~~ 229 (282)
T PF03761_consen 160 PYSPMILELEED-FSKNVSPPCLADSSTNWEKGDEVDVYGFNSTG---KLKHRKLKITNCTK------CAYSICTKQYSC 229 (282)
T ss_pred ccceEEEEEccc-ccccCCCEEeCCCccccccCceEEEeecCCCC---eEEEEEEEEEEeec------cceeEecccccC
Confidence 579999999887 33478899999865 238889888882221 22222222111100 112355566778
Q ss_pred CCCCCccee--eCCE--EEEEEeeecCCC-CceEEEEecchHHH
Q 016641 263 PGNSGGPAI--MGNK--VAGVAFQNLSGA-ENIGYIIPVPVIKH 301 (385)
Q Consensus 263 ~G~SGGPL~--~~G~--vVGI~s~~~~~~-~~~~~aip~~~i~~ 301 (385)
.|++||||+ .+|+ ||||.+...... .+..+++.+...++
T Consensus 230 ~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 230 KGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD 273 (282)
T ss_pred CCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence 999999999 5664 999987643211 12455555544443
No 35
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.62 E-value=4.9e-05 Score=77.27 Aligned_cols=40 Identities=20% Similarity=0.222 Sum_probs=37.3
Q ss_pred eEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 343 VLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 343 v~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
.+|.+|.++|||++| ||+||+|+++||++|.+.+|+...+
T Consensus 128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v 168 (449)
T PRK10779 128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLAL 168 (449)
T ss_pred ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence 479999999999999 9999999999999999999997654
No 36
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.61 E-value=0.00062 Score=62.73 Aligned_cols=113 Identities=18% Similarity=0.194 Sum_probs=59.0
Q ss_pred eEEEEEec---CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcccCCCe
Q 016641 141 TGSGFVIP---GKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIPFLQQA 217 (385)
Q Consensus 141 ~GSGfiI~---~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~G~~ 217 (385)
.|||-++. +-.|+|+.||+. ....+|.. .+.....+ .+..-|+|.-.++.-. ...|.+++++. ..|.
T Consensus 113 ~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~--~g~~~~~t---F~~~GDfA~~~~~~~~--G~~P~~k~a~~-~~Gr- 182 (297)
T PF05579_consen 113 VGSGGVFTIGGNTVVVTATHVLG-GNTARVSG--VGTRRMLT---FKKNGDFAEADITNWP--GAAPKYKFAQN-YTGR- 182 (297)
T ss_dssp EEEEEEEECTTEEEEEEEHHHCB-TTEEEEEE--TTEEEEEE---EEEETTEEEEEETTS---S---B--B-TT--SEE-
T ss_pred ccccceEEECCeEEEEEEEEEcC-CCeEEEEe--cceEEEEE---EeccCcEEEEECCCCC--CCCCceeecCC-cccc-
Confidence 45555554 448999999998 55666664 33433333 3446799998884321 25677776621 1121
Q ss_pred EEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCccee-eCCEEEEEEeeec
Q 016641 218 VAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNL 284 (385)
Q Consensus 218 V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~ 284 (385)
.+ -.-.--+..|.|..-.+ +. -..+||||+|++ .+|.+|||++...
T Consensus 183 Ay------W~t~tGvE~G~ig~~~~------------~~---fT~~GDSGSPVVt~dg~liGVHTGSn 229 (297)
T PF05579_consen 183 AY------WLTSTGVEPGFIGGGGA------------VC---FTGPGDSGSPVVTEDGDLIGVHTGSN 229 (297)
T ss_dssp EE------EEETTEEEEEEEETTEE------------EE---SS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred eE------EEcccCcccceecCceE------------EE---EcCCCCCCCccCcCCCCEEEEEecCC
Confidence 11 00011334444432111 11 135799999999 9999999999753
No 37
>PRK10139 serine endoprotease; Provisional
Probab=97.57 E-value=9.6e-05 Score=75.12 Aligned_cols=42 Identities=19% Similarity=0.312 Sum_probs=40.2
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
.|++|.+|.++|||+++ ||+||+|++|||++|.+..+|.+.|
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l 432 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVL 432 (455)
T ss_pred CceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence 58999999999999999 9999999999999999999998876
No 38
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=97.54 E-value=0.0018 Score=57.11 Aligned_cols=138 Identities=20% Similarity=0.301 Sum_probs=80.6
Q ss_pred CcceEEEEEecCCEEEecccccCCCceEEEEEcCCCcEEEE--EEEEecC---CCCeEEEEecCCc-ccccceeeecCCc
Q 016641 138 RETTGSGFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRA--QVEAVGH---ECDLAILIVESDE-FWEGMHFLELGDI 211 (385)
Q Consensus 138 ~~~~GSGfiI~~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a--~v~~~d~---~~DlAlLkv~~~~-~~~~~~~l~l~~~ 211 (385)
....++++.|.+.|+|...| -.....+. + ++..++. .+...+. ..||++++++... |.+-.+.+. ...
T Consensus 23 g~~t~l~~gi~~~~~lvp~H-~~~~~~i~--i--~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIrk~~~-~~~ 96 (172)
T PF00548_consen 23 GEFTMLALGIYDRYFLVPTH-EEPEDTIY--I--DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIRKFFP-ESI 96 (172)
T ss_dssp EEEEEEEEEEEBTEEEEEGG-GGGCSEEE--E--TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GGGGSB-SSG
T ss_pred ceEEEecceEeeeEEEEECc-CCCcEEEE--E--CCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchhhhhc-ccc
Confidence 45678899999999999999 22233333 3 3444433 2223343 4699999997743 322223333 112
Q ss_pred ccCCCeEEEEecCCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCcceee----CCEEEEEEeee
Q 016641 212 PFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAIM----GNKVAGVAFQN 283 (385)
Q Consensus 212 ~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~~----~G~vVGI~s~~ 283 (385)
+...+...++=.+. ........+.+...... ...+......+..+++..+|+-||||+. .++++||+.++
T Consensus 97 ~~~~~~~l~v~~~~-~~~~~~~v~~v~~~~~i-~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 97 PEYPECVLLVNSTK-FPRMIVEVGFVTNFGFI-NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp GTEEEEEEEEESSS-STCEEEEEEEEEEEEEE-EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred ccCCCcEEEEECCC-CccEEEEEEEEeecCcc-ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 23344444443332 22223444444443332 2223333457888888889999999995 68999999875
No 39
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.53 E-value=9.6e-05 Score=74.46 Aligned_cols=43 Identities=21% Similarity=0.239 Sum_probs=40.5
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
.|.+|.+|.++|||+++ ||+||+|+++||+++.+..|+...+.
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia 171 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIA 171 (420)
T ss_pred CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHH
Confidence 68999999999999999 99999999999999999999987664
No 40
>PRK10942 serine endoprotease; Provisional
Probab=97.53 E-value=0.00011 Score=75.03 Aligned_cols=42 Identities=31% Similarity=0.510 Sum_probs=40.0
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
.|++|.+|.++|||+++ |++||+|++|||++|.+..+|.+++
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l 450 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKIL 450 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence 58999999999999999 9999999999999999999998865
No 41
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.46 E-value=0.00015 Score=73.78 Aligned_cols=42 Identities=24% Similarity=0.303 Sum_probs=39.1
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
.+++|.+|.++|||+++ ||+||+|++|||++|.+..|+.+.+
T Consensus 221 ~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l 263 (449)
T PRK10779 221 IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLV 263 (449)
T ss_pred cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHH
Confidence 35899999999999999 9999999999999999999998765
No 42
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.40 E-value=0.00017 Score=56.26 Aligned_cols=37 Identities=27% Similarity=0.383 Sum_probs=32.7
Q ss_pred cCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEc
Q 016641 334 FGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPI 372 (385)
Q Consensus 334 ~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i 372 (385)
++.+ ..|+||++|.++|||+.| |+.+|-|+.|||-..
T Consensus 54 f~yt--D~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~Df 91 (124)
T KOG3553|consen 54 FSYT--DKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDF 91 (124)
T ss_pred CCcC--CccEEEEEeccCChhhhhcceecceEEEecCcee
Confidence 4555 489999999999999999 999999999999653
No 43
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.36 E-value=0.00024 Score=69.41 Aligned_cols=41 Identities=17% Similarity=0.170 Sum_probs=35.8
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh--hhHHhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND--GTGSHS 381 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~--~~l~~~ 381 (385)
.+++|.+|.++|||+++ ||+||+|++|||++|.+. .++...
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~ 105 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVAL 105 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHh
Confidence 58999999999999999 999999999999999875 454443
No 44
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.21 E-value=0.00047 Score=68.77 Aligned_cols=35 Identities=29% Similarity=0.425 Sum_probs=32.9
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND 375 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~ 375 (385)
.|++|..|.++|||+++ |++||+|++|||++|.+.
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~ 137 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGL 137 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCC
Confidence 48999999999999999 999999999999999864
No 45
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.05 E-value=0.00094 Score=51.97 Aligned_cols=43 Identities=26% Similarity=0.407 Sum_probs=30.6
Q ss_pred CceEEEeeCCC--------CHHhhh---cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 341 TGVLVNKINPL--------SDAHEI---LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 341 ~Gv~V~~V~~~--------spA~~a---L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
.+..|.+|.++ ||..+. +++||+|++|||+++....+++..|.
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~ 65 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLE 65 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHH
T ss_pred CEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhc
Confidence 68899999886 676654 77999999999999999999888764
No 46
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=96.98 E-value=0.001 Score=65.78 Aligned_cols=42 Identities=29% Similarity=0.312 Sum_probs=36.7
Q ss_pred CceEEEeeC--------CCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 341 TGVLVNKIN--------PLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~--------~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
.||+|.... .+|||+++ ||+||+|++|||++|.+..|+.++|
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL 155 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLI 155 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHH
Confidence 699996642 26899999 9999999999999999999998765
No 47
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=96.52 E-value=0.0032 Score=53.35 Aligned_cols=60 Identities=25% Similarity=0.259 Sum_probs=41.6
Q ss_pred eeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCC-CCEEEEECCEEcCChhhHHhhh
Q 016641 315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKK-DDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 315 ~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~-GDiI~~vng~~i~~~~~l~~~l 382 (385)
...||++++.-. . .+. ...++-|.+|.|+|||++| |++ .|.|+.+|+..+.+.++|.+.+
T Consensus 25 ~g~LG~sv~~~~-~-----~~~--~~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v 86 (138)
T PF04495_consen 25 QGLLGISVRFES-F-----EGA--EEEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELV 86 (138)
T ss_dssp SSSS-EEEEEEE-------TTG--CCCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHH
T ss_pred CCCCcEEEEEec-c-----ccc--ccceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHH
Confidence 355888876542 1 111 2368999999999999999 999 6999999999999999988765
No 48
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=96.29 E-value=0.0051 Score=61.59 Aligned_cols=49 Identities=27% Similarity=0.392 Sum_probs=41.1
Q ss_pred eeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChh
Q 016641 315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDG 376 (385)
Q Consensus 315 ~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~ 376 (385)
+..+|+.++.-. ..++.|.++.+++||+++ ||+||+|++|||+++....
T Consensus 99 ~~GiG~~i~~~~-------------~~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~ 148 (406)
T COG0793 99 FGGIGIELQMED-------------IGGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVS 148 (406)
T ss_pred ccceeEEEEEec-------------CCCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCC
Confidence 456777765431 168999999999999999 9999999999999999875
No 49
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=96.23 E-value=0.029 Score=51.42 Aligned_cols=98 Identities=17% Similarity=0.223 Sum_probs=67.4
Q ss_pred CCCCCCCCCC--CCcceEEEEEecCCEEEecccccCCC----ceEEEEEcCCCcEEE------EEEEEec-----CCCCe
Q 016641 127 NYGLPWQNKS--QRETTGSGFVIPGKKILTNAHVVADS----TFVLVRKHGSPTKYR------AQVEAVG-----HECDL 189 (385)
Q Consensus 127 ~~~~p~~~~~--~~~~~GSGfiI~~g~ILT~aHvv~~~----~~i~V~~~~~g~~~~------a~v~~~d-----~~~Dl 189 (385)
++..||...- .+...|+|++|+..|||++..|+.+- ..+.+.+. .++.+. -++..+| ++.++
T Consensus 13 ~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG-~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v 91 (267)
T PF09342_consen 13 DYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLG-GGKTYLSVDGPHEQISRVDCFKDVPESNV 91 (267)
T ss_pred cccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEec-CcceecccCCChheEEEeeeeeeccccce
Confidence 4556665432 24467999999999999999999863 34556553 444322 1333333 57899
Q ss_pred EEEEecCC-cccccceeeecCCcc---cCCCeEEEEecCC
Q 016641 190 AILIVESD-EFWEGMHFLELGDIP---FLQQAVAVVGYPQ 225 (385)
Q Consensus 190 AlLkv~~~-~~~~~~~~l~l~~~~---~~G~~V~~iG~p~ 225 (385)
+||.++.+ .|...+.|+-+.+.. ...+.++++|...
T Consensus 92 ~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 92 LLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 99999886 566778888887622 2256899999876
No 50
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.01 E-value=0.054 Score=56.23 Aligned_cols=117 Identities=20% Similarity=0.227 Sum_probs=68.6
Q ss_pred CCCCeEEEEecCCc-----cccc------ceeeecCCc--------ccCCCeEEEEecCCCCCCceEEEeeEeecccccc
Q 016641 185 HECDLAILIVESDE-----FWEG------MHFLELGDI--------PFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQY 245 (385)
Q Consensus 185 ~~~DlAlLkv~~~~-----~~~~------~~~l~l~~~--------~~~G~~V~~iG~p~~~~~~~~~~G~Vs~~~~~~~ 245 (385)
.-.|+|||+++... +.++ -|.+.+.+. ...|.+|+=+|...+. +.|.|.++.-...
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy-----T~G~lNg~klvyw 615 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY-----TTGILNGIKLVYW 615 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc-----cceEecceEEEEe
Confidence 34699999998542 1122 233333321 1238899999987553 4566665532211
Q ss_pred cCCCce-eeEEEec----ccCCCCCCCccee-eCC------EEEEEEeeecCCCCceEEEEecchHHHHHHHH
Q 016641 246 VHGATQ-LMAIQID----AAINPGNSGGPAI-MGN------KVAGVAFQNLSGAENIGYIIPVPVIKHFITGV 306 (385)
Q Consensus 246 ~~~~~~-~~~i~~d----~~i~~G~SGGPL~-~~G------~vVGI~s~~~~~~~~~~~aip~~~i~~~l~~l 306 (385)
..+... .+++... .-...||||+=|+ .-+ .|+||..+.-+....+|++.|+..|.+-|++.
T Consensus 616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHh
Confidence 122211 1233322 1235799999888 533 39999998655556789999987776666554
No 51
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=95.93 E-value=0.0079 Score=61.95 Aligned_cols=42 Identities=24% Similarity=0.380 Sum_probs=38.7
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
+-|.|..|.+++||.++ +++|||+++|||+||.+.++..+.+
T Consensus 398 ~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~ 440 (1051)
T KOG3532|consen 398 RAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFL 440 (1051)
T ss_pred eEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHH
Confidence 56889999999999999 9999999999999999999987765
No 52
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=95.84 E-value=0.0099 Score=56.32 Aligned_cols=43 Identities=26% Similarity=0.271 Sum_probs=40.7
Q ss_pred CceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 341 TGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
.||||..|..++|+...|+.||-|++|||+++.+.+|+.+.+.
T Consensus 130 ~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e~i~~v~ 172 (342)
T COG3480 130 AGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDELIDYVS 172 (342)
T ss_pred eeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHHHHHHHh
Confidence 6999999999999999999999999999999999999998763
No 53
>PRK11186 carboxy-terminal protease; Provisional
Probab=95.69 E-value=0.015 Score=61.64 Aligned_cols=34 Identities=24% Similarity=0.319 Sum_probs=29.6
Q ss_pred CceEEEeeCCCCHHhhh--cCCCCEEEEEC--CEEcCC
Q 016641 341 TGVLVNKINPLSDAHEI--LKKDDIILAFD--GVPIAN 374 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a--L~~GDiI~~vn--g~~i~~ 374 (385)
.+++|.+|.|||||+++ |++||+|++|| |+++.+
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~d 292 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVD 292 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccc
Confidence 57899999999999995 99999999999 666554
No 54
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=94.90 E-value=0.039 Score=58.81 Aligned_cols=20 Identities=30% Similarity=0.277 Sum_probs=15.8
Q ss_pred eEEEEEec-CCEEEecccccC
Q 016641 141 TGSGFVIP-GKKILTNAHVVA 160 (385)
Q Consensus 141 ~GSGfiI~-~g~ILT~aHvv~ 160 (385)
-|||.+|+ +|+|+||.||+.
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred ceeEEEEcCCceEEecchhhh
Confidence 48888888 788888888864
No 55
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=94.87 E-value=0.022 Score=51.32 Aligned_cols=41 Identities=27% Similarity=0.374 Sum_probs=34.8
Q ss_pred ccCCCCCCCcceeeCCEEEEEEeeecCCCCceEEEEecchH
Q 016641 259 AAINPGNSGGPAIMGNKVAGVAFQNLSGAENIGYIIPVPVI 299 (385)
Q Consensus 259 ~~i~~G~SGGPL~~~G~vVGI~s~~~~~~~~~~~aip~~~i 299 (385)
..+.+|+||+|++.||++||=++..+.+....+|.++++..
T Consensus 175 GGIvqGMSGSPI~qdGKLiGAVthvf~~dp~~Gygi~ie~M 215 (218)
T PF05580_consen 175 GGIVQGMSGSPIIQDGKLIGAVTHVFVNDPTKGYGIFIEWM 215 (218)
T ss_pred CCEEecccCCCEEECCEEEEEEEEEEecCCCceeeecHHHH
Confidence 35678999999999999999998888777788999986543
No 56
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=94.86 E-value=0.031 Score=49.84 Aligned_cols=39 Identities=31% Similarity=0.201 Sum_probs=35.0
Q ss_pred ceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHh
Q 016641 342 GVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSH 380 (385)
Q Consensus 342 Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~ 380 (385)
=+.|.+|.|+|||+.+ |+.||-|+++++..-.++..|++
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~ 179 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQN 179 (231)
T ss_pred eEEEeecCCCChhhhhCcccCceEEEecccccccchhHHH
Confidence 5789999999999999 99999999999998888877654
No 57
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=94.79 E-value=0.029 Score=56.71 Aligned_cols=32 Identities=25% Similarity=0.261 Sum_probs=29.7
Q ss_pred ccCceEEEeeCCCCHHhhh-cCCCCEEEEECCE
Q 016641 339 EVTGVLVNKINPLSDAHEI-LKKDDIILAFDGV 370 (385)
Q Consensus 339 ~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~ 370 (385)
+..+.+|..|.++|||++| |.+||-|++|||.
T Consensus 460 ~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~ 492 (558)
T COG3975 460 EGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI 492 (558)
T ss_pred cCCeeEEEecCCCChhHhccCCCccEEEEEcCc
Confidence 3467899999999999999 9999999999998
No 58
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.15 E-value=0.045 Score=49.37 Aligned_cols=136 Identities=21% Similarity=0.171 Sum_probs=46.3
Q ss_pred CCEEEecccccCCCceEEEEEcCCCcEEEE---EEEEecCCCCeEEEEecCCcccc--cceeeecCCcccCCCeEEEEec
Q 016641 149 GKKILTNAHVVADSTFVLVRKHGSPTKYRA---QVEAVGHECDLAILIVESDEFWE--GMHFLELGDIPFLQQAVAVVGY 223 (385)
Q Consensus 149 ~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a---~v~~~d~~~DlAlLkv~~~~~~~--~~~~l~l~~~~~~G~~V~~iG~ 223 (385)
+..++|+.||..+...+.... .|..++- +.+..+...|++||+... ..+. .++.+.+....++ .-|-
T Consensus 41 ~~~L~ta~Hv~~~~~~~~~~k--~g~kipl~~f~~~~~~~~~D~~il~~P~-n~~s~Lg~k~~~~~~~~~~-----~~g~ 112 (203)
T PF02122_consen 41 EDALLTARHVWSRPSKVTSLK--TGEKIPLAEFTDLLESRIADFVILRGPP-NWESKLGVKAAQLSQNSQL-----AKGP 112 (203)
T ss_dssp -EEEEE-HHHHTSSS---EEE--TTEEEE--S-EEEEE-TTT-EEEEE--H-HHHHHHT-----B----SE-----EEEE
T ss_pred ccceecccccCCCccceeEcC--CCCcccchhChhhhCCCccCEEEEecCc-CHHHHhCcccccccchhhh-----CCCC
Confidence 559999999999866555444 3344332 455667889999999983 2211 2444444322211 0010
Q ss_pred CCCCCCceEEEeeEeecccccccCCCceeeEEEecccCCCCCCCcceeeCCEEEEEEeeec--CCCCceEEEEecch
Q 016641 224 PQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAIMGNKVAGVAFQNL--SGAENIGYIIPVPV 298 (385)
Q Consensus 224 p~~~~~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~~~G~vVGI~s~~~--~~~~~~~~aip~~~ 298 (385)
.... ....+.-...........+ .+...-+...+|.||.|++...++||++.... ...++.++..|+.-
T Consensus 113 -~~~y--~~~~~~~~~~sa~i~g~~~---~~~~vls~T~~G~SGtp~y~g~~vvGvH~G~~~~~~~~n~n~~spip~ 183 (203)
T PF02122_consen 113 -VSFY--GFSSGEWPCSSAKIPGTEG---KFASVLSNTSPGWSGTPYYSGKNVVGVHTGSPSGSNRENNNRMSPIPP 183 (203)
T ss_dssp -SSTT--SEEEEEEEEEE-S----ST---TEEEE-----TT-TT-EEE-SS-EEEEEEEE-----------------
T ss_pred -eeee--eecCCCceeccCccccccC---cCCceEcCCCCCCCCCCeEECCCceEeecCcccccccccccccccccc
Confidence 0000 0111000000000000111 13444456688999999994449999998742 23345566555443
No 59
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=93.61 E-value=0.31 Score=40.25 Aligned_cols=42 Identities=19% Similarity=0.286 Sum_probs=29.1
Q ss_pred cccccCCCceeeEEEecccCCCCCCCcceeeCCEEEEEEeee
Q 016641 242 PTQYVHGATQLMAIQIDAAINPGNSGGPAIMGNKVAGVAFQN 283 (385)
Q Consensus 242 ~~~~~~~~~~~~~i~~d~~i~~G~SGGPL~~~G~vVGI~s~~ 283 (385)
...+.....+..++....+..||+-||+|+++--||||++++
T Consensus 68 ~s~YYP~h~Q~~~l~g~Gp~~PGdCGg~L~C~HGViGi~Tag 109 (127)
T PF00947_consen 68 ESEYYPKHYQYNLLIGEGPAEPGDCGGILRCKHGVIGIVTAG 109 (127)
T ss_dssp SBTTB-SEEEECEEEEE-SSSTT-TCSEEEETTCEEEEEEEE
T ss_pred CccCchhheecCceeecccCCCCCCCceeEeCCCeEEEEEeC
Confidence 333434444556667777889999999999776799999985
No 60
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=93.22 E-value=0.065 Score=54.71 Aligned_cols=44 Identities=23% Similarity=0.253 Sum_probs=37.5
Q ss_pred cCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 340 VTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 340 ~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
.-|+.|..|..+|||++- ||.||-|++||..+..+.--=+..+|
T Consensus 428 DVGIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlf 472 (1027)
T KOG3580|consen 428 DVGIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLF 472 (1027)
T ss_pred ceeEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHH
Confidence 369999999999999998 99999999999999888765444443
No 61
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=91.91 E-value=0.11 Score=54.18 Aligned_cols=32 Identities=38% Similarity=0.368 Sum_probs=29.2
Q ss_pred EEeeCCCCHHhhh--cCCCCEEEEECCEEcCChh
Q 016641 345 VNKINPLSDAHEI--LKKDDIILAFDGVPIANDG 376 (385)
Q Consensus 345 V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~ 376 (385)
|-+|.+||||++. ||+||-|++|||..|.+..
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~ls 815 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLS 815 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccC
Confidence 6789999999997 9999999999999998764
No 62
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=91.80 E-value=0.12 Score=55.38 Aligned_cols=34 Identities=26% Similarity=0.362 Sum_probs=31.8
Q ss_pred ceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCCh
Q 016641 342 GVLVNKINPLSDAHEILKKDDIILAFDGVPIAND 375 (385)
Q Consensus 342 Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~ 375 (385)
-|+|..|.+|+|+...|++||-|+.|||++|...
T Consensus 76 PviVr~VT~GGps~GKL~PGDQIl~vN~Epv~da 109 (1298)
T KOG3552|consen 76 PVIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDA 109 (1298)
T ss_pred ceEEEEecCCCCccccccCCCeEEEecCcccccc
Confidence 4899999999999999999999999999999875
No 63
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=91.63 E-value=0.31 Score=40.86 Aligned_cols=31 Identities=23% Similarity=0.375 Sum_probs=21.7
Q ss_pred EEecccCCCCCCCccee-eCCEEEEEEeeecC
Q 016641 255 IQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS 285 (385)
Q Consensus 255 i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~ 285 (385)
...+..+.+|.||+|+| .+|++|||......
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~ 119 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE 119 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence 34445578899999999 99999999877654
No 64
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=91.58 E-value=0.22 Score=47.03 Aligned_cols=42 Identities=19% Similarity=0.292 Sum_probs=30.8
Q ss_pred CceEEEeeCCCCHH---hhh-cCCCCEEEEECCEEcCChhhHHhhh
Q 016641 341 TGVLVNKINPLSDA---HEI-LKKDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 341 ~Gv~V~~V~~~spA---~~a-L~~GDiI~~vng~~i~~~~~l~~~l 382 (385)
.|+.=-+|.|+.++ .++ ||+|||+++|||..+++.++..+++
T Consensus 204 ~Gl~GYrl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~ 249 (276)
T PRK09681 204 EGIVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALM 249 (276)
T ss_pred CCceEEEECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHH
Confidence 45222346777544 356 9999999999999999999766554
No 65
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=91.12 E-value=0.28 Score=41.53 Aligned_cols=35 Identities=26% Similarity=0.498 Sum_probs=31.2
Q ss_pred CceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCCh
Q 016641 341 TGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIAND 375 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~ 375 (385)
.-+||++|.||+-|++- ||.||-+++|||..|..-
T Consensus 115 spiyisriipggvadrhgglkrgdqllsvngvsvege 151 (207)
T KOG3550|consen 115 SPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGE 151 (207)
T ss_pred CceEEEeecCCccccccCcccccceeEeecceeecch
Confidence 56999999999999974 999999999999888653
No 66
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=90.35 E-value=0.31 Score=40.54 Aligned_cols=27 Identities=22% Similarity=0.476 Sum_probs=22.5
Q ss_pred ccCCCCCCCccee-eCCEEEEEEeeecC
Q 016641 259 AAINPGNSGGPAI-MGNKVAGVAFQNLS 285 (385)
Q Consensus 259 ~~i~~G~SGGPL~-~~G~vVGI~s~~~~ 285 (385)
..-.+||||-|++ ..|+||||+..+..
T Consensus 101 g~g~~GDSGRpi~DNsGrVVaIVLGG~n 128 (158)
T PF00944_consen 101 GVGKPGDSGRPIFDNSGRVVAIVLGGAN 128 (158)
T ss_dssp TS-STTSTTEEEESTTSBEEEEEEEEEE
T ss_pred CCCCCCCCCCccCcCCCCEEEEEecCCC
Confidence 3457899999999 99999999988644
No 67
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=89.92 E-value=0.29 Score=47.81 Aligned_cols=42 Identities=24% Similarity=0.332 Sum_probs=37.3
Q ss_pred cCceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCChhhHHhh
Q 016641 340 VTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTGSHS 381 (385)
Q Consensus 340 ~~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~~l~~~ 381 (385)
..||.|.+|...||+..- |++||+|+++||-+|++.+|-.+-
T Consensus 219 g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ec 262 (484)
T KOG2921|consen 219 GEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLEC 262 (484)
T ss_pred CceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHH
Confidence 379999999999998864 999999999999999999886553
No 68
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=89.06 E-value=0.28 Score=48.86 Aligned_cols=41 Identities=27% Similarity=0.391 Sum_probs=34.2
Q ss_pred ccCCCCCCCcceeeCCEEEEEEeeecCCCCceEEEEecchH
Q 016641 259 AAINPGNSGGPAIMGNKVAGVAFQNLSGAENIGYIIPVPVI 299 (385)
Q Consensus 259 ~~i~~G~SGGPL~~~G~vVGI~s~~~~~~~~~~~aip~~~i 299 (385)
..+.+|+||+|++.||++||=++..+-++...||.|-++..
T Consensus 355 gGivqGMSGSPi~q~gkliGAvtHVfvndpt~GYGi~ie~M 395 (402)
T TIGR02860 355 GGIVQGMSGSPIIQNGKVIGAVTHVFVNDPTSGYGVYIEWM 395 (402)
T ss_pred CCEEecccCCCEEECCEEEEEEEEEEecCCCcceeehHHHH
Confidence 35678999999999999999998888787888899865443
No 69
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=88.66 E-value=0.29 Score=50.89 Aligned_cols=41 Identities=29% Similarity=0.377 Sum_probs=35.5
Q ss_pred CCccCceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhh
Q 016641 337 RSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGT 377 (385)
Q Consensus 337 ~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~ 377 (385)
.....|++|.+|.|+|.|++. ||.||-|++|||+...+..-
T Consensus 558 sEkGfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~ 599 (1283)
T KOG3542|consen 558 SEKGFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISA 599 (1283)
T ss_pred ccccceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhH
Confidence 334579999999999999999 99999999999998776643
No 70
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=88.16 E-value=0.26 Score=52.66 Aligned_cols=54 Identities=19% Similarity=0.269 Sum_probs=35.8
Q ss_pred EEEecccCCCCCCCccee-eCCEEEEEEeeecC--------CCCce--EEEEecchHHHHHHHHH
Q 016641 254 AIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS--------GAENI--GYIIPVPVIKHFITGVV 307 (385)
Q Consensus 254 ~i~~d~~i~~G~SGGPL~-~~G~vVGI~s~~~~--------~~~~~--~~aip~~~i~~~l~~l~ 307 (385)
.+.++..+..||||+|++ .+|||||+++=+.. -.... +..|-+..|..+|+++-
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 466778889999999999 99999999875432 11223 33444445666665543
No 71
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=87.55 E-value=0.58 Score=42.84 Aligned_cols=44 Identities=14% Similarity=0.094 Sum_probs=36.4
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhcc
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSMLF 384 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~~ 384 (385)
.|..+.=..+++-.++. ||+|||-+++||..+++.+++..+|..
T Consensus 207 ~Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~ 251 (275)
T COG3031 207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQM 251 (275)
T ss_pred EEEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHh
Confidence 45555556667778888 999999999999999999999887754
No 72
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=87.24 E-value=0.5 Score=43.95 Aligned_cols=72 Identities=22% Similarity=0.417 Sum_probs=49.2
Q ss_pred HHHcCeeeeeeccCccccccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcC--ChhhHHhh
Q 016641 306 VVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIA--NDGTGSHS 381 (385)
Q Consensus 306 l~~~g~~~~~~~lGi~~~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~--~~~~l~~~ 381 (385)
|-++|.-. -||+...+-...+.. ..|+. ...|+.|++..||+-|+.- |.+.|-|++|||.+|. +.+++.++
T Consensus 164 L~khG~ek---PLGFYIRDG~SVRVt-p~Gle-kvpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDM 238 (358)
T KOG3606|consen 164 LHKHGSEK---PLGFYIRDGTSVRVT-PHGLE-KVPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDM 238 (358)
T ss_pred hhhcCCCC---CceEEEecCceEEec-ccccc-ccCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHH
Confidence 44566532 377766543221111 24554 3579999999999999975 8999999999999995 45666655
Q ss_pred h
Q 016641 382 M 382 (385)
Q Consensus 382 l 382 (385)
|
T Consensus 239 M 239 (358)
T KOG3606|consen 239 M 239 (358)
T ss_pred H
Confidence 4
No 73
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=85.15 E-value=1.1 Score=42.38 Aligned_cols=37 Identities=14% Similarity=0.217 Sum_probs=32.5
Q ss_pred ceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCChhhH
Q 016641 342 GVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTG 378 (385)
Q Consensus 342 Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~~l 378 (385)
=+||..|..++||++- |+.||-|++|||..|....-+
T Consensus 31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKv 69 (429)
T KOG3651|consen 31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKV 69 (429)
T ss_pred eEEEEEeccCCchhccCccccCCeeEEecceeecCccHH
Confidence 5899999999999975 999999999999999865443
No 74
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=84.23 E-value=0.77 Score=46.31 Aligned_cols=36 Identities=22% Similarity=0.276 Sum_probs=30.2
Q ss_pred cCceEEEeeCCCCHHh-hh-cCCCCEEEEECCEEcCCh
Q 016641 340 VTGVLVNKINPLSDAH-EI-LKKDDIILAFDGVPIAND 375 (385)
Q Consensus 340 ~~Gv~V~~V~~~spA~-~a-L~~GDiI~~vng~~i~~~ 375 (385)
..|+||.+|.+++.-+ .. |.+||.|+.||.....++
T Consensus 276 DggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENm 313 (626)
T KOG3571|consen 276 DGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENM 313 (626)
T ss_pred CCceEEeeeccCceeeccCccCccceEEEeeecchhhc
Confidence 4799999999988744 45 999999999999876654
No 75
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=84.17 E-value=5.1 Score=43.52 Aligned_cols=65 Identities=12% Similarity=0.031 Sum_probs=35.2
Q ss_pred EEEEEecCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecC
Q 016641 142 GSGFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELG 209 (385)
Q Consensus 142 GSGfiI~~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~ 209 (385)
|...+|++.||+|.+|+..+...+..--. +...|...-..-.+..|+.+-|++.-. .++.|++..
T Consensus 67 G~aTLigpqYiVSV~HN~~gy~~v~FG~~-g~~~Y~iV~RNn~~~~Df~~pRLnK~V--TEvaP~~~t 131 (769)
T PF02395_consen 67 GVATLIGPQYIVSVKHNGKGYNSVSFGNE-GQNTYKIVDRNNYPSGDFHMPRLNKFV--TEVAPAEMT 131 (769)
T ss_dssp SS-EEEETTEEEBETTG-TSCCEECESCS-STCEEEEEEEEBETTSTEBEEEESS-----SS----BB
T ss_pred ceEEEecCCeEEEEEccCCCcCceeeccc-CCceEEEEEccCCCCcccceeecCceE--EEEeccccc
Confidence 77899999999999999955444433321 223343322222334699999998742 235555554
No 76
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=83.27 E-value=1.7 Score=42.93 Aligned_cols=39 Identities=28% Similarity=0.341 Sum_probs=34.5
Q ss_pred EEeeCCCCHHhhh-cCCCCEEEEECCEEcCChhhHHhhhc
Q 016641 345 VNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTGSHSML 383 (385)
Q Consensus 345 V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l~~~l~ 383 (385)
+..+..+|+|..+ |++||.|+++|++++.+..++...+.
T Consensus 133 ~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~ 172 (375)
T COG0750 133 VGEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLV 172 (375)
T ss_pred eeecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHH
Confidence 3379999999999 99999999999999999998776553
No 77
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=82.95 E-value=11 Score=37.46 Aligned_cols=137 Identities=15% Similarity=0.143 Sum_probs=67.3
Q ss_pred ceEEEEEecCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCCcccCCCeEE
Q 016641 140 TTGSGFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIPFLQQAVA 219 (385)
Q Consensus 140 ~~GSGfiI~~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~G~~V~ 219 (385)
+.|=||.+++..++|+-||+....+-..- .+-.-+.++..-+++-+++..+. ..+++-+-|.+...-|.-+.
T Consensus 379 GsGWGfWVS~~lfITttHViP~g~~E~FG-------v~i~~i~vh~sGeF~~~rFpk~i-RPDvtgmiLEeGapEGtV~s 450 (535)
T PF05416_consen 379 GSGWGFWVSPTLFITTTHVIPPGAKEAFG-------VPISQIQVHKSGEFCRFRFPKPI-RPDVTGMILEEGAPEGTVCS 450 (535)
T ss_dssp TTEEEEESSSSEEEEEGGGS-STTSEETT-------EECGGEEEEEETTEEEEEESS-S-STTS---EE-SS--TT-EEE
T ss_pred CCceeeeecceEEEEeeeecCCcchhhhC-------CChhHeEEeeccceEEEecCCCC-CCCccceeeccCCCCceEEE
Confidence 56789999999999999999853211100 01111223334566777776542 12566666655545565443
Q ss_pred E-EecCCCCC-CceEEEeeEeecccccccCCCceeeEEEec-------ccCCCCCCCccee-eCC---EEEEEEeeecC
Q 016641 220 V-VGYPQGGD-NISVTKGVVSRVEPTQYVHGATQLMAIQID-------AAINPGNSGGPAI-MGN---KVAGVAFQNLS 285 (385)
Q Consensus 220 ~-iG~p~~~~-~~~~~~G~Vs~~~~~~~~~~~~~~~~i~~d-------~~i~~G~SGGPL~-~~G---~vVGI~s~~~~ 285 (385)
+ |=.+.|.- .+.+..|....+.-....- .....++.+. -...|||-|.|-+ ..| -|+|++.+...
T Consensus 451 iLiKR~sGEllpLAvRMgt~AsmkIqgr~v-~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr 528 (535)
T PF05416_consen 451 ILIKRPSGELLPLAVRMGTHASMKIQGRTV-HGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATR 528 (535)
T ss_dssp EEEE-TTSBEEEEEEEEEEEEEEEETTEEE-EEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-S
T ss_pred EEEEcCCccchhhhhhhccceeEEEcceee-cceeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhcc
Confidence 3 34454422 1345566554432210000 1122333332 2456899999999 665 49999988654
No 78
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=82.76 E-value=1.2 Score=48.31 Aligned_cols=42 Identities=19% Similarity=0.232 Sum_probs=35.0
Q ss_pred CCccCceEEEeeCCCCHHhh-h-cCCCCEEEEECCEEcCChhhH
Q 016641 337 RSEVTGVLVNKINPLSDAHE-I-LKKDDIILAFDGVPIANDGTG 378 (385)
Q Consensus 337 ~~~~~Gv~V~~V~~~spA~~-a-L~~GDiI~~vng~~i~~~~~l 378 (385)
.++.-|+||.+|.+|++|+. . |+.||-+++|||..+-.+.+=
T Consensus 956 Gq~klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQE 999 (1629)
T KOG1892|consen 956 GQRKLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQE 999 (1629)
T ss_pred CccccceEEEEeccCCccccccccccCceeeeecCcccccccHH
Confidence 34567999999999999885 4 999999999999987665543
No 79
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=82.32 E-value=1.1 Score=37.30 Aligned_cols=23 Identities=26% Similarity=0.466 Sum_probs=17.8
Q ss_pred CCCCCCccee-eCCEEEEEEeeec
Q 016641 262 NPGNSGGPAI-MGNKVAGVAFQNL 284 (385)
Q Consensus 262 ~~G~SGGPL~-~~G~vVGI~s~~~ 284 (385)
-.|.||||++ .+|.+|||..+..
T Consensus 106 lkGSSGgPiLC~~GH~vG~f~aa~ 129 (148)
T PF02907_consen 106 LKGSSGGPILCPSGHAVGMFRAAV 129 (148)
T ss_dssp HTT-TT-EEEETTSEEEEEEEEEE
T ss_pred EecCCCCcccCCCCCEEEEEEEEE
Confidence 4699999999 9999999976643
No 80
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=82.23 E-value=1.1 Score=47.15 Aligned_cols=37 Identities=14% Similarity=0.125 Sum_probs=32.8
Q ss_pred CceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCChhh
Q 016641 341 TGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGT 377 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~~ 377 (385)
.+++|-++.+++||.+- +++||-|++|||+....+..
T Consensus 923 M~LfVLRlAeDGPA~rdGrm~VGDqi~eINGesTkgmtH 961 (984)
T KOG3209|consen 923 MDLFVLRLAEDGPAIRDGRMRVGDQITEINGESTKGMTH 961 (984)
T ss_pred cceEEEEeccCCCccccCceeecceEEEecCcccCCCcH
Confidence 46999999999999975 99999999999998877653
No 81
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=82.00 E-value=8.6 Score=30.86 Aligned_cols=53 Identities=15% Similarity=0.129 Sum_probs=36.0
Q ss_pred EEEecCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccccceeeecCC
Q 016641 144 GFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGD 210 (385)
Q Consensus 144 GfiI~~g~ILT~aHvv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~ 210 (385)
++-|.+|.++|+.||.+..+.+. +.. -+++. .+.|+++++.+... .+.+++++
T Consensus 3 avHIGnG~~vt~tHva~~~~~v~------g~~--f~~~~--~~ge~~~v~~~~~~----~p~~~ig~ 55 (105)
T PF03510_consen 3 AVHIGNGRYVTVTHVAKSSDSVD------GQP--FKIVK--TDGELCWVQSPLVH----LPAAQIGT 55 (105)
T ss_pred eEEeCCCEEEEEEEEeccCceEc------CcC--cEEEE--eccCEEEEECCCCC----CCeeEecc
Confidence 67778999999999998765432 221 12222 35699999988764 56666765
No 82
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=81.00 E-value=1.6 Score=45.02 Aligned_cols=37 Identities=22% Similarity=0.424 Sum_probs=33.3
Q ss_pred CceEEEeeCCCCHHhhhcCCCCEEEEECCEEcCChhh
Q 016641 341 TGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGT 377 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~~~~~ 377 (385)
.-++|++|.||+||+.-||.||-|+.|||....+...
T Consensus 40 tSiViSDVlpGGPAeG~LQenDrvvMVNGvsMenv~h 76 (1027)
T KOG3580|consen 40 TSIVISDVLPGGPAEGLLQENDRVVMVNGVSMENVLH 76 (1027)
T ss_pred eeEEEeeccCCCCcccccccCCeEEEEcCcchhhhHH
Confidence 5689999999999999999999999999998877643
No 83
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=77.88 E-value=2.9 Score=43.64 Aligned_cols=75 Identities=15% Similarity=0.252 Sum_probs=54.2
Q ss_pred EEecchHHHHHHHHHHcCeeeeeeccCccc------cccccHHHHhhcCCCCccCceEEEeeCCCCHHhhh-cCCCCEEE
Q 016641 293 IIPVPVIKHFITGVVEHGKYVGFCSLGLSC------QTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIIL 365 (385)
Q Consensus 293 aip~~~i~~~l~~l~~~g~~~~~~~lGi~~------~~~~~~~~~~~~g~~~~~~Gv~V~~V~~~spA~~a-L~~GDiI~ 365 (385)
-+|.+..+.+++.+++.-.|. |-|.- ..+.-|+++-++|+. .++|| |=+...|+-|++. +++|--|+
T Consensus 708 GLPLstcQs~Ik~~KnQT~Vk----ltiV~cpPV~~V~I~RPd~kyQLGFS-VQNGi-ICSLlRGGIAERGGVRVGHRII 781 (829)
T KOG3605|consen 708 GLPLSTCQSIIKGLKNQTAVK----LNIVSCPPVTTVLIRRPDLRYQLGFS-VQNGI-ICSLLRGGIAERGGVRVGHRII 781 (829)
T ss_pred cccHHHHHHHHhcccccceEE----EEEecCCCceEEEeecccchhhccce-eeCcE-eehhhcccchhccCceeeeeEE
Confidence 478899999998887544432 22211 122356677778887 35786 5566779999999 99999999
Q ss_pred EECCEEcC
Q 016641 366 AFDGVPIA 373 (385)
Q Consensus 366 ~vng~~i~ 373 (385)
+|||+.|.
T Consensus 782 EINgQSVV 789 (829)
T KOG3605|consen 782 EINGQSVV 789 (829)
T ss_pred EECCceEE
Confidence 99999874
No 84
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=73.21 E-value=1.5 Score=42.99 Aligned_cols=35 Identities=26% Similarity=0.302 Sum_probs=30.8
Q ss_pred CceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCCh
Q 016641 341 TGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIAND 375 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~ 375 (385)
.-++|++|.+|-.|++. |..||.|++|||+.+.+.
T Consensus 110 MPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~A 146 (506)
T KOG3551|consen 110 MPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDA 146 (506)
T ss_pred CceehhHhccccccccccceeeccEEEEecchhhhhc
Confidence 34899999999999986 999999999999987654
No 85
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=71.92 E-value=3 Score=42.62 Aligned_cols=41 Identities=27% Similarity=0.295 Sum_probs=34.9
Q ss_pred ceEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCCh--hhHHhhh
Q 016641 342 GVLVNKINPLSDAHEI--LKKDDIILAFDGVPIAND--GTGSHSM 382 (385)
Q Consensus 342 Gv~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~--~~l~~~l 382 (385)
-++|.++..|+-+++. |+.||.|+++||..+.+. .++.++|
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l 191 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELL 191 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHH
Confidence 4899999999999986 999999999999999875 5555544
No 86
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=67.86 E-value=3.4 Score=45.83 Aligned_cols=32 Identities=28% Similarity=0.356 Sum_probs=29.2
Q ss_pred EEEeeCCCCHHhhh-cCCCCEEEEECCEEcCCh
Q 016641 344 LVNKINPLSDAHEI-LKKDDIILAFDGVPIAND 375 (385)
Q Consensus 344 ~V~~V~~~spA~~a-L~~GDiI~~vng~~i~~~ 375 (385)
.|-.|.++|||..+ |++||.|+.+||+++...
T Consensus 661 ~v~sv~egsPA~~agls~~DlIthvnge~v~gl 693 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGL 693 (1205)
T ss_pred eeeeecCCCCccccCCCccceeEeccCcccchh
Confidence 57789999999999 999999999999998764
No 87
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=58.10 E-value=21 Score=27.64 Aligned_cols=37 Identities=24% Similarity=0.401 Sum_probs=30.4
Q ss_pred ccCCCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 158 VVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 158 vv~~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
++.....+.|.+. +++.+.+++.++|.+.++.|=...
T Consensus 10 ~~~~~~~V~V~lr-~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 10 AVKNNTQVLINCR-NNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred HHcCCCEEEEEEc-CCCEEEEEEEEecCccEEEEcceE
Confidence 4445678899987 889999999999999999876553
No 88
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=56.08 E-value=11 Score=38.65 Aligned_cols=40 Identities=13% Similarity=0.130 Sum_probs=30.9
Q ss_pred hhcCCCCccCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641 332 NNFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIA 373 (385)
Q Consensus 332 ~~~g~~~~~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~ 373 (385)
..++++- +=+.|.-..++.|..++.||||||++.||+.|.
T Consensus 291 a~l~l~v--nv~~vl~~~ENm~~g~A~rPGDVits~~GkTVE 330 (485)
T COG0260 291 AELKLPV--NVVGVLPAVENMPSGNAYRPGDVITSMNGKTVE 330 (485)
T ss_pred HHcCCCc--eEEEEEeeeccCCCCCCCCCCCeEEecCCcEEE
Confidence 3456663 445566677788999999999999999999874
No 89
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=51.07 E-value=11 Score=37.41 Aligned_cols=22 Identities=32% Similarity=0.617 Sum_probs=19.5
Q ss_pred cCCCCCCCccee-eCCEEEEEEe
Q 016641 260 AINPGNSGGPAI-MGNKVAGVAF 281 (385)
Q Consensus 260 ~i~~G~SGGPL~-~~G~vVGI~s 281 (385)
.+..|.||+.++ .+|++|||.+
T Consensus 351 ~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 351 SLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred CCCCCCCcCeEECCCCCEEEEeC
Confidence 456899999999 9999999975
No 90
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.95 E-value=52 Score=23.06 Aligned_cols=33 Identities=18% Similarity=0.061 Sum_probs=27.3
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES 196 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~ 196 (385)
..+.|.+. +++.+.+.+...|...++.|-....
T Consensus 7 ~~V~V~l~-~g~~~~G~L~~~D~~~Ni~L~~~~~ 39 (63)
T cd00600 7 KTVRVELK-DGRVLEGVLVAFDKYMNLVLDDVEE 39 (63)
T ss_pred CEEEEEEC-CCcEEEEEEEEECCCCCEEECCEEE
Confidence 46788886 8999999999999999987766543
No 91
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=48.70 E-value=12 Score=36.27 Aligned_cols=33 Identities=27% Similarity=0.301 Sum_probs=29.2
Q ss_pred eEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCCh
Q 016641 343 VLVNKINPLSDAHEI--LKKDDIILAFDGVPIAND 375 (385)
Q Consensus 343 v~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~ 375 (385)
|+|+++.++-.|+.. |=.||-|++|||..|..-
T Consensus 82 vviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c 116 (505)
T KOG3549|consen 82 VVISKIYKDQAADITGQLFVGDAILQVNGIYVTAC 116 (505)
T ss_pred EEeehhhhhhhhhhcCceEeeeeeEEeccEEeecC
Confidence 899999999888865 889999999999998753
No 92
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.12 E-value=93 Score=23.01 Aligned_cols=32 Identities=9% Similarity=-0.025 Sum_probs=26.9
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
.++.|.+. +++.+.+++.++|...++.|=...
T Consensus 10 ~~V~V~l~-dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 10 KTVSVITV-DGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CEEEEEEC-CCcEEEEEEEEEccccCEEccceE
Confidence 46778876 999999999999999998876653
No 93
>PRK05015 aminopeptidase B; Provisional
Probab=45.74 E-value=20 Score=36.01 Aligned_cols=39 Identities=18% Similarity=0.077 Sum_probs=29.2
Q ss_pred hcCCCCccCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641 333 NFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIA 373 (385)
Q Consensus 333 ~~g~~~~~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~ 373 (385)
.++++. +=+.|--+.++.+..++.|+||||...||+.|.
T Consensus 230 ~~~l~~--nV~~il~~aENmisg~A~kpgDVIt~~nGkTVE 268 (424)
T PRK05015 230 TRGLNK--RVKLFLCCAENLISGNAFKLGDIITYRNGKTVE 268 (424)
T ss_pred hcCCCc--eEEEEEEecccCCCCCCCCCCCEEEecCCcEEe
Confidence 345553 334455667788888889999999999999874
No 94
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.69 E-value=93 Score=23.20 Aligned_cols=32 Identities=13% Similarity=0.033 Sum_probs=26.8
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
.++.|.+. +|+.+.+.+.++|+..++.|=...
T Consensus 13 k~v~V~l~-~gr~~~G~L~~fD~~~NlvL~d~~ 44 (74)
T cd01728 13 KKVVVLLR-DGRKLIGILRSFDQFANLVLQDTV 44 (74)
T ss_pred CEEEEEEc-CCeEEEEEEEEECCcccEEecceE
Confidence 56788886 899999999999999998876553
No 95
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=44.72 E-value=59 Score=23.98 Aligned_cols=33 Identities=21% Similarity=0.305 Sum_probs=27.8
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES 196 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~ 196 (385)
..+.|.+. +|+.+.+.+.++|...++.|-....
T Consensus 15 k~V~V~lk-~g~~~~G~L~~~D~~mNlvL~d~~e 47 (72)
T PRK00737 15 SPVLVRLK-GGREFRGELQGYDIHMNLVLDNAEE 47 (72)
T ss_pred CEEEEEEC-CCCEEEEEEEEEcccceeEEeeEEE
Confidence 46788886 8999999999999999998777643
No 96
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=44.37 E-value=61 Score=23.48 Aligned_cols=33 Identities=18% Similarity=0.246 Sum_probs=28.3
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES 196 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~ 196 (385)
..+.|.+. +|+.+.+++.++|...++.|-....
T Consensus 11 ~~V~V~l~-~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 11 KPVLVKLK-GGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CEEEEEEC-CCCEEEEEEEEECCcceEEEeeEEE
Confidence 56888886 8999999999999999998877643
No 97
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=44.21 E-value=56 Score=23.64 Aligned_cols=32 Identities=22% Similarity=0.282 Sum_probs=27.1
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
..+.|.+. +|+.+.+++.++|...++.|=...
T Consensus 11 ~~V~V~Lk-~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 11 RPVVVKLN-SGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CeEEEEEC-CCCEEEEEEEEEccceeeEEeeEE
Confidence 46788886 889999999999999999876653
No 98
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=43.94 E-value=54 Score=23.87 Aligned_cols=32 Identities=19% Similarity=0.240 Sum_probs=26.9
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
..+.|.+. +|+.+.+++..+|...++.|=.+.
T Consensus 12 ~~V~V~Lk-~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 12 KPVIVKLK-WGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CEEEEEEC-CCcEEEEEEEEECCCEEEEEeeEE
Confidence 46788886 899999999999999999875553
No 99
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=43.21 E-value=49 Score=25.08 Aligned_cols=31 Identities=16% Similarity=0.195 Sum_probs=26.1
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv 194 (385)
..+.|.+. +|+.+.+++.++|.+.+|.|=..
T Consensus 12 k~V~V~l~-~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKLR-GDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEEC-CCCEEEEEEEEEccceEEeccce
Confidence 56888886 88999999999999999876544
No 100
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.69 E-value=66 Score=24.40 Aligned_cols=31 Identities=6% Similarity=-0.003 Sum_probs=26.2
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv 194 (385)
.++.|.+. +|+.+.+.+.++|...+|.|=..
T Consensus 13 k~V~V~l~-~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKFQ-GGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEEC-CCcEEEEEEEEEcCcccEEecCE
Confidence 56788886 89999999999999999876554
No 101
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.61 E-value=59 Score=24.37 Aligned_cols=31 Identities=6% Similarity=0.104 Sum_probs=26.3
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv 194 (385)
..+.|.+. +++.+.+++.++|...++.|=..
T Consensus 14 ~~V~V~l~-~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVMK-SDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEEC-CCeEEEEEEEEeccceEEEEccE
Confidence 57788886 88999999999999999976554
No 102
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.08 E-value=72 Score=23.85 Aligned_cols=32 Identities=3% Similarity=0.107 Sum_probs=26.7
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
..+.|.+. ||+.+.+.+..+|...+|.|=...
T Consensus 11 ~~v~V~l~-dgR~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 11 RTMRIHMT-DGRTLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred CeEEEEEc-CCeEEEEEEEEEcCCCcEEecCcE
Confidence 46788886 999999999999999998765543
No 103
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=40.79 E-value=65 Score=24.17 Aligned_cols=32 Identities=16% Similarity=0.107 Sum_probs=26.6
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
..+.|.+. ||+.+.+.+.++|...+|.|=...
T Consensus 11 ~~V~V~l~-dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 11 YRLRVTLQ-DGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CEEEEEEC-CCcEEEEEEEEEcCccCEEcCCEE
Confidence 46788886 899999999999999998765543
No 104
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=40.08 E-value=11 Score=39.43 Aligned_cols=29 Identities=17% Similarity=0.257 Sum_probs=24.2
Q ss_pred EeeCCCCHHhhh--cCCCCEEEEECCEEcCC
Q 016641 346 NKINPLSDAHEI--LKKDDIILAFDGVPIAN 374 (385)
Q Consensus 346 ~~V~~~spA~~a--L~~GDiI~~vng~~i~~ 374 (385)
.....++||++. |..||-|++|||..+..
T Consensus 678 Anmm~~GpAarsgkLnIGDQiiaING~SLVG 708 (829)
T KOG3605|consen 678 ANMMHGGPAARSGKLNIGDQIMSINGTSLVG 708 (829)
T ss_pred HhcccCChhhhcCCccccceeEeecCceecc
Confidence 456678999987 99999999999986543
No 105
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=39.02 E-value=81 Score=23.29 Aligned_cols=31 Identities=6% Similarity=-0.016 Sum_probs=26.0
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv 194 (385)
..+.|.+. +|+.+.+++.++|...+|.|=..
T Consensus 11 k~V~V~L~-~g~~~~G~L~~~D~~mNlvL~~~ 41 (72)
T cd01719 11 KKLSLKLN-GNRKVSGILRGFDPFMNLVLDDA 41 (72)
T ss_pred CeEEEEEC-CCeEEEEEEEEEcccccEEeccE
Confidence 46778886 88999999999999999877554
No 106
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=38.82 E-value=1.2e+02 Score=21.75 Aligned_cols=33 Identities=21% Similarity=0.184 Sum_probs=27.3
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES 196 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~ 196 (385)
..+.++.. .|..++++++.+|....+.+|+...
T Consensus 7 s~V~~kTc-~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 7 SQVSCRTC-FEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred cEEEEEec-CCceEEEEEEEecCCCcEEEEECcc
Confidence 45666665 6889999999999999999998655
No 107
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=36.42 E-value=46 Score=29.56 Aligned_cols=28 Identities=21% Similarity=0.164 Sum_probs=24.7
Q ss_pred CceEEEeeCCCCHHhhh-cCCCCEEEEEC
Q 016641 341 TGVLVNKINPLSDAHEI-LKKDDIILAFD 368 (385)
Q Consensus 341 ~Gv~V~~V~~~spA~~a-L~~GDiI~~vn 368 (385)
..++|..|..||||+++ +.-|+.|+++-
T Consensus 122 ~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 122 GKVIVDEVEFGSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred CEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence 56899999999999999 99999887763
No 108
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=35.09 E-value=1.6e+02 Score=20.71 Aligned_cols=37 Identities=24% Similarity=0.285 Sum_probs=29.6
Q ss_pred CCceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCC
Q 016641 161 DSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESD 197 (385)
Q Consensus 161 ~~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~ 197 (385)
....+.++.+++...|++++..+|...++.-++.+..
T Consensus 8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG 44 (55)
T PF09465_consen 8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG 44 (55)
T ss_dssp SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence 4567889999888888999999999999998888765
No 109
>PRK00913 multifunctional aminopeptidase A; Provisional
Probab=34.87 E-value=28 Score=35.80 Aligned_cols=31 Identities=16% Similarity=0.164 Sum_probs=25.4
Q ss_pred eEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641 343 VLVNKINPLSDAHEILKKDDIILAFDGVPIA 373 (385)
Q Consensus 343 v~V~~V~~~spA~~aL~~GDiI~~vng~~i~ 373 (385)
+-|--..++.|..++.||||||...||+.|.
T Consensus 301 ~~v~~l~ENm~~~~A~rPgDVi~~~~GkTVE 331 (483)
T PRK00913 301 VGVVAACENMPSGNAYRPGDVLTSMSGKTIE 331 (483)
T ss_pred EEEEEeeccCCCCCCCCCCCEEEECCCcEEE
Confidence 3445556788888999999999999999874
No 110
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=34.25 E-value=1.1e+02 Score=21.76 Aligned_cols=33 Identities=24% Similarity=0.301 Sum_probs=27.0
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES 196 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~ 196 (385)
..+.|.+. +|+.+.+.+...|...++.|=....
T Consensus 9 ~~V~V~l~-~g~~~~G~L~~~D~~~NlvL~~~~e 41 (67)
T smart00651 9 KRVLVELK-NGREYRGTLKGFDQFMNLVLEDVEE 41 (67)
T ss_pred cEEEEEEC-CCcEEEEEEEEECccccEEEccEEE
Confidence 46788886 8899999999999999987765543
No 111
>PTZ00412 leucyl aminopeptidase; Provisional
Probab=33.93 E-value=34 Score=35.69 Aligned_cols=40 Identities=13% Similarity=0.117 Sum_probs=29.0
Q ss_pred hhcCCCCccCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641 332 NNFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIA 373 (385)
Q Consensus 332 ~~~g~~~~~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~ 373 (385)
..++++- +=+-|.-..++.|..++.+|||||...||+.|.
T Consensus 337 A~Lklpv--nVv~iiplaENm~sg~A~rPGDVits~nGkTVE 376 (569)
T PTZ00412 337 AKLQLPV--NVVAAVGLAENAIGPESYHPSSIITSRKGLTVE 376 (569)
T ss_pred HHcCCCe--EEEEEEEhhhcCCCCCCCCCCCEeEecCCCEEe
Confidence 3456653 333445556678888889999999999999864
No 112
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=33.84 E-value=32 Score=31.26 Aligned_cols=53 Identities=15% Similarity=0.360 Sum_probs=36.2
Q ss_pred EEEecccCCCCCCCccee-eC----CEEEEEEeeecCCCCceEEEEec--chHHHHHHHHH
Q 016641 254 AIQIDAAINPGNSGGPAI-MG----NKVAGVAFQNLSGAENIGYIIPV--PVIKHFITGVV 307 (385)
Q Consensus 254 ~i~~d~~i~~G~SGGPL~-~~----G~vVGI~s~~~~~~~~~~~aip~--~~i~~~l~~l~ 307 (385)
.++...+...|+-|+|++ .+ -+++||+.++..+ .+.+||=++ +++++.++.|.
T Consensus 170 gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~-~~~gYAe~itQEDL~~A~~~l~ 229 (231)
T PF12381_consen 170 GLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSAN-HAMGYAESITQEDLMRAINKLE 229 (231)
T ss_pred eeeEECCCcCCCccceeeEcchhhhhhhheeeeccccc-ccceehhhhhHHHHHHHHHhhc
Confidence 356667778999999999 44 4799999986532 356777655 34555555543
No 113
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=33.60 E-value=1.1e+02 Score=22.36 Aligned_cols=32 Identities=16% Similarity=0.181 Sum_probs=27.8
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
..+.|.+. +|..+.+++...|...++.|-...
T Consensus 11 ~~V~VeLk-~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 11 HIVTVELK-TGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CEEEEEEC-CCcEEEEEEEEEcCCceeEEEEEE
Confidence 56788886 889999999999999999888774
No 114
>PF00883 Peptidase_M17: Cytosol aminopeptidase family, catalytic domain; InterPro: IPR000819 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine). Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear [, ]. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids []. The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another []. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape []. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices []. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core []. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer []. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain [].; GO: 0004177 aminopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 3KZW_L 3KQX_C 3KQZ_L 3KR4_I 3KR5_J 3T8W_C 3H8F_D 3H8G_A 3H8E_B 3IJ3_A ....
Probab=33.01 E-value=22 Score=34.30 Aligned_cols=30 Identities=13% Similarity=0.201 Sum_probs=20.4
Q ss_pred EEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641 344 LVNKINPLSDAHEILKKDDIILAFDGVPIA 373 (385)
Q Consensus 344 ~V~~V~~~spA~~aL~~GDiI~~vng~~i~ 373 (385)
-|--+.++.|..++.+|||||.+.||+.|.
T Consensus 133 ~~l~~~EN~i~~~a~~pgDVi~s~~GkTVE 162 (311)
T PF00883_consen 133 AVLPLAENMISGNAYRPGDVITSMNGKTVE 162 (311)
T ss_dssp EEEEEEEE--STTSTTTTEEEE-TTS-EEE
T ss_pred EEEEcccccCCCCCCCCCCEEEeCCCCEEE
Confidence 344455678888889999999999999873
No 115
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=32.33 E-value=95 Score=23.20 Aligned_cols=33 Identities=24% Similarity=0.293 Sum_probs=27.6
Q ss_pred ceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 016641 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES 196 (385)
Q Consensus 163 ~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~ 196 (385)
..+.|.+. +|+.+.+++.++|...++.|--+..
T Consensus 18 ~~V~V~lk-~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 18 KRVLVKLK-NGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CEEEEEEC-CCCEEEEEEEEEccceeEEEeceEE
Confidence 67888886 8899999999999999987765543
No 116
>cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants.
Probab=31.88 E-value=33 Score=35.24 Aligned_cols=31 Identities=16% Similarity=0.127 Sum_probs=25.3
Q ss_pred eEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641 343 VLVNKINPLSDAHEILKKDDIILAFDGVPIA 373 (385)
Q Consensus 343 v~V~~V~~~spA~~aL~~GDiI~~vng~~i~ 373 (385)
+-|.-..++.+..++.+|||||...||+.|.
T Consensus 287 ~~i~~~~EN~is~~A~rPgDVi~s~~GkTVE 317 (468)
T cd00433 287 VGVLPLAENMISGNAYRPGDVITSRSGKTVE 317 (468)
T ss_pred EEEEEeeecCCCCCCCCCCCEeEeCCCcEEE
Confidence 4455566788888889999999999999874
No 117
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=31.30 E-value=1.3e+02 Score=21.34 Aligned_cols=35 Identities=17% Similarity=0.167 Sum_probs=29.3
Q ss_pred CceEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCC
Q 016641 162 STFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESD 197 (385)
Q Consensus 162 ~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~~~ 197 (385)
...+.|.+. +|+.+.+.+...|...++.|-.....
T Consensus 8 g~~V~V~l~-~g~~~~G~L~~~D~~~Nl~L~~~~~~ 42 (67)
T PF01423_consen 8 GKRVRVELK-NGRTYRGTLVSFDQFMNLVLSDVTET 42 (67)
T ss_dssp TSEEEEEET-TSEEEEEEEEEEETTEEEEEEEEEEE
T ss_pred CcEEEEEEe-CCEEEEEEEEEeechheEEeeeEEEE
Confidence 357888886 89999999999999999988777653
No 118
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=29.60 E-value=1.2e+02 Score=23.02 Aligned_cols=47 Identities=21% Similarity=0.264 Sum_probs=30.5
Q ss_pred EEEEEEEecCCCCeEEEEecCCcccccceeeecCCcccCCCeEEE-EecC
Q 016641 176 YRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIPFLQQAVAV-VGYP 224 (385)
Q Consensus 176 ~~a~v~~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~G~~V~~-iG~p 224 (385)
+|++++..|...++|++.+-.-. ..+.---++...++|+.|.+ +||.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvk--reV~l~Lv~~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVK--REVNLDLVGEEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEe--EEEEeeeecCccccCCEEEEEeeEE
Confidence 57888999988889999886532 12222223335578998776 5654
No 119
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=27.86 E-value=21 Score=33.42 Aligned_cols=38 Identities=21% Similarity=0.200 Sum_probs=32.2
Q ss_pred eEEEeeCCCCHHhhh--cCCCCEEEEECCEEcCChhhHHh
Q 016641 343 VLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTGSH 380 (385)
Q Consensus 343 v~V~~V~~~spA~~a--L~~GDiI~~vng~~i~~~~~l~~ 380 (385)
..|..+.++|--.+. +++||.|-+|||+.|-.-+.++-
T Consensus 151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeV 190 (334)
T KOG3938|consen 151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEV 190 (334)
T ss_pred eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHH
Confidence 478899999988876 99999999999999987765543
No 120
>KOG2597 consensus Predicted aminopeptidase of the M17 family [General function prediction only]
Probab=26.91 E-value=67 Score=33.13 Aligned_cols=43 Identities=12% Similarity=0.075 Sum_probs=31.8
Q ss_pred HHHhhcCCCCccCceEEEeeCCCCHHhhhcCCCCEEEEECCEEcC
Q 016641 329 QLRNNFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIA 373 (385)
Q Consensus 329 ~~~~~~g~~~~~~Gv~V~~V~~~spA~~aL~~GDiI~~vng~~i~ 373 (385)
+....+++|. +-..|.-.-+++|...+-|+||||+..||+.|.
T Consensus 310 ~a~~~l~~~i--n~~~v~plcENm~sg~A~kpgDVit~~nGKtve 352 (513)
T KOG2597|consen 310 RAAAQLSLPI--NVHAVLPLCENMPSGNATKPGDVITLRNGKTVE 352 (513)
T ss_pred HHHHhcCCCC--ceEEEEeeeccCCCccCCCCCcEEEecCCcEEE
Confidence 3344566663 444555566789999999999999999999874
No 121
>PF11730 DUF3297: Protein of unknown function (DUF3297); InterPro: IPR021724 This family is expressed in Proteobacteria and Actinobacteria. The function is not known.
Probab=25.24 E-value=44 Score=24.42 Aligned_cols=32 Identities=25% Similarity=0.251 Sum_probs=27.4
Q ss_pred eeCCCCHHhhh-cCCCCEEEEECCEEcCChhhH
Q 016641 347 KINPLSDAHEI-LKKDDIILAFDGVPIANDGTG 378 (385)
Q Consensus 347 ~V~~~spA~~a-L~~GDiI~~vng~~i~~~~~l 378 (385)
+++|.||-+.+ +-.-||=+.+||++=++.+++
T Consensus 5 S~~P~Sp~~~~~~l~~~iGIrfng~Er~nVeEY 37 (71)
T PF11730_consen 5 SINPRSPHYDAEVLERGIGIRFNGKERTNVEEY 37 (71)
T ss_pred ccCCCChhhHHHHHhcCcceEECCeEcccceeE
Confidence 57899999988 777899999999998887764
No 122
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.14 E-value=2.1e+02 Score=21.20 Aligned_cols=33 Identities=12% Similarity=0.126 Sum_probs=27.8
Q ss_pred CceEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 016641 162 STFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (385)
Q Consensus 162 ~~~i~V~~~~~g~~~~a~v~~~d~~~DlAlLkv~ 195 (385)
...+.|.+. +|..+.+++..+|...++.|-.+.
T Consensus 11 g~~V~VeLk-ng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 11 NHPMLVELK-NGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CCEEEEEEC-CCCEEEEEEEEEcCCCceEEEeEE
Confidence 357788886 789999999999999999887663
No 123
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=23.06 E-value=53 Score=34.50 Aligned_cols=31 Identities=19% Similarity=0.177 Sum_probs=27.8
Q ss_pred eEEEeeCCCCHHhhh--cCCCCEEEEECCEEcC
Q 016641 343 VLVNKINPLSDAHEI--LKKDDIILAFDGVPIA 373 (385)
Q Consensus 343 v~V~~V~~~spA~~a--L~~GDiI~~vng~~i~ 373 (385)
.+|+++.++|||... |..||-|+.||++.+.
T Consensus 227 h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvV 259 (638)
T KOG1738|consen 227 HVTSKIFEQSPADYRQKILDGDEVLQINEQTVV 259 (638)
T ss_pred eeccccccCChHHHhhcccCccceeeecccccc
Confidence 477889999999987 9999999999999865
No 124
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=22.86 E-value=69 Score=32.19 Aligned_cols=39 Identities=18% Similarity=0.115 Sum_probs=31.1
Q ss_pred EEEeeCCCCHHhhh-cC-CCCEEEEECCEEcCChhhHHhhh
Q 016641 344 LVNKINPLSDAHEI-LK-KDDIILAFDGVPIANDGTGSHSM 382 (385)
Q Consensus 344 ~V~~V~~~spA~~a-L~-~GDiI~~vng~~i~~~~~l~~~l 382 (385)
=|-+|.++|||+.| |+ -+|-|+.+-.......+||...|
T Consensus 112 Hvl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lI 152 (462)
T KOG3834|consen 112 HVLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLI 152 (462)
T ss_pred eeeecCCCCHHHhcccccccceEecchhhhccchHHHHHHH
Confidence 46789999999999 88 68999999555666777776654
Done!