Query psy18066
Match_columns 375
No_of_seqs 383 out of 3565
Neff 8.2
Searched_HMMs 46136
Date Fri Aug 16 21:45:48 2013
Command hhsearch -i /work/01045/syshi/Psyhhblits/psy18066.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/18066hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 1.3E-55 2.9E-60 439.2 38.9 287 74-374 41-358 (455)
2 TIGR02038 protease_degS peripl 100.0 6.4E-54 1.4E-58 415.4 42.7 287 73-374 45-346 (351)
3 PRK10898 serine endoprotease; 100.0 1.9E-53 4.1E-58 411.9 42.2 286 74-374 46-347 (353)
4 PRK10942 serine endoprotease; 100.0 1.4E-52 3E-57 419.4 38.7 287 74-374 39-379 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 1.2E-51 2.7E-56 410.7 37.7 286 75-374 3-325 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 1E-41 2.2E-46 331.0 33.0 287 73-374 33-338 (347)
7 KOG1320|consensus 100.0 8.8E-32 1.9E-36 262.0 20.5 308 66-373 121-465 (473)
8 KOG1421|consensus 99.9 1.3E-25 2.8E-30 221.1 22.6 291 74-374 53-369 (955)
9 PF13365 Trypsin_2: Trypsin-li 99.7 3.6E-17 7.7E-22 133.7 11.8 111 103-242 1-120 (120)
10 KOG1421|consensus 99.6 8.1E-14 1.8E-18 138.5 23.0 277 79-374 524-829 (955)
11 PF13180 PDZ_2: PDZ domain; PD 99.6 7.9E-15 1.7E-19 112.4 11.7 79 280-374 1-82 (82)
12 cd00987 PDZ_serine_protease PD 99.4 1.8E-12 3.9E-17 100.7 11.7 86 280-371 1-89 (90)
13 PF00089 Trypsin: Trypsin; In 99.4 1.3E-11 2.8E-16 110.9 17.1 174 83-264 12-220 (220)
14 cd00990 PDZ_glycyl_aminopeptid 99.4 4.6E-12 9.9E-17 96.3 10.7 67 308-374 11-77 (80)
15 cd00190 Tryp_SPc Trypsin-like 99.3 6.3E-11 1.4E-15 107.2 16.9 178 83-266 12-231 (232)
16 cd00991 PDZ_archaeal_metallopr 99.3 1.9E-11 4.1E-16 92.9 11.2 68 306-373 7-77 (79)
17 KOG1320|consensus 99.3 4.7E-12 1E-16 124.4 9.6 250 78-342 55-319 (473)
18 TIGR01713 typeII_sec_gspC gene 99.3 5.6E-11 1.2E-15 110.3 12.5 99 257-374 158-259 (259)
19 cd00986 PDZ_LON_protease PDZ d 99.2 1.7E-10 3.7E-15 87.6 10.5 67 308-375 7-76 (79)
20 cd00989 PDZ_metalloprotease PD 99.2 2.8E-10 6.1E-15 86.1 11.0 65 309-373 12-78 (79)
21 smart00020 Tryp_SPc Trypsin-li 99.1 3.1E-09 6.8E-14 96.3 16.5 159 83-247 13-208 (229)
22 cd00988 PDZ_CTP_protease PDZ d 99.1 8.9E-10 1.9E-14 84.6 10.9 66 308-373 12-82 (85)
23 TIGR02037 degP_htrA_DO peripla 99.1 9.1E-10 2E-14 110.2 11.8 88 279-371 337-427 (428)
24 cd00136 PDZ PDZ domain, also c 99.0 4.1E-09 8.8E-14 77.7 8.4 53 309-361 13-69 (70)
25 COG3591 V8-like Glu-specific e 98.9 2.1E-07 4.6E-12 85.0 17.9 171 81-268 49-250 (251)
26 TIGR00054 RIP metalloprotease 98.8 1.7E-08 3.7E-13 100.6 10.4 66 309-374 203-270 (420)
27 PRK10779 zinc metallopeptidase 98.8 7.6E-09 1.7E-13 104.0 7.9 63 312-374 129-194 (449)
28 PRK10779 zinc metallopeptidase 98.8 4.5E-08 9.7E-13 98.5 11.0 66 309-374 221-288 (449)
29 PRK10942 serine endoprotease; 98.7 4.7E-08 1E-12 98.7 10.8 64 309-372 408-472 (473)
30 PRK10139 serine endoprotease; 98.7 7.9E-08 1.7E-12 96.6 10.5 64 309-372 390-454 (455)
31 cd00992 PDZ_signaling PDZ doma 98.7 1.4E-07 3E-12 71.6 8.8 53 309-361 26-81 (82)
32 TIGR00225 prc C-terminal pepti 98.7 1.3E-07 2.8E-12 91.6 10.0 66 309-374 62-131 (334)
33 smart00228 PDZ Domain present 98.6 1.9E-07 4.1E-12 71.1 8.8 57 309-365 26-85 (85)
34 PF00863 Peptidase_C4: Peptida 98.6 7.9E-06 1.7E-10 74.1 18.2 162 82-258 16-185 (235)
35 PF00595 PDZ: PDZ domain (Also 98.6 2.4E-07 5.2E-12 70.4 7.2 70 279-362 9-81 (81)
36 TIGR02860 spore_IV_B stage IV 98.5 4.5E-07 9.7E-12 88.4 10.5 66 308-373 104-179 (402)
37 PLN00049 carboxyl-terminal pro 98.5 4.2E-07 9E-12 89.8 10.4 64 309-372 102-169 (389)
38 TIGR03279 cyano_FeS_chp putati 98.5 3.9E-07 8.5E-12 89.4 7.6 60 313-373 2-62 (433)
39 TIGR00054 RIP metalloprotease 98.4 4.8E-07 1.1E-11 90.2 7.2 64 309-372 128-192 (420)
40 COG0793 Prc Periplasmic protea 98.3 2.5E-06 5.4E-11 84.4 9.1 59 309-367 112-174 (406)
41 PF14685 Tricorn_PDZ: Tricorn 98.2 1.2E-05 2.7E-10 61.8 8.8 65 308-372 11-88 (88)
42 PF04495 GRASP55_65: GRASP55/6 98.1 1.2E-05 2.6E-10 67.5 7.0 81 280-372 26-111 (138)
43 PRK09681 putative type II secr 98.0 1.6E-05 3.5E-10 73.8 8.0 66 310-375 205-276 (276)
44 KOG3627|consensus 98.0 0.0007 1.5E-08 62.6 18.3 162 102-267 39-253 (256)
45 PRK11186 carboxy-terminal prot 98.0 2.9E-05 6.2E-10 81.1 9.5 64 309-372 255-331 (667)
46 COG3480 SdrC Predicted secrete 97.9 4.7E-05 1E-09 71.0 8.3 65 309-374 130-198 (342)
47 PF05579 Peptidase_S32: Equine 97.9 8.8E-05 1.9E-09 67.5 9.4 124 100-255 111-237 (297)
48 PF12812 PDZ_1: PDZ-like domai 97.8 0.0001 2.2E-09 55.6 7.4 66 280-353 9-74 (78)
49 KOG3553|consensus 97.8 2.6E-05 5.6E-10 60.1 3.6 36 307-342 57-92 (124)
50 KOG3129|consensus 97.7 8.9E-05 1.9E-09 65.0 7.1 64 310-373 140-208 (231)
51 COG3975 Predicted protease wit 97.5 0.00013 2.8E-09 72.5 5.9 63 306-373 459-521 (558)
52 COG3031 PulC Type II secretory 97.4 0.00029 6.4E-09 63.1 6.2 65 309-373 207-274 (275)
53 COG5640 Secreted trypsin-like 97.2 0.014 3E-07 55.7 14.4 50 221-270 223-280 (413)
54 PF03761 DUF316: Domain of unk 97.1 0.051 1.1E-06 51.2 18.2 168 83-262 53-273 (282)
55 PF00548 Peptidase_C3: 3C cyst 96.9 0.029 6.2E-07 49.0 13.3 148 83-246 12-170 (172)
56 PF10459 Peptidase_S46: Peptid 96.6 0.0078 1.7E-07 63.4 8.9 23 101-123 47-69 (698)
57 PF08192 Peptidase_S64: Peptid 96.5 0.024 5.3E-07 58.2 10.8 117 148-268 541-689 (695)
58 KOG3532|consensus 96.4 0.011 2.4E-07 60.4 7.7 48 306-353 395-442 (1051)
59 PF05580 Peptidase_S55: SpoIVB 96.3 0.1 2.2E-06 46.7 12.6 158 100-260 19-215 (218)
60 PF10459 Peptidase_S46: Peptid 96.3 0.0062 1.3E-07 64.1 5.8 53 216-268 623-687 (698)
61 KOG3209|consensus 95.7 0.013 2.8E-07 60.2 4.8 53 313-365 782-838 (984)
62 PF00949 Peptidase_S7: Peptida 95.4 0.021 4.5E-07 47.4 4.0 115 83-247 4-118 (132)
63 KOG3580|consensus 95.3 0.026 5.5E-07 57.0 5.0 55 307-361 427-486 (1027)
64 PF00944 Peptidase_S3: Alphavi 95.1 0.035 7.7E-07 45.5 4.4 40 216-255 96-135 (158)
65 KOG3542|consensus 94.7 0.038 8.2E-07 56.6 4.3 56 307-362 560-617 (1283)
66 COG0750 Predicted membrane-ass 94.5 0.13 2.9E-06 50.4 7.9 53 315-367 135-193 (375)
67 KOG3571|consensus 94.4 0.061 1.3E-06 53.4 5.1 57 307-363 275-338 (626)
68 PF02122 Peptidase_S39: Peptid 94.3 0.23 5.1E-06 44.4 8.2 117 113-246 43-166 (203)
69 KOG3580|consensus 94.3 0.061 1.3E-06 54.4 4.9 66 307-372 217-286 (1027)
70 KOG3209|consensus 94.2 0.094 2E-06 54.1 6.1 56 310-365 924-982 (984)
71 KOG3552|consensus 94.1 0.08 1.7E-06 56.1 5.5 55 309-364 75-132 (1298)
72 PF09342 DUF1986: Domain of un 93.9 1.3 2.9E-05 40.4 12.1 97 84-187 17-131 (267)
73 KOG3550|consensus 93.8 0.24 5.2E-06 41.5 6.7 56 307-362 113-172 (207)
74 KOG3834|consensus 93.6 0.19 4.2E-06 49.1 6.8 60 304-363 10-72 (462)
75 TIGR02860 spore_IV_B stage IV 93.3 0.62 1.3E-05 46.0 9.8 43 217-260 351-395 (402)
76 KOG2921|consensus 93.2 0.12 2.5E-06 50.0 4.4 48 306-353 217-265 (484)
77 PF00947 Pico_P2A: Picornaviru 92.3 0.32 6.8E-06 39.8 5.2 39 214-253 78-116 (127)
78 KOG0606|consensus 92.1 0.23 5E-06 54.0 5.3 52 311-362 660-714 (1205)
79 KOG3651|consensus 91.8 0.44 9.6E-06 44.5 6.2 54 309-362 30-87 (429)
80 KOG3606|consensus 90.7 0.69 1.5E-05 42.6 6.2 60 306-365 191-254 (358)
81 PF03510 Peptidase_C24: 2C end 90.5 1.8 4E-05 34.3 7.6 54 105-171 3-56 (105)
82 KOG3551|consensus 89.8 0.35 7.6E-06 46.7 3.7 53 310-362 111-167 (506)
83 KOG1892|consensus 89.3 0.48 1E-05 50.7 4.6 58 308-365 959-1020(1629)
84 KOG3605|consensus 88.8 0.61 1.3E-05 48.0 4.8 109 225-352 679-801 (829)
85 KOG3549|consensus 87.8 0.75 1.6E-05 43.8 4.4 53 310-362 81-137 (505)
86 PF02907 Peptidase_S29: Hepati 86.8 1 2.2E-05 37.2 4.1 123 105-260 16-146 (148)
87 KOG0609|consensus 86.8 1.6 3.5E-05 44.1 6.3 54 310-363 147-204 (542)
88 KOG3834|consensus 84.0 1.7 3.7E-05 42.7 4.9 51 313-363 113-166 (462)
89 PF02395 Peptidase_S6: Immunog 78.1 24 0.00051 38.2 11.4 63 101-168 65-130 (769)
90 KOG3605|consensus 77.1 1.6 3.6E-05 45.0 2.3 53 311-363 675-733 (829)
91 PF01732 DUF31: Putative pepti 73.2 2.4 5.2E-05 41.7 2.3 25 220-244 349-373 (374)
92 PF05416 Peptidase_C37: Southa 70.9 8.7 0.00019 37.9 5.4 138 100-250 378-530 (535)
93 PF11874 DUF3394: Domain of un 68.3 6.8 0.00015 34.4 3.8 31 307-337 120-150 (183)
94 cd00600 Sm_like The eukaryotic 68.2 14 0.0003 25.8 4.9 33 127-159 7-39 (63)
95 cd01726 LSm6 The eukaryotic Sm 66.4 13 0.00029 26.7 4.5 33 126-158 10-42 (67)
96 PRK00737 small nuclear ribonuc 66.1 15 0.00032 27.0 4.7 34 126-159 14-47 (72)
97 cd01731 archaeal_Sm1 The archa 65.1 16 0.00035 26.3 4.7 34 126-159 10-43 (68)
98 cd01722 Sm_F The eukaryotic Sm 65.0 14 0.0003 26.8 4.3 33 126-158 11-43 (68)
99 cd01730 LSm3 The eukaryotic Sm 63.2 14 0.0003 27.9 4.2 31 127-157 12-42 (82)
100 cd01732 LSm5 The eukaryotic Sm 63.0 15 0.00033 27.3 4.3 31 127-157 14-44 (76)
101 cd01720 Sm_D2 The eukaryotic S 62.3 16 0.00035 28.0 4.4 32 127-158 15-46 (87)
102 cd01717 Sm_B The eukaryotic Sm 62.3 17 0.00036 27.2 4.5 32 127-158 11-42 (79)
103 cd01735 LSm12_N LSm12 belongs 62.3 27 0.00057 24.9 5.2 33 127-159 7-39 (61)
104 cd06168 LSm9 The eukaryotic Sm 62.1 19 0.0004 26.8 4.6 31 127-157 11-41 (75)
105 cd01729 LSm7 The eukaryotic Sm 61.1 19 0.00042 27.1 4.7 31 127-157 13-43 (81)
106 cd01719 Sm_G The eukaryotic Sm 59.5 22 0.00049 26.1 4.6 31 127-157 11-41 (72)
107 cd01728 LSm1 The eukaryotic Sm 57.2 25 0.00053 26.1 4.5 32 126-157 12-43 (74)
108 smart00651 Sm snRNP Sm protein 56.2 28 0.0006 24.7 4.7 34 126-159 8-41 (67)
109 cd01721 Sm_D3 The eukaryotic S 55.9 28 0.00061 25.3 4.7 33 126-158 10-42 (70)
110 cd01727 LSm8 The eukaryotic Sm 54.9 27 0.0006 25.6 4.5 31 127-157 10-40 (74)
111 PF01423 LSM: LSM domain ; In 53.3 22 0.00049 25.2 3.8 34 127-160 9-42 (67)
112 PF02743 Cache_1: Cache domain 52.1 23 0.00049 26.0 3.8 35 229-271 18-52 (81)
113 KOG3938|consensus 49.7 23 0.00049 32.9 3.9 54 309-362 149-208 (334)
114 PF00571 CBS: CBS domain CBS d 47.7 19 0.00042 24.2 2.6 20 226-245 29-48 (57)
115 COG1958 LSM1 Small nuclear rib 47.4 39 0.00085 25.1 4.4 34 126-159 17-50 (79)
116 cd01733 LSm10 The eukaryotic S 47.2 32 0.00069 25.7 3.8 35 124-158 17-51 (78)
117 PF12381 Peptidase_C3G: Tungro 45.3 26 0.00057 31.5 3.5 53 215-267 169-228 (231)
118 cd01723 LSm4 The eukaryotic Sm 45.2 55 0.0012 24.2 4.8 34 125-158 10-43 (76)
119 cd01725 LSm2 The eukaryotic Sm 41.2 55 0.0012 24.6 4.3 33 126-158 11-43 (81)
120 COG0260 PepB Leucyl aminopepti 39.4 27 0.00059 35.6 3.1 31 311-342 300-330 (485)
121 COG0298 HypC Hydrogenase matur 37.5 71 0.0015 24.0 4.2 47 139-186 5-52 (82)
122 cd01724 Sm_D1 The eukaryotic S 36.0 78 0.0017 24.3 4.5 34 126-159 11-44 (90)
123 COG2104 ThiS Sulfur transfer p 33.8 72 0.0016 23.2 3.8 37 323-359 25-61 (68)
124 PF02601 Exonuc_VII_L: Exonucl 33.4 49 0.0011 31.6 3.8 36 100-137 279-314 (319)
125 PF14438 SM-ATX: Ataxin 2 SM d 30.9 1.1E+02 0.0025 22.3 4.6 28 127-154 13-43 (77)
126 COG5233 GRH1 Peripheral Golgi 30.8 32 0.00068 32.8 1.8 31 312-342 66-96 (417)
127 PF01732 DUF31: Putative pepti 29.1 34 0.00073 33.6 1.9 23 100-122 35-67 (374)
128 PF14827 Cache_3: Sensory doma 28.8 49 0.0011 26.3 2.5 18 229-246 93-110 (116)
129 PF01455 HupF_HypC: HupF/HypC 28.4 1.9E+02 0.0041 20.9 5.3 43 139-183 5-47 (68)
130 PRK06437 hypothetical protein; 27.9 67 0.0015 23.1 2.8 34 323-360 28-61 (67)
131 cd00433 Peptidase_M17 Cytosol 27.7 50 0.0011 33.6 2.8 29 313-342 289-317 (468)
132 PRK05015 aminopeptidase B; Pro 27.1 57 0.0012 32.6 3.0 29 313-342 240-268 (424)
133 PRK00913 multifunctional amino 26.7 60 0.0013 33.2 3.2 29 313-342 303-331 (483)
134 KOG1738|consensus 26.1 43 0.00092 34.9 2.0 44 309-352 225-271 (638)
135 cd04627 CBS_pair_14 The CBS do 26.0 47 0.001 26.0 1.9 21 226-246 98-118 (123)
136 PF00883 Peptidase_M17: Cytoso 25.4 45 0.00098 31.9 1.9 27 315-342 136-162 (311)
137 PRK05659 sulfur carrier protei 25.0 86 0.0019 22.1 2.9 36 324-359 24-59 (66)
138 cd04603 CBS_pair_KefB_assoc Th 23.4 64 0.0014 24.8 2.2 21 226-246 86-106 (111)
139 PF10049 DUF2283: Protein of u 23.1 54 0.0012 22.1 1.5 11 234-244 36-46 (50)
140 cd04620 CBS_pair_7 The CBS dom 22.3 66 0.0014 24.6 2.1 21 226-246 90-110 (115)
141 cd00565 ThiS ThiaminS ubiquiti 21.8 1E+02 0.0022 21.7 2.8 35 325-359 24-58 (65)
142 PTZ00412 leucyl aminopeptidase 20.9 75 0.0016 32.9 2.6 28 314-342 349-376 (569)
143 cd01718 Sm_E The eukaryotic Sm 20.8 1.7E+02 0.0037 21.9 3.9 31 127-157 19-51 (79)
144 cd00218 GlcAT-I Beta1,3-glucur 20.3 91 0.002 28.3 2.7 32 229-261 136-173 (223)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=1.3e-55 Score=439.18 Aligned_cols=287 Identities=31% Similarity=0.485 Sum_probs=255.3
Q ss_pred hHHHHHHHhCCceEEEEEeee-------------cCCC---------cCceEEEEEEeC-CCEEEecccccCCCCCceEE
Q psy18066 74 FVADVLENVEKSVVNIELVIP-------------YYRQ---------TMSNGSGFIATD-DGLIITNAHVVSGKPGAQII 130 (375)
Q Consensus 74 ~~~~v~e~~~~svV~I~~~~~-------------~~~~---------~~~~GSGfiI~~-~G~IlT~~Hvv~~~~~~~i~ 130 (375)
.+.++++++.||||.|.+... |.+. ..+.||||||++ +||||||+|||+++ ..+.
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a--~~i~ 118 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQA--QKIS 118 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCC--CEEE
Confidence 489999999999999987531 1110 146899999985 79999999999999 8999
Q ss_pred EEcCCCCEEEEEEEEecCCCCeEEEEecCCCCCCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCC
Q psy18066 131 VTLPDGSKHKGAVEALDVECDLAIIRCNFPNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGL 210 (375)
Q Consensus 131 V~~~~g~~~~a~vv~~d~~~DlAlLki~~~~~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~ 210 (375)
|++.||++++|++++.|+.+||||||++...++++++|+++..+++||+|+++|||+++..+++.|+||+..+.... .
T Consensus 119 V~~~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~--~ 196 (455)
T PRK10139 119 IQLNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLN--L 196 (455)
T ss_pred EEECCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccC--C
Confidence 99999999999999999999999999987667999999999999999999999999999999999999998875321 1
Q ss_pred cccccEEEEeecCCCCCccceeeccCCeEEEEEeeecC-----CCeEEEEehHHHHHHHHHHHhcCCCcceeeeeeeeeE
Q psy18066 211 NKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDKDRTITHKKYIGIT 285 (375)
Q Consensus 211 ~~~~~~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~-----~g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~~lGi~ 285 (375)
....++||+|+++++|||||||+|.+|+||||+++... .|+|||||++.++++++++++++ ++.|+|||+.
T Consensus 197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g----~v~r~~LGv~ 272 (455)
T PRK10139 197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFG----EIKRGLLGIK 272 (455)
T ss_pred CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcC----cccccceeEE
Confidence 23467999999999999999999999999999998653 57999999999999999999988 4789999999
Q ss_pred EeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEE
Q psy18066 286 MLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFS 362 (375)
Q Consensus 286 ~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~ 362 (375)
+++++++.++.++ + +...|++|.+|.++|||+++||++||+|++|||++|++++++...+.. ++++.++|+
T Consensus 273 ~~~l~~~~~~~lg----l--~~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V~ 346 (455)
T PRK10139 273 GTEMSADIAKAFN----L--DVQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGLL 346 (455)
T ss_pred EEECCHHHHHhcC----C--CCCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEE
Confidence 9999999887764 2 345799999999999999999999999999999999999999988864 788999999
Q ss_pred ECCeEEEEEEEe
Q psy18066 363 HFKHSFLVESEL 374 (375)
Q Consensus 363 R~g~~~~v~~~l 374 (375)
|+|+.+++++++
T Consensus 347 R~G~~~~l~v~~ 358 (455)
T PRK10139 347 RNGKPLEVEVTL 358 (455)
T ss_pred ECCEEEEEEEEE
Confidence 999998888765
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=6.4e-54 Score=415.44 Aligned_cols=287 Identities=31% Similarity=0.432 Sum_probs=253.9
Q ss_pred chHHHHHHHhCCceEEEEEeeecCC-----CcCceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEEEEEec
Q psy18066 73 NFVADVLENVEKSVVNIELVIPYYR-----QTMSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALD 147 (375)
Q Consensus 73 ~~~~~v~e~~~~svV~I~~~~~~~~-----~~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d 147 (375)
.++.++++++.||||.|++...... ...+.||||+|+++||||||+||+.++ +.+.|.+.||+.++|++++.|
T Consensus 45 ~~~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~--~~i~V~~~dg~~~~a~vv~~d 122 (351)
T TIGR02038 45 ISFNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKA--DQIVVALQDGRKFEAELVGSD 122 (351)
T ss_pred hhHHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCC--CEEEEEECCCCEEEEEEEEec
Confidence 3689999999999999988642111 125789999999999999999999999 899999999999999999999
Q ss_pred CCCCeEEEEecCCCCCCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCCCC
Q psy18066 148 VECDLAIIRCNFPNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGN 227 (375)
Q Consensus 148 ~~~DlAlLki~~~~~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~ 227 (375)
+.+||||||++... +++++++++..+++||+|+++|+|+++..+++.|+|++..+.... ......++|+|+.+++||
T Consensus 123 ~~~DlAvlkv~~~~-~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~--~~~~~~~iqtda~i~~Gn 199 (351)
T TIGR02038 123 PLTDLAVLKIEGDN-LPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS--SVGRQNFIQTDAAINAGN 199 (351)
T ss_pred CCCCEEEEEecCCC-CceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC--CCCcceEEEECCccCCCC
Confidence 99999999999865 899999988899999999999999999999999999998875321 123457899999999999
Q ss_pred ccceeeccCCeEEEEEeeecC-------CCeEEEEehHHHHHHHHHHHhcCCCcceeeeeeeeeEEeeccHHHHHHHhhc
Q psy18066 228 SGGPLVNLDGEVIGINSMKVT-------AGISFAIPIDYAIEFLTNYKRKDKDRTITHKKYIGITMLTLNEKLIEQLRRD 300 (375)
Q Consensus 228 SGGPlvn~~G~VIGI~s~~~~-------~g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~~lGi~~~~~~~~~~~~~~~~ 300 (375)
|||||+|.+|+||||+++... .+++|+||++.++++++++++++ ...|+|||+.+++++++.++.++
T Consensus 200 SGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g----~~~r~~lGv~~~~~~~~~~~~lg-- 273 (351)
T TIGR02038 200 SGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDG----RVIRGYIGVSGEDINSVVAQGLG-- 273 (351)
T ss_pred CcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcC----cccceEeeeEEEECCHHHHHhcC--
Confidence 999999999999999987532 57899999999999999999987 37899999999999988877653
Q ss_pred cCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEEECCeEEEEEEEe
Q psy18066 301 RHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 301 ~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~l 374 (375)
+| ...|++|.+|.++|||+++||++||+|++|||++|.+++|+.+.+.. ++++.++|+|+|+.+++++++
T Consensus 274 --l~--~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 346 (351)
T TIGR02038 274 --LP--DLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLRQGKQLELPVTI 346 (351)
T ss_pred --CC--ccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEe
Confidence 33 34799999999999999999999999999999999999999988864 788999999999999888876
No 3
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=1.9e-53 Score=411.89 Aligned_cols=286 Identities=30% Similarity=0.435 Sum_probs=251.0
Q ss_pred hHHHHHHHhCCceEEEEEeeec--CCC---cCceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecC
Q psy18066 74 FVADVLENVEKSVVNIELVIPY--YRQ---TMSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDV 148 (375)
Q Consensus 74 ~~~~v~e~~~~svV~I~~~~~~--~~~---~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~ 148 (375)
++.++++++.||||.|.+.... ..+ ..+.||||+|+++||||||+||++++ +++.|++.||+.++|++++.|+
T Consensus 46 ~~~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a--~~i~V~~~dg~~~~a~vv~~d~ 123 (353)
T PRK10898 46 SYNQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDA--DQIIVALQDGRVFEALLVGSDS 123 (353)
T ss_pred hHHHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCC--CEEEEEeCCCCEEEEEEEEEcC
Confidence 5899999999999999986521 111 14789999999999999999999999 8999999999999999999999
Q ss_pred CCCeEEEEecCCCCCCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCCCCc
Q psy18066 149 ECDLAIIRCNFPNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNS 228 (375)
Q Consensus 149 ~~DlAlLki~~~~~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~S 228 (375)
.+||||||++... +++++++++..+++||+|+++|||++...+++.|+|++..+..... .....++|+|+++++|||
T Consensus 124 ~~DlAvl~v~~~~-l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~--~~~~~~iqtda~i~~GnS 200 (353)
T PRK10898 124 LTDLAVLKINATN-LPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSP--TGRQNFLQTDASINHGNS 200 (353)
T ss_pred CCCEEEEEEcCCC-CCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCC--ccccceEEeccccCCCCC
Confidence 9999999998764 8999999888899999999999999998999999999887753211 233578999999999999
Q ss_pred cceeeccCCeEEEEEeeecC--------CCeEEEEehHHHHHHHHHHHhcCCCcceeeeeeeeeEEeeccHHHHHHHhhc
Q psy18066 229 GGPLVNLDGEVIGINSMKVT--------AGISFAIPIDYAIEFLTNYKRKDKDRTITHKKYIGITMLTLNEKLIEQLRRD 300 (375)
Q Consensus 229 GGPlvn~~G~VIGI~s~~~~--------~g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~~lGi~~~~~~~~~~~~~~~~ 300 (375)
||||+|.+|+||||+++... .+++||||++.++++++++++++ ++.|+|||+.++++++..+..+.
T Consensus 201 GGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G----~~~~~~lGi~~~~~~~~~~~~~~-- 274 (353)
T PRK10898 201 GGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDG----RVIRGYIGIGGREIAPLHAQGGG-- 274 (353)
T ss_pred cceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcC----cccccccceEEEECCHHHHHhcC--
Confidence 99999999999999997542 47899999999999999999987 47899999999999877654432
Q ss_pred cCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEEECCeEEEEEEEe
Q psy18066 301 RHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 301 ~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~l 374 (375)
+ +...|++|.+|.++|||+++||++||+|++|||++|.++.++.+.+.. ++++.++|+|+++.+++.+++
T Consensus 275 --~--~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 347 (353)
T PRK10898 275 --I--DQLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQLTLQVTI 347 (353)
T ss_pred --C--CCCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEe
Confidence 2 234899999999999999999999999999999999999999888864 688999999999998888775
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=1.4e-52 Score=419.39 Aligned_cols=287 Identities=35% Similarity=0.517 Sum_probs=254.1
Q ss_pred hHHHHHHHhCCceEEEEEeee--------------cCC----------------------------C---cCceEEEEEE
Q psy18066 74 FVADVLENVEKSVVNIELVIP--------------YYR----------------------------Q---TMSNGSGFIA 108 (375)
Q Consensus 74 ~~~~v~e~~~~svV~I~~~~~--------------~~~----------------------------~---~~~~GSGfiI 108 (375)
+++++++++.||||.|.+... |.+ . ..+.||||||
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 499999999999999986440 000 0 1357999999
Q ss_pred eC-CCEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecCCCCeEEEEecCCCCCCeeecCCCCCCCCCCEEEEEecCC
Q psy18066 109 TD-DGLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNFPNNYPALKLGKAADIRNGEFVIAMGSPL 187 (375)
Q Consensus 109 ~~-~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~~~~~~~~~l~~s~~~~~G~~v~~iG~p~ 187 (375)
+. +||||||+||+.++ +++.|++.|+++|+|++++.|+.+||||||++...++++++|+++..+++|++|+++|+|+
T Consensus 119 ~~~~G~IlTn~HVv~~a--~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~s~~l~~G~~V~aiG~P~ 196 (473)
T PRK10942 119 DADKGYVVTNNHVVDNA--TKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMADSDALRVGDYTVAIGNPY 196 (473)
T ss_pred ECCCCEEEeChhhcCCC--CEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecCccccCCCCEEEEEcCCC
Confidence 96 59999999999999 8999999999999999999999999999999876679999999999999999999999999
Q ss_pred CCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCCCCccceeeccCCeEEEEEeeecC-----CCeEEEEehHHHHH
Q psy18066 188 TLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIE 262 (375)
Q Consensus 188 g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~-----~g~g~aip~~~i~~ 262 (375)
++..+++.|+|++..+.... ...+.++|++|+++++|||||||+|.+|+||||+++... .+++||||++.+++
T Consensus 197 g~~~tvt~GiVs~~~r~~~~--~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~ 274 (473)
T PRK10942 197 GLGETVTSGIVSALGRSGLN--VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKN 274 (473)
T ss_pred CCCcceeEEEEEEeecccCC--cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHH
Confidence 99999999999998875221 134567899999999999999999999999999998653 46999999999999
Q ss_pred HHHHHHhcCCCcceeeeeeeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 263 FLTNYKRKDKDRTITHKKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 263 ~l~~l~~~~~~~~~~~~~~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
+++++++++ .+.|+|||+.++++++++++.++ + +...|++|.+|.++|||+++||++||+|++|||++|+
T Consensus 275 v~~~l~~~g----~v~rg~lGv~~~~l~~~~a~~~~----l--~~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~ 344 (473)
T PRK10942 275 LTSQMVEYG----QVKRGELGIMGTELNSELAKAMK----V--DAQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPIS 344 (473)
T ss_pred HHHHHHhcc----ccccceeeeEeeecCHHHHHhcC----C--CCCCceEEEEECCCChHHHcCCCCCCEEEEECCEECC
Confidence 999999987 47899999999999999887754 2 3457999999999999999999999999999999999
Q ss_pred CHHHHHHHHhc---CCEEEEEEEECCeEEEEEEEe
Q psy18066 343 SAKDIYAALEV---VRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 343 ~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~l 374 (375)
++++|...+.. ++++.++|+|+|+.+++.+++
T Consensus 345 s~~dl~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l 379 (473)
T PRK10942 345 SFAALRAQVGTMPVGSKLTLGLLRDGKPVNVNVEL 379 (473)
T ss_pred CHHHHHHHHHhcCCCCEEEEEEEECCeEEEEEEEe
Confidence 99999988865 788999999999998888765
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=1.2e-51 Score=410.74 Aligned_cols=286 Identities=37% Similarity=0.576 Sum_probs=254.3
Q ss_pred HHHHHHHhCCceEEEEEee------e----------cCC-------------CcCceEEEEEEeCCCEEEecccccCCCC
Q psy18066 75 VADVLENVEKSVVNIELVI------P----------YYR-------------QTMSNGSGFIATDDGLIITNAHVVSGKP 125 (375)
Q Consensus 75 ~~~v~e~~~~svV~I~~~~------~----------~~~-------------~~~~~GSGfiI~~~G~IlT~~Hvv~~~~ 125 (375)
+.++++++.||||.|.+.. + |.+ ...+.||||+|+++||||||+||+.++
T Consensus 3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~- 81 (428)
T TIGR02037 3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA- 81 (428)
T ss_pred HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC-
Confidence 7899999999999998753 0 111 015789999999999999999999999
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEecCCCCCCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccc
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNFPNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSS 205 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~~~~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~ 205 (375)
.++.|++.|++.++|++++.|+.+||||||++....+++++|+++..+++|++|+++|||++...+++.|+|++..+..
T Consensus 82 -~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~ 160 (428)
T TIGR02037 82 -DEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSG 160 (428)
T ss_pred -CeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCc
Confidence 8999999999999999999999999999999987669999999989999999999999999999999999999988753
Q ss_pred cccCCcccccEEEEeecCCCCCccceeeccCCeEEEEEeeecC-----CCeEEEEehHHHHHHHHHHHhcCCCcceeeee
Q psy18066 206 ETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDKDRTITHKK 280 (375)
Q Consensus 206 ~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~-----~g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~ 280 (375)
. ....+..++++|+++++|||||||+|.+|+||||+++... .+++||||++.++++++++++++ .+.|+
T Consensus 161 ~--~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g----~~~~~ 234 (428)
T TIGR02037 161 L--GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGG----KVQRG 234 (428)
T ss_pred c--CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcC----cCcCC
Confidence 1 1133456899999999999999999999999999988643 57899999999999999999987 47899
Q ss_pred eeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEE
Q psy18066 281 YIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLV 357 (375)
Q Consensus 281 ~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v 357 (375)
|||+.+++++++.++.++ + ....|++|.+|.++|||+++||++||+|++|||++|+++.++...+.. ++++
T Consensus 235 ~lGi~~~~~~~~~~~~lg----l--~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~v 308 (428)
T TIGR02037 235 WLGVTIQEVTSDLAKSLG----L--EKQRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFADLRRAIGTLKPGKKV 308 (428)
T ss_pred cCceEeecCCHHHHHHcC----C--CCCCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEE
Confidence 999999999999888764 3 345799999999999999999999999999999999999999988865 7889
Q ss_pred EEEEEECCeEEEEEEEe
Q psy18066 358 NFQFSHFKHSFLVESEL 374 (375)
Q Consensus 358 ~l~v~R~g~~~~v~~~l 374 (375)
+++|+|+|+.+++++++
T Consensus 309 ~l~v~R~g~~~~~~v~l 325 (428)
T TIGR02037 309 TLGILRKGKEKTITVTL 325 (428)
T ss_pred EEEEEECCEEEEEEEEE
Confidence 99999999998888765
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1e-41 Score=330.99 Aligned_cols=287 Identities=35% Similarity=0.522 Sum_probs=253.7
Q ss_pred chHHHHHHHhCCceEEEEEeeecCC-------Cc----CceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEE
Q psy18066 73 NFVADVLENVEKSVVNIELVIPYYR-------QT----MSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKG 141 (375)
Q Consensus 73 ~~~~~v~e~~~~svV~I~~~~~~~~-------~~----~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a 141 (375)
..+.++++++.|+||.+.....-.. .. .+.||||+++++|||+||.||+.++ .++.+.+.||+++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a--~~i~v~l~dg~~~~a 110 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGA--EEITVTLADGREVPA 110 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCc--ceEEEEeCCCCEEEE
Confidence 4689999999999999988641110 00 4899999999999999999999998 899999999999999
Q ss_pred EEEEecCCCCeEEEEecCCCCCCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEee
Q psy18066 142 AVEALDVECDLAIIRCNFPNNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDA 221 (375)
Q Consensus 142 ~vv~~d~~~DlAlLki~~~~~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~ 221 (375)
++++.|+..|+|+||++....++.+.++++..++.|++++++|+|+++..+++.|+++...+. .........++||+|+
T Consensus 111 ~~vg~d~~~dlavlki~~~~~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~v~~~~~~~~~IqtdA 189 (347)
T COG0265 111 KLVGKDPISDLAVLKIDGAGGLPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-GVGSAGGYVNFIQTDA 189 (347)
T ss_pred EEEecCCccCEEEEEeccCCCCceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-cccCcccccchhhccc
Confidence 999999999999999998655889999999999999999999999999999999999999986 2111123678999999
Q ss_pred cCCCCCccceeeccCCeEEEEEeeecC-----CCeEEEEehHHHHHHHHHHHhcCCCcceeeeeeeeeEEeeccHHHHHH
Q psy18066 222 AITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDKDRTITHKKYIGITMLTLNEKLIEQ 296 (375)
Q Consensus 222 ~i~~G~SGGPlvn~~G~VIGI~s~~~~-----~g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~~lGi~~~~~~~~~~~~ 296 (375)
++++||||||++|.+|++|||++.... .|++|+||++.++.++.++...+ ...|+|+|+.+.++++...
T Consensus 190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G----~v~~~~lgv~~~~~~~~~~-- 263 (347)
T COG0265 190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKG----KVVRGYLGVIGEPLTADIA-- 263 (347)
T ss_pred ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcC----CccccccceEEEEcccccc--
Confidence 999999999999999999999998875 24899999999999999999866 4889999999998887654
Q ss_pred HhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEEECCeEEEEEEE
Q psy18066 297 LRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFSHFKHSFLVESE 373 (375)
Q Consensus 297 ~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~ 373 (375)
+ ++ ....|++|.+|.+++||+++|++.||+|+++||+++.+..++...+.. ++++.+++.|+|+.+.+.++
T Consensus 264 ~----g~--~~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~~~v~ 337 (347)
T COG0265 264 L----GL--PVAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERELAVT 337 (347)
T ss_pred c----CC--CCCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEEEEEE
Confidence 2 23 356789999999999999999999999999999999999999998876 67999999999999988887
Q ss_pred e
Q psy18066 374 L 374 (375)
Q Consensus 374 l 374 (375)
+
T Consensus 338 l 338 (347)
T COG0265 338 L 338 (347)
T ss_pred e
Confidence 6
No 7
>KOG1320|consensus
Probab=100.00 E-value=8.8e-32 Score=262.04 Aligned_cols=308 Identities=32% Similarity=0.413 Sum_probs=253.5
Q ss_pred CCcccccchHHHHHHHhCCceEEEEEeeecCCC--------cCceEEEEEEeCCCEEEecccccCCCCC---------ce
Q psy18066 66 PSLRSQFNFVADVLENVEKSVVNIELVIPYYRQ--------TMSNGSGFIATDDGLIITNAHVVSGKPG---------AQ 128 (375)
Q Consensus 66 ~~~~~~~~~~~~v~e~~~~svV~I~~~~~~~~~--------~~~~GSGfiI~~~G~IlT~~Hvv~~~~~---------~~ 128 (375)
.+++...+.+.++.++..+|+|.|+....|.+. +...|||||++.+|+++||+||+..... ..
T Consensus 121 gs~~k~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~ 200 (473)
T KOG1320|consen 121 GSPRKYKAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLR 200 (473)
T ss_pred CCchhhhhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceee
Confidence 345555677899999999999999987644332 2778999999999999999999886521 13
Q ss_pred EEEEcCCC--CEEEEEEEEecCCCCeEEEEecCCC-CCCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccc
Q psy18066 129 IIVTLPDG--SKHKGAVEALDVECDLAIIRCNFPN-NYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSS 205 (375)
Q Consensus 129 i~V~~~~g--~~~~a~vv~~d~~~DlAlLki~~~~-~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~ 205 (375)
+.+...+| ..+++.+.+.|+..|+|+++++.+. .+++++++-+..+..|+++.++|.|+++.++.+.|+++...|..
T Consensus 201 vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~ 280 (473)
T KOG1320|consen 201 VQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKS 280 (473)
T ss_pred EEEEEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccc
Confidence 77777766 8999999999999999999997653 48899999999999999999999999999999999999999887
Q ss_pred cccCCc---ccccEEEEeecCCCCCccceeeccCCeEEEEEeeecC-----CCeEEEEehHHHHHHHHHHHhcCC-C---
Q psy18066 206 ETLGLN---KTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT-----AGISFAIPIDYAIEFLTNYKRKDK-D--- 273 (375)
Q Consensus 206 ~~~~~~---~~~~~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~-----~g~g~aip~~~i~~~l~~l~~~~~-~--- 273 (375)
...+.. ...+++|+|+++++|+||||++|.+|++||+++.... .+++|++|.+.++.++.+..+.+. .
T Consensus 281 ~~lg~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~ 360 (473)
T KOG1320|consen 281 FKLGLETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPV 360 (473)
T ss_pred cccCcccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccc
Confidence 655443 4567899999999999999999999999999998765 789999999999999988865542 1
Q ss_pred -cceeeeeeeeeEEeeccHHHHHHHh-hccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHH
Q psy18066 274 -RTITHKKYIGITMLTLNEKLIEQLR-RDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAAL 351 (375)
Q Consensus 274 -~~~~~~~~lGi~~~~~~~~~~~~~~-~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l 351 (375)
.....+.|+|+....+...+..+.. +.+-+|.....++++.+|.+++++..+++++||+|++|||++|++..+|.+.+
T Consensus 361 ~~~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~~~i 440 (473)
T KOG1320|consen 361 KPLVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLYELI 440 (473)
T ss_pred cCcccccccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHHHHHHH
Confidence 1223356898888777665544332 33334434456899999999999999999999999999999999999999999
Q ss_pred hc---CCEEEEEEEECCeEEEEEEE
Q psy18066 352 EV---VRLVNFQFSHFKHSFLVESE 373 (375)
Q Consensus 352 ~~---~~~v~l~v~R~g~~~~v~~~ 373 (375)
+. .+++.+..+|..+..++.+.
T Consensus 441 ~~~~~~~~v~vl~~~~~e~~tl~Il 465 (473)
T KOG1320|consen 441 EECSTEDKVAVLDRRSAEDATLEIL 465 (473)
T ss_pred HhcCcCceEEEEEecCccceeEEec
Confidence 87 45888888898888777654
No 8
>KOG1421|consensus
Probab=99.94 E-value=1.3e-25 Score=221.10 Aligned_cols=291 Identities=18% Similarity=0.254 Sum_probs=239.3
Q ss_pred hHHHHHHHhCCceEEEEEee--ecCCC-c-CceEEEEEEeCC-CEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecC
Q psy18066 74 FVADVLENVEKSVVNIELVI--PYYRQ-T-MSNGSGFIATDD-GLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDV 148 (375)
Q Consensus 74 ~~~~v~e~~~~svV~I~~~~--~~~~~-~-~~~GSGfiI~~~-G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~ 148 (375)
.|+..+..+-+|||.|.... .|... . .+.+|||++++. ||||||+||+...| -.-.+.|.+..+.+.-.++.|+
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP-~va~avf~n~ee~ei~pvyrDp 131 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGP-FVASAVFDNHEEIEIYPVYRDP 131 (955)
T ss_pred hhhhhhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCC-ceeEEEecccccCCcccccCCc
Confidence 68999999999999999876 22222 1 678999999876 89999999998765 5667778888888888999999
Q ss_pred CCCeEEEEecCCC----CCCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCC----cccccEEEEe
Q psy18066 149 ECDLAIIRCNFPN----NYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGL----NKTINYIQTD 220 (375)
Q Consensus 149 ~~DlAlLki~~~~----~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~----~~~~~~i~~d 220 (375)
.+|+.+++.++.. .+..+++. .+..++|.+++.+|+..+..-++..|.++.+++..+..+. +....++|..
T Consensus 132 VhdfGf~r~dps~ir~s~vt~i~la-p~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~Qaa 210 (955)
T KOG1421|consen 132 VHDFGFFRYDPSTIRFSIVTEICLA-PELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYIQAA 210 (955)
T ss_pred hhhcceeecChhhcceeeeeccccC-ccccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceeeeeh
Confidence 9999999998763 34455553 3556889999999998888788889999999998876654 3445678888
Q ss_pred ecCCCCCccceeeccCCeEEEEEeeecC-CCeEEEEehHHHHHHHHHHHhcCCCcceeeeeeeeeEEeec----------
Q psy18066 221 AAITFGNSGGPLVNLDGEVIGINSMKVT-AGISFAIPIDYAIEFLTNYKRKDKDRTITHKKYIGITMLTL---------- 289 (375)
Q Consensus 221 ~~i~~G~SGGPlvn~~G~VIGI~s~~~~-~g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~~lGi~~~~~---------- 289 (375)
.....|.||+|++|.+|..|.++..+.. .+.+|++|++.+++-|.-+++.. .+.|+.|-+++..-
T Consensus 211 sstsggssgspVv~i~gyAVAl~agg~~ssas~ffLpLdrV~RaL~clq~n~----PItRGtLqvefl~k~~de~rrlGL 286 (955)
T KOG1421|consen 211 SSTSGGSSGSPVVDIPGYAVALNAGGSISSASDFFLPLDRVVRALRCLQNNT----PITRGTLQVEFLHKLFDECRRLGL 286 (955)
T ss_pred hcCCCCCCCCceecccceEEeeecCCcccccccceeeccchhhhhhhhhcCC----CcccceEEEEEehhhhHHHHhcCC
Confidence 8999999999999999999999987653 67899999999999999998765 47888888887553
Q ss_pred cHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc--CCEEEEEEEECCeE
Q psy18066 290 NEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV--VRLVNFQFSHFKHS 367 (375)
Q Consensus 290 ~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~--~~~v~l~v~R~g~~ 367 (375)
+.|+.+.. ...+| ...+-++|..|.++|||++. |++||++++||+.-++++.++.+.|.. |+.++|+|+|+|++
T Consensus 287 ~sE~eqv~--r~k~P-~~tgmLvV~~vL~~gpa~k~-Le~GDillavN~t~l~df~~l~~iLDegvgk~l~LtI~Rggqe 362 (955)
T KOG1421|consen 287 SSEWEQVV--RTKFP-ERTGMLVVETVLPEGPAEKK-LEPGDILLAVNSTCLNDFEALEQILDEGVGKNLELTIQRGGQE 362 (955)
T ss_pred cHHHHHHH--HhcCc-ccceeEEEEEeccCCchhhc-cCCCcEEEEEcceehHHHHHHHHHHhhccCceEEEEEEeCCEE
Confidence 34443333 23466 55566778999999999987 999999999999999999999999987 78999999999999
Q ss_pred EEEEEEe
Q psy18066 368 FLVESEL 374 (375)
Q Consensus 368 ~~v~~~l 374 (375)
.+++++.
T Consensus 363 lel~vtv 369 (955)
T KOG1421|consen 363 LELTVTV 369 (955)
T ss_pred EEEEEEe
Confidence 8887764
No 9
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.73 E-value=3.6e-17 Score=133.65 Aligned_cols=111 Identities=35% Similarity=0.583 Sum_probs=76.1
Q ss_pred EEEEEEeCCCEEEecccccCCC------CCceEEEEcCCCCEEE--EEEEEecCC-CCeEEEEecCCCCCCeeecCCCCC
Q psy18066 103 GSGFIATDDGLIITNAHVVSGK------PGAQIIVTLPDGSKHK--GAVEALDVE-CDLAIIRCNFPNNYPALKLGKAAD 173 (375)
Q Consensus 103 GSGfiI~~~G~IlT~~Hvv~~~------~~~~i~V~~~~g~~~~--a~vv~~d~~-~DlAlLki~~~~~~~~~~l~~s~~ 173 (375)
||||+|+++|+||||+||+.+. ...++.+.+.++..++ +++++.|+. .|+|||+++
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~--------------- 65 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVD--------------- 65 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEES---------------
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEe---------------
Confidence 8999999999999999999954 2267898999998888 999999999 999999999
Q ss_pred CCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCCCCccceeeccCCeEEEE
Q psy18066 174 IRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGI 242 (375)
Q Consensus 174 ~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VIGI 242 (375)
.....+.. ................ ......+ +++.+.+|+|||||||.+|+||||
T Consensus 66 -----~~~~~~~~-----~~~~~~~~~~~~~~~~---~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 66 -----PWTGVGGG-----VRVPGSTSGVSPTSTN---DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -----CEEEEEEE-----EEEEEEEEEEEEEEEE---ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred -----cccceeee-----eEeeeeccccccccCc---ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 00000000 0000000000000000 0111114 899999999999999999999997
No 10
>KOG1421|consensus
Probab=99.62 E-value=8.1e-14 Score=138.48 Aligned_cols=277 Identities=17% Similarity=0.169 Sum_probs=200.4
Q ss_pred HHHhCCceEEEEEeeecCCCc----CceEEEEEEeCC-CEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecCCCCeE
Q psy18066 79 LENVEKSVVNIELVIPYYRQT----MSNGSGFIATDD-GLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDVECDLA 153 (375)
Q Consensus 79 ~e~~~~svV~I~~~~~~~~~~----~~~GSGfiI~~~-G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlA 153 (375)
.+++..+.|.+++..|..-++ ...|||.|++.+ |++++++.++...- .+.+|++.|.-.++|.+...|+..++|
T Consensus 524 ~~~i~~~~~~v~~~~~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~-~d~~vt~~dS~~i~a~~~fL~~t~n~a 602 (955)
T KOG1421|consen 524 SADISNCLVDVEPMMPVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDA-KDQRVTEADSDGIPANVSFLHPTENVA 602 (955)
T ss_pred hhHHhhhhhhheeceeeccccchhhhhcCceEEEEccCCceeEecccCCchh-hceEEeecccccccceeeEecCcccee
Confidence 567778888888776543222 567999999866 89999999997432 788999999889999999999999999
Q ss_pred EEEecCCCCCCeeecCCCCCCCCCCEEEEEecCCCCC-----CCeeeeEEeeeeccccccCC--cccccEEEEeecCCCC
Q psy18066 154 IIRCNFPNNYPALKLGKAADIRNGEFVIAMGSPLTLN-----NTNTFGIISNKQRSSETLGL--NKTINYIQTDAAITFG 226 (375)
Q Consensus 154 lLki~~~~~~~~~~l~~s~~~~~G~~v~~iG~p~g~~-----~~~~~G~vs~~~~~~~~~~~--~~~~~~i~~d~~i~~G 226 (375)
.+|.++.. ...++|. ...+..||++.+.|+-.... .+++. ++........... ......|..++.+..+
T Consensus 603 ~~kydp~~-~~~~kl~-~~~v~~gD~~~f~g~~~~~r~ltaktsv~d--vs~~~~ps~~~pr~r~~n~e~Is~~~nlsT~ 678 (955)
T KOG1421|consen 603 SFKYDPAL-EVQLKLT-DTTVLRGDECTFEGFTEDLRALTAKTSVTD--VSVVIIPSSVMPRFRATNLEVISFMDNLSTS 678 (955)
T ss_pred EeccChhH-hhhhccc-eeeEecCCceeEecccccchhhcccceeee--eEEEEecCCCCcceeecceEEEEEecccccc
Confidence 99998865 3556663 45678899999999875542 12222 1111111111111 3455678888887777
Q ss_pred CccceeeccCCeEEEEEeeecCC-------CeEEEEehHHHHHHHHHHHhcCCCcceeeeeeeeeEEeecc---------
Q psy18066 227 NSGGPLVNLDGEVIGINSMKVTA-------GISFAIPIDYAIEFLTNYKRKDKDRTITHKKYIGITMLTLN--------- 290 (375)
Q Consensus 227 ~SGGPlvn~~G~VIGI~s~~~~~-------g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~~lGi~~~~~~--------- 290 (375)
+--|-+.|.+|+|+|++-....+ -.-+.+.+..+++.|+.|+.+.+ .+...+|+.+..++
T Consensus 679 c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~----~rp~i~~vef~~i~laqar~lgl 754 (955)
T KOG1421|consen 679 CLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPS----ARPTIAGVEFSHITLAQARTLGL 754 (955)
T ss_pred ccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCC----CCceeeccceeeEEeehhhccCC
Confidence 77889999999999998665442 24577889999999999998874 33344566655543
Q ss_pred -HHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhcCCEEEEEEEECCeEEE
Q psy18066 291 -EKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEVVRLVNFQFSHFKHSFL 369 (375)
Q Consensus 291 -~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~~~~v~l~v~R~g~~~~ 369 (375)
.|+..+.....+. ..+-.+|+.|.+.-+ +. |..||||+++|||.|+.+.||.+.. .++..|.|+|..++
T Consensus 755 p~e~imk~e~es~~---~~ql~~ishv~~~~~--ki-l~~gdiilsvngk~itr~~dl~d~~----eid~~ilrdg~~~~ 824 (955)
T KOG1421|consen 755 PSEFIMKSEEESTI---PRQLYVISHVRPLLH--KI-LGVGDIILSVNGKMITRLSDLHDFE----EIDAVILRDGIEME 824 (955)
T ss_pred CHHHHhhhhhcCCC---cceEEEEEeeccCcc--cc-cccccEEEEecCeEEeeehhhhhhh----hhheeeeecCcEEE
Confidence 4444444322222 235677888887663 33 9999999999999999999999744 48899999999988
Q ss_pred EEEEe
Q psy18066 370 VESEL 374 (375)
Q Consensus 370 v~~~l 374 (375)
+.+++
T Consensus 825 ikipt 829 (955)
T KOG1421|consen 825 IKIPT 829 (955)
T ss_pred EEecc
Confidence 87754
No 11
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.61 E-value=7.9e-15 Score=112.40 Aligned_cols=79 Identities=33% Similarity=0.448 Sum_probs=68.6
Q ss_pred eeeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCE
Q psy18066 280 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRL 356 (375)
Q Consensus 280 ~~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~ 356 (375)
||||+.+...++ ..|++|.+|.++|||+++||++||+|++|||++|+++.++...+.. +++
T Consensus 1 ~~lGv~~~~~~~----------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~ 64 (82)
T PF13180_consen 1 GGLGVTVQNLSD----------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDT 64 (82)
T ss_dssp -E-SEEEEECSC----------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred CEECeEEEEccC----------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence 689999875432 3699999999999999999999999999999999999999999865 789
Q ss_pred EEEEEEECCeEEEEEEEe
Q psy18066 357 VNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 357 v~l~v~R~g~~~~v~~~l 374 (375)
++|+|+|+++.++++++|
T Consensus 65 v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 65 VTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp EEEEEEETTEEEEEEEE-
T ss_pred EEEEEEECCEEEEEEEEC
Confidence 999999999999999876
No 12
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.43 E-value=1.8e-12 Score=100.67 Aligned_cols=86 Identities=33% Similarity=0.419 Sum_probs=73.6
Q ss_pred eeeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCE
Q psy18066 280 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRL 356 (375)
Q Consensus 280 ~~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~ 356 (375)
+|+|+.+++++++....+. . ....|++|.+|.++|||+++||++||+|++|||+++.++.++...+.. ++.
T Consensus 1 ~~~G~~~~~~~~~~~~~~~----~--~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~ 74 (90)
T cd00987 1 PWLGVTVQDLTPDLAEELG----L--KDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDK 74 (90)
T ss_pred CccceEEeECCHHHHHHcC----C--CCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCE
Confidence 5899999999987655421 1 345799999999999999999999999999999999999999888865 678
Q ss_pred EEEEEEECCeEEEEE
Q psy18066 357 VNFQFSHFKHSFLVE 371 (375)
Q Consensus 357 v~l~v~R~g~~~~v~ 371 (375)
+.+++.|+|+..++.
T Consensus 75 i~l~v~r~g~~~~~~ 89 (90)
T cd00987 75 VTLTVLRGGKELTVT 89 (90)
T ss_pred EEEEEEECCEEEEee
Confidence 999999999876654
No 13
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.41 E-value=1.3e-11 Score=110.94 Aligned_cols=174 Identities=24% Similarity=0.263 Sum_probs=115.5
Q ss_pred CCceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCCCceEEEEcC-------CC--CEEEEEEEEec----C-
Q psy18066 83 EKSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLP-------DG--SKHKGAVEALD----V- 148 (375)
Q Consensus 83 ~~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~-------~g--~~~~a~vv~~d----~- 148 (375)
.|.+|.|.... . ...++|++|+++ +|||++||+.+. .++.+.+. ++ ..+..+-+..+ +
T Consensus 12 ~p~~v~i~~~~----~-~~~C~G~li~~~-~vLTaahC~~~~--~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~ 83 (220)
T PF00089_consen 12 FPWVVSIRYSN----G-RFFCTGTLISPR-WVLTAAHCVDGA--SDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPS 83 (220)
T ss_dssp STTEEEEEETT----T-EEEEEEEEEETT-EEEEEGGGHTSG--GSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTT
T ss_pred CCeEEEEeeCC----C-CeeEeEEecccc-cccccccccccc--cccccccccccccccccccccccccccccccccccc
Confidence 36777777642 1 578999999988 999999999984 55666433 22 23444433332 2
Q ss_pred --CCCeEEEEecCC----CCCCeeecCCC-CCCCCCCEEEEEecCCCCCC----CeeeeEEeeeeccccc--cCCccccc
Q psy18066 149 --ECDLAIIRCNFP----NNYPALKLGKA-ADIRNGEFVIAMGSPLTLNN----TNTFGIISNKQRSSET--LGLNKTIN 215 (375)
Q Consensus 149 --~~DlAlLki~~~----~~~~~~~l~~s-~~~~~G~~v~~iG~p~g~~~----~~~~G~vs~~~~~~~~--~~~~~~~~ 215 (375)
..|+|||+++.+ ..+.++.+... ..+..|+.+.++||+..... ......+.-.....-. ........
T Consensus 84 ~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~ 163 (220)
T PF00089_consen 84 TYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPN 163 (220)
T ss_dssp TTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTT
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 579999999987 35667777652 34588999999999976432 2333333332221100 10012334
Q ss_pred EEEEee----cCCCCCccceeeccCCeEEEEEeeecC---C-CeEEEEehHHHHHHH
Q psy18066 216 YIQTDA----AITFGNSGGPLVNLDGEVIGINSMKVT---A-GISFAIPIDYAIEFL 264 (375)
Q Consensus 216 ~i~~d~----~i~~G~SGGPlvn~~G~VIGI~s~~~~---~-g~g~aip~~~i~~~l 264 (375)
.+.... ..+.|+|||||++.++.|+||++.... . ..+++.+++..++|+
T Consensus 164 ~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 164 MICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 566655 788999999999888889999998743 2 248889888777764
No 14
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.38 E-value=4.6e-12 Score=96.33 Aligned_cols=67 Identities=19% Similarity=0.066 Sum_probs=59.7
Q ss_pred CCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhcCCEEEEEEEECCeEEEEEEEe
Q psy18066 308 THGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEVVRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 308 ~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~~~~v~l~v~R~g~~~~v~~~l 374 (375)
..|++|.+|.++|||+++||++||+|++|||++++++.++...+..++.+.+++.|+++..++.+++
T Consensus 11 ~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~~ 77 (80)
T cd00990 11 EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEYQAGDPVELTVFRDDRLIEVPLTL 77 (80)
T ss_pred CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEe
Confidence 3579999999999999999999999999999999998877666655788999999999988887765
No 15
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.33 E-value=6.3e-11 Score=107.24 Aligned_cols=178 Identities=18% Similarity=0.169 Sum_probs=107.2
Q ss_pred CCceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCCCceEEEEcCC---------CCEEEEEEEEec------
Q psy18066 83 EKSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPD---------GSKHKGAVEALD------ 147 (375)
Q Consensus 83 ~~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~---------g~~~~a~vv~~d------ 147 (375)
.|-+|.|.... ....++|.+|+++ +|||+|||+.+.+...+.|.+.. ...+..+-+..+
T Consensus 12 ~Pw~v~i~~~~-----~~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~ 85 (232)
T cd00190 12 FPWQVSLQYTG-----GRHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPS 85 (232)
T ss_pred CCCEEEEEccC-----CcEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCC
Confidence 46677776541 2678999999987 99999999987432456665542 122334444444
Q ss_pred -CCCCeEEEEecCCC----CCCeeecCCCC-CCCCCCEEEEEecCCCCCC-----CeeeeEEeeeeccc--cccC--Ccc
Q psy18066 148 -VECDLAIIRCNFPN----NYPALKLGKAA-DIRNGEFVIAMGSPLTLNN-----TNTFGIISNKQRSS--ETLG--LNK 212 (375)
Q Consensus 148 -~~~DlAlLki~~~~----~~~~~~l~~s~-~~~~G~~v~~iG~p~g~~~-----~~~~G~vs~~~~~~--~~~~--~~~ 212 (375)
...|||||+++.+- .+.|+.|.... .+..|+.+.+.||...... ......+.-..... .... ...
T Consensus 86 ~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~ 165 (232)
T cd00190 86 TYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTI 165 (232)
T ss_pred CCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccC
Confidence 35899999998652 36788886543 6778999999999765332 11111111111100 0000 000
Q ss_pred cccEEEE-----eecCCCCCccceeeccC---CeEEEEEeeecC----CCeEEEEehHHHHHHHHH
Q psy18066 213 TINYIQT-----DAAITFGNSGGPLVNLD---GEVIGINSMKVT----AGISFAIPIDYAIEFLTN 266 (375)
Q Consensus 213 ~~~~i~~-----d~~i~~G~SGGPlvn~~---G~VIGI~s~~~~----~g~g~aip~~~i~~~l~~ 266 (375)
....+-. +...|+|+|||||+... ..++||.+.... ...+.+..+...++|+++
T Consensus 166 ~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~~ 231 (232)
T cd00190 166 TDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQK 231 (232)
T ss_pred CCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhhc
Confidence 0111111 34578899999999765 789999998653 223455666666666653
No 16
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.33 E-value=1.9e-11 Score=92.92 Aligned_cols=68 Identities=26% Similarity=0.316 Sum_probs=61.6
Q ss_pred CCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEEECCeEEEEEEE
Q psy18066 306 DLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFSHFKHSFLVESE 373 (375)
Q Consensus 306 ~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~ 373 (375)
....|++|.+|.++|||+++||++||+|++|||+++.+++++...+.. ++.+.+++.|+++..+++++
T Consensus 7 ~~~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 7 EAVAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVST 77 (79)
T ss_pred ccCCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence 456799999999999999999999999999999999999999998875 67899999999988877654
No 17
>KOG1320|consensus
Probab=99.32 E-value=4.7e-12 Score=124.36 Aligned_cols=250 Identities=23% Similarity=0.244 Sum_probs=174.3
Q ss_pred HHHHhCCceEEEEEee-------ecCCC--cCceEEEEEEeCCCEEEecccccCCCCC-ceEEEE-cCCCCEEEEEEEEe
Q psy18066 78 VLENVEKSVVNIELVI-------PYYRQ--TMSNGSGFIATDDGLIITNAHVVSGKPG-AQIIVT-LPDGSKHKGAVEAL 146 (375)
Q Consensus 78 v~e~~~~svV~I~~~~-------~~~~~--~~~~GSGfiI~~~G~IlT~~Hvv~~~~~-~~i~V~-~~~g~~~~a~vv~~ 146 (375)
..+...+|++.+.+.. ||... ....|+||.+... .++|++|++....+ ..+.+. ...-+.|.|++...
T Consensus 55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~ 133 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAV 133 (473)
T ss_pred CccccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccccccccccCCCchhhhhhHHHh
Confidence 3455566777776553 34332 2677999999865 99999999993311 334443 22346788999888
Q ss_pred cCCCCeEEEEecCCCCCCeee-cCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCC
Q psy18066 147 DVECDLAIIRCNFPNNYPALK-LGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITF 225 (375)
Q Consensus 147 d~~~DlAlLki~~~~~~~~~~-l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~ 225 (375)
-.+.|+|++.++..+-++.+. |..-+-+...+.++++| +....+|.|.|+.........+ ......+++|+++++
T Consensus 134 ~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~-~~~l~~vqi~aa~~~ 209 (473)
T KOG1320|consen 134 FEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHS-STVLLRVQIDAAIGP 209 (473)
T ss_pred hhcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---CCcEEEEeeEEEEEEeccccCC-CcceeeEEEEEeecC
Confidence 899999999999765333322 32223345567888888 6667889999988876543222 234457999999999
Q ss_pred CCccceeeccCCeEEEEEeeec--CCCeEEEEehHHHHHHHHHHHhcCCCcceeeeeeeeeEEeecc-HHHHHHHhhccC
Q psy18066 226 GNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYAIEFLTNYKRKDKDRTITHKKYIGITMLTLN-EKLIEQLRRDRH 302 (375)
Q Consensus 226 G~SGGPlvn~~G~VIGI~s~~~--~~g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~~lGi~~~~~~-~~~~~~~~~~~~ 302 (375)
|+||+|.+...+++.|+..... .+.+++.||.-.+..+.......+. ....++++...+.+. .+.. ..+.
T Consensus 210 ~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~---~~~f~~~nt~t~g~vs~~~R----~~~~ 282 (473)
T KOG1320|consen 210 GNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAI---GNGFGLLNTLTQGMVSGQLR----KSFK 282 (473)
T ss_pred CccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeecc---ccCceeeeeeeecccccccc----cccc
Confidence 9999999988899999998876 4577999999887777766555442 123455555544442 2211 1222
Q ss_pred CCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 303 IPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 303 ~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
+ +...|+.+.++.+-+.|-+. ++.||+|+.+||..|-
T Consensus 283 l--g~~~g~~i~~~~qtd~ai~~-~nsg~~ll~~DG~~Ig 319 (473)
T KOG1320|consen 283 L--GLETGVLISKINQTDAAINP-GNSGGPLLNLDGEVIG 319 (473)
T ss_pred c--Ccccceeeeeecccchhhhc-ccCCCcEEEecCcEee
Confidence 2 23378999999999988887 8999999999999884
No 18
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.26 E-value=5.6e-11 Score=110.31 Aligned_cols=99 Identities=16% Similarity=0.161 Sum_probs=84.3
Q ss_pred hHHHHHHHHHHHhcCCCcceeeeeeeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEe
Q psy18066 257 IDYAIEFLTNYKRKDKDRTITHKKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIEL 336 (375)
Q Consensus 257 ~~~i~~~l~~l~~~~~~~~~~~~~~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~v 336 (375)
...++++++++.+.+ ...+.|+|+.....+ +...|+.|..+.+++||+++||++||+|++|
T Consensus 158 ~~~~~~v~~~l~~~g----~~~~~~lgi~p~~~~---------------g~~~G~~v~~v~~~s~a~~aGLr~GDvIv~I 218 (259)
T TIGR01713 158 IVVSRRIIEELTKDP----QKMFDYIRLSPVMKN---------------DKLEGYRLNPGKDPSLFYKSGLQDGDIAVAL 218 (259)
T ss_pred hhhHHHHHHHHHHCH----HhhhheEeEEEEEeC---------------CceeEEEEEecCCCCHHHHcCCCCCCEEEEE
Confidence 456788899999877 467899999975322 2346999999999999999999999999999
Q ss_pred CCEEcCCHHHHHHHHhc---CCEEEEEEEECCeEEEEEEEe
Q psy18066 337 NKKPCHSAKDIYAALEV---VRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 337 ng~~v~~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~l 374 (375)
||+++++++++.+.+.+ +++++++|.|+|+..++.+.+
T Consensus 219 NG~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 219 NGLDLRDPEQAFQALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred CCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 99999999999888765 578999999999998888764
No 19
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.20 E-value=1.7e-10 Score=87.56 Aligned_cols=67 Identities=24% Similarity=0.295 Sum_probs=60.4
Q ss_pred CCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEEECCeEEEEEEEeC
Q psy18066 308 THGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFSHFKHSFLVESELK 375 (375)
Q Consensus 308 ~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~l~ 375 (375)
..|++|.+|.++|||++ ||++||+|++|||+++.+++++...+.. ++.+.+++.|+|+...++++|.
T Consensus 7 ~~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~ 76 (79)
T cd00986 7 YHGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLILK 76 (79)
T ss_pred ecCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEEe
Confidence 36899999999999997 7999999999999999999999988864 6789999999999998888763
No 20
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.19 E-value=2.8e-10 Score=86.08 Aligned_cols=65 Identities=22% Similarity=0.229 Sum_probs=58.4
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc--CCEEEEEEEECCeEEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV--VRLVNFQFSHFKHSFLVESE 373 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~--~~~v~l~v~R~g~~~~v~~~ 373 (375)
..++|.+|.++|||+++||++||+|++|||+++++++++...+.. ++.+.+++.|+++..++.+.
T Consensus 12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~~~~~~~~l~ 78 (79)
T cd00989 12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERNGETITLTLT 78 (79)
T ss_pred cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEECCEEEEEEec
Confidence 358899999999999999999999999999999999999988876 57899999999987777654
No 21
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.12 E-value=3.1e-09 Score=96.28 Aligned_cols=159 Identities=21% Similarity=0.187 Sum_probs=95.8
Q ss_pred CCceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCC--------CEEEEEEEEec-------
Q psy18066 83 EKSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDG--------SKHKGAVEALD------- 147 (375)
Q Consensus 83 ~~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g--------~~~~a~vv~~d------- 147 (375)
.|-+|.|.... ....++|.+|+++ +|||++||+.+.....+.|.+... ..+...-+..+
T Consensus 13 ~Pw~~~i~~~~-----~~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~ 86 (229)
T smart00020 13 FPWQVSLQYRG-----GRHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPST 86 (229)
T ss_pred CCcEEEEEEcC-----CCcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCC
Confidence 45566665431 2678999999987 999999999875223667766632 22334433332
Q ss_pred CCCCeEEEEecCC----CCCCeeecCCC-CCCCCCCEEEEEecCCCCC--C----CeeeeEEeeeeccccc--cCC--cc
Q psy18066 148 VECDLAIIRCNFP----NNYPALKLGKA-ADIRNGEFVIAMGSPLTLN--N----TNTFGIISNKQRSSET--LGL--NK 212 (375)
Q Consensus 148 ~~~DlAlLki~~~----~~~~~~~l~~s-~~~~~G~~v~~iG~p~g~~--~----~~~~G~vs~~~~~~~~--~~~--~~ 212 (375)
...|||||+++.+ ..+.++.+... ..+..++.+.+.||+.... . ......+.......-. ... ..
T Consensus 87 ~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~ 166 (229)
T smart00020 87 YDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAI 166 (229)
T ss_pred CcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhcccccc
Confidence 4689999999875 24667777543 3567789999999986542 1 1111111111110000 000 00
Q ss_pred cccEEE-----EeecCCCCCccceeeccCC--eEEEEEeeec
Q psy18066 213 TINYIQ-----TDAAITFGNSGGPLVNLDG--EVIGINSMKV 247 (375)
Q Consensus 213 ~~~~i~-----~d~~i~~G~SGGPlvn~~G--~VIGI~s~~~ 247 (375)
....+. .....|+|+|||||+...+ .++||++...
T Consensus 167 ~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 167 TDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 000111 1355788999999997654 8999999875
No 22
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.11 E-value=8.9e-10 Score=84.59 Aligned_cols=66 Identities=24% Similarity=0.304 Sum_probs=59.0
Q ss_pred CCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HHHHHHHhc--CCEEEEEEEEC-CeEEEEEEE
Q psy18066 308 THGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KDIYAALEV--VRLVNFQFSHF-KHSFLVESE 373 (375)
Q Consensus 308 ~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~l~~~l~~--~~~v~l~v~R~-g~~~~v~~~ 373 (375)
..+++|..|.++|||+++||++||+|++|||+++.++ .++...+.. ++.+.+++.|+ ++...++++
T Consensus 12 ~~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~~~~~~~~ 82 (85)
T cd00988 12 DGGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGEPREVTLT 82 (85)
T ss_pred CCeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCCEEEEEEE
Confidence 3689999999999999999999999999999999999 999888854 67899999999 877777664
No 23
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.07 E-value=9.1e-10 Score=110.15 Aligned_cols=88 Identities=25% Similarity=0.433 Sum_probs=76.1
Q ss_pred eeeeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CC
Q psy18066 279 KKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VR 355 (375)
Q Consensus 279 ~~~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~ 355 (375)
+.|+|+.+.+++++..+++. +| ....|++|.+|.++|||+++||++||+|++|||++|.+++++.+++.. ++
T Consensus 337 ~~~lGi~~~~l~~~~~~~~~----l~-~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~ 411 (428)
T TIGR02037 337 NPFLGLTVANLSPEIRKELR----LK-GDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGG 411 (428)
T ss_pred ccccceEEecCCHHHHHHcC----CC-cCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCC
Confidence 46899999999988876643 33 224799999999999999999999999999999999999999998874 68
Q ss_pred EEEEEEEECCeEEEEE
Q psy18066 356 LVNFQFSHFKHSFLVE 371 (375)
Q Consensus 356 ~v~l~v~R~g~~~~v~ 371 (375)
++.++|+|+|+..++.
T Consensus 412 ~v~l~v~R~g~~~~~~ 427 (428)
T TIGR02037 412 RVALLILRGGATIFVT 427 (428)
T ss_pred EEEEEEEECCEEEEEE
Confidence 8999999999987664
No 24
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.95 E-value=4.1e-09 Score=77.74 Aligned_cols=53 Identities=28% Similarity=0.383 Sum_probs=48.3
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HHHHHHHhc--CCEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KDIYAALEV--VRLVNFQF 361 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~l~~~l~~--~~~v~l~v 361 (375)
.|++|.+|.++|||+++||++||+|++|||+++.++ +++.+.+.. +++++|++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v 69 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV 69 (70)
T ss_pred CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence 489999999999999999999999999999999999 999999876 46777776
No 25
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.86 E-value=2.1e-07 Score=84.98 Aligned_cols=171 Identities=18% Similarity=0.116 Sum_probs=98.4
Q ss_pred HhCCceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCCCc--eEEEEc----CCCC-EEEEE--EEEec-C--
Q psy18066 81 NVEKSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKPGA--QIIVTL----PDGS-KHKGA--VEALD-V-- 148 (375)
Q Consensus 81 ~~~~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~--~i~V~~----~~g~-~~~a~--vv~~d-~-- 148 (375)
--..+||..+... +....++|+|+++ .+||++||+...... ++.+.. .++. .+..+ ..... .
T Consensus 49 ~Py~av~~~~~~t-----G~~~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~ 122 (251)
T COG3591 49 FPYSAVVQFEAAT-----GRLCTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGEL 122 (251)
T ss_pred CCcceeEEeecCC-----CcceeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCce
Confidence 3345666444331 1333455999998 999999999865322 222221 1221 11111 11112 2
Q ss_pred -CCCeEEEEecCC---------CCCCeeecCCCCCCCCCCEEEEEecCCCCCCCe----eeeEEeeeeccccccCCcccc
Q psy18066 149 -ECDLAIIRCNFP---------NNYPALKLGKAADIRNGEFVIAMGSPLTLNNTN----TFGIISNKQRSSETLGLNKTI 214 (375)
Q Consensus 149 -~~DlAlLki~~~---------~~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~----~~G~vs~~~~~~~~~~~~~~~ 214 (375)
+.|.+...+.+. .......+.-....+.++.+..+|||.+..+.. ..+.+... ..
T Consensus 123 ~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~-----------~~ 191 (251)
T COG3591 123 YKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI-----------KG 191 (251)
T ss_pred eccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE-----------ec
Confidence 345555555432 112223343445678899999999998754322 22222221 12
Q ss_pred cEEEEeecCCCCCccceeeccCCeEEEEEeeecC----CCeEEE-EehHHHHHHHHHHH
Q psy18066 215 NYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT----AGISFA-IPIDYAIEFLTNYK 268 (375)
Q Consensus 215 ~~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~----~g~g~a-ip~~~i~~~l~~l~ 268 (375)
..++.++.+.+|+||+|+++.+.+|||+++.... .-.+++ .-...++++++++.
T Consensus 192 ~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 192 NKLFYDADTLPGSSGSPVLISKDEVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred ceEEEEecccCCCCCCceEecCceEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence 3689999999999999999999999999988754 122322 33455666666553
No 26
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.83 E-value=1.7e-08 Score=100.57 Aligned_cols=66 Identities=21% Similarity=0.213 Sum_probs=61.0
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc--CCEEEEEEEECCeEEEEEEEe
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV--VRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~--~~~v~l~v~R~g~~~~v~~~l 374 (375)
.|++|.+|.++|||+++||++||+|++|||++|++++|+.+.+.. ++++.+++.|+|+..+++++.
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~~~~~v~~ 270 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVERNGETLSISLTP 270 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEECCEEEEEEEEE
Confidence 479999999999999999999999999999999999999999876 678999999999998888765
No 27
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.83 E-value=7.6e-09 Score=103.99 Aligned_cols=63 Identities=13% Similarity=0.032 Sum_probs=57.2
Q ss_pred EEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEEECCeEEEEEEEe
Q psy18066 312 LIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 312 ~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~l 374 (375)
+|.+|.++|||++||||+||+|+++||++|++++|+...+.. +++++++|.|+|+.+++++++
T Consensus 129 lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l 194 (449)
T PRK10779 129 VVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTL 194 (449)
T ss_pred cccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEe
Confidence 689999999999999999999999999999999999887765 678999999999987777655
No 28
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.76 E-value=4.5e-08 Score=98.46 Aligned_cols=66 Identities=15% Similarity=0.148 Sum_probs=60.0
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc--CCEEEEEEEECCeEEEEEEEe
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV--VRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~--~~~v~l~v~R~g~~~~v~~~l 374 (375)
.+++|.+|.++|||++|||++||+|+++||++|++++|+.+.+.. ++.+.+++.|+|+..++++++
T Consensus 221 ~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~~~~~v~~ 288 (449)
T PRK10779 221 IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIERQGSPLSLTLTP 288 (449)
T ss_pred cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEECCEEEEEEEEe
Confidence 358899999999999999999999999999999999999998865 678999999999988877764
No 29
>PRK10942 serine endoprotease; Provisional
Probab=98.75 E-value=4.7e-08 Score=98.66 Aligned_cols=64 Identities=22% Similarity=0.351 Sum_probs=59.4
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc-CCEEEEEEEECCeEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV-VRLVNFQFSHFKHSFLVES 372 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~-~~~v~l~v~R~g~~~~v~~ 372 (375)
.|++|.+|.++|||+++||++||+|++|||++|.+++++.+++.. .+.+.|+|.|+|+.+++.+
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~~~v~l~V~R~g~~~~v~~ 472 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKPSVLALNIQRGDSSIYLLM 472 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCeEEEEEEECCEEEEEEe
Confidence 589999999999999999999999999999999999999999976 5789999999999887765
No 30
>PRK10139 serine endoprotease; Provisional
Probab=98.70 E-value=7.9e-08 Score=96.59 Aligned_cols=64 Identities=27% Similarity=0.389 Sum_probs=59.3
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc-CCEEEEEEEECCeEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV-VRLVNFQFSHFKHSFLVES 372 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~-~~~v~l~v~R~g~~~~v~~ 372 (375)
.|++|.+|.++|||+++||++||+|++|||++|.+++++.+.+.. .+.+.++|+|+|+..++.+
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~v~l~v~R~g~~~~~~~ 454 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKPAIIALQIVRGNESIYLLL 454 (455)
T ss_pred CceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCeEEEEEEECCEEEEEEe
Confidence 689999999999999999999999999999999999999999976 5689999999999887765
No 31
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.67 E-value=1.4e-07 Score=71.62 Aligned_cols=53 Identities=25% Similarity=0.308 Sum_probs=47.2
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcC--CHHHHHHHHhc-CCEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCH--SAKDIYAALEV-VRLVNFQF 361 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~--~~~~l~~~l~~-~~~v~l~v 361 (375)
.|++|.+|.++|||+++||++||+|++|||+++. +++++...+.. ...+.+++
T Consensus 26 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~~v~l~v 81 (82)
T cd00992 26 GGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGDEVTLTV 81 (82)
T ss_pred CCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCCeEEEEE
Confidence 6899999999999999999999999999999999 89999999876 34566654
No 32
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.65 E-value=1.3e-07 Score=91.64 Aligned_cols=66 Identities=24% Similarity=0.218 Sum_probs=56.3
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HHHHHHHhc--CCEEEEEEEECCeEEEEEEEe
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KDIYAALEV--VRLVNFQFSHFKHSFLVESEL 374 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~l~~~l~~--~~~v~l~v~R~g~~~~v~~~l 374 (375)
.+++|.+|.++|||+++||++||+|++|||++|+++ .++...+.. ++++.+++.|+++...+++++
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~g~~~~~~v~l 131 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRAGKSKPLTFTL 131 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeCCCCceEEEEE
Confidence 589999999999999999999999999999999986 577776643 788999999998766655543
No 33
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.64 E-value=1.9e-07 Score=71.11 Aligned_cols=57 Identities=28% Similarity=0.310 Sum_probs=47.9
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHH--Hhc-CCEEEEEEEECC
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAA--LEV-VRLVNFQFSHFK 365 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~--l~~-~~~v~l~v~R~g 365 (375)
.|++|..|.++|||+++||++||+|++|||+++.++.+.... +.. ++.+.+++.|++
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r~~ 85 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLRGG 85 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEeCC
Confidence 699999999999999999999999999999999987554333 332 558999999864
No 34
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.56 E-value=7.9e-06 Score=74.15 Aligned_cols=162 Identities=19% Similarity=0.248 Sum_probs=89.8
Q ss_pred hCCceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEE-----EEEecCCCCeEEEE
Q psy18066 82 VEKSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGA-----VEALDVECDLAIIR 156 (375)
Q Consensus 82 ~~~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~-----vv~~d~~~DlAlLk 156 (375)
+...|..|.... +.....--|+.+. + ||+|++|..+... ..++|...-|. |... -+..=+..||.++|
T Consensus 16 Ia~~ic~l~n~s---~~~~~~l~gigyG-~-~iItn~HLf~~nn-g~L~i~s~hG~-f~v~nt~~lkv~~i~~~Diviir 88 (235)
T PF00863_consen 16 IASNICRLTNES---DGGTRSLYGIGYG-S-YIITNAHLFKRNN-GELTIKSQHGE-FTVPNTTQLKVHPIEGRDIVIIR 88 (235)
T ss_dssp HHTTEEEEEEEE---TTEEEEEEEEEET-T-EEEEEGGGGSSTT-CEEEEEETTEE-EEECEGGGSEEEE-TCSSEEEEE
T ss_pred hhheEEEEEEEe---CCCeEEEEEEeEC-C-EEEEChhhhccCC-CeEEEEeCceE-EEcCCccccceEEeCCccEEEEe
Confidence 456788887653 1113344566665 3 9999999997653 66888876653 3221 23334689999999
Q ss_pred ecCCCCCCeeecC-CCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCCCCccceeecc
Q psy18066 157 CNFPNNYPALKLG-KAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNL 235 (375)
Q Consensus 157 i~~~~~~~~~~l~-~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~ 235 (375)
++.+ +||.+-. ..+.++.+|+|..+|.-+.... ....||......+ .....+..+..+...|+-|.||++.
T Consensus 89 mPkD--fpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p----~~~~~fWkHwIsTk~G~CG~PlVs~ 160 (235)
T PF00863_consen 89 MPKD--FPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYP----EENSHFWKHWISTKDGDCGLPLVST 160 (235)
T ss_dssp --TT--S----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEE----ETTTTEEEE-C---TT-TT-EEEET
T ss_pred CCcc--cCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEee----cCCCCeeEEEecCCCCccCCcEEEc
Confidence 9875 4443321 2367889999999998654332 2223343332221 2344667788888899999999986
Q ss_pred C-CeEEEEEeeecC-CCeEEEEehH
Q psy18066 236 D-GEVIGINSMKVT-AGISFAIPID 258 (375)
Q Consensus 236 ~-G~VIGI~s~~~~-~g~g~aip~~ 258 (375)
+ |.+|||++.... ...+|+.|..
T Consensus 161 ~Dg~IVGiHsl~~~~~~~N~F~~f~ 185 (235)
T PF00863_consen 161 KDGKIVGIHSLTSNTSSRNYFTPFP 185 (235)
T ss_dssp TT--EEEEEEEEETTTSSEEEEE--
T ss_pred CCCcEEEEEcCccCCCCeEEEEcCC
Confidence 4 999999998764 4567777765
No 35
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.56 E-value=2.4e-07 Score=70.44 Aligned_cols=70 Identities=24% Similarity=0.339 Sum_probs=54.9
Q ss_pred eeeeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HHHHHHHhc-CC
Q psy18066 279 KKYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KDIYAALEV-VR 355 (375)
Q Consensus 279 ~~~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~l~~~l~~-~~ 355 (375)
...||+.+..-.+ ....+++|.+|.++|||+++||++||.|++|||+.+.++ .++...+.. .+
T Consensus 9 ~~~lG~~l~~~~~--------------~~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~ 74 (81)
T PF00595_consen 9 NGPLGFTLRGGSD--------------NDEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASN 74 (81)
T ss_dssp TSBSSEEEEEEST--------------SSSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTS
T ss_pred CCCcCEEEEecCC--------------CCcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCC
Confidence 3578888764321 012599999999999999999999999999999999976 566667765 45
Q ss_pred EEEEEEE
Q psy18066 356 LVNFQFS 362 (375)
Q Consensus 356 ~v~l~v~ 362 (375)
+++|+|.
T Consensus 75 ~v~L~V~ 81 (81)
T PF00595_consen 75 PVTLTVQ 81 (81)
T ss_dssp EEEEEEE
T ss_pred cEEEEEC
Confidence 7888764
No 36
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.55 E-value=4.5e-07 Score=88.39 Aligned_cols=66 Identities=18% Similarity=0.217 Sum_probs=57.2
Q ss_pred CCCeEEEEEc--------cCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc--CCEEEEEEEECCeEEEEEEE
Q psy18066 308 THGVLIWRVM--------YNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV--VRLVNFQFSHFKHSFLVESE 373 (375)
Q Consensus 308 ~~g~~V~~v~--------~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~--~~~v~l~v~R~g~~~~v~~~ 373 (375)
.+||+|.... .+|||+++||++||+|++|||++|++++|+.+.+.. ++.+.++|.|+++..++.++
T Consensus 104 t~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~Ge~~tv~V~ 179 (402)
T TIGR02860 104 TKGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERGGKIIETVIK 179 (402)
T ss_pred cCEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEECCEEEEEEEE
Confidence 3688886653 268999999999999999999999999999998876 67899999999998877764
No 37
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.54 E-value=4.2e-07 Score=89.77 Aligned_cols=64 Identities=17% Similarity=0.210 Sum_probs=55.6
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HHHHHHHhc--CCEEEEEEEECCeEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KDIYAALEV--VRLVNFQFSHFKHSFLVES 372 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~l~~~l~~--~~~v~l~v~R~g~~~~v~~ 372 (375)
.|++|..|.++|||+++||++||+|++|||++|.++ .++...+.. ++.+.++|.|+++..++++
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r~g~~~~~~l 169 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRRGPETRLVTL 169 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEECCEEEEEEE
Confidence 489999999999999999999999999999999864 677777753 6789999999998776655
No 38
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.46 E-value=3.9e-07 Score=89.37 Aligned_cols=60 Identities=18% Similarity=0.094 Sum_probs=52.9
Q ss_pred EEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhcCCEEEEEEE-ECCeEEEEEEE
Q psy18066 313 IWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEVVRLVNFQFS-HFKHSFLVESE 373 (375)
Q Consensus 313 V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~~~~v~l~v~-R~g~~~~v~~~ 373 (375)
|.+|.|+|||+++||++||+|++|||+++.+|.|+...+. ++.+.++|. |+|+..+++++
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~-~e~l~L~V~~rdGe~~~l~Ie 62 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA-DEELELEVLDANGESHQIEIE 62 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc-CCcEEEEEEcCCCeEEEEEEe
Confidence 5678999999999999999999999999999999988885 567899996 88887777664
No 39
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.41 E-value=4.8e-07 Score=90.19 Aligned_cols=64 Identities=20% Similarity=0.151 Sum_probs=56.2
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc-CCEEEEEEEECCeEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV-VRLVNFQFSHFKHSFLVES 372 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~-~~~v~l~v~R~g~~~~v~~ 372 (375)
.|++|.+|.++|||++|||++||+|+++||+++.+++++...+.. .+++.+++.|+++...+++
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~~~v~~~I~r~g~~~~l~v 192 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIAGEPMVEILAERENWTFEV 192 (420)
T ss_pred CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhcccceEEEEEecCceEecc
Confidence 588999999999999999999999999999999999999887764 3678899999887766443
No 40
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.30 E-value=2.5e-06 Score=84.42 Aligned_cols=59 Identities=20% Similarity=0.245 Sum_probs=51.3
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HHHHHHHhc--CCEEEEEEEECCeE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KDIYAALEV--VRLVNFQFSHFKHS 367 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~l~~~l~~--~~~v~l~v~R~g~~ 367 (375)
.++.|.++.+++||++|||++||+|++|||+++... .+....++. |..|+|+|.|.+..
T Consensus 112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~~~~ 174 (406)
T COG0793 112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRAGGG 174 (406)
T ss_pred CCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEcCCC
Confidence 789999999999999999999999999999999877 456666654 78999999997433
No 41
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.18 E-value=1.2e-05 Score=61.79 Aligned_cols=65 Identities=23% Similarity=0.266 Sum_probs=47.2
Q ss_pred CCCeEEEEEccC--------ChhhhCCCC--CCCEEEEeCCEEcCCHHHHHHHHhc--CCEEEEEEEECC-eEEEEEE
Q psy18066 308 THGVLIWRVMYN--------SPAYLAGLH--QEDIIIELNKKPCHSAKDIYAALEV--VRLVNFQFSHFK-HSFLVES 372 (375)
Q Consensus 308 ~~g~~V~~v~~~--------s~a~~aGl~--~gDiI~~vng~~v~~~~~l~~~l~~--~~~v~l~v~R~g-~~~~v~~ 372 (375)
..+..|.++.++ ||-.+.|+. +||+|++|||++++.-.++..+|.. ++.|.|+|.+.+ +.+++.+
T Consensus 11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~~~~R~v~V 88 (88)
T PF14685_consen 11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKPGGARTVVV 88 (88)
T ss_dssp TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-STT-EEEEEE
T ss_pred CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCCCCceEEEC
Confidence 468889999875 777777755 9999999999999999999999986 789999999966 4555543
No 42
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=98.06 E-value=1.2e-05 Score=67.46 Aligned_cols=81 Identities=23% Similarity=0.220 Sum_probs=57.0
Q ss_pred eeeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCC-CCEEEEeCCEEcCCHHHHHHHHhc--CCE
Q psy18066 280 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQ-EDIIIELNKKPCHSAKDIYAALEV--VRL 356 (375)
Q Consensus 280 ~~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~-gDiI~~vng~~v~~~~~l~~~l~~--~~~ 356 (375)
+-||++++--..+ .....++-|.+|.|+|||++|||++ .|.|+.+|+...++.++|.+.+.. ++.
T Consensus 26 g~LG~sv~~~~~~------------~~~~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~ 93 (138)
T PF04495_consen 26 GLLGISVRFESFE------------GAEEEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKP 93 (138)
T ss_dssp SSS-EEEEEEE-T------------TGCCCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-
T ss_pred CCCcEEEEEeccc------------ccccceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCc
Confidence 6788887643221 0235689999999999999999999 599999999999999999999876 678
Q ss_pred EEEEEEECC--eEEEEEE
Q psy18066 357 VNFQFSHFK--HSFLVES 372 (375)
Q Consensus 357 v~l~v~R~g--~~~~v~~ 372 (375)
+.|.|+... ..+.+++
T Consensus 94 l~L~Vyns~~~~vR~V~i 111 (138)
T PF04495_consen 94 LQLYVYNSKTDSVREVTI 111 (138)
T ss_dssp EEEEEEETTTTCEEEEEE
T ss_pred EEEEEEECCCCeEEEEEE
Confidence 999999743 3444554
No 43
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=98.03 E-value=1.6e-05 Score=73.81 Aligned_cols=66 Identities=15% Similarity=0.225 Sum_probs=52.6
Q ss_pred CeEEEEEccCC---hhhhCCCCCCCEEEEeCCEEcCCHHHHHHHH---hcCCEEEEEEEECCeEEEEEEEeC
Q psy18066 310 GVLIWRVMYNS---PAYLAGLHQEDIIIELNKKPCHSAKDIYAAL---EVVRLVNFQFSHFKHSFLVESELK 375 (375)
Q Consensus 310 g~~V~~v~~~s---~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l---~~~~~v~l~v~R~g~~~~v~~~l~ 375 (375)
|+.=..+.|+. --+++|||+||++++|||.++++.++..+++ ....+++|+|+|+|+..++.+.|+
T Consensus 205 Gl~GYrl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i~l~ 276 (276)
T PRK09681 205 GIVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIALR 276 (276)
T ss_pred CceEEEECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEEEcC
Confidence 43334556664 3578999999999999999999998655554 447889999999999999888764
No 44
>KOG3627|consensus
Probab=97.99 E-value=0.0007 Score=62.57 Aligned_cols=162 Identities=22% Similarity=0.237 Sum_probs=88.2
Q ss_pred eEEEEEEeCCCEEEecccccCCCCCc--eEEEEcCC---------C---CEE-EEEEEEecC-------C-CCeEEEEec
Q psy18066 102 NGSGFIATDDGLIITNAHVVSGKPGA--QIIVTLPD---------G---SKH-KGAVEALDV-------E-CDLAIIRCN 158 (375)
Q Consensus 102 ~GSGfiI~~~G~IlT~~Hvv~~~~~~--~i~V~~~~---------g---~~~-~a~vv~~d~-------~-~DlAlLki~ 158 (375)
...|.+|+++ ||+|++||+.+. . .+.|.+.. + ... -.+++ .++ . .|||||+++
T Consensus 39 ~Cggsli~~~-~vltaaHC~~~~--~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~ 114 (256)
T KOG3627|consen 39 LCGGSLISPR-WVLTAAHCVKGA--SASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLS 114 (256)
T ss_pred eeeeEEeeCC-EEEEChhhCCCC--CCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEEC
Confidence 4556677766 999999999986 3 55565531 1 111 11233 232 3 799999998
Q ss_pred CC----CCCCeeecCCCCC---CCCCCEEEEEecCCCCCC------CeeeeEEeeeeccc--cccCCc--ccccEEEEe-
Q psy18066 159 FP----NNYPALKLGKAAD---IRNGEFVIAMGSPLTLNN------TNTFGIISNKQRSS--ETLGLN--KTINYIQTD- 220 (375)
Q Consensus 159 ~~----~~~~~~~l~~s~~---~~~G~~v~~iG~p~g~~~------~~~~G~vs~~~~~~--~~~~~~--~~~~~i~~d- 220 (375)
.+ ..+.++.+..... ...+..+++.||+..... ......+.-..... ...... .....+-..
T Consensus 115 ~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~ 194 (256)
T KOG3627|consen 115 EPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGG 194 (256)
T ss_pred CCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCc
Confidence 74 3456677743332 344488888998643221 11111111111100 000000 001122222
Q ss_pred ----ecCCCCCccceeeccC---CeEEEEEeeecC-CC----eEEEEehHHHHHHHHHH
Q psy18066 221 ----AAITFGNSGGPLVNLD---GEVIGINSMKVT-AG----ISFAIPIDYAIEFLTNY 267 (375)
Q Consensus 221 ----~~i~~G~SGGPlvn~~---G~VIGI~s~~~~-~g----~g~aip~~~i~~~l~~l 267 (375)
...|.|+|||||+-.+ ..++||+++... -+ =+....+....+++++.
T Consensus 195 ~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~ 253 (256)
T KOG3627|consen 195 PEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKEN 253 (256)
T ss_pred cCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHH
Confidence 2368899999999765 589999999864 11 23345555555555553
No 45
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.98 E-value=2.9e-05 Score=81.06 Aligned_cols=64 Identities=19% Similarity=0.179 Sum_probs=50.0
Q ss_pred CCeEEEEEccCChhhhC-CCCCCCEEEEeC--CEEcCC-----HHHHHHHHhc--CCEEEEEEEEC---CeEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLA-GLHQEDIIIELN--KKPCHS-----AKDIYAALEV--VRLVNFQFSHF---KHSFLVES 372 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~a-Gl~~gDiI~~vn--g~~v~~-----~~~l~~~l~~--~~~v~l~v~R~---g~~~~v~~ 372 (375)
.+++|.+|.+||||+++ ||++||+|++|| |+++.+ .+++...++. |.+|.|+|.|+ ++.+.+++
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r~~~~~~~~~vtl 331 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILPAGKGTKTRIVTL 331 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEeCCCCCceEEEEE
Confidence 47899999999999998 999999999999 555443 3577777764 78999999984 34444443
No 46
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.89 E-value=4.7e-05 Score=70.96 Aligned_cols=65 Identities=23% Similarity=0.309 Sum_probs=58.0
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEEE-CCeEEEEEEEe
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFSH-FKHSFLVESEL 374 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~R-~g~~~~v~~~l 374 (375)
.|+++..|..++|+..- |+.||.|++|||+++.+.+++..++.+ |++|++++.| ++++..++.++
T Consensus 130 ~gvyv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl 198 (342)
T COG3480 130 AGVYVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITL 198 (342)
T ss_pred eeEEEEEccCCcchhce-eccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEE
Confidence 69999999999998864 999999999999999999999999976 8999999997 77777666654
No 47
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.87 E-value=8.8e-05 Score=67.50 Aligned_cols=124 Identities=24% Similarity=0.318 Sum_probs=67.7
Q ss_pred CceEEEEEEeCCC--EEEecccccCCCCCceEEEEcCCCCEEEEEEEEecCCCCeEEEEecC-CCCCCeeecCCCCCCCC
Q psy18066 100 MSNGSGFIATDDG--LIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNF-PNNYPALKLGKAADIRN 176 (375)
Q Consensus 100 ~~~GSGfiI~~~G--~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~-~~~~~~~~l~~s~~~~~ 176 (375)
...|||=+...+| .|+|+.||+.+ +...|... +.. +....+..-|+|.-.++. +..+|.+++.+ -..
T Consensus 111 ss~Gsggvft~~~~~vvvTAtHVlg~---~~a~v~~~-g~~---~~~tF~~~GDfA~~~~~~~~G~~P~~k~a~---~~~ 180 (297)
T PF05579_consen 111 SSVGSGGVFTIGGNTVVVTATHVLGG---NTARVSGV-GTR---RMLTFKKNGDFAEADITNWPGAAPKYKFAQ---NYT 180 (297)
T ss_dssp SSEEEEEEEECTTEEEEEEEHHHCBT---TEEEEEET-TEE---EEEEEEEETTEEEEEETTS-S---B--B-T---T-S
T ss_pred ecccccceEEECCeEEEEEEEEEcCC---CeEEEEec-ceE---EEEEEeccCcEEEEECCCCCCCCCceeecC---Ccc
Confidence 3456655555444 99999999984 35555433 322 223345677999999943 45678888752 112
Q ss_pred CCEEEEEecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCCCCccceeeccCCeEEEEEeeecCCCeEEEE
Q psy18066 177 GEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTAGISFAI 255 (375)
Q Consensus 177 G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~~g~g~ai 255 (375)
|.--+.- ...+..|.|.... .=+-..+|+||+|++..+|.+||+++.....|.|+..
T Consensus 181 GrAyW~t------~tGvE~G~ig~~~----------------~~~fT~~GDSGSPVVt~dg~liGVHTGSn~~G~g~vT 237 (297)
T PF05579_consen 181 GRAYWLT------STGVEPGFIGGGG----------------AVCFTGPGDSGSPVVTEDGDLIGVHTGSNKRGSGAVT 237 (297)
T ss_dssp EEEEEEE------TTEEEEEEEETTE----------------EEESS-GGCTT-EEEETTC-EEEEEEEEETTTEEEEE
T ss_pred cceEEEc------ccCcccceecCce----------------EEEEcCCCCCCCccCcCCCCEEEEEecCCCcCceEEE
Confidence 3211111 1223344432111 0112357999999999999999999999887777643
No 48
>PF12812 PDZ_1: PDZ-like domain
Probab=97.81 E-value=0.0001 Score=55.56 Aligned_cols=66 Identities=21% Similarity=0.242 Sum_probs=55.0
Q ss_pred eeeeeEEeeccHHHHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc
Q psy18066 280 KYIGITMLTLNEKLIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV 353 (375)
Q Consensus 280 ~~lGi~~~~~~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~ 353 (375)
-|.|..+++|+.+.+.++. ..-|+++.....++++.+.|+..|.+|++|||+++.+.+++.+.+++
T Consensus 9 ~~~Ga~f~~Ls~q~aR~~~--------~~~~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ 74 (78)
T PF12812_consen 9 EVCGAVFHDLSYQQARQYG--------IPVGGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVKK 74 (78)
T ss_pred EEcCeecccCCHHHHHHhC--------CCCCEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHHh
Confidence 5789999999988887763 23345666778899988777999999999999999999999998864
No 49
>KOG3553|consensus
Probab=97.76 E-value=2.6e-05 Score=60.11 Aligned_cols=36 Identities=33% Similarity=0.443 Sum_probs=33.2
Q ss_pred CCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 307 LTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 307 ~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
...|++|++|.++|||+.|||+.+|.|+.+||...+
T Consensus 57 tD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfT 92 (124)
T KOG3553|consen 57 TDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFT 92 (124)
T ss_pred CCccEEEEEeccCChhhhhcceecceEEEecCceeE
Confidence 357999999999999999999999999999997754
No 50
>KOG3129|consensus
Probab=97.75 E-value=8.9e-05 Score=65.04 Aligned_cols=64 Identities=16% Similarity=0.126 Sum_probs=51.8
Q ss_pred CeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHH---HHhc--CCEEEEEEEECCeEEEEEEE
Q psy18066 310 GVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYA---ALEV--VRLVNFQFSHFKHSFLVESE 373 (375)
Q Consensus 310 g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~---~l~~--~~~v~l~v~R~g~~~~v~~~ 373 (375)
=++|.+|.|+|||++|||+.||.|+++....--++..|.. .... ++.+.++|.|.|+...+.++
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~lt 208 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLT 208 (231)
T ss_pred eEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeC
Confidence 4789999999999999999999999998766666655443 2222 67899999999998887765
No 51
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.55 E-value=0.00013 Score=72.47 Aligned_cols=63 Identities=21% Similarity=0.110 Sum_probs=49.7
Q ss_pred CCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhcCCEEEEEEEECCeEEEEEEE
Q psy18066 306 DLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEVVRLVNFQFSHFKHSFLVESE 373 (375)
Q Consensus 306 ~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~~~~v~l~v~R~g~~~~v~~~ 373 (375)
....+.+|..|.++|||++|||.+||.|++|||. .+=...+.-++.+++++.|.++.++..++
T Consensus 459 ~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~-----s~~l~~~~~~d~i~v~~~~~~~L~e~~v~ 521 (558)
T COG3975 459 SEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI-----SDQLDRYKVNDKIQVHVFREGRLREFLVK 521 (558)
T ss_pred ccCCeeEEEecCCCChhHhccCCCccEEEEEcCc-----cccccccccccceEEEEccCCceEEeecc
Confidence 3457889999999999999999999999999998 11111222378899999999988877654
No 52
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.44 E-value=0.00029 Score=63.09 Aligned_cols=65 Identities=15% Similarity=0.209 Sum_probs=54.7
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc---CCEEEEEEEECCeEEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV---VRLVNFQFSHFKHSFLVESE 373 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~---~~~v~l~v~R~g~~~~v~~~ 373 (375)
.|..+.-..+++.-++.|||.||+.+++|+..+++.+++..++.. -..++++|.|+|+...+.+.
T Consensus 207 ~Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred EEEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 355555556677889999999999999999999999998877765 57799999999999888765
No 53
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.17 E-value=0.014 Score=55.68 Aligned_cols=50 Identities=18% Similarity=0.270 Sum_probs=35.9
Q ss_pred ecCCCCCccceeecc--CCeE-EEEEeeecCC-----CeEEEEehHHHHHHHHHHHhc
Q psy18066 221 AAITFGNSGGPLVNL--DGEV-IGINSMKVTA-----GISFAIPIDYAIEFLTNYKRK 270 (375)
Q Consensus 221 ~~i~~G~SGGPlvn~--~G~V-IGI~s~~~~~-----g~g~aip~~~i~~~l~~l~~~ 270 (375)
...|.|+||||+|-. +|++ +||++|.... --+...-++....|+....++
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~ 280 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNG 280 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcC
Confidence 456899999999954 3766 8999998752 123455578888888885543
No 54
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.12 E-value=0.051 Score=51.17 Aligned_cols=168 Identities=18% Similarity=0.181 Sum_probs=96.0
Q ss_pred CCceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCC---------------Cc--e----------EEE----
Q psy18066 83 EKSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKP---------------GA--Q----------IIV---- 131 (375)
Q Consensus 83 ~~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~---------------~~--~----------i~V---- 131 (375)
.|=.|.+.... ........+|++|+++ +|||++|++-... +. . +.+
T Consensus 53 ~pW~v~v~~~~--~~~~~~~~~gtlIS~R-HiLtss~~~~~~~~~W~~~~~~~~~~C~~~~~~l~vP~~~l~~~~v~~~~ 129 (282)
T PF03761_consen 53 APWAVSVYTKN--HNEGNYFSTGTLISPR-HILTSSHCVMNDKSKWLNGEEFDNKKCEGNNNHLIVPEEVLSKIDVRCCN 129 (282)
T ss_pred CCCEEEEEecc--CcccceecceEEeccC-eEEEeeeEEEecccccccCcccccceeeCCCceEEeCHHHhccEEEEeec
Confidence 45566776653 1111233399999999 9999999986321 01 1 112
Q ss_pred EcCCC-----CEEEEEEEE--------ecCCCCeEEEEecCC--CCCCeeecCCCC-CCCCCCEEEEEecCCCCCCCeee
Q psy18066 132 TLPDG-----SKHKGAVEA--------LDVECDLAIIRCNFP--NNYPALKLGKAA-DIRNGEFVIAMGSPLTLNNTNTF 195 (375)
Q Consensus 132 ~~~~g-----~~~~a~vv~--------~d~~~DlAlLki~~~--~~~~~~~l~~s~-~~~~G~~v~~iG~p~g~~~~~~~ 195 (375)
....+ +...|.++. ....++++||+++.+ ....++.|.++. .+..++.+.+.|+.. .....+
T Consensus 130 ~~~~~~~~~~~v~ka~il~~C~~~~~~~~~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~--~~~~~~ 207 (282)
T PF03761_consen 130 CFSNGKCFSIKVKKAYILNGCKKIKKNFNRPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNS--TGKLKH 207 (282)
T ss_pred ccccCCcccceeEEEEEEecCCCcccccccccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCC--CCeEEE
Confidence 00000 112233321 134679999999988 678888887654 367789998888821 111222
Q ss_pred eEEeeeeccccccCCcccccEEEEeecCCCCCccceee---ccCCeEEEEEeeecCC---CeEEEEehHHHHH
Q psy18066 196 GIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLV---NLDGEVIGINSMKVTA---GISFAIPIDYAIE 262 (375)
Q Consensus 196 G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlv---n~~G~VIGI~s~~~~~---g~g~aip~~~i~~ 262 (375)
..+.-... ......+.++...+.|++||||+ |.+-.||||.+..... ...++..+..+++
T Consensus 208 ~~~~i~~~-------~~~~~~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 208 RKLKITNC-------TKCAYSICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD 273 (282)
T ss_pred EEEEEEEe-------eccceeEecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence 22211111 11233455666778999999998 3334589998765432 1456666665543
No 55
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.92 E-value=0.029 Score=49.03 Aligned_cols=148 Identities=20% Similarity=0.254 Sum_probs=84.7
Q ss_pred CCceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEE--EEEecC---CCCeEEEEe
Q psy18066 83 EKSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGA--VEALDV---ECDLAIIRC 157 (375)
Q Consensus 83 ~~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~--vv~~d~---~~DlAlLki 157 (375)
+..++.|.+. .....++++-|-.+ ++|.+.|. ... ..+.+ +|..++.. +.-.+. ..||+++++
T Consensus 12 ~~N~~~v~~~-----~g~~t~l~~gi~~~-~~lvp~H~-~~~--~~i~i---~g~~~~~~d~~~lv~~~~~~~Dl~~v~l 79 (172)
T PF00548_consen 12 KKNVVPVTTG-----KGEFTMLALGIYDR-YFLVPTHE-EPE--DTIYI---DGVEYKVDDSVVLVDRDGVDTDLTLVKL 79 (172)
T ss_dssp HHHEEEEEET-----TEEEEEEEEEEEBT-EEEEEGGG-GGC--SEEEE---TTEEEEEEEEEEEEETTSSEEEEEEEEE
T ss_pred hccEEEEEeC-----CceEEEecceEeee-EEEEECcC-CCc--EEEEE---CCEEEEeeeeEEEecCCCcceeEEEEEc
Confidence 3456666652 23667888888877 99999992 222 34444 45655432 233343 569999999
Q ss_pred cCCCCCCeee--cCCCCCCCCCCEEEEEecCCCCCC-CeeeeEEeeeeccccccCCcccccEEEEeecCCCCCccceeec
Q psy18066 158 NFPNNYPALK--LGKAADIRNGEFVIAMGSPLTLNN-TNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVN 234 (375)
Q Consensus 158 ~~~~~~~~~~--l~~s~~~~~G~~v~~iG~p~g~~~-~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn 234 (375)
+....++-+. |.+ ......+...++-++ .... ....+.++..... . .+.......+..+++..+|+-||||+.
T Consensus 80 ~~~~kfrDIrk~~~~-~~~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~-~~g~~~~~~~~Y~~~t~~G~CG~~l~~ 155 (172)
T PF00548_consen 80 PRNPKFRDIRKFFPE-SIPEYPECVLLVNST-KFPRMIVEVGFVTNFGFI-N-LSGTTTPRSLKYKAPTKPGMCGSPLVS 155 (172)
T ss_dssp ESSS-B--GGGGSBS-SGGTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-E-ETTEEEEEEEEEESEEETTGTTEEEEE
T ss_pred cCCcccCchhhhhcc-ccccCCCcEEEEECC-CCccEEEEEEEEeecCcc-c-cCCCEeeEEEEEccCCCCCccCCeEEE
Confidence 8754332111 101 111234444444332 2222 2334444443332 1 111344567889999999999999995
Q ss_pred c---CCeEEEEEeee
Q psy18066 235 L---DGEVIGINSMK 246 (375)
Q Consensus 235 ~---~G~VIGI~s~~ 246 (375)
. .++++||+.+.
T Consensus 156 ~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 156 RIGGQGKIIGIHVAG 170 (172)
T ss_dssp SCGGTTEEEEEEEEE
T ss_pred eeccCccEEEEEecc
Confidence 3 48999999885
No 56
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.63 E-value=0.0078 Score=63.41 Aligned_cols=23 Identities=39% Similarity=0.592 Sum_probs=20.4
Q ss_pred ceEEEEEEeCCCEEEecccccCC
Q psy18066 101 SNGSGFIATDDGLIITNAHVVSG 123 (375)
Q Consensus 101 ~~GSGfiI~~~G~IlT~~Hvv~~ 123 (375)
+-|||-+|+++|+|+||.||.-+
T Consensus 47 gGCSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 47 GGCSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred CceeEEEEcCCceEEecchhhhh
Confidence 34899999999999999999764
No 57
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.46 E-value=0.024 Score=58.18 Aligned_cols=117 Identities=20% Similarity=0.217 Sum_probs=73.3
Q ss_pred CCCCeEEEEecCCC--------------CCCeeecCCC------CCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccc
Q psy18066 148 VECDLAIIRCNFPN--------------NYPALKLGKA------ADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSET 207 (375)
Q Consensus 148 ~~~DlAlLki~~~~--------------~~~~~~l~~s------~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~ 207 (375)
.-.|+||++++..- .-|.+.+.+. ..+..|.+|+=+|.-.+ .|.|.+.+.....-.
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvyw~ 616 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVYWA 616 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEEec
Confidence 35699999998641 1223333221 24667999999998644 456666655321111
Q ss_pred cCCcccccEEEEe----ecCCCCCccceeeccCCe------EEEEEeeecC--CCeEEEEehHHHHHHHHHHH
Q psy18066 208 LGLNKTINYIQTD----AAITFGNSGGPLVNLDGE------VIGINSMKVT--AGISFAIPIDYAIEFLTNYK 268 (375)
Q Consensus 208 ~~~~~~~~~i~~d----~~i~~G~SGGPlvn~~G~------VIGI~s~~~~--~g~g~aip~~~i~~~l~~l~ 268 (375)
.+.....+++... .-...|+||+=+++.-+. |+||...... ..+|++.|++.|.+-|++..
T Consensus 617 dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~vT 689 (695)
T PF08192_consen 617 DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEVT 689 (695)
T ss_pred CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHhh
Confidence 1111122333333 122479999999986444 9999988654 47899999999888887764
No 58
>KOG3532|consensus
Probab=96.39 E-value=0.011 Score=60.43 Aligned_cols=48 Identities=23% Similarity=0.289 Sum_probs=44.1
Q ss_pred CCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc
Q psy18066 306 DLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV 353 (375)
Q Consensus 306 ~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~ 353 (375)
+....|.|..|.+++||.++.+++||++++|||.+|++.++..+.+..
T Consensus 395 ~~~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s 442 (1051)
T KOG3532|consen 395 NTNRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQS 442 (1051)
T ss_pred CCceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHh
Confidence 345678899999999999999999999999999999999999988876
No 59
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.30 E-value=0.1 Score=46.67 Aligned_cols=158 Identities=16% Similarity=0.204 Sum_probs=82.6
Q ss_pred CceEEEEEEeCC-CEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecCC----------------CCeEEEEecC---
Q psy18066 100 MSNGSGFIATDD-GLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDVE----------------CDLAIIRCNF--- 159 (375)
Q Consensus 100 ~~~GSGfiI~~~-G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~~----------------~DlAlLki~~--- 159 (375)
.+.||=-+++++ +..--=.|.+.+.+ ....+.+.+|+.|++++....+. .-+.-+.-..
T Consensus 19 aGiGTlTf~dp~~~~fgALGH~I~D~d-t~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~Nt~~G 97 (218)
T PF05580_consen 19 AGIGTLTFYDPETGTFGALGHGISDVD-TGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEKNTQFG 97 (218)
T ss_pred cCeEEEEEEECCCCcEEecCCeEEcCC-CCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEeccccc
Confidence 778888888863 56666688887763 33456667888888887766431 1111111111
Q ss_pred -------C-----CCCCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccC--C---cccccEEEEeec
Q psy18066 160 -------P-----NNYPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLG--L---NKTINYIQTDAA 222 (375)
Q Consensus 160 -------~-----~~~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~--~---~~~~~~i~~d~~ 222 (375)
. ...++++++....+++|..-+.--...........-+ ..+.+...... . -...+++....-
T Consensus 98 I~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~~G~~ie~f~ieI-~~v~~~~~~~~k~~vi~vtd~~Ll~~TGG 176 (218)
T PF05580_consen 98 IYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVIDGTKIEEFDIEI-EKVLPQSSPSGKGMVIKVTDPRLLEKTGG 176 (218)
T ss_pred eeEEeccccccccccCceeEEEEHHHceEccEEEEEEEcCCeEEEeEEEE-EEEccCCCCCCCcEEEEECCcchhhhhCC
Confidence 1 1234455555566677753321100000000111111 11111110000 0 111233333445
Q ss_pred CCCCCccceeeccCCeEEEEEeeec--CCCeEEEEehHHH
Q psy18066 223 ITFGNSGGPLVNLDGEVIGINSMKV--TAGISFAIPIDYA 260 (375)
Q Consensus 223 i~~G~SGGPlvn~~G~VIGI~s~~~--~~g~g~aip~~~i 260 (375)
+-.|+||+|++ .+|++||=++... .+..||.++++..
T Consensus 177 IvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~M 215 (218)
T PF05580_consen 177 IVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEWM 215 (218)
T ss_pred EEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHHH
Confidence 66899999998 6999999877543 3567888987643
No 60
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.30 E-value=0.0062 Score=64.14 Aligned_cols=53 Identities=26% Similarity=0.363 Sum_probs=40.8
Q ss_pred EEEEeecCCCCCccceeeccCCeEEEEEeeecC----------C--CeEEEEehHHHHHHHHHHH
Q psy18066 216 YIQTDAAITFGNSGGPLVNLDGEVIGINSMKVT----------A--GISFAIPIDYAIEFLTNYK 268 (375)
Q Consensus 216 ~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~----------~--g~g~aip~~~i~~~l~~l~ 268 (375)
.+.++..|..||||+|++|.+|||||++.=..- . .-+..+-+..++.+++++-
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 467888899999999999999999999864321 1 2355666777888887764
No 61
>KOG3209|consensus
Probab=95.74 E-value=0.013 Score=60.23 Aligned_cols=53 Identities=25% Similarity=0.246 Sum_probs=45.4
Q ss_pred EEEEccCChhhhCC-CCCCCEEEEeCCEEcCCH--HHHHHHHhc-CCEEEEEEEECC
Q psy18066 313 IWRVMYNSPAYLAG-LHQEDIIIELNKKPCHSA--KDIYAALEV-VRLVNFQFSHFK 365 (375)
Q Consensus 313 V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~~--~~l~~~l~~-~~~v~l~v~R~g 365 (375)
|..|.++|||++.| |+.||.|++|||+.|.+. .|+.+.++. |-+|+|+|....
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKdaGlsVtLtIip~e 838 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKDAGLSVTLTIIPPE 838 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHhcCceEEEEEcChh
Confidence 67889999999987 999999999999999876 577777776 888999997543
No 62
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.35 E-value=0.021 Score=47.40 Aligned_cols=115 Identities=26% Similarity=0.316 Sum_probs=57.1
Q ss_pred CCceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecCCCCeEEEEecCCCC
Q psy18066 83 EKSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNFPNN 162 (375)
Q Consensus 83 ~~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~~~~ 162 (375)
.+.+-+|.....+.+. ...|.|+.- +|..-|-.|+..++ .+.. +++.+ .....|-..||. .....
T Consensus 4 ~~gvyri~~~~l~~g~-~q~gvg~~~--~gvfhtmwhvt~Ga---~L~~---~~~~~--~p~~~sv~~dli--~ygg~-- 68 (132)
T PF00949_consen 4 KDGVYRIMQPGLFWGK-RQIGVGVMK--EGVFHTMWHVTRGA---ALRW---GGKRL--DPSWGSVREDLI--SYGGP-- 68 (132)
T ss_dssp -SEEEEEEEEETT-EE-EEEEEEEEE--TTEEEEEHHHHTT-----EEE---TTEEE---EEEEETTTTEE--EESSS--
T ss_pred cCceEEEeeccccccc-eeccceeee--CCceeeeecCCCcc---eEEE---CCeee--ccchhhhhcChh--hcCCc--
Confidence 4566677666522122 446777655 57899999999987 3333 23222 122223344442 11111
Q ss_pred CCeeecCCCCCCCCCCEEEEEecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCCCCccceeeccCCeEEEE
Q psy18066 163 YPALKLGKAADIRNGEFVIAMGSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGI 242 (375)
Q Consensus 163 ~~~~~l~~s~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VIGI 242 (375)
-+| ..--.|+++-.+|+ .+...+..+.+|.||+|+||.+|++|||
T Consensus 69 ---w~l---~~~~~g~evq~~G~-----------------------------~~~~~~~d~~~GsSGSpi~n~~g~ivGl 113 (132)
T PF00949_consen 69 ---WKL---DLKWHGEEVQQYGY-----------------------------GIGAIDLDFPKGSSGSPIFNQNGEIVGL 113 (132)
T ss_dssp ---------S----TS-EEEEC------------------------------EEEEE---S-TTGTT-EEEETTSCEEEE
T ss_pred ---ccC---CcccCCCEEEEECC-----------------------------eEEeeecccCCCCCCCceEcCCCcEEEE
Confidence 111 11122444444432 2234445577899999999999999999
Q ss_pred Eeeec
Q psy18066 243 NSMKV 247 (375)
Q Consensus 243 ~s~~~ 247 (375)
.....
T Consensus 114 Yg~g~ 118 (132)
T PF00949_consen 114 YGNGV 118 (132)
T ss_dssp EEEEE
T ss_pred Eccce
Confidence 87654
No 63
>KOG3580|consensus
Probab=95.27 E-value=0.026 Score=57.02 Aligned_cols=55 Identities=24% Similarity=0.269 Sum_probs=42.7
Q ss_pred CCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HH-HHHHHh--cCCEEEEEE
Q psy18066 307 LTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KD-IYAALE--VVRLVNFQF 361 (375)
Q Consensus 307 ~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~-l~~~l~--~~~~v~l~v 361 (375)
..-|++|..|..+|||++-||+.||.|+.||..+..+. ++ +.-.|. .|+.++|.-
T Consensus 427 NDVGIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtila 486 (1027)
T KOG3580|consen 427 NDVGIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILA 486 (1027)
T ss_pred CceeEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehh
Confidence 34699999999999999999999999999999988775 23 333332 277776643
No 64
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=95.06 E-value=0.035 Score=45.51 Aligned_cols=40 Identities=23% Similarity=0.323 Sum_probs=30.0
Q ss_pred EEEEeecCCCCCccceeeccCCeEEEEEeeecCCCeEEEE
Q psy18066 216 YIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTAGISFAI 255 (375)
Q Consensus 216 ~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~~g~g~ai 255 (375)
+......-.+|+||-|++|..|+||||+.....+|.-.++
T Consensus 96 ftip~g~g~~GDSGRpi~DNsGrVVaIVLGG~neG~RTaL 135 (158)
T PF00944_consen 96 FTIPTGVGKPGDSGRPIFDNSGRVVAIVLGGANEGRRTAL 135 (158)
T ss_dssp EEEETTS-STTSTTEEEESTTSBEEEEEEEEEEETTEEEE
T ss_pred EEeccCCCCCCCCCCccCcCCCCEEEEEecCCCCCCceEE
Confidence 3344556679999999999999999999887665544443
No 65
>KOG3542|consensus
Probab=94.66 E-value=0.038 Score=56.61 Aligned_cols=56 Identities=20% Similarity=0.228 Sum_probs=45.6
Q ss_pred CCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCHHH--HHHHHhcCCEEEEEEE
Q psy18066 307 LTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSAKD--IYAALEVVRLVNFQFS 362 (375)
Q Consensus 307 ~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~--l~~~l~~~~~v~l~v~ 362 (375)
...|++|.+|.|++.|++.|++-||.|++|||+...+... ..++|.....+.+++.
T Consensus 560 kGfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrnnthLtltvK 617 (1283)
T KOG3542|consen 560 KGFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRNNTHLTLTVK 617 (1283)
T ss_pred ccceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcCCceEEEEEe
Confidence 4579999999999999999999999999999998876643 3456665555666655
No 66
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=94.52 E-value=0.13 Score=50.39 Aligned_cols=53 Identities=23% Similarity=0.206 Sum_probs=46.3
Q ss_pred EEccCChhhhCCCCCCCEEEEeCCEEcCCHHHHHHHHhc--CCE---EEEEEEE-CCeE
Q psy18066 315 RVMYNSPAYLAGLHQEDIIIELNKKPCHSAKDIYAALEV--VRL---VNFQFSH-FKHS 367 (375)
Q Consensus 315 ~v~~~s~a~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~--~~~---v~l~v~R-~g~~ 367 (375)
++..+|++..+|+++||.|+++|++++.+++++...+.. +.. +.+.+.| ++..
T Consensus 135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~ 193 (375)
T COG0750 135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEA 193 (375)
T ss_pred ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEecccee
Confidence 789999999999999999999999999999999887764 444 7888899 6655
No 67
>KOG3571|consensus
Probab=94.41 E-value=0.061 Score=53.38 Aligned_cols=57 Identities=14% Similarity=0.280 Sum_probs=41.7
Q ss_pred CCCCeEEEEEccCChhhhCC-CCCCCEEEEeCCEEcCCHH--H----HHHHHhcCCEEEEEEEE
Q psy18066 307 LTHGVLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCHSAK--D----IYAALEVVRLVNFQFSH 363 (375)
Q Consensus 307 ~~~g~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~~~--~----l~~~l~~~~~v~l~v~R 363 (375)
...|++|.++++++.-+.-| |.+||.|+.||.....++. + |++++.+...++++|-.
T Consensus 275 gDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~gPi~ltvAk 338 (626)
T KOG3571|consen 275 GDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPGPIKLTVAK 338 (626)
T ss_pred CCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCCCeEEEEee
Confidence 35799999999998755554 9999999999988776652 3 33444444557887753
No 68
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.34 E-value=0.23 Score=44.39 Aligned_cols=117 Identities=18% Similarity=0.180 Sum_probs=48.3
Q ss_pred EEEecccccCCCCCceEEEEcCCCCEEEE---EEEEecCCCCeEEEEecCC----CCCCeeecCCCCCCCCCCEEEEEec
Q psy18066 113 LIITNAHVVSGKPGAQIIVTLPDGSKHKG---AVEALDVECDLAIIRCNFP----NNYPALKLGKAADIRNGEFVIAMGS 185 (375)
Q Consensus 113 ~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a---~vv~~d~~~DlAlLki~~~----~~~~~~~l~~s~~~~~G~~v~~iG~ 185 (375)
.++|++||..+. ..+. .+.+|+..+- +.+..+...|++||+.... -.++.+.+...+.+..| .+..
T Consensus 43 ~L~ta~Hv~~~~--~~~~-~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~~g----~~~~ 115 (203)
T PF02122_consen 43 ALLTARHVWSRP--SKVT-SLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLAKG----PVSF 115 (203)
T ss_dssp EEEE-HHHHTSS--S----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEEEE----ESST
T ss_pred ceecccccCCCc--ccee-EcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhCCC----Ceee
Confidence 999999999985 3332 3345555442 3555678999999999842 02344444222221110 0000
Q ss_pred CCCCCCCeeeeEEeeeeccccccCCcccccEEEEeecCCCCCccceeeccCCeEEEEEeee
Q psy18066 186 PLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDAAITFGNSGGPLVNLDGEVIGINSMK 246 (375)
Q Consensus 186 p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~ 246 (375)
.....+........+. + ....+...-+...+|.||.|+++.+ +++|+++..
T Consensus 116 -----y~~~~~~~~~~sa~i~--g--~~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~ 166 (203)
T PF02122_consen 116 -----YGFSSGEWPCSSAKIP--G--TEGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS 166 (203)
T ss_dssp -----TSEEEEEEEEEE-S--------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE
T ss_pred -----eeecCCCceeccCccc--c--ccCcCCceEcCCCCCCCCCCeEECC-CceEeecCc
Confidence 0111111111111111 1 1223556667788999999999988 999999884
No 69
>KOG3580|consensus
Probab=94.30 E-value=0.061 Score=54.40 Aligned_cols=66 Identities=15% Similarity=0.172 Sum_probs=51.8
Q ss_pred CCCCeEEEEEccCChhhhC-CCCCCCEEEEeCCEEcCCH--HHHHHHHhc-CCEEEEEEEECCeEEEEEE
Q psy18066 307 LTHGVLIWRVMYNSPAYLA-GLHQEDIIIELNKKPCHSA--KDIYAALEV-VRLVNFQFSHFKHSFLVES 372 (375)
Q Consensus 307 ~~~g~~V~~v~~~s~a~~a-Gl~~gDiI~~vng~~v~~~--~~l~~~l~~-~~~v~l~v~R~g~~~~v~~ 372 (375)
...-++|.++...+.|++- +|+.||+|++|||....++ .|-...+.+ ..+++|.|+|+.+...+.+
T Consensus 217 LgSqIFvKeit~~gLAardgnlqEGDiiLkINGtvteNmSLtDar~LIEkS~GKL~lvVlRD~~qtLiNi 286 (1027)
T KOG3580|consen 217 LGSQIFVKEITRTGLAARDGNLQEGDIILKINGTVTENMSLTDARKLIEKSRGKLQLVVLRDSQQTLINI 286 (1027)
T ss_pred ccchhhhhhhcccchhhccCCcccccEEEEECcEeeccccchhHHHHHHhccCceEEEEEecCCceeeec
Confidence 3456889999888877766 6999999999999887765 577777776 4569999999877655544
No 70
>KOG3209|consensus
Probab=94.24 E-value=0.094 Score=54.13 Aligned_cols=56 Identities=14% Similarity=0.173 Sum_probs=47.0
Q ss_pred CeEEEEEccCChhhhCC-CCCCCEEEEeCCEEcCCHH--HHHHHHhcCCEEEEEEEECC
Q psy18066 310 GVLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCHSAK--DIYAALEVVRLVNFQFSHFK 365 (375)
Q Consensus 310 g~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~~~--~l~~~l~~~~~v~l~v~R~g 365 (375)
+++|-+..+++||.+.| ++.||.|++|||+..+.+. +..+.+++|....+.++|.|
T Consensus 924 ~LfVLRlAeDGPA~rdGrm~VGDqi~eINGesTkgmtH~rAIelIk~gg~~vll~Lr~g 982 (984)
T KOG3209|consen 924 DLFVLRLAEDGPAIRDGRMRVGDQITEINGESTKGMTHDRAIELIKQGGRRVLLLLRRG 982 (984)
T ss_pred ceEEEEeccCCCccccCceeecceEEEecCcccCCCcHHHHHHHHHhCCeEEEEEeccC
Confidence 68999999999999987 9999999999999999875 55667777766677777765
No 71
>KOG3552|consensus
Probab=94.14 E-value=0.08 Score=56.11 Aligned_cols=55 Identities=18% Similarity=0.281 Sum_probs=46.6
Q ss_pred CCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCC--HHHHHHHHhc-CCEEEEEEEEC
Q psy18066 309 HGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHS--AKDIYAALEV-VRLVNFQFSHF 364 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~--~~~l~~~l~~-~~~v~l~v~R~ 364 (375)
.-|+|..|.+|+|+. ..|++||.|+.|||++|+. ++.+.+.++. .+.|.|+|.+.
T Consensus 75 rPviVr~VT~GGps~-GKL~PGDQIl~vN~Epv~daprervIdlvRace~sv~ltV~qP 132 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSI-GKLQPGDQILAVNGEPVKDAPRERVIDLVRACESSVNLTVCQP 132 (1298)
T ss_pred CceEEEEecCCCCcc-ccccCCCeEEEecCcccccccHHHHHHHHHHHhhhcceEEecc
Confidence 569999999999987 4599999999999999985 5677788876 56788988773
No 72
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=93.89 E-value=1.3 Score=40.41 Aligned_cols=97 Identities=12% Similarity=0.139 Sum_probs=67.2
Q ss_pred CceEEEEEeeecCCCcCceEEEEEEeCCCEEEecccccCCCC--CceEEEEcCCCCEEEE------EEEEec-----CCC
Q psy18066 84 KSVVNIELVIPYYRQTMSNGSGFIATDDGLIITNAHVVSGKP--GAQIIVTLPDGSKHKG------AVEALD-----VEC 150 (375)
Q Consensus 84 ~svV~I~~~~~~~~~~~~~GSGfiI~~~G~IlT~~Hvv~~~~--~~~i~V~~~~g~~~~a------~vv~~d-----~~~ 150 (375)
|=...|...+ ....+|++|+++ |||++..|+.+-. ...+.+.++.++.+.- ++...| ++.
T Consensus 17 PWlA~IYvdG------~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S 89 (267)
T PF09342_consen 17 PWLADIYVDG------RYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPES 89 (267)
T ss_pred cceeeEEEcC------eEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeecccc
Confidence 5555565553 788999999998 9999999998742 2567788887776541 233333 578
Q ss_pred CeEEEEecCCC----CCCeeecCC-CCCCCCCCEEEEEecCC
Q psy18066 151 DLAIIRCNFPN----NYPALKLGK-AADIRNGEFVIAMGSPL 187 (375)
Q Consensus 151 DlAlLki~~~~----~~~~~~l~~-s~~~~~G~~v~~iG~p~ 187 (375)
+++||.++.+. -+.|.-+.+ +......+.++++|.-.
T Consensus 90 ~v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 90 NVLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred ceeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 99999999874 234554533 23445566899999764
No 73
>KOG3550|consensus
Probab=93.82 E-value=0.24 Score=41.52 Aligned_cols=56 Identities=18% Similarity=0.222 Sum_probs=42.3
Q ss_pred CCCCeEEEEEccCChhhhC-CCCCCCEEEEeCCEEcCCH--HHHHHHHhc-CCEEEEEEE
Q psy18066 307 LTHGVLIWRVMYNSPAYLA-GLHQEDIIIELNKKPCHSA--KDIYAALEV-VRLVNFQFS 362 (375)
Q Consensus 307 ~~~g~~V~~v~~~s~a~~a-Gl~~gDiI~~vng~~v~~~--~~l~~~l~~-~~~v~l~v~ 362 (375)
....+||+.+.|++-|++. ||+-||.+++|||..|..- +...+.|.. ...|++.|.
T Consensus 113 qnspiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~gsvklvvr 172 (207)
T KOG3550|consen 113 QNSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVGSVKLVVR 172 (207)
T ss_pred cCCceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcCcEEEEEe
Confidence 4567999999999999987 6999999999999988754 333444544 345677654
No 74
>KOG3834|consensus
Probab=93.65 E-value=0.19 Score=49.10 Aligned_cols=60 Identities=28% Similarity=0.303 Sum_probs=48.8
Q ss_pred CCCCCCCeEEEEEccCChhhhCCCCC-CCEEEEeCCEEcCCHHHHHHHHhc--CCEEEEEEEE
Q psy18066 304 PYDLTHGVLIWRVMYNSPAYLAGLHQ-EDIIIELNKKPCHSAKDIYAALEV--VRLVNFQFSH 363 (375)
Q Consensus 304 ~~~~~~g~~V~~v~~~s~a~~aGl~~-gDiI~~vng~~v~~~~~l~~~l~~--~~~v~l~v~R 363 (375)
|.+...|.-|.+|.++|+|.+|||.+ -|-|++|||..++.-.|..+.+.+ -++|+++|+.
T Consensus 10 p~ggteg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sekVkltv~n 72 (462)
T KOG3834|consen 10 PGGGTEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEKVKLTVYN 72 (462)
T ss_pred ccCCceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccceEEEEEe
Confidence 34567899999999999999999998 589999999999977766555543 3459999874
No 75
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=93.29 E-value=0.62 Score=45.99 Aligned_cols=43 Identities=23% Similarity=0.492 Sum_probs=30.9
Q ss_pred EEEeecCCCCCccceeeccCCeEEEEEeee--cCCCeEEEEehHHH
Q psy18066 217 IQTDAAITFGNSGGPLVNLDGEVIGINSMK--VTAGISFAIPIDYA 260 (375)
Q Consensus 217 i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~--~~~g~g~aip~~~i 260 (375)
+....-+-.|+||+|++ .+|++||=++-. ..+..||+|-++..
T Consensus 351 l~~tgGivqGMSGSPi~-q~gkliGAvtHVfvndpt~GYGi~ie~M 395 (402)
T TIGR02860 351 LEKTGGIVQGMSGSPII-QNGKVIGAVTHVFVNDPTSGYGVYIEWM 395 (402)
T ss_pred hhHhCCEEecccCCCEE-ECCEEEEEEEEEEecCCCcceeehHHHH
Confidence 33334556899999999 799999976543 34667888866544
No 76
>KOG2921|consensus
Probab=93.17 E-value=0.12 Score=50.03 Aligned_cols=48 Identities=29% Similarity=0.386 Sum_probs=41.0
Q ss_pred CCCCCeEEEEEccCChhhhC-CCCCCCEEEEeCCEEcCCHHHHHHHHhc
Q psy18066 306 DLTHGVLIWRVMYNSPAYLA-GLHQEDIIIELNKKPCHSAKDIYAALEV 353 (375)
Q Consensus 306 ~~~~g~~V~~v~~~s~a~~a-Gl~~gDiI~~vng~~v~~~~~l~~~l~~ 353 (375)
....|+.|++|...||+..- ||.+||+|+++||-+|++.+|..+.++.
T Consensus 217 a~g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 217 AHGEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred hcCceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHh
Confidence 45689999999999996433 8999999999999999999998776653
No 77
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=92.30 E-value=0.32 Score=39.78 Aligned_cols=39 Identities=28% Similarity=0.417 Sum_probs=28.6
Q ss_pred ccEEEEeecCCCCCccceeeccCCeEEEEEeeecCCCeEE
Q psy18066 214 INYIQTDAAITFGNSGGPLVNLDGEVIGINSMKVTAGISF 253 (375)
Q Consensus 214 ~~~i~~d~~i~~G~SGGPlvn~~G~VIGI~s~~~~~g~g~ 253 (375)
.+++....+..||+-||+|+-..| ||||+++....-.+|
T Consensus 78 ~~~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Tagg~g~VaF 116 (127)
T PF00947_consen 78 YNLLIGEGPAEPGDCGGILRCKHG-VIGIVTAGGEGHVAF 116 (127)
T ss_dssp ECEEEEE-SSSTT-TCSEEEETTC-EEEEEEEEETTEEEE
T ss_pred cCceeecccCCCCCCCceeEeCCC-eEEEEEeCCCceEEE
Confidence 456777778899999999995555 999999985544455
No 78
>KOG0606|consensus
Probab=92.09 E-value=0.23 Score=54.03 Aligned_cols=52 Identities=27% Similarity=0.358 Sum_probs=42.1
Q ss_pred eEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HHHHHHHhc-CCEEEEEEE
Q psy18066 311 VLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KDIYAALEV-VRLVNFQFS 362 (375)
Q Consensus 311 ~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~l~~~l~~-~~~v~l~v~ 362 (375)
-.|..|.++|||..+|++++|.|+.+||++|... .++.+.|.+ |.++.+++.
T Consensus 660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~~gn~v~~~tt 714 (1205)
T KOG0606|consen 660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLKSGNKVTLRTT 714 (1205)
T ss_pred eeeeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHhcCCeeEEEee
Confidence 3578899999999999999999999999999865 577776654 666666543
No 79
>KOG3651|consensus
Probab=91.83 E-value=0.44 Score=44.50 Aligned_cols=54 Identities=19% Similarity=0.318 Sum_probs=43.7
Q ss_pred CCeEEEEEccCChhhhCC-CCCCCEEEEeCCEEcCCH--HHHHHHHhc-CCEEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCHSA--KDIYAALEV-VRLVNFQFS 362 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~~--~~l~~~l~~-~~~v~l~v~ 362 (375)
.=++|..|-.++||++-| ++.||-|++|||..|+.- .++.+++.. -++|.+++.
T Consensus 30 PClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~eV~IhyN 87 (429)
T KOG3651|consen 30 PCLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLNEVKIHYN 87 (429)
T ss_pred CeEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhccceEEEeh
Confidence 468899999999999887 999999999999999854 466677764 456777764
No 80
>KOG3606|consensus
Probab=90.67 E-value=0.69 Score=42.63 Aligned_cols=60 Identities=15% Similarity=0.182 Sum_probs=47.7
Q ss_pred CCCCCeEEEEEccCChhhhCC-CCCCCEEEEeCCEEcC--CHHHHHHHHhc-CCEEEEEEEECC
Q psy18066 306 DLTHGVLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCH--SAKDIYAALEV-VRLVNFQFSHFK 365 (375)
Q Consensus 306 ~~~~g~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~--~~~~l~~~l~~-~~~v~l~v~R~g 365 (375)
....|++|....+++.|+..| |...|.+++|||.+|. +.+++.+++.. ...+-++|...+
T Consensus 191 ekvpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANshNLIiTVkPAN 254 (358)
T KOG3606|consen 191 EKVPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANSHNLIITVKPAN 254 (358)
T ss_pred cccCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcccceEEEecccc
Confidence 345799999999999999999 5689999999999985 77888887765 344666666443
No 81
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=90.47 E-value=1.8 Score=34.29 Aligned_cols=54 Identities=19% Similarity=0.386 Sum_probs=37.1
Q ss_pred EEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecCCCCeEEEEecCCCCCCeeecCCC
Q psy18066 105 GFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNFPNNYPALKLGKA 171 (375)
Q Consensus 105 GfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~~~~~~~~~l~~s 171 (375)
++-|. +|..+|+.||.+.+ +.+ +|..+ +++. ...|+++++.+... ++.+++++.
T Consensus 3 avHIG-nG~~vt~tHva~~~--~~v-----~g~~f--~~~~--~~ge~~~v~~~~~~-~p~~~ig~g 56 (105)
T PF03510_consen 3 AVHIG-NGRYVTVTHVAKSS--DSV-----DGQPF--KIVK--TDGELCWVQSPLVH-LPAAQIGTG 56 (105)
T ss_pred eEEeC-CCEEEEEEEEeccC--ceE-----cCcCc--EEEE--eccCEEEEECCCCC-CCeeEeccC
Confidence 55565 57999999999887 444 23222 3333 44599999998876 788888653
No 82
>KOG3551|consensus
Probab=89.80 E-value=0.35 Score=46.69 Aligned_cols=53 Identities=23% Similarity=0.268 Sum_probs=42.6
Q ss_pred CeEEEEEccCChhhhCC-CCCCCEEEEeCCEEcCCH--HHHHHHHhc-CCEEEEEEE
Q psy18066 310 GVLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCHSA--KDIYAALEV-VRLVNFQFS 362 (375)
Q Consensus 310 g~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~~--~~l~~~l~~-~~~v~l~v~ 362 (375)
-++|+.+-++-.|++++ |-.||.|++|||....+. ++..++|+. |++|.++|.
T Consensus 111 PIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKraGkeV~levK 167 (506)
T KOG3551|consen 111 PILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKRAGKEVLLEVK 167 (506)
T ss_pred ceehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHHhhCceeeeeee
Confidence 47889999999999886 889999999999988754 566667765 888776653
No 83
>KOG1892|consensus
Probab=89.33 E-value=0.48 Score=50.73 Aligned_cols=58 Identities=12% Similarity=0.174 Sum_probs=44.8
Q ss_pred CCCeEEEEEccCChhhhCC-CCCCCEEEEeCCEEcCCHH--HHHHHH-hcCCEEEEEEEECC
Q psy18066 308 THGVLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCHSAK--DIYAAL-EVVRLVNFQFSHFK 365 (375)
Q Consensus 308 ~~g~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~~~--~l~~~l-~~~~~v~l~v~R~g 365 (375)
.-|+||..|++|++|+.-| |+.||.+++|||+..-... +..+++ +.|..|.++|...|
T Consensus 959 klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lmtrtg~vV~leVaKqg 1020 (1629)
T KOG1892|consen 959 KLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLMTRTGNVVHLEVAKQG 1020 (1629)
T ss_pred ccceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHHHHhccCCeEEEehhhhh
Confidence 3599999999999998876 9999999999999876553 333333 34888888886433
No 84
>KOG3605|consensus
Probab=88.82 E-value=0.61 Score=47.97 Aligned_cols=109 Identities=20% Similarity=0.353 Sum_probs=67.6
Q ss_pred CCCcccee-----eccCCeEEEEEeeecCCCeEEEEehHHHHHHHHHHHhcCCCcceeeeeeeeeE-------EeeccHH
Q psy18066 225 FGNSGGPL-----VNLDGEVIGINSMKVTAGISFAIPIDYAIEFLTNYKRKDKDRTITHKKYIGIT-------MLTLNEK 292 (375)
Q Consensus 225 ~G~SGGPl-----vn~~G~VIGI~s~~~~~g~g~aip~~~i~~~l~~l~~~~~~~~~~~~~~lGi~-------~~~~~~~ 292 (375)
.=++|||. +|.--+++.||-.. -..+|.+.++.+++.+++.-. + -|-|. +.=.-|+
T Consensus 679 nmm~~GpAarsgkLnIGDQiiaING~S-----LVGLPLstcQs~Ik~~KnQT~----V---kltiV~cpPV~~V~I~RPd 746 (829)
T KOG3605|consen 679 NMMHGGPAARSGKLNIGDQIMSINGTS-----LVGLPLSTCQSIIKGLKNQTA----V---KLNIVSCPPVTTVLIRRPD 746 (829)
T ss_pred hcccCChhhhcCCccccceeEeecCce-----eccccHHHHHHHHhcccccce----E---EEEEecCCCceEEEeeccc
Confidence 34566766 33334555554332 234899999999999886531 1 11111 1111244
Q ss_pred HHHHHhhccCCCCCCCCCeEEEEEccCChhhhCCCCCCCEEEEeCCEEcCCH--HHHHHHHh
Q psy18066 293 LIEQLRRDRHIPYDLTHGVLIWRVMYNSPAYLAGLHQEDIIIELNKKPCHSA--KDIYAALE 352 (375)
Q Consensus 293 ~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~~~--~~l~~~l~ 352 (375)
...+|+ +...+|++- +...++-|++.|++.|-.|++|||+.|.-. +.+.++|.
T Consensus 747 ~kyQLG------FSVQNGiIC-SLlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs 801 (829)
T KOG3605|consen 747 LRYQLG------FSVQNGIIC-SLLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLS 801 (829)
T ss_pred chhhcc------ceeeCcEee-hhhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHH
Confidence 444443 246678754 678999999999999999999999987532 34444444
No 85
>KOG3549|consensus
Probab=87.84 E-value=0.75 Score=43.85 Aligned_cols=53 Identities=19% Similarity=0.254 Sum_probs=46.0
Q ss_pred CeEEEEEccCChhhhCC-CCCCCEEEEeCCEEcCC--HHHHHHHHhc-CCEEEEEEE
Q psy18066 310 GVLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCHS--AKDIYAALEV-VRLVNFQFS 362 (375)
Q Consensus 310 g~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~--~~~l~~~l~~-~~~v~l~v~ 362 (375)
.++|..+.++-.|+..| |=.||-|+.|||.-|+. -+|+...|++ |++|.++|.
T Consensus 81 PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRNAGdeVtlTV~ 137 (505)
T KOG3549|consen 81 PVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRNAGDEVTLTVK 137 (505)
T ss_pred cEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHhcCCEEEEEeH
Confidence 48899999998899888 55899999999999985 4788899987 899999985
No 86
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=86.81 E-value=1 Score=37.20 Aligned_cols=123 Identities=20% Similarity=0.297 Sum_probs=62.3
Q ss_pred EEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecCCCCeEEEEecCC-CCCCeeecCCCCCCCCCCEEEEE
Q psy18066 105 GFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNFP-NNYPALKLGKAADIRNGEFVIAM 183 (375)
Q Consensus 105 GfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~~-~~~~~~~l~~s~~~~~G~~v~~i 183 (375)
|+.|+ |-+-|.+|--... ++--+.| +..-.+.+...|+..-..+.. ..+.|...+. ..++++
T Consensus 16 gt~vn--GV~wT~~HGagsr-----tlAgp~G---pv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg~-------~dlylV 78 (148)
T PF02907_consen 16 GTCVN--GVMWTVYHGAGSR-----TLAGPKG---PVNQMYTSVDDDLVGWPAPPGARSLTPCTCGS-------SDLYLV 78 (148)
T ss_dssp EEEET--TEEEEEHHHHTTS-----EEEBTTS---EB-ESEEETTTTEEEEE-STTB--BBB-SSSS-------SEEEEE
T ss_pred hhEEc--cEEEEEEecCCcc-----cccCCCC---cceEeEEcCCCCCcccccccccccCCccccCC-------ccEEEE
Confidence 55564 7889999975543 1221222 233445677889988877654 3466666643 235555
Q ss_pred ecCCCCCCCeeeeEEeeeeccccccCCcccccEEEEee--cCCCCCccceeeccCCeEEEEEeeecC-CC----eEEEEe
Q psy18066 184 GSPLTLNNTNTFGIISNKQRSSETLGLNKTINYIQTDA--AITFGNSGGPLVNLDGEVIGINSMKVT-AG----ISFAIP 256 (375)
Q Consensus 184 G~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~--~i~~G~SGGPlvn~~G~VIGI~s~~~~-~g----~g~aip 256 (375)
-+-. . .+-...+. +.... +..-. +...|.||||++..+|.+|||..+... .| +-| +|
T Consensus 79 tr~~----~----v~p~rr~g------d~~~~-L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f-~P 142 (148)
T PF02907_consen 79 TRDA----D----VIPVRRRG------DSRAS-LLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDF-IP 142 (148)
T ss_dssp -TTS---------EEEEEEES------TTEEE-EEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEE-EE
T ss_pred eccC----c----EeeeEEcC------CCceE-ecCCceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEE-Ee
Confidence 3321 1 11111111 11111 11111 223699999999999999999876543 22 334 47
Q ss_pred hHHH
Q psy18066 257 IDYA 260 (375)
Q Consensus 257 ~~~i 260 (375)
.+.+
T Consensus 143 ~e~l 146 (148)
T PF02907_consen 143 VETL 146 (148)
T ss_dssp HHHH
T ss_pred eeec
Confidence 7654
No 87
>KOG0609|consensus
Probab=86.78 E-value=1.6 Score=44.06 Aligned_cols=54 Identities=24% Similarity=0.311 Sum_probs=45.3
Q ss_pred CeEEEEEccCChhhhCC-CCCCCEEEEeCCEEcCC--HHHHHHHHhc-CCEEEEEEEE
Q psy18066 310 GVLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCHS--AKDIYAALEV-VRLVNFQFSH 363 (375)
Q Consensus 310 g~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~--~~~l~~~l~~-~~~v~l~v~R 363 (375)
-++|..+..|+-+++.| |+.||.|.++||..+.+ ..++.++|.. ...+++.|..
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~G~itfkiiP 204 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSRGSITFKIIP 204 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCCCcEEEEEcc
Confidence 58999999999999988 78999999999999975 5788888876 3457777654
No 88
>KOG3834|consensus
Probab=83.97 E-value=1.7 Score=42.70 Aligned_cols=51 Identities=27% Similarity=0.401 Sum_probs=43.0
Q ss_pred EEEEccCChhhhCCCC-CCCEEEEeCCEEcCCHHHHHHHHhc--CCEEEEEEEE
Q psy18066 313 IWRVMYNSPAYLAGLH-QEDIIIELNKKPCHSAKDIYAALEV--VRLVNFQFSH 363 (375)
Q Consensus 313 V~~v~~~s~a~~aGl~-~gDiI~~vng~~v~~~~~l~~~l~~--~~~v~l~v~R 363 (375)
|-+|.++|||+.|||+ -+|-|+.+-+..-+..+||...++. ++.+++-|+.
T Consensus 113 vl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN 166 (462)
T KOG3834|consen 113 VLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYN 166 (462)
T ss_pred eeecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEee
Confidence 6688999999999999 6899999955666788999998876 5678887774
No 89
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=78.05 E-value=24 Score=38.18 Aligned_cols=63 Identities=13% Similarity=0.150 Sum_probs=35.7
Q ss_pred ceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEEEEEec--CCCCeEEEEecCC-CCCCeeec
Q psy18066 101 SNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALD--VECDLAIIRCNFP-NNYPALKL 168 (375)
Q Consensus 101 ~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d--~~~DlAlLki~~~-~~~~~~~l 168 (375)
..|..-+|++. ||+|.+|+..+. . .|.|.+.....-+++.+. +..|+.+-|++.- .+..|+..
T Consensus 65 ~~G~aTLigpq-YiVSV~HN~~gy--~--~v~FG~~g~~~Y~iV~RNn~~~~Df~~pRLnK~VTEvaP~~~ 130 (769)
T PF02395_consen 65 NKGVATLIGPQ-YIVSVKHNGKGY--N--SVSFGNEGQNTYKIVDRNNYPSGDFHMPRLNKFVTEVAPAEM 130 (769)
T ss_dssp TTSS-EEEETT-EEEBETTG-TSC--C--EECESCSSTCEEEEEEEEBETTSTEBEEEESS---SS----B
T ss_pred CCceEEEecCC-eEEEEEccCCCc--C--ceeecccCCceEEEEEccCCCCcccceeecCceEEEEecccc
Confidence 33778899987 999999998544 2 355554222333555553 4469999999863 22444444
No 90
>KOG3605|consensus
Probab=77.14 E-value=1.6 Score=44.96 Aligned_cols=53 Identities=21% Similarity=0.257 Sum_probs=34.5
Q ss_pred eEEEEEccCChhhhCC-CCCCCEEEEeCCEEcCCH--HHHHHHHh---cCCEEEEEEEE
Q psy18066 311 VLIWRVMYNSPAYLAG-LHQEDIIIELNKKPCHSA--KDIYAALE---VVRLVNFQFSH 363 (375)
Q Consensus 311 ~~V~~v~~~s~a~~aG-l~~gDiI~~vng~~v~~~--~~l~~~l~---~~~~v~l~v~R 363 (375)
|+|.+.+.++||++.| |..||.|++|||...... ..-..+++ ....|+++|.+
T Consensus 675 VViAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~ 733 (829)
T KOG3605|consen 675 VVIANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVS 733 (829)
T ss_pred HHHHhcccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEec
Confidence 3345566788999987 999999999999876532 22233333 33446665543
No 91
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=73.16 E-value=2.4 Score=41.73 Aligned_cols=25 Identities=28% Similarity=0.472 Sum_probs=21.5
Q ss_pred eecCCCCCccceeeccCCeEEEEEe
Q psy18066 220 DAAITFGNSGGPLVNLDGEVIGINS 244 (375)
Q Consensus 220 d~~i~~G~SGGPlvn~~G~VIGI~s 244 (375)
+..+..|.||+.++|.+|++|||..
T Consensus 349 ~~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 349 NYSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred ccCCCCCCCcCeEECCCCCEEEEeC
Confidence 3356689999999999999999975
No 92
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=70.91 E-value=8.7 Score=37.93 Aligned_cols=138 Identities=18% Similarity=0.276 Sum_probs=63.0
Q ss_pred CceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCCEEEEEEEEecCCCCeEEEEecCC--CCCCeeecCCCCCCCCC
Q psy18066 100 MSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNFP--NNYPALKLGKAADIRNG 177 (375)
Q Consensus 100 ~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~~--~~~~~~~l~~s~~~~~G 177 (375)
-++|=||.|+++ ..+|+-||+...- .++ |. .+-.-+..+..-+++-++++.+ .+++.+-|. .....|
T Consensus 378 fGsGWGfWVS~~-lfITttHViP~g~-~E~---FG----v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLE--eGapEG 446 (535)
T PF05416_consen 378 FGSGWGFWVSPT-LFITTTHVIPPGA-KEA---FG----VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILE--EGAPEG 446 (535)
T ss_dssp ETTEEEEESSSS-EEEEEGGGS-STT-SEE---TT----EECGGEEEEEETTEEEEEESS-SSTTS---EE---SS--TT
T ss_pred cCCceeeeecce-EEEEeeeecCCcc-hhh---hC----CChhHeEEeeccceEEEecCCCCCCCccceeec--cCCCCc
Confidence 356889999998 9999999998541 111 11 1112223444556677777654 235555552 222234
Q ss_pred CEEE-EEecCCCC--CCCeeeeEEeeeeccccccCCcccccEEE-------EeecCCCCCccceeeccCCe---EEEEEe
Q psy18066 178 EFVI-AMGSPLTL--NNTNTFGIISNKQRSSETLGLNKTINYIQ-------TDAAITFGNSGGPLVNLDGE---VIGINS 244 (375)
Q Consensus 178 ~~v~-~iG~p~g~--~~~~~~G~vs~~~~~~~~~~~~~~~~~i~-------~d~~i~~G~SGGPlvn~~G~---VIGI~s 244 (375)
.-+- +|-.|.|. +..+.-|......-.-... ..-..++. .|.-..||+-|.|-+=..|+ |+|+++
T Consensus 447 tV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v--~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~ 524 (535)
T PF05416_consen 447 TVCSILIKRPSGELLPLAVRMGTHASMKIQGRTV--HGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHA 524 (535)
T ss_dssp -EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEE--EEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEE
T ss_pred eEEEEEEEcCCccchhhhhhhccceeEEEcceee--cceeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEe
Confidence 3222 23334332 1112222222221100000 00111111 23344589999999987775 789999
Q ss_pred eecCCC
Q psy18066 245 MKVTAG 250 (375)
Q Consensus 245 ~~~~~g 250 (375)
+...+|
T Consensus 525 AAtr~G 530 (535)
T PF05416_consen 525 AATRSG 530 (535)
T ss_dssp EE-SSS
T ss_pred hhccCC
Confidence 876544
No 93
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=68.34 E-value=6.8 Score=34.40 Aligned_cols=31 Identities=29% Similarity=0.287 Sum_probs=27.9
Q ss_pred CCCCeEEEEEccCChhhhCCCCCCCEEEEeC
Q psy18066 307 LTHGVLIWRVMYNSPAYLAGLHQEDIIIELN 337 (375)
Q Consensus 307 ~~~g~~V~~v~~~s~a~~aGl~~gDiI~~vn 337 (375)
..+.+.|.+|..+|||+++|+..++.|+++.
T Consensus 120 e~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 120 EGGKVIVDEVEFGSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred eCCEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence 3467899999999999999999999999884
No 94
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=68.23 E-value=14 Score=25.85 Aligned_cols=33 Identities=24% Similarity=0.285 Sum_probs=29.0
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEecC
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRCNF 159 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~ 159 (375)
..+.|.+.||+.+.+.+.+.|+..++.|-....
T Consensus 7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~~ 39 (63)
T cd00600 7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVEE 39 (63)
T ss_pred CEEEEEECCCcEEEEEEEEECCCCCEEECCEEE
Confidence 679999999999999999999999998766543
No 95
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=66.44 E-value=13 Score=26.71 Aligned_cols=33 Identities=24% Similarity=0.456 Sum_probs=29.2
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEec
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCN 158 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~ 158 (375)
+..+.|.+.+|+.|.+++.+.|+..+|.|-...
T Consensus 10 ~~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 10 GRPVVVKLNSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CCeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence 367999999999999999999999999886654
No 96
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=66.12 E-value=15 Score=26.97 Aligned_cols=34 Identities=18% Similarity=0.443 Sum_probs=29.9
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEecC
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNF 159 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~ 159 (375)
+..+.|.+.+|+.|.+.+.+.|+..++.|-....
T Consensus 14 ~k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 47 (72)
T PRK00737 14 NSPVLVRLKGGREFRGELQGYDIHMNLVLDNAEE 47 (72)
T ss_pred CCEEEEEECCCCEEEEEEEEEcccceeEEeeEEE
Confidence 3679999999999999999999999999877643
No 97
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=65.07 E-value=16 Score=26.33 Aligned_cols=34 Identities=18% Similarity=0.371 Sum_probs=30.1
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEecC
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNF 159 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~ 159 (375)
+.++.|.+.+|+.+.+.+.++|+..+|.|-....
T Consensus 10 ~~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 10 NKPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CCEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence 3679999999999999999999999999877643
No 98
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=65.03 E-value=14 Score=26.76 Aligned_cols=33 Identities=27% Similarity=0.415 Sum_probs=29.0
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEec
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCN 158 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~ 158 (375)
+..+.|.+.+|+.|.+++.+.|...+|.|=.+.
T Consensus 11 g~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 11 GKPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CCEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 367999999999999999999999999886553
No 99
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=63.16 E-value=14 Score=27.86 Aligned_cols=31 Identities=16% Similarity=0.296 Sum_probs=27.6
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEe
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRC 157 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki 157 (375)
..+.|.+.+|+.+.+.+.++|...+|.|=..
T Consensus 12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence 6799999999999999999999999986443
No 100
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=62.97 E-value=15 Score=27.27 Aligned_cols=31 Identities=10% Similarity=0.248 Sum_probs=27.8
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEe
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRC 157 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki 157 (375)
..+.|.+.+|+.+.+++.++|...++.|=..
T Consensus 14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 6799999999999999999999999986544
No 101
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=62.31 E-value=16 Score=27.97 Aligned_cols=32 Identities=22% Similarity=0.503 Sum_probs=28.6
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEec
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRCN 158 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~ 158 (375)
..+.|.+.+++.+.+++.++|...+|.|=...
T Consensus 15 ~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 15 TQVLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred CEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 68999999999999999999999999876543
No 102
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=62.27 E-value=17 Score=27.16 Aligned_cols=32 Identities=31% Similarity=0.524 Sum_probs=28.2
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEec
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRCN 158 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~ 158 (375)
..+.|.+.||+.+.+.+.++|...+|.|=...
T Consensus 11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence 67999999999999999999999999865543
No 103
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=62.25 E-value=27 Score=24.90 Aligned_cols=33 Identities=21% Similarity=0.333 Sum_probs=29.3
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEecC
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRCNF 159 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~ 159 (375)
..+.+..-.|..++++++.+|....+.+|+.+.
T Consensus 7 s~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 7 SQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred cEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 567788888999999999999999999999665
No 104
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=62.15 E-value=19 Score=26.78 Aligned_cols=31 Identities=16% Similarity=0.385 Sum_probs=27.7
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEe
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRC 157 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki 157 (375)
..+.|.+.||+.+.+.+.++|...+|.|=..
T Consensus 11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~ 41 (75)
T cd06168 11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSA 41 (75)
T ss_pred CeEEEEEcCCeEEEEEEEEEcCCCcEEecCc
Confidence 5799999999999999999999999986544
No 105
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=61.14 E-value=19 Score=27.06 Aligned_cols=31 Identities=19% Similarity=0.287 Sum_probs=27.7
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEe
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRC 157 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki 157 (375)
.++.|.+.+|+.+.+.+.++|...+|.|=..
T Consensus 13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence 6799999999999999999999999987544
No 106
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=59.46 E-value=22 Score=26.05 Aligned_cols=31 Identities=16% Similarity=0.247 Sum_probs=27.9
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEe
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRC 157 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki 157 (375)
.++.|.+.+|+.+.+.+.++|...+|.|=..
T Consensus 11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~ 41 (72)
T cd01719 11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDA 41 (72)
T ss_pred CeEEEEECCCeEEEEEEEEEcccccEEeccE
Confidence 6799999999999999999999999987554
No 107
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=57.22 E-value=25 Score=26.05 Aligned_cols=32 Identities=25% Similarity=0.420 Sum_probs=28.3
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEe
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRC 157 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki 157 (375)
+.++.|.+.+|+.+.+.+.++|+..++.|=..
T Consensus 12 ~k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~ 43 (74)
T cd01728 12 DKKVVVLLRDGRKLIGILRSFDQFANLVLQDT 43 (74)
T ss_pred CCEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence 36799999999999999999999999987554
No 108
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=56.16 E-value=28 Score=24.70 Aligned_cols=34 Identities=21% Similarity=0.440 Sum_probs=29.2
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEecC
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNF 159 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~ 159 (375)
+..+.|.+.||+.+.+.+.+.|+..++-|=....
T Consensus 8 ~~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~e 41 (67)
T smart00651 8 GKRVLVELKNGREYRGTLKGFDQFMNLVLEDVEE 41 (67)
T ss_pred CcEEEEEECCCcEEEEEEEEECccccEEEccEEE
Confidence 3679999999999999999999999998765543
No 109
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=55.92 E-value=28 Score=25.31 Aligned_cols=33 Identities=15% Similarity=0.244 Sum_probs=30.0
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEec
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCN 158 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~ 158 (375)
+..+.|.+.+|..|.+++...|...++.|-...
T Consensus 10 g~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 10 GHIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CCEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence 468999999999999999999999999988774
No 110
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=54.91 E-value=27 Score=25.64 Aligned_cols=31 Identities=19% Similarity=0.278 Sum_probs=28.0
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEe
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRC 157 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki 157 (375)
.++.|.+.||+.+.+++.++|...++.|=..
T Consensus 10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~ 40 (74)
T cd01727 10 KTVSVITVDGRVIVGTLKGFDQATNLILDDS 40 (74)
T ss_pred CEEEEEECCCcEEEEEEEEEccccCEEccce
Confidence 6789999999999999999999999987664
No 111
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=53.25 E-value=22 Score=25.22 Aligned_cols=34 Identities=18% Similarity=0.296 Sum_probs=30.3
Q ss_pred ceEEEEcCCCCEEEEEEEEecCCCCeEEEEecCC
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDVECDLAIIRCNFP 160 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~~ 160 (375)
..+.|.+.+|+.+.+.+...|...++.|-.....
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~~ 42 (67)
T PF01423_consen 9 KRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTET 42 (67)
T ss_dssp SEEEEEETTSEEEEEEEEEEETTEEEEEEEEEEE
T ss_pred cEEEEEEeCCEEEEEEEEEeechheEEeeeEEEE
Confidence 6899999999999999999999999998877643
No 112
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=52.07 E-value=23 Score=26.04 Aligned_cols=35 Identities=29% Similarity=0.521 Sum_probs=27.4
Q ss_pred cceeeccCCeEEEEEeeecCCCeEEEEehHHHHHHHHHHHhcC
Q psy18066 229 GGPLVNLDGEVIGINSMKVTAGISFAIPIDYAIEFLTNYKRKD 271 (375)
Q Consensus 229 GGPlvn~~G~VIGI~s~~~~~g~g~aip~~~i~~~l~~l~~~~ 271 (375)
.-|+.+.+|+++|+... .+.++.+.++++++.-+.
T Consensus 18 s~pi~~~~g~~~Gvv~~--------di~l~~l~~~i~~~~~~~ 52 (81)
T PF02743_consen 18 SVPIYDDDGKIIGVVGI--------DISLDQLSEIISNIKFGN 52 (81)
T ss_dssp EEEEEETTTEEEEEEEE--------EEEHHHHHHHHTTSBBTT
T ss_pred EEEEECCCCCEEEEEEE--------EeccceeeeEEEeeEECC
Confidence 35788889999999865 488899888888876543
No 113
>KOG3938|consensus
Probab=49.68 E-value=23 Score=32.89 Aligned_cols=54 Identities=13% Similarity=0.070 Sum_probs=40.6
Q ss_pred CCeEEEEEccCChhhhC-CCCCCCEEEEeCCEEcCCHH--HHHHHHhc---CCEEEEEEE
Q psy18066 309 HGVLIWRVMYNSPAYLA-GLHQEDIIIELNKKPCHSAK--DIYAALEV---VRLVNFQFS 362 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~a-Gl~~gDiI~~vng~~v~~~~--~l~~~l~~---~~~v~l~v~ 362 (375)
.-++|..+.++|--++. -++.||.|-+|||+.+-.+. ++.++|+. +++.++.+.
T Consensus 149 GyAFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLi 208 (334)
T KOG3938|consen 149 GYAFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLI 208 (334)
T ss_pred ceeeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEee
Confidence 34678888888876655 48899999999999998775 45556654 777777654
No 114
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=47.66 E-value=19 Score=24.17 Aligned_cols=20 Identities=45% Similarity=0.642 Sum_probs=17.2
Q ss_pred CCccceeeccCCeEEEEEee
Q psy18066 226 GNSGGPLVNLDGEVIGINSM 245 (375)
Q Consensus 226 G~SGGPlvn~~G~VIGI~s~ 245 (375)
+.+.-|++|.+|+++|+.+.
T Consensus 29 ~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 29 GISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp TSSEEEEESTTSBEEEEEEH
T ss_pred CCcEEEEEecCCEEEEEEEH
Confidence 56778999999999999775
No 115
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=47.41 E-value=39 Score=25.09 Aligned_cols=34 Identities=18% Similarity=0.352 Sum_probs=29.5
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEecC
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNF 159 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~ 159 (375)
+..+.|.+.+|+.|.+++.++|...++.|--+..
T Consensus 17 ~~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 17 NKRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CCEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence 3679999999999999999999999998775543
No 116
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=47.24 E-value=32 Score=25.71 Aligned_cols=35 Identities=14% Similarity=0.325 Sum_probs=30.5
Q ss_pred CCCceEEEEcCCCCEEEEEEEEecCCCCeEEEEec
Q psy18066 124 KPGAQIIVTLPDGSKHKGAVEALDVECDLAIIRCN 158 (375)
Q Consensus 124 ~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~ 158 (375)
..+..+.|.+.+|..|.+++...|...++.|-.+.
T Consensus 17 l~g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~~~ 51 (78)
T cd01733 17 LQGKVVTVELRNETTVTGRIASVDAFMNIRLAKVT 51 (78)
T ss_pred CCCCEEEEEECCCCEEEEEEEEEcCCceeEEEEEE
Confidence 34468999999999999999999999999887765
No 117
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=45.30 E-value=26 Score=31.54 Aligned_cols=53 Identities=15% Similarity=0.330 Sum_probs=38.8
Q ss_pred cEEEEeecCCCCCccceeeccC----CeEEEEEeeecC-CCeEEEEehH--HHHHHHHHH
Q psy18066 215 NYIQTDAAITFGNSGGPLVNLD----GEVIGINSMKVT-AGISFAIPID--YAIEFLTNY 267 (375)
Q Consensus 215 ~~i~~d~~i~~G~SGGPlvn~~----G~VIGI~s~~~~-~g~g~aip~~--~i~~~l~~l 267 (375)
..++...+...|+-|||++-.+ -+++||+.++.. .+.|||-++. .+++.+..+
T Consensus 169 ~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~~~gYAe~itQEDL~~A~~~l 228 (231)
T PF12381_consen 169 QGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANHAMGYAESITQEDLMRAINKL 228 (231)
T ss_pred eeeeEECCCcCCCccceeeEcchhhhhhhheeeecccccccceehhhhhHHHHHHHHHhh
Confidence 3466777888999999998433 679999999875 5889987764 355555544
No 118
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.18 E-value=55 Score=24.17 Aligned_cols=34 Identities=15% Similarity=0.220 Sum_probs=30.1
Q ss_pred CCceEEEEcCCCCEEEEEEEEecCCCCeEEEEec
Q psy18066 125 PGAQIIVTLPDGSKHKGAVEALDVECDLAIIRCN 158 (375)
Q Consensus 125 ~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~ 158 (375)
.+..+.|.+.+|+.+.+.+...|...++.+-.+.
T Consensus 10 ~g~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 10 QNHPMLVELKNGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CCCEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence 3468999999999999999999999999987664
No 119
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.19 E-value=55 Score=24.57 Aligned_cols=33 Identities=15% Similarity=0.366 Sum_probs=29.8
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEec
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCN 158 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~ 158 (375)
+..+.|.+.+|..+.+++...|...++-+-.++
T Consensus 11 g~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~ 43 (81)
T cd01725 11 GKEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS 43 (81)
T ss_pred CCEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence 468999999999999999999999999887765
No 120
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=39.36 E-value=27 Score=35.56 Aligned_cols=31 Identities=26% Similarity=0.283 Sum_probs=24.6
Q ss_pred eEEEEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 311 VLIWRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 311 ~~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
+.|.-..+|.|...| .+|||||++.||+.|.
T Consensus 300 ~~vl~~~ENm~~g~A-~rPGDVits~~GkTVE 330 (485)
T COG0260 300 VGVLPAVENMPSGNA-YRPGDVITSMNGKTVE 330 (485)
T ss_pred EEEEeeeccCCCCCC-CCCCCeEEecCCcEEE
Confidence 445556677787776 9999999999998874
No 121
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=37.45 E-value=71 Score=24.02 Aligned_cols=47 Identities=21% Similarity=0.346 Sum_probs=31.9
Q ss_pred EEEEEEEecCCCCeEEEEecCCCCCCeeecCCCCCCCCCCEEEE-EecC
Q psy18066 139 HKGAVEALDVECDLAIIRCNFPNNYPALKLGKAADIRNGEFVIA-MGSP 186 (375)
Q Consensus 139 ~~a~vv~~d~~~DlAlLki~~~~~~~~~~l~~s~~~~~G~~v~~-iG~p 186 (375)
+|++++..|...++|++.+-.-..---+.|-. ..++.|++|++ +||.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvkreV~l~Lv~-~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVKREVNLDLVG-EEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEeEEEEeeeec-CccccCCEEEEEeeEE
Confidence 57888889988889999887643122233322 27889999886 6664
No 122
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=36.01 E-value=78 Score=24.31 Aligned_cols=34 Identities=12% Similarity=0.342 Sum_probs=30.2
Q ss_pred CceEEEEcCCCCEEEEEEEEecCCCCeEEEEecC
Q psy18066 126 GAQIIVTLPDGSKHKGAVEALDVECDLAIIRCNF 159 (375)
Q Consensus 126 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlLki~~ 159 (375)
+..+.|.+.+|..|.+.+...|...++.|-.+..
T Consensus 11 g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~a~~ 44 (90)
T cd01724 11 NETVTIELKNGTIVHGTITGVDPSMNTHLKNVKL 44 (90)
T ss_pred CCEEEEEECCCCEEEEEEEEEcCceeEEEEEEEE
Confidence 4689999999999999999999999999887643
No 123
>COG2104 ThiS Sulfur transfer protein involved in thiamine biosynthesis [Coenzyme metabolism]
Probab=33.84 E-value=72 Score=23.18 Aligned_cols=37 Identities=14% Similarity=0.117 Sum_probs=24.9
Q ss_pred hhCCCCCCCEEEEeCCEEcCCHHHHHHHHhcCCEEEE
Q psy18066 323 YLAGLHQEDIIIELNKKPCHSAKDIYAALEVVRLVNF 359 (375)
Q Consensus 323 ~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~~~~v~l 359 (375)
++.|+.+.-+++++||..+..-+-....+..++++++
T Consensus 25 ~~l~~~~~~vav~vNg~iVpr~~~~~~~l~~gD~iev 61 (68)
T COG2104 25 AQLGLNPEGVAVAVNGEIVPRSQWADTILKEGDRIEV 61 (68)
T ss_pred HHhCCCCceEEEEECCEEccchhhhhccccCCCEEEE
Confidence 3456777778888888888765555566666676544
No 124
>PF02601 Exonuc_VII_L: Exonuclease VII, large subunit; InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus. This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=33.43 E-value=49 Score=31.59 Aligned_cols=36 Identities=28% Similarity=0.500 Sum_probs=30.6
Q ss_pred CceEEEEEEeCCCEEEecccccCCCCCceEEEEcCCCC
Q psy18066 100 MSNGSGFIATDDGLIITNAHVVSGKPGAQIIVTLPDGS 137 (375)
Q Consensus 100 ~~~GSGfiI~~~G~IlT~~Hvv~~~~~~~i~V~~~~g~ 137 (375)
-..|-.++.+++|.++|+..-+... +.+++.+.||.
T Consensus 279 L~RGYaiv~~~~g~vI~s~~~l~~g--d~i~i~l~DG~ 314 (319)
T PF02601_consen 279 LKRGYAIVRDKDGKVITSVKQLKPG--DEIEIRLADGS 314 (319)
T ss_pred HhCceEEEECCCCCEECCHHHCCCC--CEEEEEEcceE
Confidence 3457778888899999999999887 89999999984
No 125
>PF14438 SM-ATX: Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=30.92 E-value=1.1e+02 Score=22.31 Aligned_cols=28 Identities=25% Similarity=0.354 Sum_probs=21.0
Q ss_pred ceEEEEcCCCCEEEEEEEEecC---CCCeEE
Q psy18066 127 AQIIVTLPDGSKHKGAVEALDV---ECDLAI 154 (375)
Q Consensus 127 ~~i~V~~~~g~~~~a~vv~~d~---~~DlAl 154 (375)
..+.|++.||..|++-+...++ +.+++|
T Consensus 13 ~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL 43 (77)
T PF14438_consen 13 QTVEVTTKNGSVYEGIFHSASPESNEFDVVL 43 (77)
T ss_dssp SEEEEEETTS-EEEEEEEEE-T---T--EEE
T ss_pred CEEEEEECCCCEEEEEEEeCCCcccceeEEE
Confidence 5789999999999999999988 666665
No 126
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=30.78 E-value=32 Score=32.83 Aligned_cols=31 Identities=35% Similarity=0.465 Sum_probs=27.0
Q ss_pred EEEEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 312 LIWRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 312 ~V~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
-+-+|.+.+||+++|+-.||-|+.+|+-++.
T Consensus 66 ~~lrv~~~~~~e~~~~~~~dyilg~n~Dp~~ 96 (417)
T COG5233 66 EVLRVNPESPAEKAGMVVGDYILGINEDPLR 96 (417)
T ss_pred hheeccccChhHhhccccceeEEeecCCcHH
Confidence 4567889999999999999999999987664
No 127
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=29.07 E-value=34 Score=33.60 Aligned_cols=23 Identities=39% Similarity=0.564 Sum_probs=19.1
Q ss_pred CceEEEEEEeC----CC------EEEecccccC
Q psy18066 100 MSNGSGFIATD----DG------LIITNAHVVS 122 (375)
Q Consensus 100 ~~~GSGfiI~~----~G------~IlT~~Hvv~ 122 (375)
...|||.|+|- ++ |+.||.||+.
T Consensus 35 ~~~GT~WIlDy~~~~~~~~p~k~y~ATNlHVa~ 67 (374)
T PF01732_consen 35 SVSGTGWILDYKKPEDNKYPTKWYFATNLHVAS 67 (374)
T ss_pred cCcceEEEEEEeccCCCCCCeEEEEEechhhhc
Confidence 56899999972 23 9999999998
No 128
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=28.81 E-value=49 Score=26.33 Aligned_cols=18 Identities=39% Similarity=0.626 Sum_probs=13.3
Q ss_pred cceeeccCCeEEEEEeee
Q psy18066 229 GGPLVNLDGEVIGINSMK 246 (375)
Q Consensus 229 GGPlvn~~G~VIGI~s~~ 246 (375)
-.|++|.+|++||+++..
T Consensus 93 ~~PV~d~~g~viG~V~VG 110 (116)
T PF14827_consen 93 FAPVYDSDGKVIGVVSVG 110 (116)
T ss_dssp EEEEE-TTS-EEEEEEEE
T ss_pred EEeeECCCCcEEEEEEEE
Confidence 358899999999998765
No 129
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=28.36 E-value=1.9e+02 Score=20.91 Aligned_cols=43 Identities=19% Similarity=0.339 Sum_probs=30.1
Q ss_pred EEEEEEEecCCCCeEEEEecCCCCCCeeecCCCCCCCCCCEEEEE
Q psy18066 139 HKGAVEALDVECDLAIIRCNFPNNYPALKLGKAADIRNGEFVIAM 183 (375)
Q Consensus 139 ~~a~vv~~d~~~DlAlLki~~~~~~~~~~l~~s~~~~~G~~v~~i 183 (375)
+|++++..+.....|++..... ...+.+.--..+++||+|++-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G~--~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGGV--RREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETTE--EEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCCc--EEEEEEEEeCCCCCCCEEEEe
Confidence 6889999988899999987753 344544444558999999864
No 130
>PRK06437 hypothetical protein; Provisional
Probab=27.86 E-value=67 Score=23.07 Aligned_cols=34 Identities=18% Similarity=0.131 Sum_probs=24.1
Q ss_pred hhCCCCCCCEEEEeCCEEcCCHHHHHHHHhcCCEEEEE
Q psy18066 323 YLAGLHQEDIIIELNKKPCHSAKDIYAALEVVRLVNFQ 360 (375)
Q Consensus 323 ~~aGl~~gDiI~~vng~~v~~~~~l~~~l~~~~~v~l~ 360 (375)
++.|+.+..+.+.+||+.+. ....|..|+++.+.
T Consensus 28 ~~Lgi~~~~vaV~vNg~iv~----~~~~L~dgD~Veiv 61 (67)
T PRK06437 28 KDLGLDEEEYVVIVNGSPVL----EDHNVKKEDDVLIL 61 (67)
T ss_pred HHcCCCCccEEEEECCEECC----CceEcCCCCEEEEE
Confidence 56678888899999999986 22345557766553
No 131
>cd00433 Peptidase_M17 Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants.
Probab=27.70 E-value=50 Score=33.58 Aligned_cols=29 Identities=17% Similarity=0.100 Sum_probs=23.0
Q ss_pred EEEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 313 IWRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 313 V~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
+.-..+|.+...+ .+|||||++.||+.|.
T Consensus 289 i~~~~EN~is~~A-~rPgDVi~s~~GkTVE 317 (468)
T cd00433 289 VLPLAENMISGNA-YRPGDVITSRSGKTVE 317 (468)
T ss_pred EEEeeecCCCCCC-CCCCCEeEeCCCcEEE
Confidence 3445567777766 9999999999998874
No 132
>PRK05015 aminopeptidase B; Provisional
Probab=27.11 E-value=57 Score=32.58 Aligned_cols=29 Identities=24% Similarity=0.050 Sum_probs=22.7
Q ss_pred EEEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 313 IWRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 313 V~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
|--..+|.+...+ .++||||+.-||+.|.
T Consensus 240 il~~aENmisg~A-~kpgDVIt~~nGkTVE 268 (424)
T PRK05015 240 FLCCAENLISGNA-FKLGDIITYRNGKTVE 268 (424)
T ss_pred EEEecccCCCCCC-CCCCCEEEecCCcEEe
Confidence 3445567776666 9999999999998875
No 133
>PRK00913 multifunctional aminopeptidase A; Provisional
Probab=26.75 E-value=60 Score=33.18 Aligned_cols=29 Identities=17% Similarity=0.208 Sum_probs=23.1
Q ss_pred EEEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 313 IWRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 313 V~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
|.-..+|.|..+| .+|||||++.||+.|.
T Consensus 303 v~~l~ENm~~~~A-~rPgDVi~~~~GkTVE 331 (483)
T PRK00913 303 VVAACENMPSGNA-YRPGDVLTSMSGKTIE 331 (483)
T ss_pred EEEeeccCCCCCC-CCCCCEEEECCCcEEE
Confidence 3344567777776 9999999999998875
No 134
>KOG1738|consensus
Probab=26.12 E-value=43 Score=34.89 Aligned_cols=44 Identities=14% Similarity=0.214 Sum_probs=34.5
Q ss_pred CCeEEEEEccCChhhhC-CCCCCCEEEEeCCEEcCCHH--HHHHHHh
Q psy18066 309 HGVLIWRVMYNSPAYLA-GLHQEDIIIELNKKPCHSAK--DIYAALE 352 (375)
Q Consensus 309 ~g~~V~~v~~~s~a~~a-Gl~~gDiI~~vng~~v~~~~--~l~~~l~ 352 (375)
+-.+|.++.++|||+.. -|..||-|+.+|++.+-.|+ -+...|.
T Consensus 225 g~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwqlk~vV~sL~ 271 (638)
T KOG1738|consen 225 GPHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQLKVVVSSLR 271 (638)
T ss_pred CceeccccccCChHHHhhcccCccceeeecccccccchhHhHHhhcc
Confidence 45678889999999877 49999999999999988774 3344443
No 135
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=25.99 E-value=47 Score=26.04 Aligned_cols=21 Identities=33% Similarity=0.428 Sum_probs=16.7
Q ss_pred CCccceeeccCCeEEEEEeee
Q psy18066 226 GNSGGPLVNLDGEVIGINSMK 246 (375)
Q Consensus 226 G~SGGPlvn~~G~VIGI~s~~ 246 (375)
+.+.=|++|.+|+++|+.+..
T Consensus 98 ~~~~lpVvd~~~~~vGiit~~ 118 (123)
T cd04627 98 GISSVAVVDNQGNLIGNISVT 118 (123)
T ss_pred CCceEEEECCCCcEEEEEeHH
Confidence 344569999999999998764
No 136
>PF00883 Peptidase_M17: Cytosol aminopeptidase family, catalytic domain; InterPro: IPR000819 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases []. This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine). Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear [, ]. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids []. The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another []. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape []. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices []. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core []. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer []. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain [].; GO: 0004177 aminopeptidase activity, 0006508 proteolysis, 0005622 intracellular; PDB: 3KZW_L 3KQX_C 3KQZ_L 3KR4_I 3KR5_J 3T8W_C 3H8F_D 3H8G_A 3H8E_B 3IJ3_A ....
Probab=25.36 E-value=45 Score=31.93 Aligned_cols=27 Identities=22% Similarity=0.175 Sum_probs=17.8
Q ss_pred EEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 315 RVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 315 ~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
-..+|.+..++ .++||||++.||+.|.
T Consensus 136 ~~~EN~i~~~a-~~pgDVi~s~~GkTVE 162 (311)
T PF00883_consen 136 PLAENMISGNA-YRPGDVITSMNGKTVE 162 (311)
T ss_dssp EEEEE--STTS-TTTTEEEE-TTS-EEE
T ss_pred EcccccCCCCC-CCCCCEEEeCCCCEEE
Confidence 34456666665 9999999999998874
No 137
>PRK05659 sulfur carrier protein ThiS; Validated
Probab=24.98 E-value=86 Score=22.06 Aligned_cols=36 Identities=19% Similarity=0.250 Sum_probs=22.6
Q ss_pred hCCCCCCCEEEEeCCEEcCCHHHHHHHHhcCCEEEE
Q psy18066 324 LAGLHQEDIIIELNKKPCHSAKDIYAALEVVRLVNF 359 (375)
Q Consensus 324 ~aGl~~gDiI~~vng~~v~~~~~l~~~l~~~~~v~l 359 (375)
..++....+.+++||+-+...+--...|..|+++++
T Consensus 24 ~l~~~~~~vav~vNg~iv~r~~~~~~~l~~gD~vei 59 (66)
T PRK05659 24 REGLAGRRVAVEVNGEIVPRSQHASTALREGDVVEI 59 (66)
T ss_pred hcCCCCCeEEEEECCeEeCHHHcCcccCCCCCEEEE
Confidence 345666666677777777655544555666776555
No 138
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=23.42 E-value=64 Score=24.77 Aligned_cols=21 Identities=14% Similarity=0.242 Sum_probs=16.1
Q ss_pred CCccceeeccCCeEEEEEeee
Q psy18066 226 GNSGGPLVNLDGEVIGINSMK 246 (375)
Q Consensus 226 G~SGGPlvn~~G~VIGI~s~~ 246 (375)
+.+--|++|.+|+++|+.+..
T Consensus 86 ~~~~lpVvd~~~~~~Giit~~ 106 (111)
T cd04603 86 EPPVVAVVDKEGKLVGTIYER 106 (111)
T ss_pred CCCeEEEEcCCCeEEEEEEhH
Confidence 334458999889999998753
No 139
>PF10049 DUF2283: Protein of unknown function (DUF2283); InterPro: IPR019270 Members of this family of hypothetical proteins have no known function.
Probab=23.14 E-value=54 Score=22.08 Aligned_cols=11 Identities=36% Similarity=0.818 Sum_probs=8.2
Q ss_pred ccCCeEEEEEe
Q psy18066 234 NLDGEVIGINS 244 (375)
Q Consensus 234 n~~G~VIGI~s 244 (375)
|.+|++|||--
T Consensus 36 d~~G~ivGIEI 46 (50)
T PF10049_consen 36 DEDGRIVGIEI 46 (50)
T ss_pred CCCCCEEEEEE
Confidence 56789999853
No 140
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=22.33 E-value=66 Score=24.64 Aligned_cols=21 Identities=19% Similarity=0.379 Sum_probs=16.6
Q ss_pred CCccceeeccCCeEEEEEeee
Q psy18066 226 GNSGGPLVNLDGEVIGINSMK 246 (375)
Q Consensus 226 G~SGGPlvn~~G~VIGI~s~~ 246 (375)
+...-|++|.+|+++|+.+..
T Consensus 90 ~~~~~pVvd~~~~~~Gvit~~ 110 (115)
T cd04620 90 QIRHLPVLDDQGQLIGLVTAE 110 (115)
T ss_pred CCceEEEEcCCCCEEEEEEhH
Confidence 334578999899999998764
No 141
>cd00565 ThiS ThiaminS ubiquitin-like sulfur carrier protein. ThiS (ThiaminS) is a sulfur carrier protein involved in thiamin biosynthesis in bacteria. The ThiS fold, like those of two closely related proteins MoaD and Urm1, is similar to that of ubiquitin although there is little or no sequence similarity.
Probab=21.81 E-value=1e+02 Score=21.72 Aligned_cols=35 Identities=14% Similarity=0.162 Sum_probs=19.3
Q ss_pred CCCCCCCEEEEeCCEEcCCHHHHHHHHhcCCEEEE
Q psy18066 325 AGLHQEDIIIELNKKPCHSAKDIYAALEVVRLVNF 359 (375)
Q Consensus 325 aGl~~gDiI~~vng~~v~~~~~l~~~l~~~~~v~l 359 (375)
.++....+.+++||+.+..-+--...|..|++|.+
T Consensus 24 l~~~~~~i~V~vNg~~v~~~~~~~~~L~~gD~V~i 58 (65)
T cd00565 24 LGLDPRGVAVALNGEIVPRSEWASTPLQDGDRIEI 58 (65)
T ss_pred cCCCCCcEEEEECCEEcCHHHcCceecCCCCEEEE
Confidence 34556667777777776433222234555665544
No 142
>PTZ00412 leucyl aminopeptidase; Provisional
Probab=20.91 E-value=75 Score=32.91 Aligned_cols=28 Identities=14% Similarity=0.036 Sum_probs=21.8
Q ss_pred EEEccCChhhhCCCCCCCEEEEeCCEEcC
Q psy18066 314 WRVMYNSPAYLAGLHQEDIIIELNKKPCH 342 (375)
Q Consensus 314 ~~v~~~s~a~~aGl~~gDiI~~vng~~v~ 342 (375)
.-..+|.|...+ .+|||||++.||+.|.
T Consensus 349 iplaENm~sg~A-~rPGDVits~nGkTVE 376 (569)
T PTZ00412 349 VGLAENAIGPES-YHPSSIITSRKGLTVE 376 (569)
T ss_pred EEhhhcCCCCCC-CCCCCEeEecCCCEEe
Confidence 334557776666 9999999999998864
No 143
>cd01718 Sm_E The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit E binds subunits F and G to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=20.83 E-value=1.7e+02 Score=21.89 Aligned_cols=31 Identities=13% Similarity=0.185 Sum_probs=25.1
Q ss_pred ceEEEEcC--CCCEEEEEEEEecCCCCeEEEEe
Q psy18066 127 AQIIVTLP--DGSKHKGAVEALDVECDLAIIRC 157 (375)
Q Consensus 127 ~~i~V~~~--~g~~~~a~vv~~d~~~DlAlLki 157 (375)
..+.|.+. +|+.+.+.+.++|...+|.|=..
T Consensus 19 ~~V~V~l~~~~g~~~~G~L~gfD~~mNlvL~d~ 51 (79)
T cd01718 19 QRVQIWLYEQTDLRIEGVIIGFDEYMNLVLDDA 51 (79)
T ss_pred cEEEEEEEeCCCcEEEEEEEEEccceeEEEcCE
Confidence 35666665 89999999999999999887554
No 144
>cd00218 GlcAT-I Beta1,3-glucuronyltransferase I (GlcAT-I) is involved in the initial steps of proteoglycan synthesis. Beta1,3-glucuronyltransferase I (GlcAT-I) domain; GlcAT-I is a Key enzyme involved in the initial steps of proteoglycan synthesis. GlcAT-I catalyzes the transfer of a glucuronic acid moiety from the uridine diphosphate-glucuronic acid (UDP-GlcUA) to the common linkage region of trisaccharide Gal-beta-(1-3)-Gal-beta-(1-4)-Xyl of proteoglycans. The enzyme has two subdomains that bind the donor and acceptor substrate separately. The active site is located at the cleft between both subdomains in which the trisaccharide molecule is oriented perpendicular to the UDP. This family has been classified as Glycosyltransferase family 43 (GT-43).
Probab=20.32 E-value=91 Score=28.35 Aligned_cols=32 Identities=22% Similarity=0.369 Sum_probs=24.0
Q ss_pred cceeeccCCeEEEEEeeecC------CCeEEEEehHHHH
Q psy18066 229 GGPLVNLDGEVIGINSMKVT------AGISFAIPIDYAI 261 (375)
Q Consensus 229 GGPlvn~~G~VIGI~s~~~~------~g~g~aip~~~i~ 261 (375)
-||+++ +|+|+|-++.... +--|||+.+..+.
T Consensus 136 egP~c~-~gkV~gw~~~w~~~R~f~idmAGFA~n~~ll~ 173 (223)
T cd00218 136 EGPVCE-NGKVVGWHTAWKPERPFPIDMAGFAFNSKLLW 173 (223)
T ss_pred eccEee-CCeEeEEecCCCCCCCCcceeeeEEEehhhhc
Confidence 468877 8999999987543 3458999887664
Done!