Query 002474
Match_columns 918
No_of_seqs 548 out of 3495
Neff 6.3
Searched_HMMs 46136
Date Fri Mar 29 00:39:03 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/002474.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/002474hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1421 Predicted signaling-as 100.0 1E-143 2E-148 1205.4 58.2 732 24-902 40-773 (955)
2 PRK10139 serine endoprotease; 100.0 2.9E-53 6.3E-58 490.0 46.9 377 36-455 40-446 (455)
3 TIGR02037 degP_htrA_DO peripla 100.0 3.4E-53 7.4E-58 488.6 46.8 384 37-456 2-421 (428)
4 PRK10942 serine endoprotease; 100.0 1.5E-51 3.2E-56 478.0 45.7 378 36-456 38-465 (473)
5 TIGR02038 protease_degS peripl 100.0 2.9E-44 6.4E-49 403.7 36.7 300 32-367 41-349 (351)
6 PRK10898 serine endoprotease; 100.0 6.3E-44 1.4E-48 400.9 35.9 300 33-368 42-351 (353)
7 COG0265 DegQ Trypsin-like seri 100.0 1E-33 2.2E-38 318.4 32.0 295 36-367 33-341 (347)
8 KOG1421 Predicted signaling-as 100.0 5.2E-33 1.1E-37 314.3 35.0 414 42-493 524-952 (955)
9 KOG1320 Serine protease [Postt 99.9 1.5E-22 3.3E-27 229.6 22.3 316 35-365 127-467 (473)
10 TIGR02038 protease_degS peripl 99.8 4.4E-19 9.5E-24 200.1 25.2 237 621-896 50-290 (351)
11 PRK10942 serine endoprotease; 99.8 1.5E-18 3.3E-23 202.5 21.4 199 649-885 110-309 (473)
12 TIGR02037 degP_htrA_DO peripla 99.8 8.2E-18 1.8E-22 194.8 22.9 226 621-885 6-255 (428)
13 PRK10898 serine endoprotease; 99.7 1.1E-16 2.3E-21 180.7 23.3 200 621-845 50-250 (353)
14 PRK10139 serine endoprotease; 99.7 2.1E-16 4.5E-21 183.8 23.9 226 621-884 45-287 (455)
15 KOG1320 Serine protease [Postt 99.7 4.5E-17 9.7E-22 185.4 15.8 376 41-456 55-457 (473)
16 PRK10779 zinc metallopeptidase 99.6 3.3E-14 7.2E-19 165.7 17.6 143 302-456 130-279 (449)
17 PF13365 Trypsin_2: Trypsin-li 99.5 1.1E-13 2.4E-18 130.4 12.5 110 70-216 1-120 (120)
18 PF12812 PDZ_1: PDZ-like domai 99.4 1.9E-13 4E-18 121.2 7.8 76 370-445 2-78 (78)
19 TIGR00054 RIP metalloprotease 99.4 1.8E-12 3.9E-17 149.8 15.7 131 297-456 128-261 (420)
20 COG0265 DegQ Trypsin-like seri 99.2 6.8E-10 1.5E-14 125.5 19.4 197 621-844 38-242 (347)
21 PF13180 PDZ_2: PDZ domain; PD 99.2 2.1E-10 4.6E-15 102.5 10.7 67 297-364 14-82 (82)
22 cd00987 PDZ_serine_protease PD 98.9 1.2E-08 2.6E-13 92.0 11.2 87 261-361 1-89 (90)
23 PF13180 PDZ_2: PDZ domain; PD 98.8 3.3E-08 7.2E-13 88.4 9.8 71 378-457 2-74 (82)
24 cd00986 PDZ_LON_protease PDZ d 98.8 5.4E-08 1.2E-12 86.3 10.5 70 297-367 8-78 (79)
25 cd00991 PDZ_archaeal_metallopr 98.8 4.6E-08 1E-12 87.0 10.1 68 295-363 8-77 (79)
26 PF00089 Trypsin: Trypsin; In 98.8 2E-07 4.4E-12 96.4 16.2 177 47-239 13-220 (220)
27 cd00990 PDZ_glycyl_aminopeptid 98.6 3E-07 6.5E-12 81.3 10.3 65 297-365 12-78 (80)
28 cd00989 PDZ_metalloprotease PD 98.6 2.1E-07 4.5E-12 82.0 9.0 63 300-362 14-77 (79)
29 cd00987 PDZ_serine_protease PD 98.5 3.6E-07 7.8E-12 82.3 9.4 78 378-455 2-82 (90)
30 TIGR01713 typeII_sec_gspC gene 98.5 9.9E-07 2.2E-11 95.9 14.3 98 233-363 159-258 (259)
31 cd00988 PDZ_CTP_protease PDZ d 98.5 5.3E-07 1.2E-11 80.6 9.7 66 297-363 13-82 (85)
32 KOG3580 Tight junction protein 98.5 5E-07 1.1E-11 103.2 10.4 62 400-461 430-495 (1027)
33 cd00190 Tryp_SPc Trypsin-like 98.4 4.1E-06 8.8E-11 87.3 15.3 162 46-221 12-208 (232)
34 cd00991 PDZ_archaeal_metallopr 98.3 2.5E-06 5.5E-11 75.8 9.3 58 399-456 10-69 (79)
35 KOG3209 WW domain-containing p 98.3 1.2E-06 2.7E-11 102.0 8.8 150 302-452 782-978 (984)
36 PF13365 Trypsin_2: Trypsin-li 98.3 7.8E-06 1.7E-10 76.8 12.3 55 652-710 1-65 (120)
37 smart00020 Tryp_SPc Trypsin-li 98.2 3.2E-05 6.9E-10 80.9 16.4 164 46-221 13-208 (229)
38 KOG3209 WW domain-containing p 98.2 1.3E-05 2.8E-10 93.9 13.5 147 300-453 676-835 (984)
39 cd00986 PDZ_LON_protease PDZ d 98.2 1.1E-05 2.4E-10 71.5 9.6 57 399-456 8-66 (79)
40 cd00136 PDZ PDZ domain, also c 98.2 7.2E-06 1.6E-10 70.4 8.2 53 298-351 14-69 (70)
41 TIGR00054 RIP metalloprotease 98.1 8.4E-06 1.8E-10 94.8 9.6 67 298-365 204-271 (420)
42 COG3591 V8-like Glu-specific e 98.1 6.2E-05 1.3E-09 80.8 14.0 158 47-222 38-225 (251)
43 PRK10779 zinc metallopeptidase 98.0 1.3E-05 2.8E-10 94.0 9.7 65 301-365 224-289 (449)
44 TIGR01713 typeII_sec_gspC gene 98.0 5.6E-05 1.2E-09 82.3 13.0 58 399-456 191-250 (259)
45 cd00136 PDZ PDZ domain, also c 98.0 1.8E-05 3.8E-10 68.0 7.2 52 400-452 14-69 (70)
46 cd00989 PDZ_metalloprotease PD 98.0 5.3E-05 1.2E-09 66.6 10.1 54 401-455 14-69 (79)
47 PF00863 Peptidase_C4: Peptida 97.9 0.00019 4.2E-09 76.4 14.3 165 43-232 14-184 (235)
48 cd00988 PDZ_CTP_protease PDZ d 97.9 5.3E-05 1.2E-09 67.6 8.7 56 399-455 13-72 (85)
49 PF00595 PDZ: PDZ domain (Also 97.8 5.3E-05 1.1E-09 67.4 7.5 55 297-352 25-81 (81)
50 PF14685 Tricorn_PDZ: Tricorn 97.8 0.00013 2.8E-09 66.5 9.5 60 297-356 11-81 (88)
51 TIGR00225 prc C-terminal pepti 97.8 5.3E-05 1.1E-09 85.6 8.4 69 298-367 63-134 (334)
52 cd00990 PDZ_glycyl_aminopeptid 97.8 0.00011 2.3E-09 64.9 8.3 54 399-455 12-67 (80)
53 smart00228 PDZ Domain present 97.8 6.8E-05 1.5E-09 66.3 7.1 58 297-355 26-85 (85)
54 TIGR02860 spore_IV_B stage IV 97.7 0.00012 2.6E-09 83.7 9.5 69 297-365 105-181 (402)
55 TIGR03279 cyano_FeS_chp putati 97.7 9E-05 1.9E-09 85.2 8.3 60 303-365 3-64 (433)
56 PLN00049 carboxyl-terminal pro 97.6 0.00015 3.2E-09 83.7 9.3 66 298-364 103-171 (389)
57 cd00992 PDZ_signaling PDZ doma 97.6 0.00014 3.1E-09 64.1 7.1 53 297-351 26-81 (82)
58 cd00992 PDZ_signaling PDZ doma 97.6 0.0002 4.4E-09 63.2 7.5 52 399-452 26-81 (82)
59 PF00595 PDZ: PDZ domain (Also 97.6 0.00019 4E-09 63.8 7.2 52 399-452 25-80 (81)
60 TIGR02860 spore_IV_B stage IV 97.5 0.00046 1E-08 79.0 10.5 57 399-456 105-171 (402)
61 KOG3580 Tight junction protein 97.5 0.00076 1.6E-08 78.0 11.9 58 299-356 41-99 (1027)
62 smart00228 PDZ Domain present 97.4 0.00047 1E-08 60.9 7.3 56 399-454 26-83 (85)
63 KOG3605 Beta amyloid precursor 97.3 0.00052 1.1E-08 80.4 7.3 116 303-441 678-802 (829)
64 TIGR00225 prc C-terminal pepti 97.2 0.00097 2.1E-08 75.4 8.9 68 400-471 63-134 (334)
65 COG0793 Prc Periplasmic protea 97.2 0.0011 2.4E-08 76.8 9.1 65 298-362 112-181 (406)
66 KOG3129 26S proteasome regulat 97.1 0.0013 2.8E-08 68.0 7.9 73 297-369 138-214 (231)
67 PF04495 GRASP55_65: GRASP55/6 97.1 0.0018 3.9E-08 64.0 8.6 90 377-474 26-118 (138)
68 COG0793 Prc Periplasmic protea 97.1 0.0015 3.3E-08 75.7 8.5 72 400-475 113-188 (406)
69 PLN00049 carboxyl-terminal pro 97.0 0.0024 5.2E-08 73.8 9.6 65 400-470 103-171 (389)
70 PRK09681 putative type II secr 97.0 0.0028 6.2E-08 69.2 9.4 67 297-364 204-275 (276)
71 KOG3834 Golgi reassembly stack 97.0 0.0086 1.9E-07 67.9 13.3 161 297-474 14-184 (462)
72 PF04495 GRASP55_65: GRASP55/6 96.9 0.0022 4.8E-08 63.4 7.4 67 299-365 44-114 (138)
73 COG3480 SdrC Predicted secrete 96.9 0.0024 5.2E-08 70.0 8.1 70 297-367 130-201 (342)
74 PF12812 PDZ_1: PDZ-like domai 96.7 0.0049 1.1E-07 55.0 6.8 64 262-341 10-74 (78)
75 TIGR03279 cyano_FeS_chp putati 96.7 0.0052 1.1E-07 71.0 8.8 57 406-470 7-63 (433)
76 PRK11186 carboxy-terminal prot 96.6 0.0053 1.1E-07 75.0 8.7 64 300-363 257-332 (667)
77 KOG3553 Tax interaction protei 96.3 0.0024 5.2E-08 58.5 2.4 48 394-441 54-105 (124)
78 COG3975 Predicted protease wit 96.3 0.0087 1.9E-07 69.7 7.4 85 263-367 439-525 (558)
79 COG3031 PulC Type II secretory 96.2 0.0096 2.1E-07 62.9 6.3 63 301-363 210-274 (275)
80 PRK11186 carboxy-terminal prot 96.1 0.015 3.2E-07 71.2 8.8 70 400-472 256-335 (667)
81 PF00089 Trypsin: Trypsin; In 96.0 0.13 2.8E-06 53.0 13.9 172 649-839 24-220 (220)
82 COG3480 SdrC Predicted secrete 95.9 0.022 4.7E-07 62.7 7.8 69 399-470 130-200 (342)
83 PRK09681 putative type II secr 95.9 0.03 6.5E-07 61.4 8.9 50 408-457 218-267 (276)
84 PF14685 Tricorn_PDZ: Tricorn 95.8 0.023 5E-07 51.9 6.5 51 406-457 29-81 (88)
85 PF10459 Peptidase_S46: Peptid 95.2 0.06 1.3E-06 66.3 9.1 42 117-158 200-252 (698)
86 cd00190 Tryp_SPc Trypsin-like 95.1 0.31 6.8E-06 50.5 13.0 94 648-748 23-133 (232)
87 PF10459 Peptidase_S46: Peptid 94.9 0.025 5.5E-07 69.5 4.7 54 191-244 624-688 (698)
88 PF05580 Peptidase_S55: SpoIVB 94.8 0.29 6.4E-06 51.6 11.6 167 66-235 18-215 (218)
89 smart00020 Tryp_SPc Trypsin-li 94.8 0.71 1.5E-05 48.0 14.7 94 648-748 24-133 (229)
90 PF05579 Peptidase_S32: Equine 94.6 0.3 6.5E-06 52.8 11.2 116 66-220 110-228 (297)
91 PF02122 Peptidase_S39: Peptid 94.1 0.014 3E-07 61.4 -0.1 143 67-232 29-181 (203)
92 PF00949 Peptidase_S7: Peptida 93.7 0.055 1.2E-06 53.1 3.2 31 191-221 88-118 (132)
93 KOG3550 Receptor targeting pro 93.6 0.17 3.7E-06 50.0 6.4 54 297-352 115-172 (207)
94 KOG3553 Tax interaction protei 93.3 0.019 4.1E-07 52.8 -0.6 36 296-332 58-94 (124)
95 KOG3571 Dishevelled 3 and rela 93.2 0.22 4.9E-06 57.7 7.4 87 376-484 260-351 (626)
96 KOG3532 Predicted protein kina 92.9 0.22 4.8E-06 59.4 7.1 51 296-346 396-447 (1051)
97 KOG3542 cAMP-regulated guanine 92.4 0.11 2.3E-06 61.7 3.6 53 399-452 562-616 (1283)
98 PF08192 Peptidase_S64: Peptid 92.2 0.57 1.2E-05 56.5 9.3 121 114-242 540-688 (695)
99 PF00548 Peptidase_C3: 3C cyst 91.9 1.8 4E-05 44.5 11.6 150 47-220 13-170 (172)
100 KOG3129 26S proteasome regulat 91.3 0.72 1.6E-05 48.3 7.7 57 401-457 141-201 (231)
101 KOG3552 FERM domain protein FR 91.1 0.23 5E-06 61.1 4.6 53 300-353 77-131 (1298)
102 KOG3550 Receptor targeting pro 90.3 0.55 1.2E-05 46.5 5.6 46 399-444 115-165 (207)
103 COG3031 PulC Type II secretory 90.0 0.63 1.4E-05 49.7 6.1 54 406-459 216-269 (275)
104 KOG3605 Beta amyloid precursor 89.1 1.3 2.9E-05 53.0 8.5 63 406-473 682-747 (829)
105 COG0750 Predicted membrane-ass 88.0 1.3 2.8E-05 50.6 7.7 55 304-358 135-194 (375)
106 KOG1892 Actin filament-binding 87.5 0.66 1.4E-05 57.3 4.8 60 296-356 959-1021(1629)
107 PF00863 Peptidase_C4: Peptida 87.4 3.7 8E-05 44.3 9.9 150 663-841 40-195 (235)
108 COG3975 Predicted protease wit 87.3 0.61 1.3E-05 54.9 4.2 73 375-456 435-513 (558)
109 KOG3627 Trypsin [Amino acid tr 86.8 17 0.00037 38.9 14.9 147 69-222 39-229 (256)
110 PF03761 DUF316: Domain of unk 85.4 32 0.00069 37.7 16.5 107 115-237 159-273 (282)
111 KOG3606 Cell polarity protein 81.0 2.5 5.3E-05 45.9 5.1 46 399-444 194-244 (358)
112 KOG3532 Predicted protein kina 80.7 3.7 8E-05 49.6 6.9 45 400-444 399-445 (1051)
113 KOG3542 cAMP-regulated guanine 80.4 1.3 2.9E-05 52.9 3.2 37 297-334 562-599 (1283)
114 PF00944 Peptidase_S3: Alphavi 77.9 1.9 4.2E-05 42.2 2.9 32 191-222 97-128 (158)
115 KOG3549 Syntrophins (type gamm 77.4 5.3 0.00011 44.8 6.4 55 401-457 82-141 (505)
116 PF11874 DUF3394: Domain of un 76.1 11 0.00023 39.4 7.9 81 332-425 62-150 (183)
117 KOG3551 Syntrophins (type beta 74.6 2.7 5.8E-05 47.7 3.3 52 300-351 112-166 (506)
118 KOG3549 Syntrophins (type gamm 74.1 3.5 7.6E-05 46.1 4.1 53 300-352 82-137 (505)
119 KOG0609 Calcium/calmodulin-dep 73.7 6.8 0.00015 46.5 6.5 54 299-353 147-204 (542)
120 KOG3551 Syntrophins (type beta 73.3 12 0.00025 42.8 7.9 50 406-457 119-171 (506)
121 KOG3938 RGS-GAIP interacting p 72.4 6.7 0.00015 42.6 5.5 50 406-455 158-210 (334)
122 PF01732 DUF31: Putative pepti 72.2 2.4 5.2E-05 48.9 2.4 30 190-219 345-374 (374)
123 KOG3571 Dishevelled 3 and rela 71.2 9.5 0.00021 44.9 6.8 80 268-353 253-338 (626)
124 KOG3651 Protein kinase C, alph 71.0 7.6 0.00017 42.8 5.6 53 298-352 31-87 (429)
125 COG0750 Predicted membrane-ass 69.1 11 0.00024 43.1 6.9 49 406-454 138-188 (375)
126 KOG2921 Intramembrane metallop 69.0 6.5 0.00014 45.0 4.8 45 296-341 219-265 (484)
127 KOG3606 Cell polarity protein 63.6 14 0.00031 40.3 5.8 55 296-352 193-251 (358)
128 KOG0606 Microtubule-associated 59.5 9.9 0.00022 48.7 4.4 52 405-457 666-718 (1205)
129 KOG3552 FERM domain protein FR 58.8 12 0.00025 47.0 4.7 47 406-455 84-132 (1298)
130 KOG0606 Microtubule-associated 57.6 15 0.00033 47.2 5.5 49 302-351 662-713 (1205)
131 KOG1892 Actin filament-binding 53.5 27 0.00058 44.1 6.5 54 399-454 960-1018(1629)
132 PF03510 Peptidase_C24: 2C end 52.4 35 0.00077 32.4 5.8 52 72-135 3-54 (105)
133 PF02907 Peptidase_S29: Hepati 46.2 13 0.00028 36.7 1.9 114 71-220 15-128 (148)
134 KOG3651 Protein kinase C, alph 44.5 47 0.001 37.0 6.0 54 400-455 31-89 (429)
135 KOG0609 Calcium/calmodulin-dep 41.3 30 0.00066 41.3 4.3 51 401-453 148-203 (542)
136 PF12381 Peptidase_C3G: Tungro 37.9 43 0.00094 35.7 4.4 56 188-243 168-229 (231)
137 KOG2921 Intramembrane metallop 35.2 46 0.00099 38.5 4.3 44 399-442 220-266 (484)
138 KOG0460 Mitochondrial translat 32.7 37 0.0008 38.7 3.1 36 720-761 269-304 (449)
139 KOG3938 RGS-GAIP interacting p 31.1 38 0.00082 37.1 2.7 56 297-352 148-208 (334)
140 KOG3834 Golgi reassembly stack 30.6 66 0.0014 37.6 4.7 62 304-365 115-178 (462)
141 KOG4407 Predicted Rho GTPase-a 30.1 36 0.00078 44.5 2.7 87 303-443 101-188 (1973)
142 COG0298 HypC Hydrogenase matur 28.7 1.2E+02 0.0027 27.4 5.0 47 689-745 3-52 (82)
143 PF12857 TOBE_3: TOBE-like dom 27.2 1.5E+02 0.0032 24.8 5.1 49 689-742 5-56 (58)
144 KOG4371 Membrane-associated pr 26.6 1.5E+02 0.0033 38.3 7.1 116 314-432 1186-1306(1332)
145 PF14275 DUF4362: Domain of un 23.1 2.2E+02 0.0047 26.9 5.9 53 416-472 1-53 (98)
146 PF01455 HupF_HypC: HupF/HypC 22.9 2.6E+02 0.0056 24.4 6.0 43 690-741 4-46 (68)
147 PF00548 Peptidase_C3: 3C cyst 22.6 8.5E+02 0.018 25.0 11.5 154 621-807 7-167 (172)
148 KOG1738 Membrane-associated gu 22.1 52 0.0011 40.1 2.0 37 297-333 224-262 (638)
149 PF03761 DUF316: Domain of unk 21.7 7.3E+02 0.016 27.0 10.9 92 702-813 161-255 (282)
150 cd01735 LSm12_N LSm12 belongs 21.5 2.2E+02 0.0048 24.5 5.1 32 95-126 8-39 (61)
151 PF08192 Peptidase_S64: Peptid 20.6 5.7E+02 0.012 32.0 10.1 90 733-843 587-686 (695)
No 1
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=100.00 E-value=1.1e-143 Score=1205.36 Aligned_cols=732 Identities=49% Similarity=0.749 Sum_probs=693.0
Q ss_pred CCCCCccCccCchhHHHHHHHhCCceEEEEEEeeeccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEecCC
Q 002474 24 VDPPLRENVATADDWRKALNKVVPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNR 103 (918)
Q Consensus 24 ~~~~~~~~~~~~~~~~~~vekv~~SVV~I~~~~~~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg 103 (918)
+.|+.........+|+..+.++.+|||+|++...+.||++.++.+.||||+|++..|||||||||+.+++....+.|.|.
T Consensus 40 ~~p~~~~s~~~~e~w~~~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ 119 (955)
T KOG1421|consen 40 PDPPLNESLATSEDWRNTIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNH 119 (955)
T ss_pred cCCCCCcccchhhhhhhhhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCceeEEEeccc
Confidence 33343333556679999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred eEEeEEEEEecCCCcEEEEEECCCCcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCc
Q 002474 104 EEIPVYPIYRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGY 183 (918)
Q Consensus 104 ~~~~a~vv~~Dp~~DlAlLkvd~~~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~ 183 (918)
++++..++|+||.|||+++|++|+.+.+..+..++++++..++|.+++++||+.+++.++..|.+++++|++|.|++..|
T Consensus 120 ee~ei~pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~y 199 (955)
T KOG1421|consen 120 EEIEIYPVYRDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTY 199 (955)
T ss_pred ccCCcccccCCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCccceEEeehhhhhhccCCCcccccccc
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred cccceeEEEEeecCCCCCCCccEEcccceEEEeccccCCCCCcccccchhhHHHHHHHHHhcCCCccccccccccCCCcc
Q 002474 184 NDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSSSASAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTL 263 (918)
Q Consensus 184 ~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~~~~~~faIPi~~i~~~L~~l~~g~~~~~~~~~~~~v~rg~L 263 (918)
+|||++|+|..+...+|+||+||+|.+|.+|++++++...++.+|++|+++++|+|.++++++ +++||+|
T Consensus 200 ndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg~~ssas~ffLpLdrV~RaL~clq~n~----------PItRGtL 269 (955)
T KOG1421|consen 200 NDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSISSASDFFLPLDRVVRALRCLQNNT----------PITRGTL 269 (955)
T ss_pred ccccceeeeehhcCCCCCCCCceecccceEEeeecCCcccccccceeeccchhhhhhhhhcCC----------CcccceE
Confidence 999999999999999999999999999999999999999999999999999999999999998 8999999
Q ss_pred CeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCccccCCCCCCEEEEECCEEecCHHHHHHHHhccC
Q 002474 264 QVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQFLKLETLLDDGV 343 (918)
Q Consensus 264 gv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~ 343 (918)
+++|.++.+|+||+|||+.|||+.+|.++| ..+|+|+|+.|+++|||++.|++||++++||+.-+.+|..+.++|++..
T Consensus 270 qvefl~k~~de~rrlGL~sE~eqv~r~k~P-~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~l~~iLDegv 348 (955)
T KOG1421|consen 270 QVEFLHKLFDECRRLGLSSEWEQVVRTKFP-ERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEALEQILDEGV 348 (955)
T ss_pred EEEEehhhhHHHHhcCCcHHHHHHHHhcCc-ccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHHHHHHHhhcc
Confidence 999999999999999999999999999999 8999999999999999999999999999999999999999999999999
Q ss_pred CCeEEEEEEECCeEEEEEEEeecCCCCCCCceeeecceeecccchhhhcccCCCCCcEEEEcC-CChhHHcCCCCCCEEE
Q 002474 344 DKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLSYQQARNFRFPCGLVYVAEP-GYMLFRAGVPRHAIIK 422 (918)
Q Consensus 344 G~~V~l~V~R~G~~~~~~I~l~~~~~~t~~~~v~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs~p-gspA~~AGLk~GD~I~ 422 (918)
|+.+.|+|+|+|++.+++++++++|.++|+||++|||+.+|+++||+++.|.+|++||||+++ |+++.+.++. |++|.
T Consensus 349 gk~l~LtI~Rggqelel~vtvqdlh~itp~R~levcGav~hdlsyq~ar~y~lP~~GvyVa~~~gsf~~~~~~y-~~ii~ 427 (955)
T KOG1421|consen 349 GKNLELTIQRGGQELELTVTVQDLHGITPDRFLEVCGAVFHDLSYQLARLYALPVEGVYVASPGGSFRHRGPRY-GQIID 427 (955)
T ss_pred CceEEEEEEeCCEEEEEEEEeccccCCCCceEEEEcceEecCCCHHHHhhcccccCcEEEccCCCCccccCCcc-eEEEE
Confidence 999999999999999999999999999999999999999999999999999999999999985 5667777776 99999
Q ss_pred EECCeecCCHHHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEcCCCCCCCceeeecCCCCCceeeecCCCCCCCC
Q 002474 423 KFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDRHEWYAPPQIYTRNDSSGLWSANPAILSEVLMP 502 (918)
Q Consensus 423 sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdRd~w~~~~~~~~r~d~tg~W~~~~~~~~~~~~~ 502 (918)
+||+|++++|++|+++++++++|+||+++|++++|+|+.++..+++|| ||||++|+++|||+||+||++++.+|+|
T Consensus 428 ~vanK~tPdLdaFidvlk~L~dg~rV~vry~hl~dkh~p~v~~v~iDr-Hwy~p~~~~trndetglWdrk~L~~pqP--- 503 (955)
T KOG1421|consen 428 SVANKPTPDLDAFIDVLKELPDGARVPVRYHHLTDKHSPRVTTVTIDR-HWYWPFREYTRNDETGLWDRKNLKDPQP--- 503 (955)
T ss_pred eecCCcCCCHHHHHHHHHhccCCCeeeEEEEEecCCCCceEEEEEEec-cccccceeeeeCCCcccccccccCCCCc---
Confidence 999999999999999999999999999999999999999999999999 9999999999999999999999999998
Q ss_pred CCCCCCCCcCcccccccccccccccccccccCccccccccccchhhcccccccccccCCCcccccccccccccccCCccc
Q 002474 503 SSGINGGVQGVASQTVSICGELVHMEHMHQRNNQELTDGVTSMETACEHASAESISRGESDNGRKKRRVEENISADGVVA 582 (918)
Q Consensus 503 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 582 (918)
+.+.+|.+.+. +.
T Consensus 504 -----------a~~~kP~s~~i------------------p~-------------------------------------- 516 (955)
T KOG1421|consen 504 -----------AISIKPASVSI------------------PS-------------------------------------- 516 (955)
T ss_pred -----------ccccCCccccC------------------CC--------------------------------------
Confidence 45667766663 11
Q ss_pred cCCCCCCCCccccccccccccCCCCCCCCCcccCCcccccccccCceEEEEEecCCccccCCcccceeEeEEEEEeccCC
Q 002474 583 DCSPHESGDARLEDSSTMENAGSRDYFGAPAATTNASFAESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQS 662 (918)
Q Consensus 583 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~s~V~v~~~~p~~~~~d~~~~~~~~G~G~Vvd~~~~ 662 (918)
+ .+.++..+++..|||.|+++|| +.+||+.+..++|+|+|+| .+
T Consensus 517 -----------i---------------------~~~~~~~~~i~~~~~~v~~~~~--~~l~g~s~~i~kgt~~i~d--~~ 560 (955)
T KOG1421|consen 517 -----------I---------------------GVNNFPSADISNCLVDVEPMMP--VNLDGVSSDIYKGTALIMD--TS 560 (955)
T ss_pred -----------c---------------------CcCCcchhHHhhhhhhheecee--eccccchhhhhcCceEEEE--cc
Confidence 1 1123456779999999999999 6999999999999999999 56
Q ss_pred CcEEEEecccccCCCccEEEEeeeCceeeeeEEEEeecccceEEEEecCCCcCcccccceeecccCCCcCCCCCCeEEEE
Q 002474 663 MGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLV 742 (918)
Q Consensus 663 ~GlV~v~r~~V~~~~~di~vtfa~~~~~vp~~vvflhp~~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd~v~~v 742 (918)
+||++|||++||+++||.+||||+ |+.+||.|.||||+||+|++||||+++ .+++|.... ++|||+++|.
T Consensus 561 ~g~~vvsr~~vp~d~~d~~vt~~d-S~~i~a~~~fL~~t~n~a~~kydp~~~--------~~~kl~~~~-v~~gD~~~f~ 630 (955)
T KOG1421|consen 561 KGLGVVSRSVVPSDAKDQRVTEAD-SDGIPANVSFLHPTENVASFKYDPALE--------VQLKLTDTT-VLRGDECTFE 630 (955)
T ss_pred CCceeEecccCCchhhceEEeecc-cccccceeeEecCccceeEeccChhHh--------hhhccceee-EecCCceeEe
Confidence 999999999999999999999997 999999999999999999999999776 789999998 9999999999
Q ss_pred EecCCCceeEEeEEEecceeecccCCCCCccccccceeeEEEecCcCCCC-cceEECCCccEEEEEeeeecceeccCCCC
Q 002474 743 GLSRSLQATSRKSIVTNPCAALNISSADCPRYRAMNMEVIELDTDFGSTF-SGVLTDEHGRVQAIWGSFSTQVKFGCSSS 821 (918)
Q Consensus 743 G~~~~~~~~~~~t~vt~i~~~~~~~~~~~pryr~~n~e~i~~d~~~~~~~-~Gvl~d~~G~v~alw~s~~~~~~~~~~~~ 821 (918)
|+++++|++++||+||+++ +..+|++.+|||||+|+|+|+++++++.+| +|+|+|+||.|+|||+++.||+. .+
T Consensus 631 g~~~~~r~ltaktsv~dvs-~~~~ps~~~pr~r~~n~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~~----~~ 705 (955)
T KOG1421|consen 631 GFTEDLRALTAKTSVTDVS-VVIIPSSVMPRFRATNLEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGEDV----GG 705 (955)
T ss_pred cccccchhhcccceeeeeE-EEEecCCCCcceeecceEEEEEeccccccccceEEECCCCeEEEEEeeeecccc----CC
Confidence 9999999999999999976 899999999999999999999999999999 99999999999999999999953 57
Q ss_pred CCceEEeccchhhHHHHHHHHHcCCCCCCccccCccCCCCceEEEeeEEEEeehHhHHhcCCCHHHHHHhhcCCCCccee
Q 002474 822 EDHQFVRGIPIYTISRVLDKIISGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARSFGLSDDWVQVCFLPNASSFFF 901 (918)
Q Consensus 822 ~~~~~~~gl~~~~i~~v~~~l~~g~~~~~~~~~~~~~~~p~~~~l~~e~~~~~~~~ar~~g~~~~wi~~~~~~~~~~~~~ 901 (918)
+|..|++||++++|+++|++||.| +.|..+|++|||.+++|+|||++|||+|||.|+|.++..++-+
T Consensus 706 kd~~y~~gl~~~~~l~vl~rlk~g-------------~~~rp~i~~vef~~i~laqar~lglp~e~imk~e~es~~~~ql 772 (955)
T KOG1421|consen 706 KDYTYKYGLSMSYILPVLERLKLG-------------PSARPTIAGVEFSHITLAQARTLGLPSEFIMKSEEESTIPRQL 772 (955)
T ss_pred ceeEEEeccchHHHHHHHHHHhcC-------------CCCCceeeccceeeEEeehhhccCCCHHHHhhhhhcCCCcceE
Confidence 999999999999999999999999 4678999999999999999999999999999999988877654
Q ss_pred e
Q 002474 902 W 902 (918)
Q Consensus 902 ~ 902 (918)
+
T Consensus 773 ~ 773 (955)
T KOG1421|consen 773 Y 773 (955)
T ss_pred E
Confidence 3
No 2
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=2.9e-53 Score=490.05 Aligned_cols=377 Identities=20% Similarity=0.316 Sum_probs=308.6
Q ss_pred hhHHHHHHHhCCceEEEEEEeeec------------cCC------CCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEE
Q 002474 36 DDWRKALNKVVPAVVVLRTTACRA------------FDT------EAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAE 97 (918)
Q Consensus 36 ~~~~~~vekv~~SVV~I~~~~~~~------------fd~------~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~ 97 (918)
.+|.++++++.||||.|.+..... |.. .....+.||||||++++||||||+|||. +...+.
T Consensus 40 ~~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~-~a~~i~ 118 (455)
T PRK10139 40 PSLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVIN-QAQKIS 118 (455)
T ss_pred ccHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhC-CCCEEE
Confidence 479999999999999999864311 111 1123578999999865799999999999 567899
Q ss_pred EEecCCeEEeEEEEEecCCCcEEEEEECC-CCcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCC
Q 002474 98 AMFVNREEIPVYPIYRDPVHDFGFFRYDP-SAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAP 176 (918)
Q Consensus 98 v~f~dg~~~~a~vv~~Dp~~DlAlLkvd~-~~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p 176 (918)
|+|.|+++++|++++.|+.+|||+||++. ..++++.+++ ++.+++||+|+++|||+|...+++.|+||++.|...
T Consensus 119 V~~~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~----s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~ 194 (455)
T PRK10139 119 IQLNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIAD----SDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGL 194 (455)
T ss_pred EEECCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecC----ccccCCCCEEEEEecCCCCCCceEEEEEcccccccc
Confidence 99999999999999999999999999974 5566555554 678999999999999999999999999999988633
Q ss_pred CCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEeccccCC----CCCcccccchhhHHHHHHHHHhcCCCcccc
Q 002474 177 HYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKS----SSASAFFLPLERVVRALRFLQERRDCNIHN 252 (918)
Q Consensus 177 ~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~----~~~~~faIPi~~i~~~L~~l~~g~~~~~~~ 252 (918)
.. .+|. +|||+|+++++|||||||||.+|+||||+++... ..+++|+||++.+++++++|.+++
T Consensus 195 ~~--~~~~----~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g------ 262 (455)
T PRK10139 195 NL--EGLE----NFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFG------ 262 (455)
T ss_pred CC--CCcc----eEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcC------
Confidence 22 1233 4699999999999999999999999999987543 267899999999999999999888
Q ss_pred ccccccCCCccCeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecC
Q 002474 253 WEAVSIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQ 331 (918)
Q Consensus 253 ~~~~~v~rg~Lgv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s 331 (918)
++.|+|||+.++.++.+..+.||++ ...|++| ..|.++|||++ ||++||+|++|||++|.+
T Consensus 263 ----~v~r~~LGv~~~~l~~~~~~~lgl~-------------~~~Gv~V-~~V~~~SpA~~AGL~~GDvIl~InG~~V~s 324 (455)
T PRK10139 263 ----EIKRGLLGIKGTEMSADIAKAFNLD-------------VQRGAFV-SEVLPNSGSAKAGVKAGDIITSLNGKPLNS 324 (455)
T ss_pred ----cccccceeEEEEECCHHHHHhcCCC-------------CCCceEE-EEECCCChHHHCCCCCCCEEEEECCEECCC
Confidence 7899999999999999999999985 4578777 69999999999 999999999999999999
Q ss_pred HHHHHHHHhc-cCCCeEEEEEEECCeEEEEEEEeecCCCCCCCce---eeecceeecccchhhhcccCCCCCcEEEEc--
Q 002474 332 FLKLETLLDD-GVDKNIELLIERGGISMTVNLVVQDLHSITPDYF---LEVSGAVIHPLSYQQARNFRFPCGLVYVAE-- 405 (918)
Q Consensus 332 ~~~l~~~L~~-~~G~~V~l~V~R~G~~~~~~I~l~~~~~~t~~~~---v~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs~-- 405 (918)
|.++...+.. .+|+++.++|.|+|+.+++++++........... ..+.|+.+.+. + .+. ...|++|..
T Consensus 325 ~~dl~~~l~~~~~g~~v~l~V~R~G~~~~l~v~~~~~~~~~~~~~~~~~~~~g~~l~~~--~-~~~---~~~Gv~V~~V~ 398 (455)
T PRK10139 325 FAELRSRIATTEPGTKVKLGLLRNGKPLEVEVTLDTSTSSSASAEMITPALQGATLSDG--Q-LKD---GTKGIKIDEVV 398 (455)
T ss_pred HHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEECCCCCcccccccccccccccEeccc--c-ccc---CCCceEEEEeC
Confidence 9999988854 7889999999999999999988754332111111 11234444331 1 010 124788884
Q ss_pred CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeec
Q 002474 406 PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSY 455 (918)
Q Consensus 406 pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~ 455 (918)
++|||+++||++||+|++|||+++.+|++|.+++++.+ +.+.|+++|-
T Consensus 399 ~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~--~~v~l~v~R~ 446 (455)
T PRK10139 399 KGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP--AIIALQIVRG 446 (455)
T ss_pred CCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC--CeEEEEEEEC
Confidence 99999999999999999999999999999999998843 4566666553
No 3
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=3.4e-53 Score=488.58 Aligned_cols=384 Identities=25% Similarity=0.377 Sum_probs=330.9
Q ss_pred hHHHHHHHhCCceEEEEEEeee---------------ccCC----------CCCCCceEEEEEEeCCCcEEEEcCcccCC
Q 002474 37 DWRKALNKVVPAVVVLRTTACR---------------AFDT----------EAAGASYATGFVVDKRRGIILTNRHVVKP 91 (918)
Q Consensus 37 ~~~~~vekv~~SVV~I~~~~~~---------------~fd~----------~~~~~~~GTGFVVd~~~G~ILTn~HVV~~ 91 (918)
++.++++++.||||.|.+.... .|.. .....+.||||+|++ +||||||+||+.
T Consensus 2 ~~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~-~G~IlTn~Hvv~- 79 (428)
T TIGR02037 2 SFAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISA-DGYILTNNHVVD- 79 (428)
T ss_pred cHHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECC-CCEEEEcHHHcC-
Confidence 4789999999999999986421 1111 012457899999997 699999999999
Q ss_pred CCcEEEEEecCCeEEeEEEEEecCCCcEEEEEECCC-CcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEe
Q 002474 92 GPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDPS-AIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLAR 170 (918)
Q Consensus 92 ~~~~i~v~f~dg~~~~a~vv~~Dp~~DlAlLkvd~~-~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~ 170 (918)
++..+.|++.|+++++|++++.|+.+|||+||++.. .++++.+.+ ++.+++||+|+++|||++...+++.|+|++
T Consensus 80 ~~~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~----~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~ 155 (428)
T TIGR02037 80 GADEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGD----SDKLRVGDWVLAIGNPFGLGQTVTSGIVSA 155 (428)
T ss_pred CCCeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccC----CCCCCCCCEEEEEECCCcCCCcEEEEEEEe
Confidence 677899999999999999999999999999999864 566555554 678999999999999999999999999999
Q ss_pred ecCCCCCCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEeccccCC----CCCcccccchhhHHHHHHHHHhcC
Q 002474 171 LDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKS----SSASAFFLPLERVVRALRFLQERR 246 (918)
Q Consensus 171 ~~r~~p~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~----~~~~~faIPi~~i~~~L~~l~~g~ 246 (918)
..+... ....|.+ +||+|+.+++|+|||||+|.+|+||||+++... ..+.+|+||++.++++++++++++
T Consensus 156 ~~~~~~--~~~~~~~----~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g 229 (428)
T TIGR02037 156 LGRSGL--GIGDYEN----FIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGG 229 (428)
T ss_pred cccCcc--CCCCccc----eEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcC
Confidence 987632 1123333 699999999999999999999999999987544 257899999999999999999998
Q ss_pred CCccccccccccCCCccCeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEEC
Q 002474 247 DCNIHNWEAVSIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVN 325 (918)
Q Consensus 247 ~~~~~~~~~~~v~rg~Lgv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVN 325 (918)
.+.|+|||+.++.++.+.++.||++ ...|++| ..|.++|||++ ||++||+|++||
T Consensus 230 ----------~~~~~~lGi~~~~~~~~~~~~lgl~-------------~~~Gv~V-~~V~~~spA~~aGL~~GDvI~~Vn 285 (428)
T TIGR02037 230 ----------KVQRGWLGVTIQEVTSDLAKSLGLE-------------KQRGALV-AQVLPGSPAEKAGLKAGDVILSVN 285 (428)
T ss_pred ----------cCcCCcCceEeecCCHHHHHHcCCC-------------CCCceEE-EEccCCCChHHcCCCCCCEEEEEC
Confidence 7899999999999999999999996 4577776 69999999999 999999999999
Q ss_pred CEEecCHHHHHHHHh-ccCCCeEEEEEEECCeEEEEEEEeecCCCCCCCceeeecceeecccchhhhcccCCCC--CcEE
Q 002474 326 GEVITQFLKLETLLD-DGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLSYQQARNFRFPC--GLVY 402 (918)
Q Consensus 326 G~~I~s~~~l~~~L~-~~~G~~V~l~V~R~G~~~~~~I~l~~~~~~t~~~~v~~~Gl~~~~ls~~~~~~~gl~~--~GV~ 402 (918)
|++|.++.++...+. ..+|+++++++.|+|+.+++++++...+.........++|+.+++++....+.++++. .|++
T Consensus 286 g~~i~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~lGi~~~~l~~~~~~~~~l~~~~~Gv~ 365 (428)
T TIGR02037 286 GKPISSFADLRRAIGTLKPGKKVTLGILRKGKEKTITVTLGASPEEQASSSNPFLGLTVANLSPEIRKELRLKGDVKGVV 365 (428)
T ss_pred CEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEECcCCCccccccccccceEEecCCHHHHHHcCCCcCcCceE
Confidence 999999999998885 4678999999999999999999887665444445667899999999988888888765 6999
Q ss_pred EEc--CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeecc
Q 002474 403 VAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (918)
Q Consensus 403 Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~ 456 (918)
|.+ ++|||+++||++||+|++|||+++.++++|.+++++.+.++.+.|+++|-+
T Consensus 366 V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~v~R~g 421 (428)
T TIGR02037 366 VTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALLILRGG 421 (428)
T ss_pred EEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECC
Confidence 984 999999999999999999999999999999999999877888888887654
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=1.5e-51 Score=478.00 Aligned_cols=378 Identities=20% Similarity=0.322 Sum_probs=308.9
Q ss_pred hhHHHHHHHhCCceEEEEEEeeec-------------cCC----------------------------CCCCCceEEEEE
Q 002474 36 DDWRKALNKVVPAVVVLRTTACRA-------------FDT----------------------------EAAGASYATGFV 74 (918)
Q Consensus 36 ~~~~~~vekv~~SVV~I~~~~~~~-------------fd~----------------------------~~~~~~~GTGFV 74 (918)
.++.++++++.||||.|.+..... |.. .....+.|||||
T Consensus 38 ~~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~i 117 (473)
T PRK10942 38 PSLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVI 117 (473)
T ss_pred ccHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEE
Confidence 369999999999999999865210 110 001246899999
Q ss_pred EeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEEEEEecCCCcEEEEEEC-CCCcccccccCCCCCCcccCCCCEEEEE
Q 002474 75 VDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYD-PSAIQFLNYDEIPLAPEAACVGLEIRVV 153 (918)
Q Consensus 75 Vd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~vv~~Dp~~DlAlLkvd-~~~l~~~~l~~l~l~~~~l~vG~~V~vv 153 (918)
|++++||||||+|||. +.+.+.|+|.|+++++|++++.|+.+||||||++ ++.++++.+++ ++.+++||+|+++
T Consensus 118 i~~~~G~IlTn~HVv~-~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~----s~~l~~G~~V~ai 192 (473)
T PRK10942 118 IDADKGYVVTNNHVVD-NATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMAD----SDALRVGDYTVAI 192 (473)
T ss_pred EECCCCEEEeChhhcC-CCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecC----ccccCCCCEEEEE
Confidence 9975699999999999 6778999999999999999999999999999996 45566555554 6789999999999
Q ss_pred ecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEeccccCCC----CCcccc
Q 002474 154 GNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSS----SASAFF 229 (918)
Q Consensus 154 G~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~~----~~~~fa 229 (918)
|||++...+++.|+|+++.+.... ...|. .|||+|+++++|||||||+|.+|+||||+++.... .+.+|+
T Consensus 193 G~P~g~~~tvt~GiVs~~~r~~~~--~~~~~----~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfa 266 (473)
T PRK10942 193 GNPYGLGETVTSGIVSALGRSGLN--VENYE----NFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFA 266 (473)
T ss_pred cCCCCCCcceeEEEEEEeecccCC--ccccc----ceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEE
Confidence 999999999999999999875321 12343 36999999999999999999999999999875532 468999
Q ss_pred cchhhHHHHHHHHHhcCCCccccccccccCCCccCeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCC
Q 002474 230 LPLERVVRALRFLQERRDCNIHNWEAVSIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGG 309 (918)
Q Consensus 230 IPi~~i~~~L~~l~~g~~~~~~~~~~~~v~rg~Lgv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~s 309 (918)
||++.+++++++|++++ .+.|||||+.++.++.+.++.||++ ...|++| ..|.++|
T Consensus 267 IP~~~~~~v~~~l~~~g----------~v~rg~lGv~~~~l~~~~a~~~~l~-------------~~~GvlV-~~V~~~S 322 (473)
T PRK10942 267 IPSNMVKNLTSQMVEYG----------QVKRGELGIMGTELNSELAKAMKVD-------------AQRGAFV-SQVLPNS 322 (473)
T ss_pred EEHHHHHHHHHHHHhcc----------ccccceeeeEeeecCHHHHHhcCCC-------------CCCceEE-EEECCCC
Confidence 99999999999999988 7999999999999999999999985 4578887 6999999
Q ss_pred cccc-CCCCCCEEEEECCEEecCHHHHHHHHh-ccCCCeEEEEEEECCeEEEEEEEeecCCCCCCCceeeecceeecccc
Q 002474 310 PAHL-RLEPGDVLVRVNGEVITQFLKLETLLD-DGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLS 387 (918)
Q Consensus 310 pA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~-~~~G~~V~l~V~R~G~~~~~~I~l~~~~~~t~~~~v~~~Gl~~~~ls 387 (918)
||++ ||++||+|++|||++|.++.++...+. ...|+++.+++.|+|+.+++.+++.............++|+....+.
T Consensus 323 pA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~~~~~~~~~~~~~lGl~g~~l~ 402 (473)
T PRK10942 323 SAAKAGIKAGDVITSLNGKPISSFAALRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQSSQNQVDSSNIFNGIEGAELS 402 (473)
T ss_pred hHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCcCcccccccccccccceeeecc
Confidence 9999 999999999999999999999998885 46788999999999999999888765422111121223444333222
Q ss_pred hhhhcccCCCCCcEEEEc--CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeecc
Q 002474 388 YQQARNFRFPCGLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (918)
Q Consensus 388 ~~~~~~~gl~~~GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~ 456 (918)
... ...|++|.+ ++|+|+++||++||+|++|||+++.++++|.+++++.+ +.+.|+++|-.
T Consensus 403 ~~~------~~~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~--~~v~l~V~R~g 465 (473)
T PRK10942 403 NKG------GDKGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP--SVLALNIQRGD 465 (473)
T ss_pred ccc------CCCCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC--CeEEEEEEECC
Confidence 210 114788884 99999999999999999999999999999999999833 46666666543
No 5
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=2.9e-44 Score=403.68 Aligned_cols=300 Identities=19% Similarity=0.295 Sum_probs=256.2
Q ss_pred ccCchhHHHHHHHhCCceEEEEEEeeec-cCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEEE
Q 002474 32 VATADDWRKALNKVVPAVVVLRTTACRA-FDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYP 110 (918)
Q Consensus 32 ~~~~~~~~~~vekv~~SVV~I~~~~~~~-fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~v 110 (918)
.+...++.++++++.||||.|+...... ........+.||||+|++ +||||||+|||. +...+.+.|.|++.++|++
T Consensus 41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~-~G~IlTn~HVV~-~~~~i~V~~~dg~~~~a~v 118 (351)
T TIGR02038 41 NTVEISFNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSK-EGYILTNYHVIK-KADQIVVALQDGRKFEAEL 118 (351)
T ss_pred cccchhHHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeC-CeEEEecccEeC-CCCEEEEEECCCCEEEEEE
Confidence 3445689999999999999999864321 111223467899999997 799999999999 5678999999999999999
Q ss_pred EEecCCCcEEEEEECCCCcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeE
Q 002474 111 IYRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFY 190 (918)
Q Consensus 111 v~~Dp~~DlAlLkvd~~~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~ 190 (918)
+++|+.+|||+||++...++++++.. +..+++||+|+++|||++...+++.|+|+++++.... ..++ .++
T Consensus 119 v~~d~~~DlAvlkv~~~~~~~~~l~~----s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~--~~~~----~~~ 188 (351)
T TIGR02038 119 VGSDPLTDLAVLKIEGDNLPTIPVNL----DRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS--SVGR----QNF 188 (351)
T ss_pred EEecCCCCEEEEEecCCCCceEeccC----cCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC--CCCc----ceE
Confidence 99999999999999987666555543 5689999999999999999999999999999886421 1122 346
Q ss_pred EEEeecCCCCCCCccEEcccceEEEeccccCC------CCCcccccchhhHHHHHHHHHhcCCCccccccccccCCCccC
Q 002474 191 MQAASGTKGGSSGSPVIDWQGRAVALNAGSKS------SSASAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTLQ 264 (918)
Q Consensus 191 Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~------~~~~~faIPi~~i~~~L~~l~~g~~~~~~~~~~~~v~rg~Lg 264 (918)
||+|+.+++|||||||+|.+|+||||+++... ..+.+|+||++.++++++++++++ .+.|+|||
T Consensus 189 iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g----------~~~r~~lG 258 (351)
T TIGR02038 189 IQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDG----------RVIRGYIG 258 (351)
T ss_pred EEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcC----------cccceEee
Confidence 99999999999999999999999999976432 157899999999999999999887 68899999
Q ss_pred eEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhc-c
Q 002474 265 VTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-G 342 (918)
Q Consensus 265 v~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~-~ 342 (918)
+.++.+.....+.||++ ...|++| ..|.++|||++ ||++||+|++|||++|.++.++.+.+.. .
T Consensus 259 v~~~~~~~~~~~~lgl~-------------~~~Gv~V-~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~ 324 (351)
T TIGR02038 259 VSGEDINSVVAQGLGLP-------------DLRGIVI-TGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAETR 324 (351)
T ss_pred eEEEECCHHHHHhcCCC-------------ccccceE-eecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhcC
Confidence 99999998888899985 4467777 69999999999 9999999999999999999999988854 7
Q ss_pred CCCeEEEEEEECCeEEEEEEEeecC
Q 002474 343 VDKNIELLIERGGISMTVNLVVQDL 367 (918)
Q Consensus 343 ~G~~V~l~V~R~G~~~~~~I~l~~~ 367 (918)
.|+++.+++.|+|+.+++.+++...
T Consensus 325 ~g~~v~l~v~R~g~~~~~~v~l~~~ 349 (351)
T TIGR02038 325 PGSKVMVTVLRQGKQLELPVTIDEK 349 (351)
T ss_pred CCCEEEEEEEECCEEEEEEEEecCC
Confidence 8899999999999999988887643
No 6
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=6.3e-44 Score=400.92 Aligned_cols=300 Identities=18% Similarity=0.299 Sum_probs=254.4
Q ss_pred cCchhHHHHHHHhCCceEEEEEEeeeccC-CCCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEEEE
Q 002474 33 ATADDWRKALNKVVPAVVVLRTTACRAFD-TEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPI 111 (918)
Q Consensus 33 ~~~~~~~~~vekv~~SVV~I~~~~~~~fd-~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~vv 111 (918)
+...++.++++++.||||.|.......+. ......+.||||+|++ +||||||+|||. +...+.|++.|++.++|+++
T Consensus 42 ~~~~~~~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~-~G~IlTn~HVv~-~a~~i~V~~~dg~~~~a~vv 119 (353)
T PRK10898 42 ETPASYNQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQ-RGYILTNKHVIN-DADQIIVALQDGRVFEALLV 119 (353)
T ss_pred cccchHHHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeC-CeEEEecccEeC-CCCEEEEEeCCCCEEEEEEE
Confidence 33458999999999999999997643322 2233457899999997 799999999999 57789999999999999999
Q ss_pred EecCCCcEEEEEECCCCcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEE
Q 002474 112 YRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYM 191 (918)
Q Consensus 112 ~~Dp~~DlAlLkvd~~~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~I 191 (918)
++|+.+|||+||++...++++.+.+ ++.+++||+|+++|||++...+++.|+|++.++..... .++ .++|
T Consensus 120 ~~d~~~DlAvl~v~~~~l~~~~l~~----~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~--~~~----~~~i 189 (353)
T PRK10898 120 GSDSLTDLAVLKINATNLPVIPINP----KRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSP--TGR----QNFL 189 (353)
T ss_pred EEcCCCCEEEEEEcCCCCCeeeccC----cCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCC--ccc----cceE
Confidence 9999999999999987666655554 56789999999999999999999999999998764221 122 2469
Q ss_pred EEeecCCCCCCCccEEcccceEEEeccccCC-------CCCcccccchhhHHHHHHHHHhcCCCccccccccccCCCccC
Q 002474 192 QAASGTKGGSSGSPVIDWQGRAVALNAGSKS-------SSASAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTLQ 264 (918)
Q Consensus 192 q~da~i~~G~SGGPvvn~dG~VVGI~~~~~~-------~~~~~faIPi~~i~~~L~~l~~g~~~~~~~~~~~~v~rg~Lg 264 (918)
|+|+.+++|||||||+|.+|+||||+++... ..+.+|+||++.++++++++.+++ .+.|+|||
T Consensus 190 qtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G----------~~~~~~lG 259 (353)
T PRK10898 190 QTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDG----------RVIRGYIG 259 (353)
T ss_pred EeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcC----------cccccccc
Confidence 9999999999999999999999999986432 157899999999999999998887 78899999
Q ss_pred eEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhc-c
Q 002474 265 VTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-G 342 (918)
Q Consensus 265 v~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~-~ 342 (918)
+..+..+......++++ ...|++| ..|.++|||++ ||++||+|++|||++|.++.++.+.+.. .
T Consensus 260 i~~~~~~~~~~~~~~~~-------------~~~Gv~V-~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~ 325 (353)
T PRK10898 260 IGGREIAPLHAQGGGID-------------QLQGIVV-NEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIR 325 (353)
T ss_pred eEEEECCHHHHHhcCCC-------------CCCeEEE-EEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcC
Confidence 99998876666666664 4478887 69999999999 9999999999999999999999888854 7
Q ss_pred CCCeEEEEEEECCeEEEEEEEeecCC
Q 002474 343 VDKNIELLIERGGISMTVNLVVQDLH 368 (918)
Q Consensus 343 ~G~~V~l~V~R~G~~~~~~I~l~~~~ 368 (918)
+|+.+.+++.|+|+.+++.+++..++
T Consensus 326 ~g~~v~l~v~R~g~~~~~~v~l~~~p 351 (353)
T PRK10898 326 PGSVIPVVVMRDDKQLTLQVTIQEYP 351 (353)
T ss_pred CCCEEEEEEEECCEEEEEEEEeccCC
Confidence 88999999999999999988886553
No 7
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1e-33 Score=318.38 Aligned_cols=295 Identities=24% Similarity=0.416 Sum_probs=253.7
Q ss_pred hhHHHHHHHhCCceEEEEEEeeecc----CCCCC---CCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeE
Q 002474 36 DDWRKALNKVVPAVVVLRTTACRAF----DTEAA---GASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPV 108 (918)
Q Consensus 36 ~~~~~~vekv~~SVV~I~~~~~~~f----d~~~~---~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a 108 (918)
..+...++++.|+||.+........ ..... ..+.||||++++ +|||+||.||+.. +..+.+.+.|++++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~-~g~ivTn~hVi~~-a~~i~v~l~dg~~~~a 110 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISS-DGYIVTNNHVIAG-AEEITVTLADGREVPA 110 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcC-CeEEEecceecCC-cceEEEEeCCCCEEEE
Confidence 6899999999999999999764321 11110 148899999996 8999999999995 8889999999999999
Q ss_pred EEEEecCCCcEEEEEECCCC-cccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccc
Q 002474 109 YPIYRDPVHDFGFFRYDPSA-IQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFN 187 (918)
Q Consensus 109 ~vv~~Dp~~DlAlLkvd~~~-l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn 187 (918)
++++.|+..|+|+||++... +++..+.. +..+++|++++++|+|++...+++.|+++.+.|. .+....+ .
T Consensus 111 ~~vg~d~~~dlavlki~~~~~~~~~~~~~----s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~--~v~~~~~---~ 181 (347)
T COG0265 111 KLVGKDPISDLAVLKIDGAGGLPVIALGD----SDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT--GVGSAGG---Y 181 (347)
T ss_pred EEEecCCccCEEEEEeccCCCCceeeccC----CCCcccCCEEEEecCCCCcccceeccEEeccccc--cccCccc---c
Confidence 99999999999999999754 56555555 6788999999999999999999999999999996 2222111 2
Q ss_pred eeEEEEeecCCCCCCCccEEcccceEEEeccccCCCC----CcccccchhhHHHHHHHHHhcCCCccccccccccCCCcc
Q 002474 188 TFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSSS----ASAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTL 263 (918)
Q Consensus 188 ~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~~~----~~~faIPi~~i~~~L~~l~~g~~~~~~~~~~~~v~rg~L 263 (918)
.++||+|+.+++|+||||++|.+|++|||+++..... +.+|++|++.+++++.++.+.+ ++.|+++
T Consensus 182 ~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G----------~v~~~~l 251 (347)
T COG0265 182 VNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKG----------KVVRGYL 251 (347)
T ss_pred cchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcC----------Ccccccc
Confidence 4569999999999999999999999999998877653 4899999999999999999865 6999999
Q ss_pred CeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHh-c
Q 002474 264 QVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLD-D 341 (918)
Q Consensus 264 gv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~-~ 341 (918)
|+.+.++..+.+ +|++ ...|++| ..|.+++||++ |++.||+|+++||+++.+..++...+. .
T Consensus 252 gv~~~~~~~~~~--~g~~-------------~~~G~~V-~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~ 315 (347)
T COG0265 252 GVIGEPLTADIA--LGLP-------------VAAGAVV-LGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASN 315 (347)
T ss_pred ceEEEEcccccc--cCCC-------------CCCceEE-EecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhcc
Confidence 999998887777 7764 5678666 69999999999 999999999999999999999998884 5
Q ss_pred cCCCeEEEEEEECCeEEEEEEEeecC
Q 002474 342 GVDKNIELLIERGGISMTVNLVVQDL 367 (918)
Q Consensus 342 ~~G~~V~l~V~R~G~~~~~~I~l~~~ 367 (918)
.+|+.+.+++.|+|++.++.+++.+.
T Consensus 316 ~~g~~v~~~~~r~g~~~~~~v~l~~~ 341 (347)
T COG0265 316 RPGDEVALKLLRGGKERELAVTLGDR 341 (347)
T ss_pred CCCCEEEEEEEECCEEEEEEEEecCc
Confidence 68999999999999999999998763
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=100.00 E-value=5.2e-33 Score=314.27 Aligned_cols=414 Identities=19% Similarity=0.268 Sum_probs=346.4
Q ss_pred HHHhCCceEEEEEEeeeccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEEEEEecCCCcEEE
Q 002474 42 LNKVVPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGF 121 (918)
Q Consensus 42 vekv~~SVV~I~~~~~~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~vv~~Dp~~DlAl 121 (918)
.+++..+.|.+....+...|+..+....|||.|++.+.|++++++.++..+.++..++++|...++|++.+.||.+++|+
T Consensus 524 ~~~i~~~~~~v~~~~~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~n~a~ 603 (955)
T KOG1421|consen 524 SADISNCLVDVEPMMPVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTENVAS 603 (955)
T ss_pred hhHHhhhhhhheeceeeccccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCccceeE
Confidence 67888899999999988888888777789999999989999999999999999999999999999999999999999999
Q ss_pred EEECCCCcccccccCCCCCCcccCCCCEEEEEecCCCCC-----ceEEEeE-EEeecCCCCCCCCCCccccceeEEEEee
Q 002474 122 FRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEK-----VSILAGT-LARLDRDAPHYKKDGYNDFNTFYMQAAS 195 (918)
Q Consensus 122 Lkvd~~~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~-----~svt~G~-Vs~~~r~~p~~~~~~~~dfn~~~Iq~da 195 (918)
+|++|+... .+.|....++.||++...|+..... .+++.-. +.-.....|+|+.. |.+.|.+.+
T Consensus 604 ~kydp~~~~-----~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~-----n~e~Is~~~ 673 (955)
T KOG1421|consen 604 FKYDPALEV-----QLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRAT-----NLEVISFMD 673 (955)
T ss_pred eccChhHhh-----hhccceeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeec-----ceEEEEEec
Confidence 999987532 3445567889999999999985544 3333311 22223345666644 678899999
Q ss_pred cCCCCCCCccEEcccceEEEeccccCCC--CC----cccccchhhHHHHHHHHHhcCCCccccccccccCCCccCeEEEE
Q 002474 196 GTKGGSSGSPVIDWQGRAVALNAGSKSS--SA----SAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTLQVTFVH 269 (918)
Q Consensus 196 ~i~~G~SGGPvvn~dG~VVGI~~~~~~~--~~----~~faIPi~~i~~~L~~l~~g~~~~~~~~~~~~v~rg~Lgv~~~~ 269 (918)
....++-.|-+.|.||+|+|+|...... .+ .-|-+.+.+++..|++|+.+. ....-.+|++|.+
T Consensus 674 nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~----------~~rp~i~~vef~~ 743 (955)
T KOG1421|consen 674 NLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGP----------SARPTIAGVEFSH 743 (955)
T ss_pred cccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCC----------CCCceeeccceee
Confidence 9998888889999999999999654432 22 345588899999999999986 2333358999999
Q ss_pred cChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCccccCCCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEE
Q 002474 270 KGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIEL 349 (918)
Q Consensus 270 ~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l 349 (918)
++...+|.|||+.||+.+.+...- .+..+++++.|.+.-+- -|..||+|+++||+.|+...++.++. .+..
T Consensus 744 i~laqar~lglp~e~imk~e~es~-~~~ql~~ishv~~~~~k--il~~gdiilsvngk~itr~~dl~d~~------eid~ 814 (955)
T KOG1421|consen 744 ITLAQARTLGLPSEFIMKSEEEST-IPRQLYVISHVRPLLHK--ILGVGDIILSVNGKMITRLSDLHDFE------EIDA 814 (955)
T ss_pred EEeehhhccCCCHHHHhhhhhcCC-CcceEEEEEeeccCccc--ccccccEEEEecCeEEeeehhhhhhh------hhhe
Confidence 999999999999999999998876 67888998898776443 49999999999999999999998643 5678
Q ss_pred EEEECCeEEEEEEEeecCCCCCCCceeeecceeecccchhhhcc-cCCCCCcEEEEc--CCChhHHcCCCCCCEEEEECC
Q 002474 350 LIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLSYQQARN-FRFPCGLVYVAE--PGYMLFRAGVPRHAIIKKFAG 426 (918)
Q Consensus 350 ~V~R~G~~~~~~I~l~~~~~~t~~~~v~~~Gl~~~~ls~~~~~~-~gl~~~GV~Vs~--pgspA~~AGLk~GD~I~sVNG 426 (918)
+|.|+|.+++++|+.-... ..+|++-|.|+++|+.+..+... -.+| +|||+.. -||||.+ +|.+-.+|++|||
T Consensus 815 ~ilrdg~~~~ikipt~p~~--et~r~vi~~gailq~ph~av~~q~edlp-~gvyvt~rg~gspalq-~l~aa~fitavng 890 (955)
T KOG1421|consen 815 VILRDGIEMEIKIPTYPEY--ETSRAVIWMGAILQPPHSAVFEQVEDLP-EGVYVTSRGYGSPALQ-MLRAAHFITAVNG 890 (955)
T ss_pred eeeecCcEEEEEecccccc--ccceEEEEEeccccCchHHHHHHHhccC-CceEEeecccCChhHh-hcchheeEEEecc
Confidence 8999999999999876554 57899999999999988644322 2344 8999995 7899999 9999999999999
Q ss_pred eecCCHHHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEcCCCCCCCceeeecCCCCCceeee
Q 002474 427 EEISRLEDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDRHEWYAPPQIYTRNDSSGLWSANP 493 (918)
Q Consensus 427 ~~v~~l~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdRd~w~~~~~~~~r~d~tg~W~~~~ 493 (918)
..++++++|...++++|++..|.++.+.+++ -+..++++.|.+|||+-++.||.. |.|-.+.
T Consensus 891 ~~t~~lddf~~~~~~ipdnsyv~v~~mtfd~----vp~~~s~k~n~hyfpt~~l~rd~~-~~wi~ke 952 (955)
T KOG1421|consen 891 HDTNTLDDFYHMLLEIPDNSYVQVKQMTFDG----VPSIVSVKPNPHYFPTCILERDSN-GRWITKE 952 (955)
T ss_pred cccCcHHHHHHHHhhCCCCceEEEEEeccCC----CceEEEeccCCccCceeEEEeccc-Cceeeee
Confidence 9999999999999999999999999999988 799999999999999999999874 4595443
No 9
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.90 E-value=1.5e-22 Score=229.61 Aligned_cols=316 Identities=19% Similarity=0.257 Sum_probs=232.6
Q ss_pred chhHHHHHHHhCCceEEEEEEe----eeccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCCc----------EEEEEe
Q 002474 35 ADDWRKALNKVVPAVVVLRTTA----CRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPV----------VAEAMF 100 (918)
Q Consensus 35 ~~~~~~~vekv~~SVV~I~~~~----~~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~----------~i~v~f 100 (918)
.+..+...++...|+|.|+... ..+|....-....||||||+. +|+|+||+||+..... .+.+..
T Consensus 127 ~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~g-d~i~VTnghV~~~~~~~y~~~~~~l~~vqi~a 205 (473)
T KOG1320|consen 127 KAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGG-DGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDA 205 (473)
T ss_pred hhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcC-CcEEEEeeEEEEEEeccccCCCcceeeEEEEE
Confidence 4677889999999999999742 123444555678899999997 8999999999986433 366666
Q ss_pred cCC--eEEeEEEEEecCCCcEEEEEECCCC--cccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCC
Q 002474 101 VNR--EEIPVYPIYRDPVHDFGFFRYDPSA--IQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAP 176 (918)
Q Consensus 101 ~dg--~~~~a~vv~~Dp~~DlAlLkvd~~~--l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p 176 (918)
+++ ..+.+.+.+.|+..|+|+++++... ++.+++.- ...++.|+++..+|+|++...+.+.|+++...|...
T Consensus 206 a~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~----~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~ 281 (473)
T KOG1320|consen 206 AIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGV----SSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSF 281 (473)
T ss_pred eecCCccCCCeEEccccccceEEEEEecCCcccceeecce----eeeecccceeeccccCceeeeeeeeccccccccccc
Confidence 655 8889999999999999999996432 23333332 678999999999999999999999999999998765
Q ss_pred CCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEeccccCCC----CCcccccchhhHHHHHHHHHhcCCCcccc
Q 002474 177 HYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSS----SASAFFLPLERVVRALRFLQERRDCNIHN 252 (918)
Q Consensus 177 ~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~~----~~~~faIPi~~i~~~L~~l~~g~~~~~~~ 252 (918)
..+.. ......+|+|+|++++.|+||+|++|.||++||+++..... .+.+|++|.+.++.++.+.-+.+- ...+
T Consensus 282 ~lg~~-~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~-~lr~ 359 (473)
T KOG1320|consen 282 KLGLE-TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQI-SLRP 359 (473)
T ss_pred ccCcc-cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhce-eecc
Confidence 43332 22335678999999999999999999999999999876653 788999999999998877633321 0001
Q ss_pred ccccccCCCccCeEEEEcChHHHH-HhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEec
Q 002474 253 WEAVSIPRGTLQVTFVHKGFDETR-RLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVIT 330 (918)
Q Consensus 253 ~~~~~v~rg~Lgv~~~~~~~d~~r-~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~ 330 (918)
.......+.++|.....+.....- .++.+ ..+|.......++..|.|++++.. ++++||+|++|||+++.
T Consensus 360 ~~~~~p~~~~~g~~s~~i~~g~vf~~~~~~--------~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~ 431 (473)
T KOG1320|consen 360 VKPLVPVHQYIGLPSYYIFAGLVFVPLTKS--------YIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVK 431 (473)
T ss_pred ccCcccccccCCceeEEEecceEEeecCCC--------ccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEee
Confidence 111112234555544332211110 01221 123323332344479999999999 99999999999999999
Q ss_pred CHHHHHHHHhc-cCCCeEEEEEEECCeEEEEEEEee
Q 002474 331 QFLKLETLLDD-GVDKNIELLIERGGISMTVNLVVQ 365 (918)
Q Consensus 331 s~~~l~~~L~~-~~G~~V~l~V~R~G~~~~~~I~l~ 365 (918)
+..++..++.. ..++++.+..+|..+..++.+...
T Consensus 432 n~~~l~~~i~~~~~~~~v~vl~~~~~e~~tl~Il~~ 467 (473)
T KOG1320|consen 432 NLKHLYELIEECSTEDKVAVLDRRSAEDATLEILPE 467 (473)
T ss_pred chHHHHHHHHhcCcCceEEEEEecCccceeEEeccc
Confidence 99999999964 556788888888888888877654
No 10
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=99.83 E-value=4.4e-19 Score=200.06 Aligned_cols=237 Identities=16% Similarity=0.176 Sum_probs=175.6
Q ss_pred ccccccCceEEEEEecCCccccCCcccceeEeEEEEEeccCCCcEEEEecccccCCCccEEEEeeeCceeeeeEEEEeec
Q 002474 621 AESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLHP 700 (918)
Q Consensus 621 ~~~~l~~s~V~v~~~~p~~~~~d~~~~~~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~~di~vtfa~~~~~vp~~vvflhp 700 (918)
..++..+|+|.|...-.. .-..........|||||||. +||||||+|+|... -.+.|+|++ ...++|+|++.||
T Consensus 50 ~~~~~~psVV~I~~~~~~-~~~~~~~~~~~~GSG~vi~~---~G~IlTn~HVV~~~-~~i~V~~~d-g~~~~a~vv~~d~ 123 (351)
T TIGR02038 50 AVRRAAPAVVNIYNRSIS-QNSLNQLSIQGLGSGVIMSK---EGYILTNYHVIKKA-DQIVVALQD-GRKFEAELVGSDP 123 (351)
T ss_pred HHHhcCCcEEEEEeEecc-ccccccccccceEEEEEEeC---CeEEEecccEeCCC-CEEEEEECC-CCEEEEEEEEecC
Confidence 677899999999875431 00011112346799999993 79999999999765 469999997 8999999999999
Q ss_pred ccceEEEEecCCCcCcccccceeecccCCCcCCCCCCeEEEEEecCCCceeEEeEEEecceeecccCCCCCcccccccee
Q 002474 701 VHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRAMNME 780 (918)
Q Consensus 701 ~~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~~~~~~~~~~~pryr~~n~e 780 (918)
.+|+|+||+++..+ ..++|.+...+++||+|++||+..++..+...+.|+. +..... .++.+ .+
T Consensus 124 ~~DlAvlkv~~~~~--------~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~----~~r~~~-~~~~~---~~ 187 (351)
T TIGR02038 124 LTDLAVLKIEGDNL--------PTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISA----TGRNGL-SSVGR---QN 187 (351)
T ss_pred CCCEEEEEecCCCC--------ceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEe----ccCccc-CCCCc---ce
Confidence 99999999997544 4556665555899999999999988776655555554 323221 12222 36
Q ss_pred eEEEecCcCCCC-cceEECCCccEEEEEeeeecceeccCCCCCCceEEeccchhhHHHHHHHHHcCCCCCCccccCccCC
Q 002474 781 VIELDTDFGSTF-SGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISGASGPSLLINGVKRP 859 (918)
Q Consensus 781 ~i~~d~~~~~~~-~Gvl~d~~G~v~alw~s~~~~~~~~~~~~~~~~~~~gl~~~~i~~v~~~l~~g~~~~~~~~~~~~~~ 859 (918)
+|++|+.++.++ ||+|+|.+|+|.|+|....... .........+.||++.+++++++|++++
T Consensus 188 ~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~----~~~~~~g~~faIP~~~~~~vl~~l~~~g------------- 250 (351)
T TIGR02038 188 FIQTDAAINAGNSGGALINTNGELVGINTASFQKG----GDEGGEGINFAIPIKLAHKIMGKIIRDG------------- 250 (351)
T ss_pred EEEECCccCCCCCcceEECCCCeEEEEEeeeeccc----CCCCccceEEEecHHHHHHHHHHHhhcC-------------
Confidence 899999999998 9999999999999998544320 0112234556799999999999998763
Q ss_pred CCceEEEeeEEEEeehHhHHhcCCCHH---HHHHhhcCCC
Q 002474 860 MPLVRILEVELYPTLLSKARSFGLSDD---WVQVCFLPNA 896 (918)
Q Consensus 860 ~p~~~~l~~e~~~~~~~~ar~~g~~~~---wi~~~~~~~~ 896 (918)
.+....|.+++..+....++.+|+++. .|.++...++
T Consensus 251 ~~~r~~lGv~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~sp 290 (351)
T TIGR02038 251 RVIRGYIGVSGEDINSVVAQGLGLPDLRGIVITGVDPNGP 290 (351)
T ss_pred cccceEeeeEEEECCHHHHHhcCCCccccceEeecCCCCh
Confidence 344557999999998889999999753 4555544443
No 11
>PRK10942 serine endoprotease; Provisional
Probab=99.80 E-value=1.5e-18 Score=202.46 Aligned_cols=199 Identities=19% Similarity=0.226 Sum_probs=154.7
Q ss_pred eeEeEEEEEeccCCCcEEEEecccccCCCccEEEEeeeCceeeeeEEEEeecccceEEEEecCCCcCcccccceeecccC
Q 002474 649 HFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELL 728 (918)
Q Consensus 649 ~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~~di~vtfa~~~~~vp~~vvflhp~~n~aiv~ydp~~~~~~~~~~v~~~~l~ 728 (918)
...|+||||| .+.||||||+|+|-.. ++|.|+|+| .-+++|+|++.||.+|+||||++. . .++..++|.
T Consensus 110 ~~~GSG~ii~--~~~G~IlTn~HVv~~a-~~i~V~~~d-g~~~~a~vv~~D~~~DlAvlki~~---~----~~l~~~~lg 178 (473)
T PRK10942 110 MALGSGVIID--ADKGYVVTNNHVVDNA-TKIKVQLSD-GRKFDAKVVGKDPRSDIALIQLQN---P----KNLTAIKMA 178 (473)
T ss_pred cceEEEEEEE--CCCCEEEeChhhcCCC-CEEEEEECC-CCEEEEEEEEecCCCCEEEEEecC---C----CCCceeEec
Confidence 4679999999 4579999999988755 799999997 999999999999999999999962 1 245678898
Q ss_pred CCcCCCCCCeEEEEEecCCCceeEEeEEEecceeecccCCCCCccccccceeeEEEecCcCCCC-cceEECCCccEEEEE
Q 002474 729 PEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRAMNMEVIELDTDFGSTF-SGVLTDEHGRVQAIW 807 (918)
Q Consensus 729 ~~~~l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~~~~~~~~~~~pryr~~n~e~i~~d~~~~~~~-~Gvl~d~~G~v~alw 807 (918)
++..+++||.|++||+..++..++ .+++++++.......++|. ..|++|+.++.++ ||+|+|.+|+|.|||
T Consensus 179 ~s~~l~~G~~V~aiG~P~g~~~tv----t~GiVs~~~r~~~~~~~~~----~~iqtda~i~~GnSGGpL~n~~GeviGI~ 250 (473)
T PRK10942 179 DSDALRVGDYTVAIGNPYGLGETV----TSGIVSALGRSGLNVENYE----NFIQTDAAINRGNSGGALVNLNGELIGIN 250 (473)
T ss_pred CccccCCCCEEEEEcCCCCCCcce----eEEEEEEeecccCCccccc----ceEEeccccCCCCCcCccCCCCCeEEEEE
Confidence 776699999999999998775544 3444444444433445665 5799999999888 999999999999999
Q ss_pred eeeecceeccCCCCCCceEEeccchhhHHHHHHHHHcCCCCCCccccCccCCCCceEEEeeEEEEeehHhHHhcCCCH
Q 002474 808 GSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARSFGLSD 885 (918)
Q Consensus 808 ~s~~~~~~~~~~~~~~~~~~~gl~~~~i~~v~~~l~~g~~~~~~~~~~~~~~~p~~~~l~~e~~~~~~~~ar~~g~~~ 885 (918)
.++... +.....+.+.||+..+++++++|++++ ....-.|.+.+..+.-..|+.+|+++
T Consensus 251 t~~~~~------~g~~~g~gfaIP~~~~~~v~~~l~~~g-------------~v~rg~lGv~~~~l~~~~a~~~~l~~ 309 (473)
T PRK10942 251 TAILAP------DGGNIGIGFAIPSNMVKNLTSQMVEYG-------------QVKRGELGIMGTELNSELAKAMKVDA 309 (473)
T ss_pred EEEEcC------CCCcccEEEEEEHHHHHHHHHHHHhcc-------------ccccceeeeEeeecCHHHHHhcCCCC
Confidence 998765 112234455689999999999998763 22223577777776666677788764
No 12
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=99.78 E-value=8.2e-18 Score=194.79 Aligned_cols=226 Identities=22% Similarity=0.258 Sum_probs=174.8
Q ss_pred ccccccCceEEEEEecCCcc-----ccC------------Ccc------cceeEeEEEEEeccCCCcEEEEecccccCCC
Q 002474 621 AESVIEPTLVMFEVHVPPSC-----MID------------GVH------SQHFFGTGVIIYHSQSMGLVVVDKNTVAISA 677 (918)
Q Consensus 621 ~~~~l~~s~V~v~~~~p~~~-----~~d------------~~~------~~~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~ 677 (918)
..+++.||+|.|++.-.... .++ +.+ .....||||||+. .||||||+|+|-. +
T Consensus 6 ~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~---~G~IlTn~Hvv~~-~ 81 (428)
T TIGR02037 6 LVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISA---DGYILTNNHVVDG-A 81 (428)
T ss_pred HHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECC---CCEEEEcHHHcCC-C
Confidence 56778999999987531100 011 011 1346799999994 5999999998865 5
Q ss_pred ccEEEEeeeCceeeeeEEEEeecccceEEEEecCCCcCcccccceeecccCCCcCCCCCCeEEEEEecCCCceeEEeEEE
Q 002474 678 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIV 757 (918)
Q Consensus 678 ~di~vtfa~~~~~vp~~vvflhp~~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd~v~~vG~~~~~~~~~~~t~v 757 (918)
+++.|+|++ ..+++|+|++.||.+|+|+||+++. ..+..++|.+...+++||.|+++|+..++......+.|
T Consensus 82 ~~i~V~~~~-~~~~~a~vv~~d~~~DlAllkv~~~-------~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~v 153 (428)
T TIGR02037 82 DEITVTLSD-GREFKAKLVGKDPRTDIAVLKIDAK-------KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIV 153 (428)
T ss_pred CeEEEEeCC-CCEEEEEEEEecCCCCEEEEEecCC-------CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEE
Confidence 799999997 9999999999999999999999963 25678899876569999999999999887766656666
Q ss_pred ecceeecccCCCCCccccccceeeEEEecCcCCCC-cceEECCCccEEEEEeeeecceeccCCCCCCceEEeccchhhHH
Q 002474 758 TNPCAALNISSADCPRYRAMNMEVIELDTDFGSTF-SGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTIS 836 (918)
Q Consensus 758 t~i~~~~~~~~~~~pryr~~n~e~i~~d~~~~~~~-~Gvl~d~~G~v~alw~s~~~~~~~~~~~~~~~~~~~gl~~~~i~ 836 (918)
+... ........|. ..|++|+.+..++ ||+|+|.+|+|.|+|...... +.....+.+.||+..++
T Consensus 154 s~~~----~~~~~~~~~~----~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~------~g~~~g~~faiP~~~~~ 219 (428)
T TIGR02037 154 SALG----RSGLGIGDYE----NFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSP------SGGNVGIGFAIPSNMAK 219 (428)
T ss_pred Eecc----cCccCCCCcc----ceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcC------CCCccceEEEEEhHHHH
Confidence 6432 2222233454 5899999999888 999999999999999876543 12234566779999999
Q ss_pred HHHHHHHcCCCCCCccccCccCCCCceEEEeeEEEEeehHhHHhcCCCH
Q 002474 837 RVLDKIISGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARSFGLSD 885 (918)
Q Consensus 837 ~v~~~l~~g~~~~~~~~~~~~~~~p~~~~l~~e~~~~~~~~ar~~g~~~ 885 (918)
+++++|++++ .+..-.|++++..+....|+.+|++.
T Consensus 220 ~~~~~l~~~g-------------~~~~~~lGi~~~~~~~~~~~~lgl~~ 255 (428)
T TIGR02037 220 NVVDQLIEGG-------------KVQRGWLGVTIQEVTSDLAKSLGLEK 255 (428)
T ss_pred HHHHHHHhcC-------------cCcCCcCceEeecCCHHHHHHcCCCC
Confidence 9999999874 34456799999999999999999974
No 13
>PRK10898 serine endoprotease; Provisional
Probab=99.75 E-value=1.1e-16 Score=180.74 Aligned_cols=200 Identities=15% Similarity=0.208 Sum_probs=148.6
Q ss_pred ccccccCceEEEEEecCCccccCCcccceeEeEEEEEeccCCCcEEEEecccccCCCccEEEEeeeCceeeeeEEEEeec
Q 002474 621 AESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLHP 700 (918)
Q Consensus 621 ~~~~l~~s~V~v~~~~p~~~~~d~~~~~~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~~di~vtfa~~~~~vp~~vvflhp 700 (918)
..++..+|+|.|...-.. +..+........|||||||. +||||||+|+|... .++.|+|.+ ...++|+|++.||
T Consensus 50 ~~~~~~psvV~v~~~~~~-~~~~~~~~~~~~GSGfvi~~---~G~IlTn~HVv~~a-~~i~V~~~d-g~~~~a~vv~~d~ 123 (353)
T PRK10898 50 AVRRAAPAVVNVYNRSLN-STSHNQLEIRTLGSGVIMDQ---RGYILTNKHVINDA-DQIIVALQD-GRVFEALLVGSDS 123 (353)
T ss_pred HHHHhCCcEEEEEeEecc-ccCcccccccceeeEEEEeC---CeEEEecccEeCCC-CEEEEEeCC-CCEEEEEEEEEcC
Confidence 667899999999887652 11111122346899999993 79999999999855 789999997 8899999999999
Q ss_pred ccceEEEEecCCCcCcccccceeecccCCCcCCCCCCeEEEEEecCCCceeEEeEEEecceeecccCCCCCcccccccee
Q 002474 701 VHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRAMNME 780 (918)
Q Consensus 701 ~~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~~~~~~~~~~~pryr~~n~e 780 (918)
.+|+||||+|+..+ ..++|.++..+++||.|.+||+..++......+.|+... ......-++ .+
T Consensus 124 ~~DlAvl~v~~~~l--------~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~----r~~~~~~~~----~~ 187 (353)
T PRK10898 124 LTDLAVLKINATNL--------PVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATG----RIGLSPTGR----QN 187 (353)
T ss_pred CCCEEEEEEcCCCC--------CeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEecc----ccccCCccc----cc
Confidence 99999999997544 446666655589999999999998876655555555422 211110011 25
Q ss_pred eEEEecCcCCCC-cceEECCCccEEEEEeeeecceeccCCCCCCceEEeccchhhHHHHHHHHHcC
Q 002474 781 VIELDTDFGSTF-SGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISG 845 (918)
Q Consensus 781 ~i~~d~~~~~~~-~Gvl~d~~G~v~alw~s~~~~~~~~~~~~~~~~~~~gl~~~~i~~v~~~l~~g 845 (918)
+|++|+.+..++ ||+|+|.+|+|.++-.....+.. .......+.+.||++.+++++++|+++
T Consensus 188 ~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~---~~~~~~g~~faIP~~~~~~~~~~l~~~ 250 (353)
T PRK10898 188 FLQTDASINHGNSGGALVNSLGELMGINTLSFDKSN---DGETPEGIGFAIPTQLATKIMDKLIRD 250 (353)
T ss_pred eEEeccccCCCCCcceEECCCCeEEEEEEEEecccC---CCCcccceEEEEchHHHHHHHHHHhhc
Confidence 899999999998 99999999999999876554310 001113345569999999999998765
No 14
>PRK10139 serine endoprotease; Provisional
Probab=99.73 E-value=2.1e-16 Score=183.77 Aligned_cols=226 Identities=18% Similarity=0.209 Sum_probs=167.9
Q ss_pred ccccccCceEEEEEecCCcc--ccC-------C--cc-----cceeEeEEEEEeccCCCcEEEEecccccCCCccEEEEe
Q 002474 621 AESVIEPTLVMFEVHVPPSC--MID-------G--VH-----SQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSF 684 (918)
Q Consensus 621 ~~~~l~~s~V~v~~~~p~~~--~~d-------~--~~-----~~~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~~di~vtf 684 (918)
..+++.||+|.|.+.--... .+| | .+ .....|+||||| .+.||||||.|+|... -.|.|+|
T Consensus 45 ~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~--~~~g~IlTn~HVv~~a-~~i~V~~ 121 (455)
T PRK10139 45 MLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIID--AAKGYVLTNNHVINQA-QKISIQL 121 (455)
T ss_pred HHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEE--CCCCEEEeChHHhCCC-CEEEEEE
Confidence 67789999999987531100 011 1 11 123579999999 4579999999999766 5899999
Q ss_pred eeCceeeeeEEEEeecccceEEEEecCCCcCcccccceeecccCCCcCCCCCCeEEEEEecCCCceeEEeEEEecceeec
Q 002474 685 AAFPIEIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAAL 764 (918)
Q Consensus 685 a~~~~~vp~~vvflhp~~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~~~~ 764 (918)
.| .-+++|+|++.||.+|+|+||.|.. .++..++|.+...+++||.|.+||+..++......+.|+. +
T Consensus 122 ~d-g~~~~a~vvg~D~~~DlAvlkv~~~-------~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~----~ 189 (455)
T PRK10139 122 ND-GREFDAKLIGSDDQSDIALLQIQNP-------SKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISA----L 189 (455)
T ss_pred CC-CCEEEEEEEEEcCCCCEEEEEecCC-------CCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEcc----c
Confidence 97 8899999999999999999999831 2456788887666999999999999988776655555554 3
Q ss_pred ccCCCCCccccccceeeEEEecCcCCCC-cceEECCCccEEEEEeeeecceeccCCCCCCceEEeccchhhHHHHHHHHH
Q 002474 765 NISSADCPRYRAMNMEVIELDTDFGSTF-SGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKII 843 (918)
Q Consensus 765 ~~~~~~~pryr~~n~e~i~~d~~~~~~~-~Gvl~d~~G~v~alw~s~~~~~~~~~~~~~~~~~~~gl~~~~i~~v~~~l~ 843 (918)
+......-.| ...|++|+.++.++ ||+|+|.+|+|.++....... ++......+.||+..+++++++|+
T Consensus 190 ~r~~~~~~~~----~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~------~~~~~gigfaIP~~~~~~v~~~l~ 259 (455)
T PRK10139 190 GRSGLNLEGL----ENFIQTDASINRGNSGGALLNLNGELIGINTAILAP------GGGSVGIGFAIPSNMARTLAQQLI 259 (455)
T ss_pred cccccCCCCc----ceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcC------CCCccceEEEEEhHHHHHHHHHHh
Confidence 3322111223 36899999999998 999999999999998876543 122234455599999999999998
Q ss_pred cCCCCCCccccCccCCCCceEEEeeEEEEeehHhHHhcCCC
Q 002474 844 SGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARSFGLS 884 (918)
Q Consensus 844 ~g~~~~~~~~~~~~~~~p~~~~l~~e~~~~~~~~ar~~g~~ 884 (918)
++. ....-.|.+.+..+.-..++.+|++
T Consensus 260 ~~g-------------~v~r~~LGv~~~~l~~~~~~~lgl~ 287 (455)
T PRK10139 260 DFG-------------EIKRGLLGIKGTEMSADIAKAFNLD 287 (455)
T ss_pred hcC-------------cccccceeEEEEECCHHHHHhcCCC
Confidence 763 1122357888888877777778875
No 15
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.72 E-value=4.5e-17 Score=185.40 Aligned_cols=376 Identities=18% Similarity=0.194 Sum_probs=257.7
Q ss_pred HHHHhCCceEEEEEEee-----eccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEe---cCCeEEeEEEEE
Q 002474 41 ALNKVVPAVVVLRTTAC-----RAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMF---VNREEIPVYPIY 112 (918)
Q Consensus 41 ~vekv~~SVV~I~~~~~-----~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f---~dg~~~~a~vv~ 112 (918)
.++....|++.+..... .+|+......+.|+||.+. ...++||+|++........+.+ ...+.+.+++..
T Consensus 55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~--~~~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~ 132 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIY--GKKLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAA 132 (473)
T ss_pred CccccccceeEEEeecccccccCcceeeehhcccccchhhc--ccceeecCccccccccccccccccCCCchhhhhhHHH
Confidence 44556667777776543 2355555677889999997 3599999999995433333333 234668888888
Q ss_pred ecCCCcEEEEEECCCCcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEEE
Q 002474 113 RDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQ 192 (918)
Q Consensus 113 ~Dp~~DlAlLkvd~~~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~Iq 192 (918)
.-.+.|+|++.++..+++. .+.++.+ -+-+...+.++++| |....+|.|.|++........... ..-.+|
T Consensus 133 ~~~~cd~Avv~Ie~~~f~~-~~~~~e~-~~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~~~-----~l~~vq 202 (473)
T KOG1320|consen 133 VFEECDLAVVYIESEEFWK-GMNPFEL-GDIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHSST-----VLLRVQ 202 (473)
T ss_pred hhhcccceEEEEeeccccC-CCccccc-CCCcccCccEEEEc---CCcEEEEeeEEEEEEeccccCCCc-----ceeeEE
Confidence 8889999999999766543 1112222 22345667899998 888999999999987664332221 334699
Q ss_pred EeecCCCCCCCccEEcccceEEEeccccCC-CCCcccccchhhHHHHHHHHHhcCCCccccccccccCCCccCeEEEEcC
Q 002474 193 AASGTKGGSSGSPVIDWQGRAVALNAGSKS-SSASAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTLQVTFVHKG 271 (918)
Q Consensus 193 ~da~i~~G~SGGPvvn~dG~VVGI~~~~~~-~~~~~faIPi~~i~~~L~~l~~g~~~~~~~~~~~~v~rg~Lgv~~~~~~ 271 (918)
+++++++|+||+|++.-.+++.|+...... +.+..+.+|.-.+.++..-..... ...+.+.+.+..+.+-
T Consensus 203 i~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a---------~~~~f~~~nt~t~g~v 273 (473)
T KOG1320|consen 203 IDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSA---------IGNGFGLLNTLTQGMV 273 (473)
T ss_pred EEEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeec---------cccCceeeeeeeeccc
Confidence 999999999999999988999999987653 236789999988888764433221 1345566666655433
Q ss_pred hHHHH-HhccchhHHHHHHhcCCCCCCCceEEeEecCCCccccCCCCCCEEEEECCEEecC-HHH-----HHHHH-hccC
Q 002474 272 FDETR-RLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQ-FLK-----LETLL-DDGV 343 (918)
Q Consensus 272 ~d~~r-~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~GLq~GDiIlsVNG~~I~s-~~~-----l~~~L-~~~~ 343 (918)
....| .+.|. .+ +|.++ ..+.+-+.|.+-++.||.|+.+||+.|-- +.. +...+ ...+
T Consensus 274 s~~~R~~~~lg------------~~-~g~~i-~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p 339 (473)
T KOG1320|consen 274 SGQLRKSFKLG------------LE-TGVLI-SKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIP 339 (473)
T ss_pred ccccccccccC------------cc-cceee-eeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccC
Confidence 22222 12221 13 78887 68988888888999999999999999941 111 22333 3457
Q ss_pred CCeEEEEEEECCeEEEEEEEeecC----C-CCCCCceeeecceeecccchhhhcccCCC---CCcEEEEc--CCChhHHc
Q 002474 344 DKNIELLIERGGISMTVNLVVQDL----H-SITPDYFLEVSGAVIHPLSYQQARNFRFP---CGLVYVAE--PGYMLFRA 413 (918)
Q Consensus 344 G~~V~l~V~R~G~~~~~~I~l~~~----~-~~t~~~~v~~~Gl~~~~ls~~~~~~~gl~---~~GV~Vs~--pgspA~~A 413 (918)
++++.+.+.|.+ +.......... + ......+.-+.|+.+.++... |.++ ..+|+++. |++++..+
T Consensus 340 ~d~vl~~v~r~~-e~~~~lr~~~~~~p~~~~~g~~s~~i~~g~vf~~~~~~----~~~~~~~~q~v~is~Vlp~~~~~~~ 414 (473)
T KOG1320|consen 340 IDTVLVIVLRLG-EFQISLRPVKPLVPVHQYIGLPSYYIFAGLVFVPLTKS----YIFPSGVVQLVLVSQVLPGSINGGY 414 (473)
T ss_pred chHhhhhhhhhh-hhceeeccccCcccccccCCceeEEEecceEEeecCCC----ccccccceeEEEEEEeccCCCcccc
Confidence 788888888987 33322222211 1 122234556678888766532 2222 14788885 99999999
Q ss_pred CCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeecc
Q 002474 414 GVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (918)
Q Consensus 414 GLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~ 456 (918)
++++||+|.+|||+++.|+.++.+.++....+.+|.+..++-.
T Consensus 415 ~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~~~v~vl~~~~~ 457 (473)
T KOG1320|consen 415 GLKPGDQVVKVNGKPVKNLKHLYELIEECSTEDKVAVLDRRSA 457 (473)
T ss_pred cccCCCEEEEECCEEeechHHHHHHHHhcCcCceEEEEEecCc
Confidence 9999999999999999999999999999877666666665543
No 16
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=99.57 E-value=3.3e-14 Score=165.65 Aligned_cols=143 Identities=17% Similarity=0.206 Sum_probs=110.8
Q ss_pred EeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHh-ccCCCeEEEEEEECCeEEEEEEEeecCCCC-C--CCcee
Q 002474 302 VDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLD-DGVDKNIELLIERGGISMTVNLVVQDLHSI-T--PDYFL 376 (918)
Q Consensus 302 V~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~-~~~G~~V~l~V~R~G~~~~~~I~l~~~~~~-t--~~~~v 376 (918)
|..|.++|||++ |||+||+|++|||++|.+|.++...+. ...|+++++++.|+|+.++.++++...+.. . .....
T Consensus 130 V~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~~~~~~~~~~~~~~ 209 (449)
T PRK10779 130 VGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTLDLRHWAFEPDKQDPV 209 (449)
T ss_pred ccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEecccccccCccccchh
Confidence 479999999999 999999999999999999999998874 567788999999999998888877533211 0 01111
Q ss_pred eecceeecccchhhhcccCCCCCcEEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEee
Q 002474 377 EVSGAVIHPLSYQQARNFRFPCGLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSS 454 (918)
Q Consensus 377 ~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~ 454 (918)
..+|+ .+..+. .++.|. .++|||++|||++||+|++|||+++.+|+++.+.++. ..++.+.+++.|
T Consensus 210 ~~lGl--~~~~~~---------~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v~R 277 (449)
T PRK10779 210 SSLGI--RPRGPQ---------IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEIER 277 (449)
T ss_pred hcccc--cccCCC---------cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEEEE
Confidence 12332 222211 135555 4999999999999999999999999999999999988 456778888877
Q ss_pred cc
Q 002474 455 YT 456 (918)
Q Consensus 455 ~~ 456 (918)
-+
T Consensus 278 ~g 279 (449)
T PRK10779 278 QG 279 (449)
T ss_pred CC
Confidence 54
No 17
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.51 E-value=1.1e-13 Score=130.35 Aligned_cols=110 Identities=33% Similarity=0.569 Sum_probs=74.8
Q ss_pred EEEEEEeCCCcEEEEcCcccCC-------CCcEEEEEecCCeEEe--EEEEEecCC-CcEEEEEECCCCcccccccCCCC
Q 002474 70 ATGFVVDKRRGIILTNRHVVKP-------GPVVAEAMFVNREEIP--VYPIYRDPV-HDFGFFRYDPSAIQFLNYDEIPL 139 (918)
Q Consensus 70 GTGFVVd~~~G~ILTn~HVV~~-------~~~~i~v~f~dg~~~~--a~vv~~Dp~-~DlAlLkvd~~~l~~~~l~~l~l 139 (918)
||||+|++ +|+||||+||+.+ ....+.+.+.++..+. +++++.|+. +|+|||+++..
T Consensus 1 GTGf~i~~-~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~~------------ 67 (120)
T PF13365_consen 1 GTGFLIGP-DGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDPW------------ 67 (120)
T ss_dssp EEEEEEET-TTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESCE------------
T ss_pred CEEEEEcC-CceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEecc------------
Confidence 79999997 7899999999995 2456788888888888 999999999 99999999800
Q ss_pred CCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEe
Q 002474 140 APEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVAL 216 (918)
Q Consensus 140 ~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI 216 (918)
...+.. ....+........ ...+....++ +++.+.+|+|||||||.+|+||||
T Consensus 68 -----------~~~~~~-----~~~~~~~~~~~~~-------~~~~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 68 -----------TGVGGG-----VRVPGSTSGVSPT-------STNDNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -----------EEEEEE-----EEEEEEEEEEEEE-------EEEETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred -----------cceeee-----eEeeeeccccccc-------cCcccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 000000 0000000011000 0000011224 899999999999999999999997
No 18
>PF12812 PDZ_1: PDZ-like domain
Probab=99.45 E-value=1.9e-13 Score=121.23 Aligned_cols=76 Identities=45% Similarity=0.752 Sum_probs=69.4
Q ss_pred CCCCceeeecceeecccchhhhcccCCCCCcEEEEcC-CChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCC
Q 002474 370 ITPDYFLEVSGAVIHPLSYQQARNFRFPCGLVYVAEP-GYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRG 445 (918)
Q Consensus 370 ~t~~~~v~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs~p-gspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g 445 (918)
++|+|++.++|+.||+++|+++|+|+++++|+|++.+ |+++...|+.+|++|++|||+||+|+++|+++|+++||+
T Consensus 2 itp~r~v~~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ipd~ 78 (78)
T PF12812_consen 2 ITPSRFVEVCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVKKIPDN 78 (78)
T ss_pred ccCCEEEEEcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCCCC
Confidence 6899999999999999999999999999999999975 555555559999999999999999999999999999973
No 19
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=99.42 E-value=1.8e-12 Score=149.84 Aligned_cols=131 Identities=19% Similarity=0.242 Sum_probs=105.0
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCeEEEEEEEeecCCCCCCCce
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVVQDLHSITPDYF 375 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I~l~~~~~~t~~~~ 375 (918)
.+.+| ..|.++|||++ ||++||+|+++||+++.++.++...+.... +++.+++.|+++..++.+++.
T Consensus 128 ~g~~V-~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~-~~v~~~I~r~g~~~~l~v~l~---------- 195 (420)
T TIGR00054 128 VGPVI-ELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA-GEPMVEILAERENWTFEVMKE---------- 195 (420)
T ss_pred CCcee-eccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc-ccceEEEEEecCceEeccccc----------
Confidence 56676 79999999999 999999999999999999999998886544 678899999988766543322
Q ss_pred eeecceeecccchhhhcccCCCCCcEEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEe
Q 002474 376 LEVSGAVIHPLSYQQARNFRFPCGLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYS 453 (918)
Q Consensus 376 v~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~ 453 (918)
+.+..+ ..++.|. .++|||+++||++||+|++|||+++.+++++.+.+++. .++.+.++++
T Consensus 196 -------~~~~~~---------~~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~-~~~~v~l~v~ 258 (420)
T TIGR00054 196 -------LIPRGP---------KIEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKEN-PGKSMDIKVE 258 (420)
T ss_pred -------ceecCC---------CcCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhC-CCCceEEEEE
Confidence 111111 1245565 49999999999999999999999999999999999984 4567888887
Q ss_pred ecc
Q 002474 454 SYT 456 (918)
Q Consensus 454 ~~~ 456 (918)
|-+
T Consensus 259 R~g 261 (420)
T TIGR00054 259 RNG 261 (420)
T ss_pred ECC
Confidence 644
No 20
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.19 E-value=6.8e-10 Score=125.54 Aligned_cols=197 Identities=24% Similarity=0.339 Sum_probs=149.0
Q ss_pred ccccccCceEEEEEecCCcccc----CCc---ccceeEeEEEEEeccCCCcEEEEecccccCCCccEEEEeeeCceeeee
Q 002474 621 AESVIEPTLVMFEVHVPPSCMI----DGV---HSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPG 693 (918)
Q Consensus 621 ~~~~l~~s~V~v~~~~p~~~~~----d~~---~~~~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~~di~vtfa~~~~~vp~ 693 (918)
..+++.+++|.+...... .. ... ......|+||+++ +.|||+|+.|+|.. .-.+.++..| ..++++
T Consensus 38 ~~~~~~~~vV~~~~~~~~--~~~~~~~~~~~~~~~~~~gSg~i~~---~~g~ivTn~hVi~~-a~~i~v~l~d-g~~~~a 110 (347)
T COG0265 38 AVEKVAPAVVSIATGLTA--KLRSFFPSDPPLRSAEGLGSGFIIS---SDGYIVTNNHVIAG-AEEITVTLAD-GREVPA 110 (347)
T ss_pred HHHhcCCcEEEEEeeeee--cchhcccCCcccccccccccEEEEc---CCeEEEecceecCC-cceEEEEeCC-CCEEEE
Confidence 566788999999887652 21 111 1125789999999 48999999999999 6789999886 999999
Q ss_pred EEEEeecccceEEEEecCCCcCcccccceeecccCCCcCCCCCCeEEEEEecCCCceeEEeEEEecceeecccCCCCCcc
Q 002474 694 EVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPR 773 (918)
Q Consensus 694 ~vvflhp~~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~~~~~~~~~~~pr 773 (918)
+++..|+.++.|++|-|.... +..+.|.+...++.||.+..||+..++..+ ..++|.+++....-....
T Consensus 111 ~~vg~d~~~dlavlki~~~~~-------~~~~~~~~s~~l~vg~~v~aiGnp~g~~~t----vt~Givs~~~r~~v~~~~ 179 (347)
T COG0265 111 KLVGKDPISDLAVLKIDGAGG-------LPVIALGDSDKLRVGDVVVAIGNPFGLGQT----VTSGIVSALGRTGVGSAG 179 (347)
T ss_pred EEEecCCccCEEEEEeccCCC-------CceeeccCCCCcccCCEEEEecCCCCcccc----eeccEEeccccccccCcc
Confidence 999999999999999996431 355677777778999999999999884433 444555455554111111
Q ss_pred ccccceeeEEEecCcCCCC-cceEECCCccEEEEEeeeecceeccCCCCCCceEEeccchhhHHHHHHHHHc
Q 002474 774 YRAMNMEVIELDTDFGSTF-SGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIIS 844 (918)
Q Consensus 774 yr~~n~e~i~~d~~~~~~~-~Gvl~d~~G~v~alw~s~~~~~~~~~~~~~~~~~~~gl~~~~i~~v~~~l~~ 844 (918)
+ ....|+.|+.++.++ ||.|.|.+|.+.++...-..... ..++..| .+|+..+.+++++|..
T Consensus 180 ~---~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~----~~~gigf--aiP~~~~~~v~~~l~~ 242 (347)
T COG0265 180 G---YVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSG----GSSGIGF--AIPVNLVAPVLDELIS 242 (347)
T ss_pred c---ccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCC----CcceeEE--EecHHHHHHHHHHHHH
Confidence 1 457999999999999 99999999999997766665511 0123334 4999999999999997
No 21
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.16 E-value=2.1e-10 Score=102.54 Aligned_cols=67 Identities=31% Similarity=0.591 Sum_probs=60.1
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHH-hccCCCeEEEEEEECCeEEEEEEEe
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLL-DDGVDKNIELLIERGGISMTVNLVV 364 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L-~~~~G~~V~l~V~R~G~~~~~~I~l 364 (918)
.|++| ..|.++|||++ ||++||+|++|||++|.++.++...+ ...+|++++|++.|+|+.+++++++
T Consensus 14 ~g~~V-~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 14 GGVVV-VSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp SSEEE-EEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEETTEEEEEEEE-
T ss_pred CeEEE-EEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEC
Confidence 57777 58999999999 99999999999999999999999888 5688999999999999999998874
No 22
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.90 E-value=1.2e-08 Score=92.01 Aligned_cols=87 Identities=37% Similarity=0.540 Sum_probs=71.0
Q ss_pred CccCeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHH
Q 002474 261 GTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLL 339 (918)
Q Consensus 261 g~Lgv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L 339 (918)
+++|+.++.......++++++ ...|++| ..|.+++||++ ||++||+|++|||+++.++.++..++
T Consensus 1 ~~~G~~~~~~~~~~~~~~~~~-------------~~~g~~V-~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l 66 (90)
T cd00987 1 PWLGVTVQDLTPDLAEELGLK-------------DTKGVLV-ASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRAL 66 (90)
T ss_pred CccceEEeECCHHHHHHcCCC-------------CCCEEEE-EEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHH
Confidence 478999998876666655553 3456554 79999999998 99999999999999999999999888
Q ss_pred hc-cCCCeEEEEEEECCeEEEEE
Q 002474 340 DD-GVDKNIELLIERGGISMTVN 361 (918)
Q Consensus 340 ~~-~~G~~V~l~V~R~G~~~~~~ 361 (918)
.. ..++.+.+++.|+|+...+.
T Consensus 67 ~~~~~~~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 67 AELKPGDKVTLTVLRGGKELTVT 89 (90)
T ss_pred HhcCCCCEEEEEEEECCEEEEee
Confidence 65 45889999999999776543
No 23
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=98.79 E-value=3.3e-08 Score=88.39 Aligned_cols=71 Identities=30% Similarity=0.374 Sum_probs=59.4
Q ss_pred ecceeecccchhhhcccCCCCCcEEEEc--CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeec
Q 002474 378 VSGAVIHPLSYQQARNFRFPCGLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSY 455 (918)
Q Consensus 378 ~~Gl~~~~ls~~~~~~~gl~~~GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~ 455 (918)
++|+.+..... ..|++|.+ ++|||+++||++||+|++|||+++.++.+|.+.+...+.|+.|+|+++|-
T Consensus 2 ~lGv~~~~~~~---------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~ 72 (82)
T PF13180_consen 2 GLGVTVQNLSD---------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRD 72 (82)
T ss_dssp E-SEEEEECSC---------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEET
T ss_pred EECeEEEEccC---------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEEC
Confidence 45666655443 24788884 99999999999999999999999999999999999999999999999995
Q ss_pred cc
Q 002474 456 TD 457 (918)
Q Consensus 456 ~~ 457 (918)
+.
T Consensus 73 g~ 74 (82)
T PF13180_consen 73 GE 74 (82)
T ss_dssp TE
T ss_pred CE
Confidence 53
No 24
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.77 E-value=5.4e-08 Score=86.27 Aligned_cols=70 Identities=24% Similarity=0.440 Sum_probs=62.2
Q ss_pred CCceEEeEecCCCccccCCCCCCEEEEECCEEecCHHHHHHHHhc-cCCCeEEEEEEECCeEEEEEEEeecC
Q 002474 297 TGLLVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLVVQDL 367 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~GLq~GDiIlsVNG~~I~s~~~l~~~L~~-~~G~~V~l~V~R~G~~~~~~I~l~~~ 367 (918)
.|++| ..|.++|||+.+|++||+|++|||+++.+|.++..++.. ..|+.+.+++.|+|+.+.+++++..+
T Consensus 8 ~Gv~V-~~V~~~s~A~~gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~~~ 78 (79)
T cd00986 8 HGVYV-TSVVEGMPAAGKLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLILKTF 78 (79)
T ss_pred cCEEE-EEECCCCchhhCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEEecc
Confidence 56665 699999999889999999999999999999999998864 67889999999999999999888654
No 25
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.77 E-value=4.6e-08 Score=86.97 Aligned_cols=68 Identities=24% Similarity=0.324 Sum_probs=59.8
Q ss_pred CCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhc-cCCCeEEEEEEECCeEEEEEEE
Q 002474 295 GETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLV 363 (918)
Q Consensus 295 ~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~-~~G~~V~l~V~R~G~~~~~~I~ 363 (918)
...|++| ..|.++|||++ ||++||+|++|||+++.+|.++...+.. .+|+.+.+++.|+|+..+++++
T Consensus 8 ~~~Gv~V-~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 8 AVAGVVI-VGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVST 77 (79)
T ss_pred cCCcEEE-EEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence 3467666 69999999999 9999999999999999999999998865 4688999999999998887664
No 26
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=98.76 E-value=2e-07 Score=96.43 Aligned_cols=177 Identities=21% Similarity=0.265 Sum_probs=109.9
Q ss_pred CceEEEEEEeeeccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEec-------CC--eEEeEEEEEecC--
Q 002474 47 PAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFV-------NR--EEIPVYPIYRDP-- 115 (918)
Q Consensus 47 ~SVV~I~~~~~~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~-------dg--~~~~a~vv~~Dp-- 115 (918)
|.+|.|.... ....++|++|+++ +|||++||+.. ...+.+.+. ++ ..+...-+..++
T Consensus 13 p~~v~i~~~~---------~~~~C~G~li~~~--~vLTaahC~~~-~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~ 80 (220)
T PF00089_consen 13 PWVVSIRYSN---------GRFFCTGTLISPR--WVLTAAHCVDG-ASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKY 80 (220)
T ss_dssp TTEEEEEETT---------TEEEEEEEEEETT--EEEEEGGGHTS-GGSEEEEESESBTTSTTTTSEEEEEEEEEEETTS
T ss_pred CeEEEEeeCC---------CCeeEeEEecccc--ccccccccccc-cccccccccccccccccccccccccccccccccc
Confidence 6677777631 1577999999964 99999999995 334444332 22 345555544433
Q ss_pred -----CCcEEEEEECCCCcccccccCCCCCC--cccCCCCEEEEEecCCCCC----ceEEEeEEEeecCC--CCCCCCCC
Q 002474 116 -----VHDFGFFRYDPSAIQFLNYDEIPLAP--EAACVGLEIRVVGNDSGEK----VSILAGTLARLDRD--APHYKKDG 182 (918)
Q Consensus 116 -----~~DlAlLkvd~~~l~~~~l~~l~l~~--~~l~vG~~V~vvG~p~g~~----~svt~G~Vs~~~r~--~p~~~~~~ 182 (918)
.+|+|||+++..-.....+.++.+.. ..++.|+.+.++|++.... ..+....+..+.+. ...+...
T Consensus 81 ~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~- 159 (220)
T PF00089_consen 81 DPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDN- 159 (220)
T ss_dssp BTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTT-
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc-
Confidence 46999999986521112334444443 4568999999999987532 23443333333221 0001111
Q ss_pred ccccceeEEEEee----cCCCCCCCccEEcccceEEEeccccCCCC---CcccccchhhHHHHH
Q 002474 183 YNDFNTFYMQAAS----GTKGGSSGSPVIDWQGRAVALNAGSKSSS---ASAFFLPLERVVRAL 239 (918)
Q Consensus 183 ~~dfn~~~Iq~da----~i~~G~SGGPvvn~dG~VVGI~~~~~~~~---~~~faIPi~~i~~~L 239 (918)
+....+.+.. ....|+|||||++.+++++||.+.+.... ...++.+++..++++
T Consensus 160 ---~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 160 ---LTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp ---STTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred ---cccccccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 1223456555 77889999999998889999998864332 236778887766653
No 27
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.60 E-value=3e-07 Score=81.32 Aligned_cols=65 Identities=28% Similarity=0.260 Sum_probs=53.3
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhc-cCCCeEEEEEEECCeEEEEEEEee
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLVVQ 365 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~-~~G~~V~l~V~R~G~~~~~~I~l~ 365 (918)
.+++ |..|.++|||++ ||++||+|++|||+++.+|.++ +.. ..++.+.+++.|+|+..++.+++.
T Consensus 12 ~~~~-V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~---l~~~~~~~~v~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 12 GLGK-VTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDR---LKEYQAGDPVELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred CcEE-EEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHH---HHhcCCCCEEEEEEEECCEEEEEEEEec
Confidence 3445 479999999999 9999999999999999986554 433 467889999999999888877654
No 28
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.60 E-value=2.1e-07 Score=81.97 Aligned_cols=63 Identities=33% Similarity=0.588 Sum_probs=55.4
Q ss_pred eEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCeEEEEEE
Q 002474 300 LVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNL 362 (918)
Q Consensus 300 lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I 362 (918)
++|..|.++|||++ ||++||+|++|||+++.++.++...+....++.+.+++.|+|+..++.+
T Consensus 14 ~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~~~~~~~~l 77 (79)
T cd00989 14 PVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERNGETITLTL 77 (79)
T ss_pred cEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEECCEEEEEEe
Confidence 34579999999999 9999999999999999999999988866567889999999998776655
No 29
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.55 E-value=3.6e-07 Score=82.30 Aligned_cols=78 Identities=23% Similarity=0.304 Sum_probs=65.6
Q ss_pred ecceeecccchhhhcccCCCC-CcEEEEc--CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEee
Q 002474 378 VSGAVIHPLSYQQARNFRFPC-GLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSS 454 (918)
Q Consensus 378 ~~Gl~~~~ls~~~~~~~gl~~-~GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~ 454 (918)
++|+.+++++.+..+.+.++. .|++|.. +++||+++||++||+|++|||+++.++.++.+++.....++.+.+++.|
T Consensus 2 ~~G~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~i~l~v~r 81 (90)
T cd00987 2 WLGVTVQDLTPDLAEELGLKDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDKVTLTVLR 81 (90)
T ss_pred ccceEEeECCHHHHHHcCCCCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEE
Confidence 578889888877666555543 5888884 9999999999999999999999999999999999987767788887765
Q ss_pred c
Q 002474 455 Y 455 (918)
Q Consensus 455 ~ 455 (918)
-
T Consensus 82 ~ 82 (90)
T cd00987 82 G 82 (90)
T ss_pred C
Confidence 3
No 30
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.54 E-value=9.9e-07 Score=95.87 Aligned_cols=98 Identities=17% Similarity=0.120 Sum_probs=80.1
Q ss_pred hhHHHHHHHHHhcCCCccccccccccCCCccCeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCccc
Q 002474 233 ERVVRALRFLQERRDCNIHNWEAVSIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAH 312 (918)
Q Consensus 233 ~~i~~~L~~l~~g~~~~~~~~~~~~v~rg~Lgv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~ 312 (918)
..++++++++.+.. .+-++++|+.-.... | ...|.++ ..+.+++||+
T Consensus 159 ~~~~~v~~~l~~~g----------~~~~~~lgi~p~~~~-------g---------------~~~G~~v-~~v~~~s~a~ 205 (259)
T TIGR01713 159 VVSRRIIEELTKDP----------QKMFDYIRLSPVMKN-------D---------------KLEGYRL-NPGKDPSLFY 205 (259)
T ss_pred hhHHHHHHHHHHCH----------HhhhheEeEEEEEeC-------C---------------ceeEEEE-EecCCCCHHH
Confidence 45677788888766 677888888764321 1 2357666 6899999999
Q ss_pred c-CCCCCCEEEEECCEEecCHHHHHHHHhc-cCCCeEEEEEEECCeEEEEEEE
Q 002474 313 L-RLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLV 363 (918)
Q Consensus 313 ~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~-~~G~~V~l~V~R~G~~~~~~I~ 363 (918)
+ ||++||+|++|||+++.++.++.+++.. ..++.++|+|+|+|+.+++.+.
T Consensus 206 ~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~ 258 (259)
T TIGR01713 206 KSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQREDIYVR 258 (259)
T ss_pred HcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEE
Confidence 9 9999999999999999999999998865 6778999999999998888765
No 31
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.52 E-value=5.3e-07 Score=80.57 Aligned_cols=66 Identities=32% Similarity=0.617 Sum_probs=56.6
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCH--HHHHHHHhccCCCeEEEEEEEC-CeEEEEEEE
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLIERG-GISMTVNLV 363 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~--~~l~~~L~~~~G~~V~l~V~R~-G~~~~~~I~ 363 (918)
.+++ |..|.+++||++ ||++||+|++|||+++.++ .++..++....++.+.+++.|+ |+..++++.
T Consensus 13 ~~~~-V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~~~~~~~~ 82 (85)
T cd00988 13 GGLV-ITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGEPREVTLT 82 (85)
T ss_pred CeEE-EEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCCEEEEEEE
Confidence 4445 479999999999 9999999999999999999 8888887666788999999998 877776654
No 32
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=98.48 E-value=5e-07 Score=103.22 Aligned_cols=62 Identities=26% Similarity=0.302 Sum_probs=55.8
Q ss_pred cEEEE--cCCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeeccccccc
Q 002474 400 LVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSYTDRHRR 461 (918)
Q Consensus 400 GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~~~~~~~ 461 (918)
|+||+ ..|+||+..||+.||.|+.||.++..|+ ++.+..|-.+|+|+.|+|..++-.|.+++
T Consensus 430 GIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ~k~Dvyr~ 495 (1027)
T KOG3580|consen 430 GIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQSKADVYRD 495 (1027)
T ss_pred eEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhhhhhHHHHH
Confidence 88998 4899999999999999999999999998 78889999999999999998887775443
No 33
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=98.44 E-value=4.1e-06 Score=87.29 Aligned_cols=162 Identities=17% Similarity=0.176 Sum_probs=93.6
Q ss_pred CCceEEEEEEeeeccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCC-cEEEEEecC---------CeEEeEEEEEecC
Q 002474 46 VPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGP-VVAEAMFVN---------REEIPVYPIYRDP 115 (918)
Q Consensus 46 ~~SVV~I~~~~~~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~-~~i~v~f~d---------g~~~~a~vv~~Dp 115 (918)
.|.+|.|.... ....++|.+|++ .+|||++|++.... ....+.+.. ...+.++-+..+|
T Consensus 12 ~Pw~v~i~~~~---------~~~~C~GtlIs~--~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp 80 (232)
T cd00190 12 FPWQVSLQYTG---------GRHFCGGSLISP--RWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHP 80 (232)
T ss_pred CCCEEEEEccC---------CcEEEEEEEeeC--CEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECC
Confidence 46677776531 356799999996 59999999998531 234444321 2334455555554
Q ss_pred -------CCcEEEEEECCCCcccccccCCCCCCc--ccCCCCEEEEEecCCCCCc-----eEEEeEEEeecC--CCCCCC
Q 002474 116 -------VHDFGFFRYDPSAIQFLNYDEIPLAPE--AACVGLEIRVVGNDSGEKV-----SILAGTLARLDR--DAPHYK 179 (918)
Q Consensus 116 -------~~DlAlLkvd~~~l~~~~l~~l~l~~~--~l~vG~~V~vvG~p~g~~~-----svt~G~Vs~~~r--~~p~~~ 179 (918)
.+|+|||+++...-....+.++.|... .+..|+.+.++|+...... ......+..+.+ ....+.
T Consensus 81 ~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~ 160 (232)
T cd00190 81 NYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYS 160 (232)
T ss_pred CCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhcc
Confidence 479999999843211112455555544 6788999999998654321 122222222221 111111
Q ss_pred C-CCccccceeEEEE-----eecCCCCCCCccEEccc---ceEEEeccccC
Q 002474 180 K-DGYNDFNTFYMQA-----ASGTKGGSSGSPVIDWQ---GRAVALNAGSK 221 (918)
Q Consensus 180 ~-~~~~dfn~~~Iq~-----da~i~~G~SGGPvvn~d---G~VVGI~~~~~ 221 (918)
. .... ...+-. ......|.|||||+... ++++||.+.+.
T Consensus 161 ~~~~~~---~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 161 YGGTIT---DNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred CcccCC---CceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 0 0000 111111 23355699999999865 89999998754
No 34
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.34 E-value=2.5e-06 Score=75.84 Aligned_cols=58 Identities=21% Similarity=0.224 Sum_probs=51.3
Q ss_pred CcEEEEc--CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeecc
Q 002474 399 GLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~ 456 (918)
.|++|.. ++|||+++||++||+|++|||+++.+|++|.+.+.....|+.+.+++.|-+
T Consensus 10 ~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g 69 (79)
T cd00991 10 AGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPST 69 (79)
T ss_pred CcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECC
Confidence 4788874 899999999999999999999999999999999998766778888888643
No 35
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.33 E-value=1.2e-06 Score=102.04 Aligned_cols=150 Identities=17% Similarity=0.221 Sum_probs=89.8
Q ss_pred EeEecCCCcccc--CCCCCCEEEEECCEEecCHHHHHHH-HhccCCCeEEEEEEECCeEEE----------EEEEee---
Q 002474 302 VDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLETL-LDDGVDKNIELLIERGGISMT----------VNLVVQ--- 365 (918)
Q Consensus 302 V~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~~l~~~-L~~~~G~~V~l~V~R~G~~~~----------~~I~l~--- 365 (918)
|..|.++|||++ .|+.||+|++|||+.|.+..+-+-+ |.+.+|-+|+|+|.-..+.-. ..++..
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKdaGlsVtLtIip~ee~~~~~~~~sa~~~s~~t~~~~~ 861 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKDAGLSVTLTIIPPEEAGPPTSMTSAEKQSPFTQNGPY 861 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHhcCceEEEEEcChhccCCCCCCcchhhcCcccccCCH
Confidence 478999999999 6999999999999999998876644 345678899999874322110 001000
Q ss_pred -cCCCCCCCc---------eeeecceeecccch-----------hhhcccCCCC-------CcEEEE--cCCChhHHcC-
Q 002474 366 -DLHSITPDY---------FLEVSGAVIHPLSY-----------QQARNFRFPC-------GLVYVA--EPGYMLFRAG- 414 (918)
Q Consensus 366 -~~~~~t~~~---------~v~~~Gl~~~~ls~-----------~~~~~~gl~~-------~GV~Vs--~pgspA~~AG- 414 (918)
..-.....+ ...+.|..+.+... .-++.|||+. .+.||- ..++||.+.|
T Consensus 862 ~q~~glp~~~~s~~~~~pqpdt~~~~~~~~r~~qn~~~~~VelErG~kGFGFSiRGGreynM~LfVLRlAeDGPA~rdGr 941 (984)
T KOG3209|consen 862 EQQYGLPGPRPSVYEEHPQPDTFQGLSINDRMSQNGDLYTVELERGAKGFGFSIRGGREYNMDLFVLRLAEDGPAIRDGR 941 (984)
T ss_pred hHccCCCCCCccccccCCCCccccceeccccccccCCeeEEEeeccccccceEeecccccccceEEEEeccCCCccccCc
Confidence 000000000 00111111111110 1122344433 257777 4789999998
Q ss_pred CCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEE
Q 002474 415 VPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEY 452 (918)
Q Consensus 415 Lk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~ 452 (918)
++.||.|++|||+++.++ ....+++-|+.|.++.+.+
T Consensus 942 m~VGDqi~eINGesTkgm-tH~rAIelIk~gg~~vll~ 978 (984)
T KOG3209|consen 942 MRVGDQITEINGESTKGM-THDRAIELIKQGGRRVLLL 978 (984)
T ss_pred eeecceEEEecCcccCCC-cHHHHHHHHHhCCeEEEEE
Confidence 999999999999999988 4444444444444444333
No 36
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=98.30 E-value=7.8e-06 Score=76.80 Aligned_cols=55 Identities=31% Similarity=0.503 Sum_probs=47.0
Q ss_pred eEEEEEeccCCCcEEEEecccccC-------CCccEEEEeeeCceeee--eEEEEeecc-cceEEEEec
Q 002474 652 GTGVIIYHSQSMGLVVVDKNTVAI-------SASDVMLSFAAFPIEIP--GEVVFLHPV-HNFALIAYD 710 (918)
Q Consensus 652 G~G~Vvd~~~~~GlV~v~r~~V~~-------~~~di~vtfa~~~~~vp--~~vvflhp~-~n~aiv~yd 710 (918)
|||++|+. +|+|||++|+|.. ....+.+.+.+ .-..+ +++++.++. +|+|||+.+
T Consensus 1 GTGf~i~~---~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~D~All~v~ 65 (120)
T PF13365_consen 1 GTGFLIGP---DGYILTAAHVVEDWNDGKQPDNSSVEVVFPD-GRRVPPVAEVVYFDPDDYDLALLKVD 65 (120)
T ss_dssp EEEEEEET---TTEEEEEHHHHTCCTT--G-TCSEEEEEETT-SCEEETEEEEEEEETT-TTEEEEEES
T ss_pred CEEEEEcC---CceEEEchhheecccccccCCCCEEEEEecC-CCEEeeeEEEEEECCccccEEEEEEe
Confidence 89999993 6799999999974 33567777776 66688 999999999 999999999
No 37
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=98.24 E-value=3.2e-05 Score=80.89 Aligned_cols=164 Identities=16% Similarity=0.156 Sum_probs=93.5
Q ss_pred CCceEEEEEEeeeccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCC-cEEEEEecCC--------eEEeEEEEEec--
Q 002474 46 VPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGP-VVAEAMFVNR--------EEIPVYPIYRD-- 114 (918)
Q Consensus 46 ~~SVV~I~~~~~~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~-~~i~v~f~dg--------~~~~a~vv~~D-- 114 (918)
-|-+|.|.... ....++|.+|++ .+|||++|++.... ....+.+... ..+.+.-+..+
T Consensus 13 ~Pw~~~i~~~~---------~~~~C~GtlIs~--~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~ 81 (229)
T smart00020 13 FPWQVSLQYRG---------GRHFCGGSLISP--RWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPN 81 (229)
T ss_pred CCcEEEEEEcC---------CCcEEEEEEecC--CEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCC
Confidence 35566665421 357799999995 59999999999542 3455655432 33445545544
Q ss_pred -----CCCcEEEEEECCCC-cccccccCCCCCC--cccCCCCEEEEEecCCCCC------ceEEEeEEEeecC--CCCCC
Q 002474 115 -----PVHDFGFFRYDPSA-IQFLNYDEIPLAP--EAACVGLEIRVVGNDSGEK------VSILAGTLARLDR--DAPHY 178 (918)
Q Consensus 115 -----p~~DlAlLkvd~~~-l~~~~l~~l~l~~--~~l~vG~~V~vvG~p~g~~------~svt~G~Vs~~~r--~~p~~ 178 (918)
..+|+|||+++..- +.. .+.++.|.. ..+..|+.+.+.|+..... .......+..+.+ ....+
T Consensus 82 ~~~~~~~~DiAll~L~~~i~~~~-~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~ 160 (229)
T smart00020 82 YNPSTYDNDIALLKLKSPVTLSD-NVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAY 160 (229)
T ss_pred CCCCCCcCCEEEEEECcccCCCC-ceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhh
Confidence 35799999997531 111 344544443 3677899999999876542 1222222222221 11111
Q ss_pred CCC-CccccceeEEE--EeecCCCCCCCccEEcccc--eEEEeccccC
Q 002474 179 KKD-GYNDFNTFYMQ--AASGTKGGSSGSPVIDWQG--RAVALNAGSK 221 (918)
Q Consensus 179 ~~~-~~~dfn~~~Iq--~da~i~~G~SGGPvvn~dG--~VVGI~~~~~ 221 (918)
... ........... ....+.+|.||||++...+ +++||.+.+.
T Consensus 161 ~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 161 SGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred ccccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 100 01110000111 1244567999999998664 9999988754
No 38
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.21 E-value=1.3e-05 Score=93.88 Aligned_cols=147 Identities=22% Similarity=0.246 Sum_probs=83.1
Q ss_pred eEEeEecCCCcccc--CCCCCCEEEEECCEEecCHHHHH--HHHh-ccCCCeEEEEEEECCeEEEE-EEEeecCCCCCCC
Q 002474 300 LVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLE--TLLD-DGVDKNIELLIERGGISMTV-NLVVQDLHSITPD 373 (918)
Q Consensus 300 lVV~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~~l~--~~L~-~~~G~~V~l~V~R~G~~~~~-~I~l~~~~~~t~~ 373 (918)
+.|..|.+.+.|++ .|++||.|+.|||.+|..-.+-+ .++. ....+.|.|+|.|.-..-.- .-+........+.
T Consensus 676 i~iG~Iv~lGaAe~DGRL~~gDElv~iDG~pV~GksH~~vv~Lm~~AArnghV~LtVRRkv~~~~~~rsp~~s~~~~~~y 755 (984)
T KOG3209|consen 676 IYIGAIVPLGAAEEDGRLREGDELVCIDGIPVEGKSHSEVVDLMEAAARNGHVNLTVRRKVRTGPARRSPRNSAAPSGPY 755 (984)
T ss_pred eEEeeeeecccccccCcccCCCeEEEecCeeccCccHHHHHHHHHHHHhcCceEEEEeeeeeeccccCCcccccCCCCCe
Confidence 34489999999999 59999999999999998665543 3342 23446788988872110000 0000000000011
Q ss_pred cee----eecceeecccchhhhcccCCCCCcEEEEcCCChhHHcC-CCCCCEEEEECCeecCCH--HHHHHHHHhcCCCC
Q 002474 374 YFL----EVSGAVIHPLSYQQARNFRFPCGLVYVAEPGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGA 446 (918)
Q Consensus 374 ~~v----~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs~pgspA~~AG-Lk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~ 446 (918)
+.+ +--|+.|.-++-+-. |..||=---+||||++.| |+.||+|++|||+.|-++ .+.++++|. .|-
T Consensus 756 DV~lhR~ENeGFGFVi~sS~~k-----p~sgiGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGl 828 (984)
T KOG3209|consen 756 DVVLHRKENEGFGFVIMSSQNK-----PESGIGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD--AGL 828 (984)
T ss_pred eeEEecccCCceeEEEEecccC-----CCCCccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHh--cCc
Confidence 100 111222222221111 111210002899999998 999999999999999988 355555555 234
Q ss_pred eEEEEEe
Q 002474 447 RVPIEYS 453 (918)
Q Consensus 447 rV~l~~~ 453 (918)
.|+|++.
T Consensus 829 sVtLtIi 835 (984)
T KOG3209|consen 829 SVTLTII 835 (984)
T ss_pred eEEEEEc
Confidence 4444443
No 39
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.18 E-value=1.1e-05 Score=71.49 Aligned_cols=57 Identities=19% Similarity=0.207 Sum_probs=49.8
Q ss_pred CcEEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeecc
Q 002474 399 GLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (918)
Q Consensus 399 ~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~ 456 (918)
.|++|. .++|||+. ||++||+|++|||+++.+++++.+.++..+.|..+.|++.|-+
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g 66 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREE 66 (79)
T ss_pred cCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECC
Confidence 367777 38999987 8999999999999999999999999998777888999998743
No 40
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.17 E-value=7.2e-06 Score=70.44 Aligned_cols=53 Identities=36% Similarity=0.619 Sum_probs=46.9
Q ss_pred CceEEeEecCCCcccc-CCCCCCEEEEECCEEecCH--HHHHHHHhccCCCeEEEEE
Q 002474 298 GLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLI 351 (918)
Q Consensus 298 G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~--~~l~~~L~~~~G~~V~l~V 351 (918)
+++| ..|.+++||+. ||++||+|++|||+++.++ .++..++....|++++|++
T Consensus 14 ~~~V-~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v 69 (70)
T cd00136 14 GVVV-LSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV 69 (70)
T ss_pred CEEE-EEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence 5555 79999999999 9999999999999999999 8888888776688888876
No 41
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.11 E-value=8.4e-06 Score=94.79 Aligned_cols=67 Identities=22% Similarity=0.504 Sum_probs=60.9
Q ss_pred CceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCeEEEEEEEee
Q 002474 298 GLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVVQ 365 (918)
Q Consensus 298 G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I~l~ 365 (918)
+..| ..|.++|||++ ||++||+|++|||+++.+|.++.+.+....++++.++++|+|+..++++++.
T Consensus 204 g~vV-~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~~~~~v~~~ 271 (420)
T TIGR00054 204 EPVL-SDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVERNGETLSISLTPE 271 (420)
T ss_pred CcEE-EEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEECCEEEEEEEEEc
Confidence 4554 79999999999 9999999999999999999999999977788889999999999988888774
No 42
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.05 E-value=6.2e-05 Score=80.85 Aligned_cols=158 Identities=16% Similarity=0.126 Sum_probs=90.9
Q ss_pred CceEEEEEEeeeccCCC-----CCCCceEEEEEEeCCCcEEEEcCcccCCCCcE-EEE-Eec-----CCe---EEeEEEE
Q 002474 47 PAVVVLRTTACRAFDTE-----AAGASYATGFVVDKRRGIILTNRHVVKPGPVV-AEA-MFV-----NRE---EIPVYPI 111 (918)
Q Consensus 47 ~SVV~I~~~~~~~fd~~-----~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~-i~v-~f~-----dg~---~~~a~vv 111 (918)
..+..|.-+...+|... ..+...+++|+|.++ .+||++||+-..... ..+ .+. ++. .+.....
T Consensus 38 d~r~~V~dt~~~Py~av~~~~~~tG~~~~~~~lI~pn--tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~ 115 (251)
T COG3591 38 DDRTQVTDTTQFPYSAVVQFEAATGRLCTAATLIGPN--TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEI 115 (251)
T ss_pred CCeeecccCCCCCcceeEEeecCCCcceeeEEEEcCc--eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEE
Confidence 45555554443333321 112233455999974 999999999854321 111 111 111 1222222
Q ss_pred Eec-C---CCcEEEEEECCCCcc----cc---cccCCCCCCcccCCCCEEEEEecCCCCCceE----EEeEEEeecCCCC
Q 002474 112 YRD-P---VHDFGFFRYDPSAIQ----FL---NYDEIPLAPEAACVGLEIRVVGNDSGEKVSI----LAGTLARLDRDAP 176 (918)
Q Consensus 112 ~~D-p---~~DlAlLkvd~~~l~----~~---~l~~l~l~~~~l~vG~~V~vvG~p~g~~~sv----t~G~Vs~~~r~~p 176 (918)
... . ..|.+...+.+..+. .. .+...++ ....+.++.+.++|||.+....- ..+.+....
T Consensus 116 ~~~~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~-~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~---- 190 (251)
T COG3591 116 RVYPGELYKEDGASYDVGEAALESGINIGDVVNYLKRNT-ASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIK---- 190 (251)
T ss_pred EecCCceeccCCceeeccHHHhccCCCcccccccccccc-ccccccCceeEEEeccCCCCcceeEeeecceeEEEe----
Confidence 122 2 346666666543222 11 1112222 35678999999999998765332 233333331
Q ss_pred CCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEeccccCC
Q 002474 177 HYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKS 222 (918)
Q Consensus 177 ~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~ 222 (918)
..+++.++.+.+|+||+||++.+.++||+++.+..
T Consensus 191 -----------~~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~~ 225 (251)
T COG3591 191 -----------GNKLFYDADTLPGSSGSPVLISKDEVIGVHYNGPG 225 (251)
T ss_pred -----------cceEEEEecccCCCCCCceEecCceEEEEEecCCC
Confidence 22589999999999999999999999999987655
No 43
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.05 E-value=1.3e-05 Score=94.01 Aligned_cols=65 Identities=34% Similarity=0.540 Sum_probs=59.7
Q ss_pred EEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCeEEEEEEEee
Q 002474 301 VVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVVQ 365 (918)
Q Consensus 301 VV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I~l~ 365 (918)
+|..|.++|||++ ||++||+|++|||+++.+|.++.+.+....++.+.+++.|+|+..++++++.
T Consensus 224 vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~~~~~v~~~ 289 (449)
T PRK10779 224 VLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIERQGSPLSLTLTPD 289 (449)
T ss_pred EEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEECCEEEEEEEEee
Confidence 4479999999999 9999999999999999999999999877788899999999999988888775
No 44
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.01 E-value=5.6e-05 Score=82.35 Aligned_cols=58 Identities=10% Similarity=0.179 Sum_probs=53.3
Q ss_pred CcEEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeecc
Q 002474 399 GLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (918)
Q Consensus 399 ~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~ 456 (918)
.|+.|. .++++|+++||++||+|++|||+++.+++++.+++.+++.+..+.|++.|-+
T Consensus 191 ~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G 250 (259)
T TIGR01713 191 EGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDG 250 (259)
T ss_pred eEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECC
Confidence 588888 4899999999999999999999999999999999999988889999998754
No 45
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.00 E-value=1.8e-05 Score=68.04 Aligned_cols=52 Identities=31% Similarity=0.439 Sum_probs=46.8
Q ss_pred cEEEEc--CCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEE
Q 002474 400 LVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEY 452 (918)
Q Consensus 400 GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~ 452 (918)
+++|.. +++||+.+||++||+|++|||+++.++ +++.+.++..+ |+.|+|++
T Consensus 14 ~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~-g~~v~l~v 69 (70)
T cd00136 14 GVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEV-GEKVTLTV 69 (70)
T ss_pred CEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCC-CCeEEEEE
Confidence 677774 899999999999999999999999999 99999999966 78888776
No 46
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=97.98 E-value=5.3e-05 Score=66.56 Aligned_cols=54 Identities=20% Similarity=0.324 Sum_probs=46.8
Q ss_pred EEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeec
Q 002474 401 VYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSY 455 (918)
Q Consensus 401 V~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~ 455 (918)
+.|+ .++|+|+++||++||+|++|||+++.+++++...++... +..+.+++.|-
T Consensus 14 ~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~-~~~~~l~v~r~ 69 (79)
T cd00989 14 PVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENP-GKPLTLTVERN 69 (79)
T ss_pred cEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCC-CceEEEEEEEC
Confidence 5565 489999999999999999999999999999999998854 66788888763
No 47
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.90 E-value=0.00019 Score=76.45 Aligned_cols=165 Identities=20% Similarity=0.244 Sum_probs=91.0
Q ss_pred HHhCCceEEEEEEeeeccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEE-----EEEecCCC
Q 002474 43 NKVVPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVY-----PIYRDPVH 117 (918)
Q Consensus 43 ekv~~SVV~I~~~~~~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~-----vv~~Dp~~ 117 (918)
.-+...|+.|.... ......=-||.... +|+||+|........+.+...-|. |... -+..=+..
T Consensus 14 n~Ia~~ic~l~n~s-------~~~~~~l~gigyG~---~iItn~HLf~~nng~L~i~s~hG~-f~v~nt~~lkv~~i~~~ 82 (235)
T PF00863_consen 14 NPIASNICRLTNES-------DGGTRSLYGIGYGS---YIITNAHLFKRNNGELTIKSQHGE-FTVPNTTQLKVHPIEGR 82 (235)
T ss_dssp HHHHTTEEEEEEEE-------TTEEEEEEEEEETT---EEEEEGGGGSSTTCEEEEEETTEE-EEECEGGGSEEEE-TCS
T ss_pred chhhheEEEEEEEe-------CCCeEEEEEEeECC---EEEEChhhhccCCCeEEEEeCceE-EEcCCccccceEEeCCc
Confidence 34556777777542 11223335777765 999999999866666777765553 3322 24445678
Q ss_pred cEEEEEECCCCcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEEEEeecC
Q 002474 118 DFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGT 197 (918)
Q Consensus 118 DlAlLkvd~~~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~Iq~da~i 197 (918)
|+.++|+. +++|+.+- .+.-..++.||+|++||.-+..+...+ .||......|. . +..+...-..+
T Consensus 83 DiviirmP-kDfpPf~~---kl~FR~P~~~e~v~mVg~~fq~k~~~s--~vSesS~i~p~----~----~~~fWkHwIsT 148 (235)
T PF00863_consen 83 DIVIIRMP-KDFPPFPQ---KLKFRAPKEGERVCMVGSNFQEKSISS--TVSESSWIYPE----E----NSHFWKHWIST 148 (235)
T ss_dssp SEEEEE---TTS----S------B----TT-EEEEEEEECSSCCCEE--EEEEEEEEEEE----T----TTTEEEE-C--
T ss_pred cEEEEeCC-cccCCcch---hhhccCCCCCCEEEEEEEEEEcCCeeE--EECCceEEeec----C----CCCeeEEEecC
Confidence 99999995 44443211 111245789999999998665543222 22322222221 1 12247777888
Q ss_pred CCCCCCccEEc-ccceEEEeccccCCCCCcccccch
Q 002474 198 KGGSSGSPVID-WQGRAVALNAGSKSSSASAFFLPL 232 (918)
Q Consensus 198 ~~G~SGGPvvn-~dG~VVGI~~~~~~~~~~~faIPi 232 (918)
..|+=|+|+++ .||++|||++........+|+.|+
T Consensus 149 k~G~CG~PlVs~~Dg~IVGiHsl~~~~~~~N~F~~f 184 (235)
T PF00863_consen 149 KDGDCGLPLVSTKDGKIVGIHSLTSNTSSRNYFTPF 184 (235)
T ss_dssp -TT-TT-EEEETTT--EEEEEEEEETTTSSEEEEE-
T ss_pred CCCccCCcEEEcCCCcEEEEEcCccCCCCeEEEEcC
Confidence 99999999998 689999999987777788888666
No 48
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=97.90 E-value=5.3e-05 Score=67.60 Aligned_cols=56 Identities=27% Similarity=0.424 Sum_probs=48.2
Q ss_pred CcEEEE--cCCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeec
Q 002474 399 GLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSY 455 (918)
Q Consensus 399 ~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~ 455 (918)
.+++|. .+++||+++||++||+|++|||+++.++ +++.+.++.. .++.+.+++.|-
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~-~~~~i~l~v~r~ 72 (85)
T cd00988 13 GGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGK-AGTKVRLTLKRG 72 (85)
T ss_pred CeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCC-CCCEEEEEEEcC
Confidence 367777 4899999999999999999999999999 9999988774 467888888765
No 49
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.84 E-value=5.3e-05 Score=67.35 Aligned_cols=55 Identities=29% Similarity=0.527 Sum_probs=43.1
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHH-hccCCCeEEEEEE
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLL-DDGVDKNIELLIE 352 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L-~~~~G~~V~l~V~ 352 (918)
.+++| ..|.++|||++ ||++||+|++|||+.+.++...+... ....++.++|+|+
T Consensus 25 ~~~~V-~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~~v~L~V~ 81 (81)
T PF00595_consen 25 KGVFV-SSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASNPVTLTVQ 81 (81)
T ss_dssp EEEEE-EEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTSEEEEEEE
T ss_pred CCEEE-EEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCCcEEEEEC
Confidence 46666 79999999999 99999999999999999886555432 2333448888774
No 50
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.80 E-value=0.00013 Score=66.48 Aligned_cols=60 Identities=23% Similarity=0.367 Sum_probs=45.5
Q ss_pred CCceEEeEecCC--------Ccccc-C--CCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCe
Q 002474 297 TGLLVVDSVVPG--------GPAHL-R--LEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGI 356 (918)
Q Consensus 297 ~G~lVV~~V~p~--------spA~~-G--Lq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~ 356 (918)
.+.+.+..|.++ ||..+ | +++||.|++|||+++..-.++..+|....|+.+.|+|.+.+.
T Consensus 11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~~ 81 (88)
T PF14685_consen 11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKPG 81 (88)
T ss_dssp TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-STT
T ss_pred CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCCC
Confidence 466777888886 77777 5 569999999999999999999999999999999999999764
No 51
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.79 E-value=5.3e-05 Score=85.60 Aligned_cols=69 Identities=22% Similarity=0.409 Sum_probs=56.6
Q ss_pred CceEEeEecCCCcccc-CCCCCCEEEEECCEEecCH--HHHHHHHhccCCCeEEEEEEECCeEEEEEEEeecC
Q 002474 298 GLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLIERGGISMTVNLVVQDL 367 (918)
Q Consensus 298 G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~--~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I~l~~~ 367 (918)
+++ |..|.++|||++ ||++||+|++|||+++.+| .++...+....|+++.+++.|+|+...+++++...
T Consensus 63 ~~~-V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~g~~~~~~v~l~~~ 134 (334)
T TIGR00225 63 EIV-IVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRAGKSKPLTFTLKRD 134 (334)
T ss_pred EEE-EEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeCCCCceEEEEEEEE
Confidence 444 479999999999 9999999999999999987 45555665667889999999998877777766543
No 52
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=97.77 E-value=0.00011 Score=64.91 Aligned_cols=54 Identities=19% Similarity=0.157 Sum_probs=43.5
Q ss_pred CcEEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeec
Q 002474 399 GLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSY 455 (918)
Q Consensus 399 ~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~ 455 (918)
.++.|. .++|+|+++||++||+|++|||+++.+|. ++++..+.+..+.+++.|-
T Consensus 12 ~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~---~~l~~~~~~~~v~l~v~r~ 67 (80)
T cd00990 12 GLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQ---DRLKEYQAGDPVELTVFRD 67 (80)
T ss_pred CcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHH---HHHHhcCCCCEEEEEEEEC
Confidence 357777 49999999999999999999999999854 4556655667888888763
No 53
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.77 E-value=6.8e-05 Score=66.31 Aligned_cols=58 Identities=38% Similarity=0.615 Sum_probs=47.2
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHh-ccCCCeEEEEEEECC
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLD-DGVDKNIELLIERGG 355 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~-~~~G~~V~l~V~R~G 355 (918)
.+++| ..|.+++||++ ||++||+|++|||+.+.++.+...... ...+..+.+++.|++
T Consensus 26 ~~~~i-~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r~~ 85 (85)
T smart00228 26 GGVVV-SSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLRGG 85 (85)
T ss_pred CCEEE-EEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEeCC
Confidence 46555 79999999999 999999999999999998877665542 334568899988864
No 54
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.71 E-value=0.00012 Score=83.72 Aligned_cols=69 Identities=28% Similarity=0.455 Sum_probs=58.8
Q ss_pred CCceEEeE--ec-----CCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCeEEEEEEEee
Q 002474 297 TGLLVVDS--VV-----PGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVVQ 365 (918)
Q Consensus 297 ~G~lVV~~--V~-----p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I~l~ 365 (918)
.|++|+.. |. .++||++ ||++||+|++|||+++.+|.++.+++....++.+.+++.|+|+..++.+++.
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~Ge~~tv~V~Pv 181 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERGGKIIETVIKPV 181 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEECCEEEEEEEEEe
Confidence 68888743 21 2589999 9999999999999999999999999977668899999999999888887643
No 55
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=97.69 E-value=9e-05 Score=85.21 Aligned_cols=60 Identities=30% Similarity=0.500 Sum_probs=51.5
Q ss_pred eEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEE-ECCeEEEEEEEee
Q 002474 303 DSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIE-RGGISMTVNLVVQ 365 (918)
Q Consensus 303 ~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~-R~G~~~~~~I~l~ 365 (918)
..|.|+|+|++ ||++||+|++|||+++.+|.++...+. ++.+.++|. |+|+..++++...
T Consensus 3 ~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~rdGe~~~l~Ie~~ 64 (433)
T TIGR03279 3 SAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDANGESHQIEIEKD 64 (433)
T ss_pred CCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcCCCeEEEEEEecC
Confidence 57999999999 999999999999999999999887773 357889887 7888777776653
No 56
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.65 E-value=0.00015 Score=83.74 Aligned_cols=66 Identities=30% Similarity=0.527 Sum_probs=54.7
Q ss_pred CceEEeEecCCCcccc-CCCCCCEEEEECCEEecCH--HHHHHHHhccCCCeEEEEEEECCeEEEEEEEe
Q 002474 298 GLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLIERGGISMTVNLVV 364 (918)
Q Consensus 298 G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~--~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I~l 364 (918)
+++| ..|.++|||++ ||++||+|++|||+++.++ .++...+....|+.+.|+|.|+|+..+++++-
T Consensus 103 g~~V-~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r~g~~~~~~l~r 171 (389)
T PLN00049 103 GLVV-VAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRRGPETRLVTLTR 171 (389)
T ss_pred cEEE-EEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEe
Confidence 5555 69999999999 9999999999999999864 56666666667889999999999877666553
No 57
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.64 E-value=0.00014 Score=64.10 Aligned_cols=53 Identities=34% Similarity=0.577 Sum_probs=43.4
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEec--CHHHHHHHHhccCCCeEEEEE
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVIT--QFLKLETLLDDGVDKNIELLI 351 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~--s~~~l~~~L~~~~G~~V~l~V 351 (918)
.+++| ..|.++|||++ ||++||+|++|||+++. ++.++...+....+ .+++++
T Consensus 26 ~~~~V-~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~-~v~l~v 81 (82)
T cd00992 26 GGIFV-SRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV 81 (82)
T ss_pred CCeEE-EEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC-eEEEEE
Confidence 45555 79999999999 99999999999999999 88888888765433 666654
No 58
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.60 E-value=0.0002 Score=63.17 Aligned_cols=52 Identities=21% Similarity=0.336 Sum_probs=44.1
Q ss_pred CcEEEEc--CCChhHHcCCCCCCEEEEECCeecC--CHHHHHHHHHhcCCCCeEEEEE
Q 002474 399 GLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEIS--RLEDLISVLSKLSRGARVPIEY 452 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~--~l~efi~vl~~~~~g~rV~l~~ 452 (918)
.|++|.. +++||+++||++||+|++|||+++. +++++.+.++.... .++|++
T Consensus 26 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~--~v~l~v 81 (82)
T cd00992 26 GGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD--EVTLTV 81 (82)
T ss_pred CCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC--eEEEEE
Confidence 4788884 8999999999999999999999999 89999999987443 555554
No 59
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.59 E-value=0.00019 Score=63.83 Aligned_cols=52 Identities=23% Similarity=0.388 Sum_probs=42.9
Q ss_pred CcEEEEc--CCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEE
Q 002474 399 GLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEY 452 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~ 452 (918)
.++||++ ++|+|+++||+.||+|++|||+++.++ ++..+.++..++ .|+|++
T Consensus 25 ~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~--~v~L~V 80 (81)
T PF00595_consen 25 KGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASN--PVTLTV 80 (81)
T ss_dssp EEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTS--EEEEEE
T ss_pred CCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCC--cEEEEE
Confidence 4799984 999999999999999999999999976 666777777443 666665
No 60
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.50 E-value=0.00046 Score=78.96 Aligned_cols=57 Identities=23% Similarity=0.266 Sum_probs=49.7
Q ss_pred CcEEEE---c-------CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeecc
Q 002474 399 GLVYVA---E-------PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (918)
Q Consensus 399 ~GV~Vs---~-------pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~ 456 (918)
.||+|. + .++||+.+||++||+|++|||+++.++++|.+++++.. ++.+.|++.|-.
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~-g~~V~LtV~R~G 171 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAG-GEKLTLTIERGG 171 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCC-CCeEEEEEEECC
Confidence 578775 1 36899999999999999999999999999999999975 788999998744
No 61
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=97.49 E-value=0.00076 Score=78.02 Aligned_cols=58 Identities=28% Similarity=0.369 Sum_probs=47.2
Q ss_pred ceEEeEecCCCccccCCCCCCEEEEECCEEecCHHHHHHHH-hccCCCeEEEEEEECCe
Q 002474 299 LLVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQFLKLETLL-DDGVDKNIELLIERGGI 356 (918)
Q Consensus 299 ~lVV~~V~p~spA~~GLq~GDiIlsVNG~~I~s~~~l~~~L-~~~~G~~V~l~V~R~G~ 356 (918)
-+||+.|+|++||+.-||.||.|+-|||....+..+...+. ....|+...|+|.|-.+
T Consensus 41 SiViSDVlpGGPAeG~LQenDrvvMVNGvsMenv~haFAvQqLrksgK~A~ItvkRprk 99 (1027)
T KOG3580|consen 41 SIVISDVLPGGPAEGLLQENDRVVMVNGVSMENVLHAFAVQQLRKSGKVAAITVKRPRK 99 (1027)
T ss_pred eEEEeeccCCCCcccccccCCeEEEEcCcchhhhHHHHHHHHHHhhccceeEEecccce
Confidence 35558999999999999999999999999999887766554 35678888888877544
No 62
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.41 E-value=0.00047 Score=60.90 Aligned_cols=56 Identities=25% Similarity=0.297 Sum_probs=42.4
Q ss_pred CcEEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEee
Q 002474 399 GLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSS 454 (918)
Q Consensus 399 ~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~ 454 (918)
.|++|. .++++|+++||++||+|++|||+++.++.+..........+..+.|++.|
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r 83 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLR 83 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEe
Confidence 478888 48999999999999999999999999875554444433334466666654
No 63
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=97.27 E-value=0.00052 Score=80.37 Aligned_cols=116 Identities=21% Similarity=0.305 Sum_probs=75.8
Q ss_pred eEecCCCcccc--CCCCCCEEEEECCEEecCH--HHHHHHHhccCC-CeEEEEEEECCeEEEEEEEeecCCCCCCCceee
Q 002474 303 DSVVPGGPAHL--RLEPGDVLVRVNGEVITQF--LKLETLLDDGVD-KNIELLIERGGISMTVNLVVQDLHSITPDYFLE 377 (918)
Q Consensus 303 ~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~--~~l~~~L~~~~G-~~V~l~V~R~G~~~~~~I~l~~~~~~t~~~~v~ 377 (918)
.....++||++ .|-.||.|++|||..+-.. ..-+.++..... ..|+|+|.+=--..++.|.-.+
T Consensus 678 Anmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd----------- 746 (829)
T KOG3605|consen 678 ANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPD----------- 746 (829)
T ss_pred HhcccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEecCCCceEEEeeccc-----------
Confidence 35667899999 6999999999999877653 333455544322 3577777653333333322111
Q ss_pred ecceeecccchhhhcccCCCC-CcEEEE-cCCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHh
Q 002474 378 VSGAVIHPLSYQQARNFRFPC-GLVYVA-EPGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSK 441 (918)
Q Consensus 378 ~~Gl~~~~ls~~~~~~~gl~~-~GV~Vs-~pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~ 441 (918)
....+||.+ .||+++ -.|+-|+|-|++.|-+|++|||+.|--. +-.+++|..
T Consensus 747 ------------~kyQLGFSVQNGiICSLlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~ 802 (829)
T KOG3605|consen 747 ------------LRYQLGFSVQNGIICSLLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSN 802 (829)
T ss_pred ------------chhhccceeeCcEeehhhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHH
Confidence 111234433 578877 4899999999999999999999987422 345555554
No 64
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.22 E-value=0.00097 Score=75.42 Aligned_cols=68 Identities=19% Similarity=0.366 Sum_probs=53.7
Q ss_pred cEEEEc--CCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEcC
Q 002474 400 LVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDRH 471 (918)
Q Consensus 400 GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdRd 471 (918)
+++|.. ++|||+++||++||+|++|||+++.+| +++...++. +.|..+.|++.|-+. .....+++.|.
T Consensus 63 ~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g~~v~l~v~R~g~---~~~~~v~l~~~ 134 (334)
T TIGR00225 63 EIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRG-KKGTKVSLEILRAGK---SKPLTFTLKRD 134 (334)
T ss_pred EEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccC-CCCCEEEEEEEeCCC---CceEEEEEEEE
Confidence 566663 999999999999999999999999986 466666655 457899999987543 35667777773
No 65
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.19 E-value=0.0011 Score=76.84 Aligned_cols=65 Identities=29% Similarity=0.496 Sum_probs=50.9
Q ss_pred CceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHH--HHHHHHhccCCCeEEEEEEECC--eEEEEEE
Q 002474 298 GLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFL--KLETLLDDGVDKNIELLIERGG--ISMTVNL 362 (918)
Q Consensus 298 G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~--~l~~~L~~~~G~~V~l~V~R~G--~~~~~~I 362 (918)
+.+.|.++.+++||++ ||++||+|++|||+++.... +....+....|..++|++.|.+ +..++++
T Consensus 112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~~~~k~~~v~l 181 (406)
T COG0793 112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRAGGGKPFTVTL 181 (406)
T ss_pred CCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEcCCCceeEEEE
Confidence 4444469999999999 99999999999999998763 3445556678999999999974 4444444
No 66
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.14 E-value=0.0013 Score=67.95 Aligned_cols=73 Identities=27% Similarity=0.449 Sum_probs=61.9
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHH---HhccCCCeEEEEEEECCeEEEEEEEeecCCC
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETL---LDDGVDKNIELLIERGGISMTVNLVVQDLHS 369 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~---L~~~~G~~V~l~V~R~G~~~~~~I~l~~~~~ 369 (918)
+...+|+.|.|+|||+. ||+.||.|+++....--+|..++.+ .....++.+.+++.|.|+...+.+++..|..
T Consensus 138 ~~Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~W~G 214 (231)
T KOG3129|consen 138 RPFAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLTPKKWQG 214 (231)
T ss_pred cceEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeCcccccC
Confidence 44566799999999999 9999999999988877777766543 3567889999999999999999999888864
No 67
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.13 E-value=0.0018 Score=64.04 Aligned_cols=90 Identities=17% Similarity=0.211 Sum_probs=55.1
Q ss_pred eecceeecccchhhhcccCCCCCcEEEE--cCCChhHHcCCCC-CCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEe
Q 002474 377 EVSGAVIHPLSYQQARNFRFPCGLVYVA--EPGYMLFRAGVPR-HAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYS 453 (918)
Q Consensus 377 ~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs--~pgspA~~AGLk~-GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~ 453 (918)
..+|++++.-...-... .+.-|. .|+|||+.|||++ .|.|+.+++....+.++|.+.++. ..++.+.|.++
T Consensus 26 g~LG~sv~~~~~~~~~~-----~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~-~~~~~l~L~Vy 99 (138)
T PF04495_consen 26 GLLGISVRFESFEGAEE-----EGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEA-NENKPLQLYVY 99 (138)
T ss_dssp SSS-EEEEEEE-TTGCC-----CEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHH-TTTS-EEEEEE
T ss_pred CCCcEEEEEeccccccc-----ceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHH-cCCCcEEEEEE
Confidence 45677765444331211 233344 3999999999998 699999999999999999999998 55688999998
Q ss_pred eccccccceEEEEEEEcCCCC
Q 002474 454 SYTDRHRRKSVLVTIDRHEWY 474 (918)
Q Consensus 454 ~~~~~~~~k~~~ltIdRd~w~ 474 (918)
+.... ..+.+.|+-.| .|-
T Consensus 100 ns~~~-~vR~V~i~P~~-~Wg 118 (138)
T PF04495_consen 100 NSKTD-SVREVTITPSR-NWG 118 (138)
T ss_dssp ETTTT-CEEEEEE---T-TSS
T ss_pred ECCCC-eEEEEEEEcCC-CCC
Confidence 76542 22333343355 664
No 68
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.06 E-value=0.0015 Score=75.69 Aligned_cols=72 Identities=21% Similarity=0.295 Sum_probs=58.2
Q ss_pred cEEEEc--CCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEcCCCCC
Q 002474 400 LVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDRHEWYA 475 (918)
Q Consensus 400 GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdRd~w~~ 475 (918)
++.|.+ +++||++||+++||+|++|||+++.+. ++.++.|+. +.|..|+|++.|-. ..++..+++.|+.-.-
T Consensus 113 ~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG-~~Gt~V~L~i~r~~---~~k~~~v~l~Re~i~l 188 (406)
T COG0793 113 GVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRG-KPGTKVTLTILRAG---GGKPFTVTLTREEIEL 188 (406)
T ss_pred CcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCC-CCCCeEEEEEEEcC---CCceeEEEEEEEEEec
Confidence 566664 899999999999999999999999988 456666666 66899999999973 2378999998854443
No 69
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.01 E-value=0.0024 Score=73.80 Aligned_cols=65 Identities=25% Similarity=0.306 Sum_probs=51.5
Q ss_pred cEEEE--cCCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEc
Q 002474 400 LVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDR 470 (918)
Q Consensus 400 GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdR 470 (918)
|+.|. .++|||+++||++||+|++|||+++.++ +++...++. +.|+.|.|+++|-. ++..+++.|
T Consensus 103 g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g-~~g~~v~ltv~r~g-----~~~~~~l~r 171 (389)
T PLN00049 103 GLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQG-PEGSSVELTLRRGP-----ETRLVTLTR 171 (389)
T ss_pred cEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhc-CCCCEEEEEEEECC-----EEEEEEEEe
Confidence 56666 3899999999999999999999999864 677777765 56788999988643 456666766
No 70
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.00 E-value=0.0028 Score=69.23 Aligned_cols=67 Identities=21% Similarity=0.325 Sum_probs=53.8
Q ss_pred CCceEEeEecCCCcc---cc-CCCCCCEEEEECCEEecCHHHHHHHHh-ccCCCeEEEEEEECCeEEEEEEEe
Q 002474 297 TGLLVVDSVVPGGPA---HL-RLEPGDVLVRVNGEVITQFLKLETLLD-DGVDKNIELLIERGGISMTVNLVV 364 (918)
Q Consensus 297 ~G~lVV~~V~p~spA---~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~-~~~G~~V~l~V~R~G~~~~~~I~l 364 (918)
.| ++=-.+.|+..+ .+ |||+||++++|||..+++..+..+++. .+....++|+|+|+|+.+++.+.+
T Consensus 204 ~G-l~GYrl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i~l 275 (276)
T PRK09681 204 EG-IVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIAL 275 (276)
T ss_pred CC-ceEEEECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEEEc
Confidence 45 432356777544 34 999999999999999999998888885 466788999999999999887764
No 71
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=96.99 E-value=0.0086 Score=67.91 Aligned_cols=161 Identities=21% Similarity=0.218 Sum_probs=98.7
Q ss_pred CCceEEeEecCCCcccc-CCCC-CCEEEEECCEEecCHHHHHHH-HhccCCCeEEEEEEECCeE--EEEEEEeecCCCCC
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEP-GDVLVRVNGEVITQFLKLETL-LDDGVDKNIELLIERGGIS--MTVNLVVQDLHSIT 371 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~-GDiIlsVNG~~I~s~~~l~~~-L~~~~G~~V~l~V~R~G~~--~~~~I~l~~~~~~t 371 (918)
+-.+-|-.|..+|+|++ ||++ -|-|++|||..+..-.+..+. |+....+ |++++...... +.++|+..+.-.
T Consensus 14 teg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~kt~~~R~v~I~ps~~wg-- 90 (462)
T KOG3834|consen 14 TEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNSKTQEVRIVEIVPSNNWG-- 90 (462)
T ss_pred ceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEecccceeEEEEeccccccc--
Confidence 33344568999999999 9998 589999999999876655544 4444444 99998754332 233333222111
Q ss_pred CCceeeecceeecccchhhhcccCCCCCc---EEEEcCCChhHHcCCC-CCCEEEEECCeecCCHHHHHHHHHhcCCCCe
Q 002474 372 PDYFLEVSGAVIHPLSYQQARNFRFPCGL---VYVAEPGYMLFRAGVP-RHAIIKKFAGEEISRLEDLISVLSKLSRGAR 447 (918)
Q Consensus 372 ~~~~v~~~Gl~~~~ls~~~~~~~gl~~~G---V~Vs~pgspA~~AGLk-~GD~I~sVNG~~v~~l~efi~vl~~~~~g~r 447 (918)
. .++|++++--+.. .++.- |+--.++|||++|||. -+|.|+-+-+.-..+-+++...|.. ..++-
T Consensus 91 -g---qllGvsvrFcsf~------~A~~~vwHvl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIes-he~kp 159 (462)
T KOG3834|consen 91 -G---QLLGVSVRFCSFD------GAVESVWHVLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIES-HEGKP 159 (462)
T ss_pred -c---cccceEEEeccCc------cchhheeeeeecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHh-ccCCC
Confidence 1 1345544321111 11111 2222489999999999 7899999944444556677777777 45677
Q ss_pred EEEEEeeccccccceEEEEEEEcC-CCC
Q 002474 448 VPIEYSSYTDRHRRKSVLVTIDRH-EWY 474 (918)
Q Consensus 448 V~l~~~~~~~~~~~k~~~ltIdRd-~w~ 474 (918)
+.|-++..+.. .+..++|..| .|=
T Consensus 160 LklyVYN~D~d---~~ReVti~pn~awG 184 (462)
T KOG3834|consen 160 LKLYVYNHDTD---SCREVTITPNSAWG 184 (462)
T ss_pred cceeEeecCCC---ccceEEeecccccc
Confidence 88887776553 4455555432 554
No 72
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=96.95 E-value=0.0022 Score=63.42 Aligned_cols=67 Identities=22% Similarity=0.304 Sum_probs=48.1
Q ss_pred ceEEeEecCCCcccc-CCCC-CCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCeE--EEEEEEee
Q 002474 299 LLVVDSVVPGGPAHL-RLEP-GDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGIS--MTVNLVVQ 365 (918)
Q Consensus 299 ~lVV~~V~p~spA~~-GLq~-GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~~--~~~~I~l~ 365 (918)
.+-|-.|.|+|||++ ||++ .|.|+.+|+..+.+.++|.+.+..+.++.+.|.|...... +.+++.+.
T Consensus 44 ~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vyns~~~~vR~V~i~P~ 114 (138)
T PF04495_consen 44 GWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVYNSKTDSVREVTITPS 114 (138)
T ss_dssp EEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEEETTTTCEEEEEE---
T ss_pred eEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEEECCCCeEEEEEEEcC
Confidence 344469999999999 9999 6999999999999999999999888899999999975544 44444443
No 73
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=96.94 E-value=0.0024 Score=69.99 Aligned_cols=70 Identities=24% Similarity=0.367 Sum_probs=62.2
Q ss_pred CCceEEeEecCCCccccCCCCCCEEEEECCEEecCHHHHHHHHh-ccCCCeEEEEEEE-CCeEEEEEEEeecC
Q 002474 297 TGLLVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQFLKLETLLD-DGVDKNIELLIER-GGISMTVNLVVQDL 367 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~GLq~GDiIlsVNG~~I~s~~~l~~~L~-~~~G~~V~l~V~R-~G~~~~~~I~l~~~ 367 (918)
.|++++ .+..++|+...|+.||.|++|||+++.+..++...+. ..+|+++++++.| +++....++++...
T Consensus 130 ~gvyv~-~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl~~~ 201 (342)
T COG3480 130 AGVYVL-SVIDNSPFKGKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITLIKN 201 (342)
T ss_pred eeEEEE-EccCCcchhceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEEEee
Confidence 688886 8889999999999999999999999999999999885 6899999999997 77877788877655
No 74
>PF12812 PDZ_1: PDZ-like domain
Probab=96.68 E-value=0.0049 Score=55.02 Aligned_cols=64 Identities=19% Similarity=0.336 Sum_probs=55.0
Q ss_pred ccCeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHh
Q 002474 262 TLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLD 340 (918)
Q Consensus 262 ~Lgv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~ 340 (918)
+.|..|+.+++..+|.++++ -+.+++ ....++++.. ++..|-+|.+|||+++.+.+++.+++.
T Consensus 10 ~~Ga~f~~Ls~q~aR~~~~~---------------~~gv~v-~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk 73 (78)
T PF12812_consen 10 VCGAVFHDLSYQQARQYGIP---------------VGGVYV-AVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVK 73 (78)
T ss_pred EcCeecccCCHHHHHHhCCC---------------CCEEEE-EecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHH
Confidence 58899999999999999985 234443 6778899998 699999999999999999999998885
Q ss_pred c
Q 002474 341 D 341 (918)
Q Consensus 341 ~ 341 (918)
+
T Consensus 74 ~ 74 (78)
T PF12812_consen 74 K 74 (78)
T ss_pred h
Confidence 4
No 75
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=96.68 E-value=0.0052 Score=71.00 Aligned_cols=57 Identities=16% Similarity=0.201 Sum_probs=45.6
Q ss_pred CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEc
Q 002474 406 PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDR 470 (918)
Q Consensus 406 pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdR 470 (918)
|+|+|+++||++||+|++|||+++.+|.++...+. ++.+.+++..-++ +...+.+.+
T Consensus 7 pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~----~e~l~L~V~~rdG----e~~~l~Ie~ 63 (433)
T TIGR03279 7 PGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA----DEELELEVLDANG----ESHQIEIEK 63 (433)
T ss_pred CCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc----CCcEEEEEEcCCC----eEEEEEEec
Confidence 89999999999999999999999999999887774 3568888864222 455556655
No 76
>PRK11186 carboxy-terminal protease; Provisional
Probab=96.61 E-value=0.0053 Score=75.05 Aligned_cols=64 Identities=30% Similarity=0.436 Sum_probs=47.7
Q ss_pred eEEeEecCCCcccc--CCCCCCEEEEEC--CEEecC-----HHHHHHHHhccCCCeEEEEEEEC---CeEEEEEEE
Q 002474 300 LVVDSVVPGGPAHL--RLEPGDVLVRVN--GEVITQ-----FLKLETLLDDGVDKNIELLIERG---GISMTVNLV 363 (918)
Q Consensus 300 lVV~~V~p~spA~~--GLq~GDiIlsVN--G~~I~s-----~~~l~~~L~~~~G~~V~l~V~R~---G~~~~~~I~ 363 (918)
++|..|.|+|||++ ||++||+|++|| |+++.+ ..++..++....|.+|.|+|.|+ ++..+++++
T Consensus 257 ~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r~~~~~~~~~vtl~ 332 (667)
T PRK11186 257 TVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILPAGKGTKTRIVTLT 332 (667)
T ss_pred EEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEeCCCCCceEEEEEE
Confidence 44569999999998 899999999999 554433 23566666667899999999984 344555443
No 77
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.31 E-value=0.0024 Score=58.53 Aligned_cols=48 Identities=21% Similarity=0.304 Sum_probs=37.1
Q ss_pred cCCCCCcEEEE--cCCChhHHcCCCCCCEEEEECCeecC--CHHHHHHHHHh
Q 002474 394 FRFPCGLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEIS--RLEDLISVLSK 441 (918)
Q Consensus 394 ~gl~~~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~--~l~efi~vl~~ 441 (918)
|+-+..|+||. +.||||+.|||+.+|.|+.|||-... +-+..++.+++
T Consensus 54 f~ytD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTMvTHd~Avk~i~k 105 (124)
T KOG3553|consen 54 FSYTDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTMVTHDQAVKRITK 105 (124)
T ss_pred CCcCCccEEEEEeccCChhhhhcceecceEEEecCceeEEEEhHHHHHHhhH
Confidence 44456899999 49999999999999999999998765 33444444444
No 78
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=96.29 E-value=0.0087 Score=69.68 Aligned_cols=85 Identities=31% Similarity=0.403 Sum_probs=66.2
Q ss_pred cCeEEEEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHH-h
Q 002474 263 LQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLL-D 340 (918)
Q Consensus 263 Lgv~~~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L-~ 340 (918)
.|+.+..++.+ .-.||+. .- .+.|..++..|.++|||++ ||.+||.|++|||. ...+ +
T Consensus 439 ~gL~~~~~~~~-~~~LGl~----------v~-~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~ 498 (558)
T COG3975 439 FGLTFTPKPRE-AYYLGLK----------VK-SEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDR 498 (558)
T ss_pred cceEEEecCCC-CcccceE----------ec-ccCCeeEEEecCCCChhHhccCCCccEEEEEcCc--------cccccc
Confidence 78888887765 5567774 11 4455666679999999999 99999999999999 2233 3
Q ss_pred ccCCCeEEEEEEECCeEEEEEEEeecC
Q 002474 341 DGVDKNIELLIERGGISMTVNLVVQDL 367 (918)
Q Consensus 341 ~~~G~~V~l~V~R~G~~~~~~I~l~~~ 367 (918)
..++..+++++.|.|..+++.+++...
T Consensus 499 ~~~~d~i~v~~~~~~~L~e~~v~~~~~ 525 (558)
T COG3975 499 YKVNDKIQVHVFREGRLREFLVKLGGD 525 (558)
T ss_pred cccccceEEEEccCCceEEeecccCCC
Confidence 567889999999999999988876544
No 79
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=96.16 E-value=0.0096 Score=62.86 Aligned_cols=63 Identities=21% Similarity=0.186 Sum_probs=50.9
Q ss_pred EEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhc-cCCCeEEEEEEECCeEEEEEEE
Q 002474 301 VVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLV 363 (918)
Q Consensus 301 VV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~-~~G~~V~l~V~R~G~~~~~~I~ 363 (918)
.++-..+++..+. |||+||+.++||+..+++.+++..++.. ..-..++++|.|+|+...+.|.
T Consensus 210 r~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 210 RFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred EecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 3334444566677 9999999999999999999999988853 4557899999999998887664
No 80
>PRK11186 carboxy-terminal protease; Provisional
Probab=96.14 E-value=0.015 Score=71.21 Aligned_cols=70 Identities=19% Similarity=0.322 Sum_probs=52.4
Q ss_pred cEEEEc--CCChhHHc-CCCCCCEEEEEC--CeecCC-----HHHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEE
Q 002474 400 LVYVAE--PGYMLFRA-GVPRHAIIKKFA--GEEISR-----LEDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTID 469 (918)
Q Consensus 400 GV~Vs~--pgspA~~A-GLk~GD~I~sVN--G~~v~~-----l~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltId 469 (918)
+++|.+ |||||+++ ||++||+|++|| |+++.+ +++.++.++. +.|..|+|++.+-.. ..++..+++.
T Consensus 256 ~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG-~~Gt~V~LtV~r~~~--~~~~~~vtl~ 332 (667)
T PRK11186 256 YTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKG-PKGSKVRLEILPAGK--GTKTRIVTLT 332 (667)
T ss_pred eEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcC-CCCCEEEEEEEeCCC--CCceEEEEEE
Confidence 455554 99999998 999999999999 555433 4577777776 678999999987421 2256778887
Q ss_pred cCC
Q 002474 470 RHE 472 (918)
Q Consensus 470 Rd~ 472 (918)
|+.
T Consensus 333 R~~ 335 (667)
T PRK11186 333 RDK 335 (667)
T ss_pred eee
Confidence 743
No 81
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=95.98 E-value=0.13 Score=52.97 Aligned_cols=172 Identities=18% Similarity=0.139 Sum_probs=99.8
Q ss_pred eeEeEEEEEeccCCCcEEEEecccccCCCccEEEEeee------C--ceeeeeEEEEeec-------ccceEEEEecCCC
Q 002474 649 HFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAA------F--PIEIPGEVVFLHP-------VHNFALIAYDPSS 713 (918)
Q Consensus 649 ~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~~di~vtfa~------~--~~~vp~~vvflhp-------~~n~aiv~ydp~~ 713 (918)
.+.++|.+|. .-+|||..|.+.. -.++.+.+.. + ...+..+=++.|| .+|+|||+.|...
T Consensus 24 ~~~C~G~li~----~~~vLTaahC~~~-~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~ 98 (220)
T PF00089_consen 24 RFFCTGTLIS----PRWVLTAAHCVDG-ASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPI 98 (220)
T ss_dssp EEEEEEEEEE----TTEEEEEGGGHTS-GGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSS
T ss_pred CeeEeEEecc----ccccccccccccc-cccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 7889999999 4499999999988 3355554332 0 1456666677887 5799999999873
Q ss_pred cCcccccceeecccCCCc-CCCCCCeEEEEEecCCCce----eEEeEEEecceeecccCCCCCccccccceeeEEEec--
Q 002474 714 LGVAGASVVRAAELLPEP-ALRRGDSVYLVGLSRSLQA----TSRKSIVTNPCAALNISSADCPRYRAMNMEVIELDT-- 786 (918)
Q Consensus 714 ~~~~~~~~v~~~~l~~~~-~l~~Gd~v~~vG~~~~~~~----~~~~t~vt~i~~~~~~~~~~~pryr~~n~e~i~~d~-- 786 (918)
-.. ..+..+.+.... .++.|+.+.++|+...... ..++..+.-+. .-.-... .+++ .....+..+.
T Consensus 99 ~~~---~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~-~~~c~~~-~~~~--~~~~~~c~~~~~ 171 (220)
T PF00089_consen 99 TFG---DNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVS-RKTCRSS-YNDN--LTPNMICAGSSG 171 (220)
T ss_dssp EHB---SSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEE-HHHHHHH-TTTT--STTTEEEEETTS
T ss_pred ccc---ccccccccccccccccccccccccccccccccccccccccccccccc-ccccccc-cccc--cccccccccccc
Confidence 321 467777887732 3589999999999765222 12222222111 0000000 1111 1122333333
Q ss_pred --CcCCCC-cceEECCCccEEEEEeeeecceeccCCCCCCceEEeccchhhHHHHH
Q 002474 787 --DFGSTF-SGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVL 839 (918)
Q Consensus 787 --~~~~~~-~Gvl~d~~G~v~alw~s~~~~~~~~~~~~~~~~~~~gl~~~~i~~v~ 839 (918)
+...+. ||+|+..++.+.|+...- .. |..... .....++...+++|
T Consensus 172 ~~~~~~g~sG~pl~~~~~~lvGI~s~~-~~----c~~~~~--~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 172 SGDACQGDSGGPLICNNNYLVGIVSFG-EN----CGSPNY--PGVYTRVSSYLDWI 220 (220)
T ss_dssp SSBGGTTTTTSEEEETTEEEEEEEEEE-SS----SSBTTS--EEEEEEGGGGHHHH
T ss_pred cccccccccccccccceeeecceeeec-CC----CCCCCc--CEEEEEHHHhhccC
Confidence 334444 888888888888877543 21 111112 23336777766664
No 82
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=95.92 E-value=0.022 Score=62.75 Aligned_cols=69 Identities=22% Similarity=0.292 Sum_probs=55.6
Q ss_pred CcEEEEc--CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEc
Q 002474 399 GLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDR 470 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdR 470 (918)
.|||+.+ .++|+. .-|+.||.|++|||+++.+.++|++.++..+.|+.|+|.|.|..+ ..+...+++..
T Consensus 130 ~gvyv~~v~~~~~~~-gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~--~~~~~~~tl~~ 200 (342)
T COG3480 130 AGVYVLSVIDNSPFK-GKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNE--TPEIVTITLIK 200 (342)
T ss_pred eeEEEEEccCCcchh-ceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccC--CCceEEEEEEe
Confidence 5999885 456654 349999999999999999999999999999999999999997655 23344445444
No 83
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=95.89 E-value=0.03 Score=61.40 Aligned_cols=50 Identities=8% Similarity=0.150 Sum_probs=45.8
Q ss_pred ChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeeccc
Q 002474 408 YMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTD 457 (918)
Q Consensus 408 spA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~~ 457 (918)
..+.++||++||++++|||.++.+.++..++++.+++..+++|++.|-+.
T Consensus 218 ~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq 267 (276)
T PRK09681 218 SLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGA 267 (276)
T ss_pred HHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCE
Confidence 46888999999999999999999999999999999998888888888664
No 84
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=95.84 E-value=0.023 Score=51.92 Aligned_cols=51 Identities=16% Similarity=0.186 Sum_probs=38.1
Q ss_pred CCChhHHcC--CCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeeccc
Q 002474 406 PGYMLFRAG--VPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTD 457 (918)
Q Consensus 406 pgspA~~AG--Lk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~~ 457 (918)
..||..+.| ++.||.|++|||+++..-.++.+++.. +.|+.|.|++.+-..
T Consensus 29 ~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~-~agk~V~Ltv~~~~~ 81 (88)
T PF14685_consen 29 ARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEG-KAGKQVLLTVNRKPG 81 (88)
T ss_dssp -B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHT-TTTSEEEEEEE-STT
T ss_pred ccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcc-cCCCEEEEEEecCCC
Confidence 358888888 559999999999999998899999998 567999999988765
No 85
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.20 E-value=0.06 Score=66.33 Aligned_cols=42 Identities=19% Similarity=0.270 Sum_probs=33.0
Q ss_pred CcEEEEEECCC-----------CcccccccCCCCCCcccCCCCEEEEEecCCC
Q 002474 117 HDFGFFRYDPS-----------AIQFLNYDEIPLAPEAACVGLEIRVVGNDSG 158 (918)
Q Consensus 117 ~DlAlLkvd~~-----------~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g 158 (918)
-||+++|+-.. +.|+-+-.-++++.+.++.||.|+++|||..
T Consensus 200 gDfs~fRvY~~~dg~PA~Ys~dnvP~~p~~~l~is~~G~keGD~vmv~GyPG~ 252 (698)
T PF10459_consen 200 GDFSFFRVYADKDGKPADYSKDNVPYKPKHFLKISLKGVKEGDFVMVAGYPGR 252 (698)
T ss_pred CceEEEEEEeCCCCCccccCcCCCCCCCccccccCCCCCCCCCeEEEccCCCc
Confidence 39999999432 4565555567788899999999999999953
No 86
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=95.08 E-value=0.31 Score=50.49 Aligned_cols=94 Identities=21% Similarity=0.212 Sum_probs=68.0
Q ss_pred ceeEeEEEEEeccCCCcEEEEecccccCC-CccEEEEeee--------CceeeeeEEEEeec-------ccceEEEEecC
Q 002474 648 QHFFGTGVIIYHSQSMGLVVVDKNTVAIS-ASDVMLSFAA--------FPIEIPGEVVFLHP-------VHNFALIAYDP 711 (918)
Q Consensus 648 ~~~~G~G~Vvd~~~~~GlV~v~r~~V~~~-~~di~vtfa~--------~~~~vp~~vvflhp-------~~n~aiv~ydp 711 (918)
..+.++|.+|+ .-+|||..|.+... ..++.+.+.. .....+.+-++.|| -+|+|||+.+.
T Consensus 23 ~~~~C~GtlIs----~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~ 98 (232)
T cd00190 23 GRHFCGGSLIS----PRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKR 98 (232)
T ss_pred CcEEEEEEEee----CCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECC
Confidence 45789999999 67999999999764 2345555442 13455677788997 47889999995
Q ss_pred CCcCcccccceeecccCCCc-CCCCCCeEEEEEecCCC
Q 002474 712 SSLGVAGASVVRAAELLPEP-ALRRGDSVYLVGLSRSL 748 (918)
Q Consensus 712 ~~~~~~~~~~v~~~~l~~~~-~l~~Gd~v~~vG~~~~~ 748 (918)
.+-.. ..++++.|.... .+..|+.+.++|+....
T Consensus 99 ~~~~~---~~v~picl~~~~~~~~~~~~~~~~G~g~~~ 133 (232)
T cd00190 99 PVTLS---DNVRPICLPSSGYNLPAGTTCTVSGWGRTS 133 (232)
T ss_pred cccCC---CcccceECCCccccCCCCCEEEEEeCCcCC
Confidence 44321 357777777661 27899999999996543
No 87
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=94.91 E-value=0.025 Score=69.50 Aligned_cols=54 Identities=24% Similarity=0.357 Sum_probs=42.2
Q ss_pred EEEeecCCCCCCCccEEcccceEEEeccccCC-----------CCCcccccchhhHHHHHHHHHh
Q 002474 191 MQAASGTKGGSSGSPVIDWQGRAVALNAGSKS-----------SSASAFFLPLERVVRALRFLQE 244 (918)
Q Consensus 191 Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~-----------~~~~~faIPi~~i~~~L~~l~~ 244 (918)
+.++..|.+|||||||+|.+|++|||+.-+.- ....+..+.+..|+.+|+++-.
T Consensus 624 FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~g 688 (698)
T PF10459_consen 624 FLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVYG 688 (698)
T ss_pred EEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHhC
Confidence 57788899999999999999999999975431 1234556788888888877653
No 88
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=94.83 E-value=0.29 Score=51.60 Aligned_cols=167 Identities=15% Similarity=0.076 Sum_probs=85.0
Q ss_pred CCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEEEEEecCC----------------CcEEEEEEC----
Q 002474 66 GASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPV----------------HDFGFFRYD---- 125 (918)
Q Consensus 66 ~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~vv~~Dp~----------------~DlAlLkvd---- 125 (918)
..+.||=-+++++++..--=-|.+.+......+.+.+|+.+++++....+. .-++-+.-+
T Consensus 18 ~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~Nt~~G 97 (218)
T PF05580_consen 18 TAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEKNTQFG 97 (218)
T ss_pred CcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEeccccc
Confidence 456788888887666666667777754444556667787777776554322 112222221
Q ss_pred ------CCC-cccccccCCCCC-CcccCCCCEEEEEecCCCCC-ceEEEeEEEeecCCC-CCCCCCCccccceeEEEEee
Q 002474 126 ------PSA-IQFLNYDEIPLA-PEAACVGLEIRVVGNDSGEK-VSILAGTLARLDRDA-PHYKKDGYNDFNTFYMQAAS 195 (918)
Q Consensus 126 ------~~~-l~~~~l~~l~l~-~~~l~vG~~V~vvG~p~g~~-~svt~G~Vs~~~r~~-p~~~~~~~~dfn~~~Iq~da 195 (918)
... .....-.++|++ .+.++.|..-..--. .|.. -.... .|..+.++. +..+..-..-...+++....
T Consensus 98 I~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~-~G~~ie~f~i-eI~~v~~~~~~~~k~~vi~vtd~~Ll~~TG 175 (218)
T PF05580_consen 98 IYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVI-DGTKIEEFDI-EIEKVLPQSSPSGKGMVIKVTDPRLLEKTG 175 (218)
T ss_pred eeEEeccccccccccCceeEEEEHHHceEccEEEEEEE-cCCeEEEeEE-EEEEEccCCCCCCCcEEEEECCcchhhhhC
Confidence 110 111122334444 356777763322111 1211 11111 222232221 11111101001122344455
Q ss_pred cCCCCCCCccEEcccceEEEeccccCC-CCCcccccchhhH
Q 002474 196 GTKGGSSGSPVIDWQGRAVALNAGSKS-SSASAFFLPLERV 235 (918)
Q Consensus 196 ~i~~G~SGGPvvn~dG~VVGI~~~~~~-~~~~~faIPi~~i 235 (918)
.+-.||||+|++ .||++||=++.... +...+|.++++..
T Consensus 176 GIvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~M 215 (218)
T PF05580_consen 176 GIVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEWM 215 (218)
T ss_pred CEEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHHH
Confidence 577899999998 59999998876554 3677888887544
No 89
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=94.78 E-value=0.71 Score=48.03 Aligned_cols=94 Identities=20% Similarity=0.224 Sum_probs=70.3
Q ss_pred ceeEeEEEEEeccCCCcEEEEecccccCCC-ccEEEEeeeCc-------eeeeeEEEEeec-------ccceEEEEecCC
Q 002474 648 QHFFGTGVIIYHSQSMGLVVVDKNTVAISA-SDVMLSFAAFP-------IEIPGEVVFLHP-------VHNFALIAYDPS 712 (918)
Q Consensus 648 ~~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~-~di~vtfa~~~-------~~vp~~vvflhp-------~~n~aiv~ydp~ 712 (918)
..+..+|.+|+ .-+|||..|.+...- .++.|.+.... ......=+++|| .+|+|||+.+-.
T Consensus 24 ~~~~C~GtlIs----~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~ 99 (229)
T smart00020 24 GRHFCGGSLIS----PRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSP 99 (229)
T ss_pred CCcEEEEEEec----CCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcc
Confidence 46889999999 789999999997653 56777766422 556677788887 578999999876
Q ss_pred CcCcccccceeecccCCC-cCCCCCCeEEEEEecCCC
Q 002474 713 SLGVAGASVVRAAELLPE-PALRRGDSVYLVGLSRSL 748 (918)
Q Consensus 713 ~~~~~~~~~v~~~~l~~~-~~l~~Gd~v~~vG~~~~~ 748 (918)
+--. ..++.+.|... ..+..|+.+.++|+....
T Consensus 100 i~~~---~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~ 133 (229)
T smart00020 100 VTLS---DNVRPICLPSSNYNVPAGTTCTVSGWGRTS 133 (229)
T ss_pred cCCC---CceeeccCCCcccccCCCCEEEEEeCCCCC
Confidence 4321 35777777664 126789999999997665
No 90
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=94.63 E-value=0.3 Score=52.84 Aligned_cols=116 Identities=22% Similarity=0.234 Sum_probs=57.5
Q ss_pred CCceEEEEEEeCC-CcEEEEcCcccCCCCcEEEEEecCCeEEeEEEEEecCCCcEEEEEECC--CCcccccccCCCCCCc
Q 002474 66 GASYATGFVVDKR-RGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDP--SAIQFLNYDEIPLAPE 142 (918)
Q Consensus 66 ~~~~GTGFVVd~~-~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~vv~~Dp~~DlAlLkvd~--~~l~~~~l~~l~l~~~ 142 (918)
+++.|||=+..-+ +-.|+|+.||+..+ ..++... +... ..-++..-|||.-.+++ ...|..++.. .
T Consensus 110 Gss~Gsggvft~~~~~vvvTAtHVlg~~--~a~v~~~-g~~~---~~tF~~~GDfA~~~~~~~~G~~P~~k~a~----~- 178 (297)
T PF05579_consen 110 GSSVGSGGVFTIGGNTVVVTATHVLGGN--TARVSGV-GTRR---MLTFKKNGDFAEADITNWPGAAPKYKFAQ----N- 178 (297)
T ss_dssp SSSEEEEEEEECTTEEEEEEEHHHCBTT--EEEEEET-TEEE---EEEEEEETTEEEEEETTS-S---B--B-T----T-
T ss_pred eecccccceEEECCeEEEEEEEEEcCCC--eEEEEec-ceEE---EEEEeccCcEEEEECCCCCCCCCceeecC----C-
Confidence 3455665555431 34999999999943 3444432 2222 23445566999999953 2223222221 0
Q ss_pred ccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEecccc
Q 002474 143 AACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGS 220 (918)
Q Consensus 143 ~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~ 220 (918)
..|---+. . ..-+..|.|..- ..=+=+.+|.||+|++..||.+||++++.
T Consensus 179 --~~GrAyW~---t---~tGvE~G~ig~~--------------------~~~~fT~~GDSGSPVVt~dg~liGVHTGS 228 (297)
T PF05579_consen 179 --YTGRAYWL---T---STGVEPGFIGGG--------------------GAVCFTGPGDSGSPVVTEDGDLIGVHTGS 228 (297)
T ss_dssp ---SEEEEEE---E---TTEEEEEEEETT--------------------EEEESS-GGCTT-EEEETTC-EEEEEEEE
T ss_pred --cccceEEE---c---ccCcccceecCc--------------------eEEEEcCCCCCCCccCcCCCCEEEEEecC
Confidence 01100000 0 011223332211 11123568999999999999999999974
No 91
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.08 E-value=0.014 Score=61.37 Aligned_cols=143 Identities=22% Similarity=0.265 Sum_probs=53.1
Q ss_pred CceEEEEEEeCCCc--EEEEcCcccCCCCcEEEEEecCCeEEeE---EEEEecCCCcEEEEEECCCCcccccccCCCCCC
Q 002474 67 ASYATGFVVDKRRG--IILTNRHVVKPGPVVAEAMFVNREEIPV---YPIYRDPVHDFGFFRYDPSAIQFLNYDEIPLAP 141 (918)
Q Consensus 67 ~~~GTGFVVd~~~G--~ILTn~HVV~~~~~~i~v~f~dg~~~~a---~vv~~Dp~~DlAlLkvd~~~l~~~~l~~l~l~~ 141 (918)
.++++. |...+| .++|++||... +..+ ..+.+++.++. +.++.+...|++||+..++-...+....+.+..
T Consensus 29 vGya~c--v~l~~g~~~L~ta~Hv~~~-~~~~-~~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~ 104 (203)
T PF02122_consen 29 VGYATC--VRLFDGEDALLTARHVWSR-PSKV-TSLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQ 104 (203)
T ss_dssp -----E--EEE----EEEEE-HHHHTS-SS----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B--
T ss_pred cccceE--EECcCCccceecccccCCC-ccce-eEcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccc
Confidence 444555 432245 99999999995 4433 34566766654 356778999999999985322222223322221
Q ss_pred -cccCCCCEEEEEecCCCCCceEEEe-EEEeecCCCCCCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEeccc
Q 002474 142 -EAACVGLEIRVVGNDSGEKVSILAG-TLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAG 219 (918)
Q Consensus 142 -~~l~vG~~V~vvG~p~g~~~svt~G-~Vs~~~r~~p~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~ 219 (918)
+.+.. -.+..+ ....+ ..+.-..-.+. .++ ++..-+.+.+|.||.|.++.+ +++|++.+
T Consensus 105 ~~~~~~----g~~~~y-----~~~~~~~~~~sa~i~g~------~~~---~~~vls~T~~G~SGtp~y~g~-~vvGvH~G 165 (203)
T PF02122_consen 105 NSQLAK----GPVSFY-----GFSSGEWPCSSAKIPGT------EGK---FASVLSNTSPGWSGTPYYSGK-NVVGVHTG 165 (203)
T ss_dssp --SEEE----EESSTT-----SEEEEEEEEEE-S----------STT---EEEE-----TT-TT-EEE-SS--EEEEEEE
T ss_pred hhhhCC----CCeeee-----eecCCCceeccCccccc------cCc---CCceEcCCCCCCCCCCeEECC-CceEeecC
Confidence 11100 001111 11111 11111111111 111 477788999999999999999 99999998
Q ss_pred c---CCCCCcccccch
Q 002474 220 S---KSSSASAFFLPL 232 (918)
Q Consensus 220 ~---~~~~~~~faIPi 232 (918)
. ....+.++..|+
T Consensus 166 ~~~~~~~~n~n~~spi 181 (203)
T PF02122_consen 166 SPSGSNRENNNRMSPI 181 (203)
T ss_dssp E---------------
T ss_pred cccccccccccccccc
Confidence 4 223444444333
No 92
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=93.68 E-value=0.055 Score=53.10 Aligned_cols=31 Identities=29% Similarity=0.359 Sum_probs=22.3
Q ss_pred EEEeecCCCCCCCccEEcccceEEEeccccC
Q 002474 191 MQAASGTKGGSSGSPVIDWQGRAVALNAGSK 221 (918)
Q Consensus 191 Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~ 221 (918)
...+..+.+|+||+|+||.+|++|||...+.
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~ 118 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGV 118 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEE
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEccce
Confidence 4555667899999999999999999987644
No 93
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=93.63 E-value=0.17 Score=49.96 Aligned_cols=54 Identities=28% Similarity=0.508 Sum_probs=38.9
Q ss_pred CCceEEeEecCCCcccc--CCCCCCEEEEECCEEecCHHHHH--HHHhccCCCeEEEEEE
Q 002474 297 TGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLE--TLLDDGVDKNIELLIE 352 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~~l~--~~L~~~~G~~V~l~V~ 352 (918)
..+++ +.+.|++.|++ ||+.||.+++|||..+..-.+-. ++|.... .+++|.|.
T Consensus 115 spiyi-sriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~-gsvklvvr 172 (207)
T KOG3550|consen 115 SPIYI-SRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAV-GSVKLVVR 172 (207)
T ss_pred CceEE-EeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhc-CcEEEEEe
Confidence 34555 89999999999 89999999999999998644332 3333333 45666553
No 94
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=93.33 E-value=0.019 Score=52.82 Aligned_cols=36 Identities=33% Similarity=0.512 Sum_probs=31.9
Q ss_pred CCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCH
Q 002474 296 ETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF 332 (918)
Q Consensus 296 ~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~ 332 (918)
..|++| ..|..+|||+. ||+.+|.|+.+||-..+=.
T Consensus 58 D~GiYv-T~V~eGsPA~~AGLrihDKIlQvNG~DfTMv 94 (124)
T KOG3553|consen 58 DKGIYV-TRVSEGSPAEIAGLRIHDKILQVNGWDFTMV 94 (124)
T ss_pred CccEEE-EEeccCChhhhhcceecceEEEecCceeEEE
Confidence 579998 69999999999 9999999999999776643
No 95
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=93.17 E-value=0.22 Score=57.67 Aligned_cols=87 Identities=20% Similarity=0.269 Sum_probs=64.0
Q ss_pred eeecceeecccchhhhcccCCCCCcEEEEc--CCChhHHcC-CCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEE
Q 002474 376 LEVSGAVIHPLSYQQARNFRFPCGLVYVAE--PGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPI 450 (918)
Q Consensus 376 v~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs~--pgspA~~AG-Lk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l 450 (918)
+.++|+++..-+.. -..+|+||.+ +|++-+..| +.+||.|+.||....+|+ ++.+++|+++-
T Consensus 260 vnfLGiSivgqsn~------rgDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV------- 326 (626)
T KOG3571|consen 260 VNFLGISIVGQSNA------RGDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAV------- 326 (626)
T ss_pred cccceeEeecccCc------CCCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHh-------
Confidence 45678777543321 1236999995 888888887 999999999999999988 78899999864
Q ss_pred EEeeccccccceEEEEEEEcCCCCCCCceeeecC
Q 002474 451 EYSSYTDRHRRKSVLVTIDRHEWYAPPQIYTRND 484 (918)
Q Consensus 451 ~~~~~~~~~~~k~~~ltIdRd~w~~~~~~~~r~d 484 (918)
|+..++.|+|-. .|-+..+-+.+.+
T Consensus 327 --------~~~gPi~ltvAk-~~DP~~q~~fTip 351 (626)
T KOG3571|consen 327 --------SRPGPIKLTVAK-CWDPNPQSYFTIP 351 (626)
T ss_pred --------ccCCCeEEEEee-ccCCCCcccccCC
Confidence 344677888877 7876665555554
No 96
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=92.93 E-value=0.22 Score=59.35 Aligned_cols=51 Identities=24% Similarity=0.244 Sum_probs=43.9
Q ss_pred CCCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCe
Q 002474 296 ETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKN 346 (918)
Q Consensus 296 ~~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~ 346 (918)
...++-|..|.+++||.+ .|++||++++|||.+|++..+..+.+....|..
T Consensus 396 ~~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~~ 447 (1051)
T KOG3532|consen 396 TNRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGDL 447 (1051)
T ss_pred CceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccce
Confidence 345566789999999999 999999999999999999999998887665543
No 97
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=92.43 E-value=0.11 Score=61.70 Aligned_cols=53 Identities=21% Similarity=0.299 Sum_probs=43.7
Q ss_pred CcEEEEc--CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEE
Q 002474 399 GLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEY 452 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~ 452 (918)
-|+||.+ ||+.|.++|||.||.|++|||+..+++ .+.+++.-+.++..++|++
T Consensus 562 fgifV~~V~pgskAa~~GlKRgDqilEVNgQnfeni-s~~KA~eiLrnnthLtltv 616 (1283)
T KOG3542|consen 562 FGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENI-SAKKAEEILRNNTHLTLTV 616 (1283)
T ss_pred ceeEEeeecCCchHHHhhhhhhhhhhhccccchhhh-hHHHHHHHhcCCceEEEEE
Confidence 3799984 999999999999999999999999999 7777777766655555544
No 98
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=92.18 E-value=0.57 Score=56.46 Aligned_cols=121 Identities=20% Similarity=0.322 Sum_probs=73.8
Q ss_pred cCCCcEEEEEECCCC---------cc------cccccCCCCC--CcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCC
Q 002474 114 DPVHDFGFFRYDPSA---------IQ------FLNYDEIPLA--PEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAP 176 (918)
Q Consensus 114 Dp~~DlAlLkvd~~~---------l~------~~~l~~l~l~--~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p 176 (918)
....|+||+|+++.- +. .+.+..+-.. -..+..|.+|+=+|...|. |.|.+.+..-.
T Consensus 540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgy----T~G~lNg~klv-- 613 (695)
T PF08192_consen 540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGY----TTGILNGIKLV-- 613 (695)
T ss_pred ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCc----cceEecceEEE--
Confidence 445699999998532 11 1112111111 2356789999999987665 55555544210
Q ss_pred CCCCCCccccceeEEEEe----ecCCCCCCCccEEcccce------EEEeccccCCC-CCcccccchhhHHHHHHHH
Q 002474 177 HYKKDGYNDFNTFYMQAA----SGTKGGSSGSPVIDWQGR------AVALNAGSKSS-SASAFFLPLERVVRALRFL 242 (918)
Q Consensus 177 ~~~~~~~~dfn~~~Iq~d----a~i~~G~SGGPvvn~dG~------VVGI~~~~~~~-~~~~faIPi~~i~~~L~~l 242 (918)
|...+-.. -.+++... .=..+|.||+-|++.-+. |+||..+.... ..++++.|+..|+.=|+.+
T Consensus 614 -yw~dG~i~-s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 614 -YWADGKIQ-SSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred -EecCCCeE-EEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHh
Confidence 11111111 13445555 334679999999997555 99999885544 5788899998888766654
No 99
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=91.89 E-value=1.8 Score=44.47 Aligned_cols=150 Identities=15% Similarity=0.144 Sum_probs=82.3
Q ss_pred CceEEEEEEeeeccCCCCCCCceEEEEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEE--EEEecCC---CcEEE
Q 002474 47 PAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVY--PIYRDPV---HDFGF 121 (918)
Q Consensus 47 ~SVV~I~~~~~~~fd~~~~~~~~GTGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~--vv~~Dp~---~DlAl 121 (918)
+.++.|.+ ..+...++++-|.. .++|.++|--. ...+.+ ++..++.. +...+.. .|+++
T Consensus 13 ~N~~~v~~---------~~g~~t~l~~gi~~--~~~lvp~H~~~----~~~i~i-~g~~~~~~d~~~lv~~~~~~~Dl~~ 76 (172)
T PF00548_consen 13 KNVVPVTT---------GKGEFTMLALGIYD--RYFLVPTHEEP----EDTIYI-DGVEYKVDDSVVLVDRDGVDTDLTL 76 (172)
T ss_dssp HHEEEEEE---------TTEEEEEEEEEEEB--TEEEEEGGGGG----CSEEEE-TTEEEEEEEEEEEEETTSSEEEEEE
T ss_pred ccEEEEEe---------CCceEEEecceEee--eEEEEECcCCC----cEEEEE-CCEEEEeeeeEEEecCCCcceeEEE
Confidence 35566665 23566788888874 59999999222 222333 45554332 2334444 59999
Q ss_pred EEECCCCcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEEEEeecCCCCC
Q 002474 122 FRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGS 201 (918)
Q Consensus 122 Lkvd~~~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~Iq~da~i~~G~ 201 (918)
++++.. -++-++.. -+........+...++-++......+..+.+...+.- ... + ......+...+++..|.
T Consensus 77 v~l~~~-~kfrDIrk-~~~~~~~~~~~~~l~v~~~~~~~~~~~v~~v~~~~~i-~~~---g--~~~~~~~~Y~~~t~~G~ 148 (172)
T PF00548_consen 77 VKLPRN-PKFRDIRK-FFPESIPEYPECVLLVNSTKFPRMIVEVGFVTNFGFI-NLS---G--TTTPRSLKYKAPTKPGM 148 (172)
T ss_dssp EEEESS-S-B--GGG-GSBSSGGTEEEEEEEEESSSSTCEEEEEEEEEEEEEE-EET---T--EEEEEEEEEESEEETTG
T ss_pred EEccCC-cccCchhh-hhccccccCCCcEEEEECCCCccEEEEEEEEeecCcc-ccC---C--CEeeEEEEEccCCCCCc
Confidence 999642 12222221 0111122445555555554433334444444443321 000 0 01134688889999999
Q ss_pred CCccEEc---ccceEEEecccc
Q 002474 202 SGSPVID---WQGRAVALNAGS 220 (918)
Q Consensus 202 SGGPvvn---~dG~VVGI~~~~ 220 (918)
-||||+. ..++++||+.++
T Consensus 149 CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 149 CGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp TTEEEEESCGGTTEEEEEEEEE
T ss_pred cCCeEEEeeccCccEEEEEecc
Confidence 9999985 358999999875
No 100
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=91.25 E-value=0.72 Score=48.27 Aligned_cols=57 Identities=18% Similarity=0.108 Sum_probs=41.9
Q ss_pred EEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHH--HHHHhcCCCCeEEEEEeeccc
Q 002474 401 VYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLI--SVLSKLSRGARVPIEYSSYTD 457 (918)
Q Consensus 401 V~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi--~vl~~~~~g~rV~l~~~~~~~ 457 (918)
++|+ .|+|||+.|||+.||.|+++.+....|...|. ..+.....++.+.+++.|-+.
T Consensus 141 a~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~ 201 (231)
T KOG3129|consen 141 AVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQ 201 (231)
T ss_pred EEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCC
Confidence 4455 39999999999999999999888776655443 334444667888888877554
No 101
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=91.10 E-value=0.23 Score=61.09 Aligned_cols=53 Identities=28% Similarity=0.541 Sum_probs=39.5
Q ss_pred eEEeEecCCCccccCCCCCCEEEEECCEEecC--HHHHHHHHhccCCCeEEEEEEE
Q 002474 300 LVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQ--FLKLETLLDDGVDKNIELLIER 353 (918)
Q Consensus 300 lVV~~V~p~spA~~GLq~GDiIlsVNG~~I~s--~~~l~~~L~~~~G~~V~l~V~R 353 (918)
+||..|.+|||+..+|++||.|+.|||+++.. |+++..++.. ....|.|+|.+
T Consensus 77 viVr~VT~GGps~GKL~PGDQIl~vN~Epv~daprervIdlvRa-ce~sv~ltV~q 131 (1298)
T KOG3552|consen 77 VIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDAPRERVIDLVRA-CESSVNLTVCQ 131 (1298)
T ss_pred eEEEEecCCCCccccccCCCeEEEecCcccccccHHHHHHHHHH-HhhhcceEEec
Confidence 45579999999999999999999999999986 4444444432 23456666654
No 102
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=90.32 E-value=0.55 Score=46.49 Aligned_cols=46 Identities=20% Similarity=0.341 Sum_probs=37.3
Q ss_pred CcEEEEc--CCChhHHc-CCCCCCEEEEECCeecCC--HHHHHHHHHhcCC
Q 002474 399 GLVYVAE--PGYMLFRA-GVPRHAIIKKFAGEEISR--LEDLISVLSKLSR 444 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~A-GLk~GD~I~sVNG~~v~~--l~efi~vl~~~~~ 444 (918)
+-+||+. ||+-|++- ||+.||.+++|||..+.. -+-.+++++...+
T Consensus 115 spiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~g 165 (207)
T KOG3550|consen 115 SPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVG 165 (207)
T ss_pred CceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcC
Confidence 3588985 99999988 599999999999998864 4667778887554
No 103
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=90.03 E-value=0.63 Score=49.65 Aligned_cols=54 Identities=19% Similarity=0.368 Sum_probs=45.9
Q ss_pred CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeeccccc
Q 002474 406 PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTDRH 459 (918)
Q Consensus 406 pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~~~~ 459 (918)
+++.+...||+.||+-+++|+..+.+.+++..+++.++.-....+++.|-+.+|
T Consensus 216 d~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rh 269 (275)
T COG3031 216 DGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRH 269 (275)
T ss_pred CcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccc
Confidence 567889999999999999999999999999999999987666666666655543
No 104
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=89.08 E-value=1.3 Score=52.98 Aligned_cols=63 Identities=21% Similarity=0.214 Sum_probs=53.4
Q ss_pred CCChhHHcC-CCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEcCCC
Q 002474 406 PGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDRHEW 473 (918)
Q Consensus 406 pgspA~~AG-Lk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdRd~w 473 (918)
+++||++.| |-.||.|++|||...-.| .+-..++|..+.-..|++.+.+.-. ++.+.|.|-+-
T Consensus 682 ~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~cpP-----V~~V~I~RPd~ 747 (829)
T KOG3605|consen 682 HGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVSCPP-----VTTVLIRRPDL 747 (829)
T ss_pred cCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEecCCC-----ceEEEeecccc
Confidence 899999998 999999999999988654 6788899999988999999998875 56677888443
No 105
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=88.05 E-value=1.3 Score=50.65 Aligned_cols=55 Identities=31% Similarity=0.514 Sum_probs=47.7
Q ss_pred EecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCe---EEEEEEE-CCeEE
Q 002474 304 SVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKN---IELLIER-GGISM 358 (918)
Q Consensus 304 ~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~---V~l~V~R-~G~~~ 358 (918)
.+..++++.. ++++||.++++|++++.+|.+....+....+.. +.+.+.| ++...
T Consensus 135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~ 194 (375)
T COG0750 135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEAH 194 (375)
T ss_pred ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEeccceee
Confidence 6888999999 999999999999999999999998887666655 7888899 66553
No 106
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=87.46 E-value=0.66 Score=57.35 Aligned_cols=60 Identities=32% Similarity=0.486 Sum_probs=48.2
Q ss_pred CCCceEEeEecCCCcccc--CCCCCCEEEEECCEEecCHHHHHH-HHhccCCCeEEEEEEECCe
Q 002474 296 ETGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLET-LLDDGVDKNIELLIERGGI 356 (918)
Q Consensus 296 ~~G~lVV~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~~l~~-~L~~~~G~~V~l~V~R~G~ 356 (918)
.-|++| +.|++|++|+. .|+.||.+++|||+.+-...+=+. .|..+.|..|.|.|...|-
T Consensus 959 klGIYv-KsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lmtrtg~vV~leVaKqgA 1021 (1629)
T KOG1892|consen 959 KLGIYV-KSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLMTRTGNVVHLEVAKQGA 1021 (1629)
T ss_pred ccceEE-EEeccCCccccccccccCceeeeecCcccccccHHHHHHHHhccCCeEEEehhhhhh
Confidence 457777 89999999998 599999999999998877655443 3566788999999876553
No 107
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=87.37 E-value=3.7 Score=44.33 Aligned_cols=150 Identities=14% Similarity=0.198 Sum_probs=74.0
Q ss_pred CcEEEEecccccCCCccEEEEeeeCceeeeeE---EEEeecccc--eEEEEecCCCcCcccccceeecccCCCcCCCCCC
Q 002474 663 MGLVVVDKNTVAISASDVMLSFAAFPIEIPGE---VVFLHPVHN--FALIAYDPSSLGVAGASVVRAAELLPEPALRRGD 737 (918)
Q Consensus 663 ~GlV~v~r~~V~~~~~di~vtfa~~~~~vp~~---vvflhp~~n--~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd 737 (918)
-.++++++|..-..-+++.|.-. .-+..-+ -.-+||+.+ ..||+- |+.+++ -=+.++|. .+++||
T Consensus 40 G~~iItn~HLf~~nng~L~i~s~--hG~f~v~nt~~lkv~~i~~~Diviirm-PkDfpP----f~~kl~FR---~P~~~e 109 (235)
T PF00863_consen 40 GSYIITNAHLFKRNNGELTIKSQ--HGEFTVPNTTQLKVHPIEGRDIVIIRM-PKDFPP----FPQKLKFR---APKEGE 109 (235)
T ss_dssp TTEEEEEGGGGSSTTCEEEEEET--TEEEEECEGGGSEEEE-TCSSEEEEE---TTS--------S---B-------TT-
T ss_pred CCEEEEChhhhccCCCeEEEEeC--ceEEEcCCccccceEEeCCccEEEEeC-CcccCC----cchhhhcc---CCCCCC
Confidence 46999999998777677555533 1111111 123455544 344443 555542 11222332 359999
Q ss_pred eEEEEEecCCCceeEEeEEEecceeecccCCCCCccccccceeeEEEecCcCCCCcceEEC-CCccEEEEEeeeecceec
Q 002474 738 SVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRAMNMEVIELDTDFGSTFSGVLTD-EHGRVQAIWGSFSTQVKF 816 (918)
Q Consensus 738 ~v~~vG~~~~~~~~~~~t~vt~i~~~~~~~~~~~pryr~~n~e~i~~d~~~~~~~~Gvl~d-~~G~v~alw~s~~~~~~~ 816 (918)
+|.+||.+......+ ++||+ ++...|....-+|+ =-+++.-| .||.+|++ .||.|+||=..-...
T Consensus 110 ~v~mVg~~fq~k~~~--s~vSe--sS~i~p~~~~~fWk------HwIsTk~G-~CG~PlVs~~Dg~IVGiHsl~~~~--- 175 (235)
T PF00863_consen 110 RVCMVGSNFQEKSIS--STVSE--SSWIYPEENSHFWK------HWISTKDG-DCGLPLVSTKDGKIVGIHSLTSNT--- 175 (235)
T ss_dssp EEEEEEEECSSCCCE--EEEEE--EEEEEEETTTTEEE------E-C---TT--TT-EEEETTT--EEEEEEEEETT---
T ss_pred EEEEEEEEEEcCCee--EEECC--ceEEeecCCCCeeE------EEecCCCC-ccCCcEEEcCCCcEEEEEcCccCC---
Confidence 999999987776544 66776 34445544434555 24455445 69666655 899999987744433
Q ss_pred cCCCCCCceEEeccchhhHHHHHHH
Q 002474 817 GCSSSEDHQFVRGIPIYTISRVLDK 841 (918)
Q Consensus 817 ~~~~~~~~~~~~gl~~~~i~~v~~~ 841 (918)
...-|+..++-+.+..+++.
T Consensus 176 -----~~~N~F~~f~~~f~~~~l~~ 195 (235)
T PF00863_consen 176 -----SSRNYFTPFPDDFEEFYLEN 195 (235)
T ss_dssp -----TSSEEEEE--TTHHHHHCC-
T ss_pred -----CCeEEEEcCCHHHHHHHhcc
Confidence 44557767776655444433
No 108
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=87.29 E-value=0.61 Score=54.92 Aligned_cols=73 Identities=18% Similarity=0.230 Sum_probs=45.2
Q ss_pred eeeecceeecccchhhhcccCCCC----CcEEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCCeE
Q 002474 375 FLEVSGAVIHPLSYQQARNFRFPC----GLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARV 448 (918)
Q Consensus 375 ~v~~~Gl~~~~ls~~~~~~~gl~~----~GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV 448 (918)
.+...|+++.+...+ ...+|+.. ++..|+ .++|||.+|||.+||.|++|||. ...+...+-+..+
T Consensus 435 ~l~~~gL~~~~~~~~-~~~LGl~v~~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~~~~~d~i 505 (558)
T COG3975 435 LLERFGLTFTPKPRE-AYYLGLKVKSEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDRYKVNDKI 505 (558)
T ss_pred hhhhcceEEEecCCC-CcccceEecccCCeeEEEecCCCChhHhccCCCccEEEEEcCc--------cccccccccccce
Confidence 334456666554433 22333322 234455 49999999999999999999999 2223444555566
Q ss_pred EEEEeecc
Q 002474 449 PIEYSSYT 456 (918)
Q Consensus 449 ~l~~~~~~ 456 (918)
.+.+.+-+
T Consensus 506 ~v~~~~~~ 513 (558)
T COG3975 506 QVHVFREG 513 (558)
T ss_pred EEEEccCC
Confidence 66665533
No 109
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=86.79 E-value=17 Score=38.87 Aligned_cols=147 Identities=18% Similarity=0.141 Sum_probs=73.7
Q ss_pred eEEEEEEeCCCcEEEEcCcccCCCC-cEEEEEec---------CC---eEEeE-EEEEecC-------C-CcEEEEEECC
Q 002474 69 YATGFVVDKRRGIILTNRHVVKPGP-VVAEAMFV---------NR---EEIPV-YPIYRDP-------V-HDFGFFRYDP 126 (918)
Q Consensus 69 ~GTGFVVd~~~G~ILTn~HVV~~~~-~~i~v~f~---------dg---~~~~a-~vv~~Dp-------~-~DlAlLkvd~ 126 (918)
.+-|.+|++ .+|||++|++.... ....|.+. ++ ..... +++ .|+ . .|+|+|+++.
T Consensus 39 ~Cggsli~~--~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~ 115 (256)
T KOG3627|consen 39 LCGGSLISP--RWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSE 115 (256)
T ss_pred eeeeEEeeC--CEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECC
Confidence 455667764 49999999999531 03333332 11 11111 222 332 2 7999999985
Q ss_pred C-CcccccccCCCCCCcc----cCCCCEEEEEecCCCC------CceEEEeEEEeecC--CCCCCCCC-CccccceeEEE
Q 002474 127 S-AIQFLNYDEIPLAPEA----ACVGLEIRVVGNDSGE------KVSILAGTLARLDR--DAPHYKKD-GYNDFNTFYMQ 192 (918)
Q Consensus 127 ~-~l~~~~l~~l~l~~~~----l~vG~~V~vvG~p~g~------~~svt~G~Vs~~~r--~~p~~~~~-~~~dfn~~~Iq 192 (918)
. .+. ..+.++.|.... ...+....+.|+.... ........+..+.. ....|+.. ...+ ..+-
T Consensus 116 ~v~~~-~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~---~~~C 191 (256)
T KOG3627|consen 116 PVTFS-SHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITD---TMLC 191 (256)
T ss_pred CcccC-CcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCC---CEEe
Confidence 3 221 234455553221 3444788888865321 11222222222221 21222211 0001 1122
Q ss_pred Ee-----ecCCCCCCCccEEccc---ceEEEeccccCC
Q 002474 193 AA-----SGTKGGSSGSPVIDWQ---GRAVALNAGSKS 222 (918)
Q Consensus 193 ~d-----a~i~~G~SGGPvvn~d---G~VVGI~~~~~~ 222 (918)
+. ..+-.|.|||||+-.+ ..++||.+.+..
T Consensus 192 a~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 192 AGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred eCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 22 2245699999998765 699999987654
No 110
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=85.45 E-value=32 Score=37.72 Aligned_cols=107 Identities=10% Similarity=0.111 Sum_probs=56.9
Q ss_pred CCCcEEEEEECCC---CcccccccCCCCCCcccCCCCEEEEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEE
Q 002474 115 PVHDFGFFRYDPS---AIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYM 191 (918)
Q Consensus 115 p~~DlAlLkvd~~---~l~~~~l~~l~l~~~~l~vG~~V~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~I 191 (918)
...++.||.++.. ...++=+. =.+..+..|+.+.+.|+..........-.+..... ....+
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~---~~~~~~~~~~~~~~yg~~~~~~~~~~~~~i~~~~~-------------~~~~~ 222 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLA---DSSTNWEKGDEVDVYGFNSTGKLKHRKLKITNCTK-------------CAYSI 222 (282)
T ss_pred cccceEEEEEcccccccCCCEEeC---CCccccccCceEEEeecCCCCeEEEEEEEEEEeec-------------cceeE
Confidence 3468888888755 11111111 12345778999999998322221111111111111 11234
Q ss_pred EEeecCCCCCCCccEE-ccc--ceEEEeccccCCC--CCcccccchhhHHH
Q 002474 192 QAASGTKGGSSGSPVI-DWQ--GRAVALNAGSKSS--SASAFFLPLERVVR 237 (918)
Q Consensus 192 q~da~i~~G~SGGPvv-n~d--G~VVGI~~~~~~~--~~~~faIPi~~i~~ 237 (918)
........|.+|||++ +.+ -.|||+.+.+... ....+++.+...++
T Consensus 223 ~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 223 CTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQD 273 (282)
T ss_pred ecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhhh
Confidence 5555667899999997 334 4599998765432 22455666555443
No 111
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=80.96 E-value=2.5 Score=45.89 Aligned_cols=46 Identities=15% Similarity=0.325 Sum_probs=40.1
Q ss_pred CcEEEEc--CCChhHHcC-CCCCCEEEEECCeec--CCHHHHHHHHHhcCC
Q 002474 399 GLVYVAE--PGYMLFRAG-VPRHAIIKKFAGEEI--SRLEDLISVLSKLSR 444 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~AG-Lk~GD~I~sVNG~~v--~~l~efi~vl~~~~~ 444 (918)
.|+||+. ||+.|+.-| |...|.|++|||.++ .++|+....|-+...
T Consensus 194 pGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANsh 244 (358)
T KOG3606|consen 194 PGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANSH 244 (358)
T ss_pred CceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhccc
Confidence 4899995 999999999 678999999999999 588999998888554
No 112
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=80.72 E-value=3.7 Score=49.55 Aligned_cols=45 Identities=13% Similarity=0.085 Sum_probs=40.1
Q ss_pred cEEEE--cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCC
Q 002474 400 LVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSR 444 (918)
Q Consensus 400 GV~Vs--~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~ 444 (918)
-|-|+ .++++|.++.+++||++++|||.|+.+..+..++++....
T Consensus 399 ~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~ 445 (1051)
T KOG3532|consen 399 AVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTG 445 (1051)
T ss_pred EEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhccc
Confidence 34455 4999999999999999999999999999999999999754
No 113
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=80.42 E-value=1.3 Score=52.88 Aligned_cols=37 Identities=32% Similarity=0.549 Sum_probs=32.2
Q ss_pred CCceEEeEecCCCcccc-CCCCCCEEEEECCEEecCHHH
Q 002474 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLK 334 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~ 334 (918)
.|++| ..|.|++.|+. ||+.||.|++|||+...+...
T Consensus 562 fgifV-~~V~pgskAa~~GlKRgDqilEVNgQnfenis~ 599 (1283)
T KOG3542|consen 562 FGIFV-AEVFPGSKAAREGLKRGDQILEVNGQNFENISA 599 (1283)
T ss_pred ceeEE-eeecCCchHHHhhhhhhhhhhhccccchhhhhH
Confidence 46676 79999999999 999999999999998876543
No 114
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=77.89 E-value=1.9 Score=42.17 Aligned_cols=32 Identities=34% Similarity=0.509 Sum_probs=25.8
Q ss_pred EEEeecCCCCCCCccEEcccceEEEeccccCC
Q 002474 191 MQAASGTKGGSSGSPVIDWQGRAVALNAGSKS 222 (918)
Q Consensus 191 Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~~~ 222 (918)
..-...-.+|.||-|++|..|+||||+.++..
T Consensus 97 tip~g~g~~GDSGRpi~DNsGrVVaIVLGG~n 128 (158)
T PF00944_consen 97 TIPTGVGKPGDSGRPIFDNSGRVVAIVLGGAN 128 (158)
T ss_dssp EEETTS-STTSTTEEEESTTSBEEEEEEEEEE
T ss_pred EeccCCCCCCCCCCccCcCCCCEEEEEecCCC
Confidence 34455668899999999999999999988654
No 115
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=77.36 E-value=5.3 Score=44.79 Aligned_cols=55 Identities=15% Similarity=0.212 Sum_probs=43.3
Q ss_pred EEEEc--CCChhHHcC-CCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeeccc
Q 002474 401 VYVAE--PGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSYTD 457 (918)
Q Consensus 401 V~Vs~--pgspA~~AG-Lk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~~~ 457 (918)
|.|+. .+..|+..| |-.||-|+.|||..+..- ++.+.++++ .|..|+|++.++..
T Consensus 82 vviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV~~lr~ 141 (505)
T KOG3549|consen 82 VVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTVKHLRA 141 (505)
T ss_pred EEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEeHhhhc
Confidence 55552 677788887 779999999999999764 788888887 56788888877654
No 116
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=76.07 E-value=11 Score=39.37 Aligned_cols=81 Identities=22% Similarity=0.161 Sum_probs=51.7
Q ss_pred HHHHHHHHh-ccCCCeEEEEEEE---CCeEEEEE--EEeecCCCCCCCceeeecceeecccchhhhcccCCCCCcEEEEc
Q 002474 332 FLKLETLLD-DGVDKNIELLIER---GGISMTVN--LVVQDLHSITPDYFLEVSGAVIHPLSYQQARNFRFPCGLVYVAE 405 (918)
Q Consensus 332 ~~~l~~~L~-~~~G~~V~l~V~R---~G~~~~~~--I~l~~~~~~t~~~~v~~~Gl~~~~ls~~~~~~~gl~~~GV~Vs~ 405 (918)
..++.+.+. ..+|+.+.++|.+ .|+..+.+ +++.+.. +...-+.-.|+.+.+.. ..+.|.+
T Consensus 62 ~~~~~~~~~~~~~g~~lrl~V~G~~~~G~~~~k~v~lpl~~~~--~g~eRL~~~GL~l~~e~-----------~~~~Vd~ 128 (183)
T PF11874_consen 62 PSELVQVAEQLPPGSSLRLRVEGPDFEGDPVTKTVLLPLGDGA--DGEERLEAAGLTLMEEG-----------GKVIVDE 128 (183)
T ss_pred HHHHHHHHhcCCCCCEEEEEEEccCCCCCceEEEEEEEcCCCC--CHHHHHHhCCCEEEeeC-----------CEEEEEe
Confidence 456666664 4678999999987 45554444 4444322 22222334466654322 3466664
Q ss_pred --CCChhHHcCCCCCCEEEEEC
Q 002474 406 --PGYMLFRAGVPRHAIIKKFA 425 (918)
Q Consensus 406 --pgspA~~AGLk~GD~I~sVN 425 (918)
.||+|+++|+.-++.|++|-
T Consensus 129 v~fgS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 129 VEFGSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred cCCCCHHHHcCCCCCcEEEEEE
Confidence 79999999999999898773
No 117
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=74.57 E-value=2.7 Score=47.71 Aligned_cols=52 Identities=23% Similarity=0.305 Sum_probs=41.6
Q ss_pred eEEeEecCCCcccc--CCCCCCEEEEECCEEecCHHHHHHHH-hccCCCeEEEEE
Q 002474 300 LVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLETLL-DDGVDKNIELLI 351 (918)
Q Consensus 300 lVV~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~~l~~~L-~~~~G~~V~l~V 351 (918)
++++.+.++-.|++ .|..||.|++|||+.+.+..+-+..- .++.|+.|.++|
T Consensus 112 IlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKraGkeV~lev 166 (506)
T KOG3551|consen 112 ILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKRAGKEVLLEV 166 (506)
T ss_pred eehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHHhhCceeeeee
Confidence 33489999999998 69999999999999999887766554 467788766554
No 118
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=74.12 E-value=3.5 Score=46.10 Aligned_cols=53 Identities=21% Similarity=0.354 Sum_probs=42.1
Q ss_pred eEEeEecCCCcccc-C-CCCCCEEEEECCEEecCHHHHHHH-HhccCCCeEEEEEE
Q 002474 300 LVVDSVVPGGPAHL-R-LEPGDVLVRVNGEVITQFLKLETL-LDDGVDKNIELLIE 352 (918)
Q Consensus 300 lVV~~V~p~spA~~-G-Lq~GDiIlsVNG~~I~s~~~l~~~-L~~~~G~~V~l~V~ 352 (918)
+|++.+..+-.|+. | |-.||-|++|||..|+.-.+-+.. +..++|+.|+|+|.
T Consensus 82 vviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRNAGdeVtlTV~ 137 (505)
T KOG3549|consen 82 VVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRNAGDEVTLTVK 137 (505)
T ss_pred EEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHhcCCEEEEEeH
Confidence 55689999999998 5 889999999999999976554432 34667999888875
No 119
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=73.74 E-value=6.8 Score=46.50 Aligned_cols=54 Identities=30% Similarity=0.396 Sum_probs=42.5
Q ss_pred ceEEeEecCCCcccc-C-CCCCCEEEEECCEEecC--HHHHHHHHhccCCCeEEEEEEE
Q 002474 299 LLVVDSVVPGGPAHL-R-LEPGDVLVRVNGEVITQ--FLKLETLLDDGVDKNIELLIER 353 (918)
Q Consensus 299 ~lVV~~V~p~spA~~-G-Lq~GDiIlsVNG~~I~s--~~~l~~~L~~~~G~~V~l~V~R 353 (918)
-++|..+..|+.+++ | |..||.|++|||..+.+ ..+++.+|.... ..+++++.-
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G~itfkiiP 204 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSR-GSITFKIIP 204 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCC-CcEEEEEcc
Confidence 344579999999999 5 99999999999999985 567778886655 466776653
No 120
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=73.32 E-value=12 Score=42.80 Aligned_cols=50 Identities=18% Similarity=0.300 Sum_probs=38.9
Q ss_pred CCChhHHcC-CCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeeccc
Q 002474 406 PGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSYTD 457 (918)
Q Consensus 406 pgspA~~AG-Lk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~~~ 457 (918)
+|-.|++.+ |..||.|++|||....+. |+.++++|. .|+.|.++++.+.+
T Consensus 119 kGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKr--aGkeV~levKy~RE 171 (506)
T KOG3551|consen 119 KGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKR--AGKEVLLEVKYMRE 171 (506)
T ss_pred cccccccccceeeccEEEEecchhhhhcchHHHHHHHHh--hCceeeeeeeeehh
Confidence 777777776 899999999999999765 666666666 57788877776544
No 121
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=72.42 E-value=6.7 Score=42.61 Aligned_cols=50 Identities=20% Similarity=0.270 Sum_probs=40.5
Q ss_pred CCChhHHc-CCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeec
Q 002474 406 PGYMLFRA-GVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSY 455 (918)
Q Consensus 406 pgspA~~A-GLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~ 455 (918)
+||-..+- -+..||.|.+|||+.+-.+ .+..+.|++++.|+..+|++.-.
T Consensus 158 egsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLieP 210 (334)
T KOG3938|consen 158 EGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIEP 210 (334)
T ss_pred CCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeecc
Confidence 55555444 3789999999999999776 57889999999999988887654
No 122
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=72.16 E-value=2.4 Score=48.88 Aligned_cols=30 Identities=33% Similarity=0.489 Sum_probs=25.1
Q ss_pred EEEEeecCCCCCCCccEEcccceEEEeccc
Q 002474 190 YMQAASGTKGGSSGSPVIDWQGRAVALNAG 219 (918)
Q Consensus 190 ~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~ 219 (918)
++.-.....+|+||+.|+|.+|++|||..|
T Consensus 345 y~~~~~~l~gGaSGS~V~n~~~~lvGIy~g 374 (374)
T PF01732_consen 345 YLIDNYSLGGGASGSMVINQNNELVGIYFG 374 (374)
T ss_pred hcccccCCCCCCCcCeEECCCCCEEEEeCC
Confidence 444555778999999999999999999764
No 123
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=71.22 E-value=9.5 Score=44.88 Aligned_cols=80 Identities=26% Similarity=0.518 Sum_probs=51.0
Q ss_pred EEcChHHHHHhccchhHHHHHHhcCCCCCCCceEEeEecCCCcccc--CCCCCCEEEEECCEEecCHHHHH--HHHhccC
Q 002474 268 VHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLE--TLLDDGV 343 (918)
Q Consensus 268 ~~~~~d~~r~LGL~~e~e~~~r~~~p~~~~G~lVV~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~~l~--~~L~~~~ 343 (918)
..++.+..-.||++.- -...-.+..|++| ..|.+++..+. .+++||.||.||.....++..-+ +.|.+.+
T Consensus 253 V~LnMe~vnfLGiSiv-----gqsn~rgDggIYV-gsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV 326 (626)
T KOG3571|consen 253 VTLNMETVNFLGISIV-----GQSNARGDGGIYV-GSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAV 326 (626)
T ss_pred EEecccccccceeEee-----cccCcCCCCceEE-eeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHh
Confidence 3455566666887621 1111113456676 89999998887 59999999999999887764333 3343321
Q ss_pred --CCeEEEEEEE
Q 002474 344 --DKNIELLIER 353 (918)
Q Consensus 344 --G~~V~l~V~R 353 (918)
-..++|+|-.
T Consensus 327 ~~~gPi~ltvAk 338 (626)
T KOG3571|consen 327 SRPGPIKLTVAK 338 (626)
T ss_pred ccCCCeEEEEee
Confidence 1247777765
No 124
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=70.98 E-value=7.6 Score=42.81 Aligned_cols=53 Identities=23% Similarity=0.355 Sum_probs=39.2
Q ss_pred CceEEeEecCCCcccc-C-CCCCCEEEEECCEEecCHHHH--HHHHhccCCCeEEEEEE
Q 002474 298 GLLVVDSVVPGGPAHL-R-LEPGDVLVRVNGEVITQFLKL--ETLLDDGVDKNIELLIE 352 (918)
Q Consensus 298 G~lVV~~V~p~spA~~-G-Lq~GDiIlsVNG~~I~s~~~l--~~~L~~~~G~~V~l~V~ 352 (918)
-++|| .|..++||++ | ++.||.|++|||..|..-..+ ..+++... +.|++++-
T Consensus 31 ClYiV-QvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~-~eV~IhyN 87 (429)
T KOG3651|consen 31 CLYIV-QVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSL-NEVKIHYN 87 (429)
T ss_pred eEEEE-EeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhc-cceEEEeh
Confidence 45776 8999999999 4 999999999999999865544 45554433 34566553
No 125
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=69.09 E-value=11 Score=43.14 Aligned_cols=49 Identities=18% Similarity=0.188 Sum_probs=41.3
Q ss_pred CCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcCCCC--eEEEEEee
Q 002474 406 PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGA--RVPIEYSS 454 (918)
Q Consensus 406 pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~~g~--rV~l~~~~ 454 (918)
.++++..+|++.||.|+++|++++.++++..+.+....... .+.+.+.|
T Consensus 138 ~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~ 188 (375)
T COG0750 138 PKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIR 188 (375)
T ss_pred CCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEe
Confidence 78999999999999999999999999999999888865533 25666655
No 126
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=69.05 E-value=6.5 Score=44.97 Aligned_cols=45 Identities=22% Similarity=0.274 Sum_probs=39.1
Q ss_pred CCCceEEeEecCCCcccc--CCCCCCEEEEECCEEecCHHHHHHHHhc
Q 002474 296 ETGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLETLLDD 341 (918)
Q Consensus 296 ~~G~lVV~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~~l~~~L~~ 341 (918)
..|+.| .+|...||+.. ||.+||+|.++||-++.+.++..+-++.
T Consensus 219 g~gV~V-tev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 219 GEGVTV-TEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred CceEEE-EeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHh
Confidence 456666 69999999998 9999999999999999999998877754
No 127
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=63.55 E-value=14 Score=40.30 Aligned_cols=55 Identities=22% Similarity=0.437 Sum_probs=40.1
Q ss_pred CCCceEEeEecCCCcccc-C-CCCCCEEEEECCEEec--CHHHHHHHHhccCCCeEEEEEE
Q 002474 296 ETGLLVVDSVVPGGPAHL-R-LEPGDVLVRVNGEVIT--QFLKLETLLDDGVDKNIELLIE 352 (918)
Q Consensus 296 ~~G~lVV~~V~p~spA~~-G-Lq~GDiIlsVNG~~I~--s~~~l~~~L~~~~G~~V~l~V~ 352 (918)
.-|+++ +...|++-|+. | |...|.|++|||.++. +..++..++..+. ..+-++|.
T Consensus 193 vpGIFI-SRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANs-hNLIiTVk 251 (358)
T KOG3606|consen 193 VPGIFI-SRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANS-HNLIITVK 251 (358)
T ss_pred cCceEE-EeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcc-cceEEEec
Confidence 457787 79999999999 6 6789999999999996 5666666664322 23444443
No 128
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=59.52 E-value=9.9 Score=48.75 Aligned_cols=52 Identities=25% Similarity=0.413 Sum_probs=41.2
Q ss_pred cCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhc-CCCCeEEEEEeeccc
Q 002474 405 EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKL-SRGARVPIEYSSYTD 457 (918)
Q Consensus 405 ~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~-~~g~rV~l~~~~~~~ 457 (918)
++|+||..+|++.+|.|+.|||+++..+ .+.++|+-+ +.|.+|.+++.-+..
T Consensus 666 ~egsPA~~agls~~DlIthvnge~v~gl-~H~ev~~Lll~~gn~v~~~ttplen 718 (1205)
T KOG0606|consen 666 EEGSPAFEAGLSAGDLITHVNGEPVHGL-VHTEVMELLLKSGNKVTLRTTPLEN 718 (1205)
T ss_pred cCCCCccccCCCccceeEeccCcccchh-hHHHHHHHHHhcCCeeEEEeecccc
Confidence 4899999999999999999999999988 444454443 467888888766644
No 129
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=58.80 E-value=12 Score=47.03 Aligned_cols=47 Identities=15% Similarity=0.133 Sum_probs=38.0
Q ss_pred CCChhHHcCCCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeec
Q 002474 406 PGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSY 455 (918)
Q Consensus 406 pgspA~~AGLk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~ 455 (918)
+|+|+.- .|.+||.|++|||+++.+. +..+++++..++ .|.|++.+.
T Consensus 84 ~GGps~G-KL~PGDQIl~vN~Epv~daprervIdlvRace~--sv~ltV~qP 132 (1298)
T KOG3552|consen 84 EGGPSIG-KLQPGDQILAVNGEPVKDAPRERVIDLVRACES--SVNLTVCQP 132 (1298)
T ss_pred CCCCccc-cccCCCeEEEecCcccccccHHHHHHHHHHHhh--hcceEEecc
Confidence 8888763 3999999999999999875 888899988553 677777763
No 130
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=57.58 E-value=15 Score=47.18 Aligned_cols=49 Identities=35% Similarity=0.495 Sum_probs=36.8
Q ss_pred EeEecCCCcccc-CCCCCCEEEEECCEEecCHHHHH--HHHhccCCCeEEEEE
Q 002474 302 VDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLE--TLLDDGVDKNIELLI 351 (918)
Q Consensus 302 V~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~--~~L~~~~G~~V~l~V 351 (918)
|..|.++|||.. ||++||.|+.+||+++....+-+ +.|. +.|..+.+.+
T Consensus 662 v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll-~~gn~v~~~t 713 (1205)
T KOG0606|consen 662 VGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLL-KSGNKVTLRT 713 (1205)
T ss_pred eeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHH-hcCCeeEEEe
Confidence 368999999998 99999999999999998765433 3333 3455555544
No 131
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=53.52 E-value=27 Score=44.15 Aligned_cols=54 Identities=17% Similarity=0.239 Sum_probs=41.9
Q ss_pred CcEEEEc--CCChhHHcC-CCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEee
Q 002474 399 GLVYVAE--PGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSS 454 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~AG-Lk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~ 454 (918)
-|+||.+ +|++|+..| |..||.+++|||+..-.+ +...++|.. -|..|.|++..
T Consensus 960 lGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lmtr--tg~vV~leVaK 1018 (1629)
T KOG1892|consen 960 LGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLMTR--TGNVVHLEVAK 1018 (1629)
T ss_pred cceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHHHHhc--cCCeEEEehhh
Confidence 4799985 999999998 999999999999998776 444555554 45667776653
No 132
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=52.40 E-value=35 Score=32.38 Aligned_cols=52 Identities=10% Similarity=0.103 Sum_probs=32.2
Q ss_pred EEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEEEEEecCCCcEEEEEECCCCccccccc
Q 002474 72 GFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDPSAIQFLNYD 135 (918)
Q Consensus 72 GFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~vv~~Dp~~DlAlLkvd~~~l~~~~l~ 135 (918)
++=|. +|..+|+.||.+.. ..+ ++.++ +++ ...-|+++++.+...++.++++
T Consensus 3 avHIG--nG~~vt~tHva~~~-~~v-----~g~~f--~~~--~~~ge~~~v~~~~~~~p~~~ig 54 (105)
T PF03510_consen 3 AVHIG--NGRYVTVTHVAKSS-DSV-----DGQPF--KIV--KTDGELCWVQSPLVHLPAAQIG 54 (105)
T ss_pred eEEeC--CCEEEEEEEEeccC-ceE-----cCcCc--EEE--EeccCEEEEECCCCCCCeeEec
Confidence 34444 79999999999943 111 22222 222 3445999999987666665554
No 133
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=46.19 E-value=13 Score=36.74 Aligned_cols=114 Identities=19% Similarity=0.222 Sum_probs=56.5
Q ss_pred EEEEEeCCCcEEEEcCcccCCCCcEEEEEecCCeEEeEEEEEecCCCcEEEEEECCCCcccccccCCCCCCcccCCCCEE
Q 002474 71 TGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEI 150 (918)
Q Consensus 71 TGFVVd~~~G~ILTn~HVV~~~~~~i~v~f~dg~~~~a~vv~~Dp~~DlAlLkvd~~~l~~~~l~~l~l~~~~l~vG~~V 150 (918)
-|+.|+ |..-|-+|=-... .+.-..| +..-.+.+...|+..-...+..- .+.+..-.+ +.+
T Consensus 15 mgt~vn---GV~wT~~HGagsr----tlAgp~G---pv~q~~~s~~~Dlv~~p~P~Ga~---SL~pCtCg~------~dl 75 (148)
T PF02907_consen 15 MGTCVN---GVMWTVYHGAGSR----TLAGPKG---PVNQMYTSVDDDLVGWPAPPGAR---SLTPCTCGS------SDL 75 (148)
T ss_dssp EEEEET---TEEEEEHHHHTTS----EEEBTTS---EB-ESEEETTTTEEEEE-STTB-----BBB-SSSS------SEE
T ss_pred ehhEEc---cEEEEEEecCCcc----cccCCCC---cceEeEEcCCCCCcccccccccc---cCCccccCC------ccE
Confidence 466774 7999999976631 1111112 33456778888998887765321 222222111 356
Q ss_pred EEEecCCCCCceEEEeEEEeecCCCCCCCCCCccccceeEEEEeecCCCCCCCccEEcccceEEEecccc
Q 002474 151 RVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGS 220 (918)
Q Consensus 151 ~vvG~p~g~~~svt~G~Vs~~~r~~p~~~~~~~~dfn~~~Iq~da~i~~G~SGGPvvn~dG~VVGI~~~~ 220 (918)
++|-+.... ..+ .+.+. .+... ..-.-...-.|+||||++=.+|.+|||..+.
T Consensus 76 ylVtr~~~v----~p~--rr~gd--------~~~~L---~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa 128 (148)
T PF02907_consen 76 YLVTRDADV----IPV--RRRGD--------SRASL---LSPRPISDLKGSSGGPILCPSGHAVGMFRAA 128 (148)
T ss_dssp EEE-TTS-E----EEE--EEEST--------TEEEE---EEEEEHHHHTT-TT-EEEETTSEEEEEEEEE
T ss_pred EEEeccCcE----eee--EEcCC--------CceEe---cCCceeEEEecCCCCcccCCCCCEEEEEEEE
Confidence 666554321 111 11111 00000 0111123346999999999999999998653
No 134
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=44.53 E-value=47 Score=36.96 Aligned_cols=54 Identities=13% Similarity=0.191 Sum_probs=39.2
Q ss_pred cEEEEc--CCChhHHcC-CCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEeec
Q 002474 400 LVYVAE--PGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYSSY 455 (918)
Q Consensus 400 GV~Vs~--pgspA~~AG-Lk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~~~ 455 (918)
=+||.. .++||.+.| ++.||.|++|||..+..- -+..+.++... ..|.|.|-.+
T Consensus 31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~--~eV~IhyNKL 89 (429)
T KOG3651|consen 31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSL--NEVKIHYNKL 89 (429)
T ss_pred eEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhc--cceEEEehhc
Confidence 355553 789999998 999999999999999754 45556666543 3577766555
No 135
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=41.27 E-value=30 Score=41.31 Aligned_cols=51 Identities=24% Similarity=0.259 Sum_probs=40.7
Q ss_pred EEEEc--CCChhHHcC-CCCCCEEEEECCeecCCH--HHHHHHHHhcCCCCeEEEEEe
Q 002474 401 VYVAE--PGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPIEYS 453 (918)
Q Consensus 401 V~Vs~--pgspA~~AG-Lk~GD~I~sVNG~~v~~l--~efi~vl~~~~~g~rV~l~~~ 453 (918)
++|+. .|+++++.| |..||.|.+|||..+.+. +++.+.+++.. ..+++++.
T Consensus 148 ~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~--G~itfkii 203 (542)
T KOG0609|consen 148 VVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSR--GSITFKII 203 (542)
T ss_pred cEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCC--CcEEEEEc
Confidence 66773 899999998 889999999999999765 78899998866 34444443
No 136
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=37.87 E-value=43 Score=35.72 Aligned_cols=56 Identities=20% Similarity=0.322 Sum_probs=44.1
Q ss_pred eeEEEEeecCCCCCCCccEEcc----cceEEEeccccCCCCCcccccch--hhHHHHHHHHH
Q 002474 188 TFYMQAASGTKGGSSGSPVIDW----QGRAVALNAGSKSSSASAFFLPL--ERVVRALRFLQ 243 (918)
Q Consensus 188 ~~~Iq~da~i~~G~SGGPvvn~----dG~VVGI~~~~~~~~~~~faIPi--~~i~~~L~~l~ 243 (918)
..-++..++...|+=|||++=. --+++||+.++..+.+.+||-++ +.+++++++|.
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~~~gYAe~itQEDL~~A~~~l~ 229 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANHAMGYAESITQEDLMRAINKLE 229 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccccccceehhhhhHHHHHHHHHhhc
Confidence 3456778899999999998732 26899999999888889999554 67777777665
No 137
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=35.20 E-value=46 Score=38.45 Aligned_cols=44 Identities=23% Similarity=0.319 Sum_probs=35.1
Q ss_pred CcEEEEc--CCChhHHc-CCCCCCEEEEECCeecCCHHHHHHHHHhc
Q 002474 399 GLVYVAE--PGYMLFRA-GVPRHAIIKKFAGEEISRLEDLISVLSKL 442 (918)
Q Consensus 399 ~GV~Vs~--pgspA~~A-GLk~GD~I~sVNG~~v~~l~efi~vl~~~ 442 (918)
.||.|.+ ..||+... ||.+||+|+++||-|+.+.+++.+.++..
T Consensus 220 ~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~ts 266 (484)
T KOG2921|consen 220 EGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLATS 266 (484)
T ss_pred ceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHhh
Confidence 4677774 44554322 89999999999999999999999999883
No 138
>KOG0460 consensus Mitochondrial translation elongation factor Tu [Translation, ribosomal structure and biogenesis]
Probab=32.69 E-value=37 Score=38.71 Aligned_cols=36 Identities=31% Similarity=0.383 Sum_probs=28.1
Q ss_pred cceeecccCCCcCCCCCCeEEEEEecCCCceeEEeEEEecce
Q 002474 720 SVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPC 761 (918)
Q Consensus 720 ~~v~~~~l~~~~~l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~ 761 (918)
-.|.+=++---- ||+||++-++|++++ .||+||+|-
T Consensus 269 GTVvtGrlERG~-lKkG~e~eivG~~~~-----lkttvtgie 304 (449)
T KOG0460|consen 269 GTVVTGRLERGV-LKKGDEVEIVGHNKT-----LKTTVTGIE 304 (449)
T ss_pred ceEEEEEEeecc-cccCCEEEEeccCcc-----eeeEeehHH
Confidence 355555555554 999999999999998 459999976
No 139
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=31.07 E-value=38 Score=37.07 Aligned_cols=56 Identities=20% Similarity=0.291 Sum_probs=42.8
Q ss_pred CCceEEeEecCCCcccc--CCCCCCEEEEECCEEecCHHHHH--HHHhc-cCCCeEEEEEE
Q 002474 297 TGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLE--TLLDD-GVDKNIELLIE 352 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~~l~--~~L~~-~~G~~V~l~V~ 352 (918)
.|--.+..+.++|.-+. -++.||.|-+|||+.+-.+.+.+ ++|.+ ..|++.++.+.
T Consensus 148 ~GyAFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLi 208 (334)
T KOG3938|consen 148 AGYAFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLI 208 (334)
T ss_pred cceeeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEee
Confidence 44444588999999998 79999999999999999998776 45543 55666665543
No 140
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=30.57 E-value=66 Score=37.58 Aligned_cols=62 Identities=24% Similarity=0.297 Sum_probs=49.7
Q ss_pred EecCCCcccc-CCC-CCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCeEEEEEEEee
Q 002474 304 SVVPGGPAHL-RLE-PGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVVQ 365 (918)
Q Consensus 304 ~V~p~spA~~-GLq-~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I~l~ 365 (918)
+|.++|||+. ||+ -+|-|+-+-.......+|+...+..+-++.+++.|+.-......++++.
T Consensus 115 ~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReVti~ 178 (462)
T KOG3834|consen 115 SVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREVTIT 178 (462)
T ss_pred ecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEeecCCCccceEEee
Confidence 8999999999 999 6799999955566677888888888888999999987666555555544
No 141
>KOG4407 consensus Predicted Rho GTPase-activating protein [General function prediction only]
Probab=30.08 E-value=36 Score=44.46 Aligned_cols=87 Identities=9% Similarity=0.007 Sum_probs=58.8
Q ss_pred eEecCCCcccc-CCCCCCEEEEECCEEecCHHHHHHHHhccCCCeEEEEEEECCeEEEEEEEeecCCCCCCCceeeecce
Q 002474 303 DSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGA 381 (918)
Q Consensus 303 ~~V~p~spA~~-GLq~GDiIlsVNG~~I~s~~~l~~~L~~~~G~~V~l~V~R~G~~~~~~I~l~~~~~~t~~~~v~~~Gl 381 (918)
..+..++++.. |+-.||.|+.|||..+++...+--++.++- +++-+
T Consensus 101 ~Q~~s~~~~~nsG~~s~~~v~~itG~e~~~~TS~~~~~vk~~-eT~~~-------------------------------- 147 (1973)
T KOG4407|consen 101 PQEASSAAGSNSGSSSSVGVAGITGLEPTSPTSLPPYQVKAM-ETIFI-------------------------------- 147 (1973)
T ss_pred chhcccCcccccCcccccceeeecccccCCCccccHHHHhhh-hhhhh--------------------------------
Confidence 45666788888 999999999999998877553332221110 00000
Q ss_pred eecccchhhhcccCCCCCcEEEEcCCChhHHcCCCCCCEEEEECCeecCCHHHHHHHHHhcC
Q 002474 382 VIHPLSYQQARNFRFPCGLVYVAEPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLS 443 (918)
Q Consensus 382 ~~~~ls~~~~~~~gl~~~GV~Vs~pgspA~~AGLk~GD~I~sVNG~~v~~l~efi~vl~~~~ 443 (918)
..|- +++|+..+.|+-||.|+.||++++..+ ..-.++..++
T Consensus 148 -----------------~eV~---~n~~~~~a~LQ~~~~V~~v~~q~~A~i-~~s~~~S~~~ 188 (1973)
T KOG4407|consen 148 -----------------KEVQ---ANGPAHYANLQTGDRVLMVNNQPIAGI-AYSTIVSMIK 188 (1973)
T ss_pred -----------------hhhc---cCChhHHHhhhccceeEEeecCcccch-hhhhhhhhhc
Confidence 0122 778999999999999999999999988 4444444443
No 142
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=28.71 E-value=1.2e+02 Score=27.42 Aligned_cols=47 Identities=32% Similarity=0.628 Sum_probs=35.0
Q ss_pred eeeeeEEEEeecccceEEEEecCCCcCcccccceeecc--cCCCcCCCCCCeEEE-EEec
Q 002474 689 IEIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAE--LLPEPALRRGDSVYL-VGLS 745 (918)
Q Consensus 689 ~~vp~~vvflhp~~n~aiv~ydp~~~~~~~~~~v~~~~--l~~~~~l~~Gd~v~~-vG~~ 745 (918)
+-|||+|+=.++..++|+|.|= + --|.++ |-++. .+.||-|.+ +||.
T Consensus 3 laiPgqI~~I~~~~~~A~Vd~g----G-----vkreV~l~Lv~~~-v~~GdyVLVHvGfA 52 (82)
T COG0298 3 LAIPGQIVEIDDNNHLAIVDVG----G-----VKREVNLDLVGEE-VKVGDYVLVHVGFA 52 (82)
T ss_pred cccccEEEEEeCCCceEEEEec----c-----EeEEEEeeeecCc-cccCCEEEEEeeEE
Confidence 3489999999998889999886 2 223333 34544 799999988 8884
No 143
>PF12857 TOBE_3: TOBE-like domain; InterPro: IPR024765 The TOBE (transport-associated OB) domain [] always occurs as a dimer and it is found in ABC transporters immediately after the ATPase domain. This entry represents a TOBE-like domain, found in the C terminus of ATPase subunit CysA. CysA is part of the CysATWP ABC transporter complex, involved in sulphate/thiosulphate import [, ].
Probab=27.19 E-value=1.5e+02 Score=24.80 Aligned_cols=49 Identities=27% Similarity=0.253 Sum_probs=38.6
Q ss_pred eeeeeEEEEeecccceEEEEecCCCcCcccccceeecccCCCc---CCCCCCeEEEE
Q 002474 689 IEIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEP---ALRRGDSVYLV 742 (918)
Q Consensus 689 ~~vp~~vvflhp~~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~---~l~~Gd~v~~v 742 (918)
--++|+|...|++-..+-|-..+..= -....++|+.+. .++.||.|+|.
T Consensus 5 ~~l~a~V~~v~~~G~~vRlEl~~~~~-----~~~iEvel~~~~~~l~l~~G~~V~l~ 56 (58)
T PF12857_consen 5 GGLPARVRRVRPVGPEVRLELKRLDD-----GEPIEVELPRERRQLGLQPGDRVYLR 56 (58)
T ss_pred CcEeEEEEEEEecCCeEEEEEEECCC-----CCEEEEEeCHhHHhcCCCCCCEEEEE
Confidence 35899999999999998888876411 357778888766 78899999974
No 144
>KOG4371 consensus Membrane-associated protein tyrosine phosphatase PTP-BAS and related proteins, contain FERM domain [Signal transduction mechanisms]
Probab=26.61 E-value=1.5e+02 Score=38.32 Aligned_cols=116 Identities=14% Similarity=0.125 Sum_probs=57.5
Q ss_pred CCCCCCEEEEECCEEecCHHHHHHH-HhccCCCeEEEEEEECCeEEEEEEEeecCCCCCCCceeeecceeecccchh-hh
Q 002474 314 RLEPGDVLVRVNGEVITQFLKLETL-LDDGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLSYQ-QA 391 (918)
Q Consensus 314 GLq~GDiIlsVNG~~I~s~~~l~~~-L~~~~G~~V~l~V~R~G~~~~~~I~l~~~~~~t~~~~v~~~Gl~~~~ls~~-~~ 391 (918)
.|+.||.++.+||..+.--.+.... .....|+.+.|-++|.-... ........ ...+..+--.-+...+...+ +.
T Consensus 1186 d~~~g~~l~~~n~i~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~-~d~~~~s~--~~~~~~l~~~~~~~~p~~~~~~~ 1262 (1332)
T KOG4371|consen 1186 DIRVGDVLLYVNGIAVEGKVHQEVVAMLRGGGDRVVLGVQRPPPAY-SDQHHASS--TSASAPLISVMLLKKPMATLGLS 1262 (1332)
T ss_pred CcchhhhhhhccceeeechhhHHHHHHHhccCceEEEEeecCCccc-ccchhhhh--hcccchhhhheeeeccccccccc
Confidence 7999999999999777654333322 23456788999999854221 11110000 00000000000001111000 00
Q ss_pred cccCCCCCcEEEEc--CCChhH-HcCCCCCCEEEEECCeecCCH
Q 002474 392 RNFRFPCGLVYVAE--PGYMLF-RAGVPRHAIIKKFAGEEISRL 432 (918)
Q Consensus 392 ~~~gl~~~GV~Vs~--pgspA~-~AGLk~GD~I~sVNG~~v~~l 432 (918)
-.-.-+..|+|+.. ..+.|. ...+++||.+.+.+|+++...
T Consensus 1263 ~~~~~~s~~~~~~~~~~~~~a~~~~~~r~g~~~~~~~~~~~~~~ 1306 (1332)
T KOG4371|consen 1263 LAKRTMSDGIFIRNIAQDSAASSEGTLRVGDRLVSLDGEPVDGF 1306 (1332)
T ss_pred ccccCcCCceeeecccccccccccccccccceeeccCCccCCCC
Confidence 00001225788763 222222 224999999999999998655
No 145
>PF14275 DUF4362: Domain of unknown function (DUF4362)
Probab=23.06 E-value=2.2e+02 Score=26.86 Aligned_cols=53 Identities=19% Similarity=0.315 Sum_probs=37.2
Q ss_pred CCCCEEEEECCeecCCHHHHHHHHHhcCCCCeEEEEEeeccccccceEEEEEEEcCC
Q 002474 416 PRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDRHE 472 (918)
Q Consensus 416 k~GD~I~sVNG~~v~~l~efi~vl~~~~~g~rV~l~~~~~~~~~~~k~~~ltIdRd~ 472 (918)
+.||+|.+ +-.+.|++.|.+.+.....++.=.|++.+++..+. |+..++.-++
T Consensus 1 ~~~DVi~~--~~~i~Nl~kl~~Fi~nv~~~k~d~IrIv~yT~EGd--PI~~~L~~~G 53 (98)
T PF14275_consen 1 KNNDVINK--HGEIENLDKLDQFIENVEQGKPDKIRIVQYTIEGD--PIFQDLEYDG 53 (98)
T ss_pred CCCCEEEe--CCeEEeHHHHHHHHHHHhcCCCCEEEEEEecCCCC--CEEEEEEECC
Confidence 57898888 44588998888888888777777777777765444 4555554444
No 146
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=22.93 E-value=2.6e+02 Score=24.44 Aligned_cols=43 Identities=30% Similarity=0.544 Sum_probs=29.5
Q ss_pred eeeeEEEEeecccceEEEEecCCCcCcccccceeecccCCCcCCCCCCeEEE
Q 002474 690 EIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYL 741 (918)
Q Consensus 690 ~vp~~vvflhp~~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd~v~~ 741 (918)
-+|+||+-.++-.+.|.+.|. +. ..-....|-++ ++.||-|..
T Consensus 4 ~iP~~Vv~v~~~~~~A~v~~~----G~---~~~V~~~lv~~--v~~Gd~VLV 46 (68)
T PF01455_consen 4 AIPGRVVEVDEDGGMAVVDFG----GV---RREVSLALVPD--VKVGDYVLV 46 (68)
T ss_dssp CEEEEEEEEETTTTEEEEEET----TE---EEEEEGTTCTS--B-TT-EEEE
T ss_pred cccEEEEEEeCCCCEEEEEcC----Cc---EEEEEEEEeCC--CCCCCEEEE
Confidence 489999999988999999998 31 12223344454 699998865
No 147
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=22.56 E-value=8.5e+02 Score=24.98 Aligned_cols=154 Identities=15% Similarity=0.158 Sum_probs=79.0
Q ss_pred ccccccCceEEEEEecCCccccCCcccceeEeEEEEEeccCCCcEEEEecccccCCCccEEEEeeeCceeeeeEEEEeec
Q 002474 621 AESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLHP 700 (918)
Q Consensus 621 ~~~~l~~s~V~v~~~~p~~~~~d~~~~~~~~G~G~Vvd~~~~~GlV~v~r~~V~~~~~di~vtfa~~~~~vp~~vvflhp 700 (918)
+.+.+++-++.|.+ .++.+.++++=|. .-+.|+.+|- -.+..+.+......+.-++.....
T Consensus 7 ~~~~~~~N~~~v~~-----------~~g~~t~l~~gi~----~~~~lvp~H~----~~~~~i~i~g~~~~~~d~~~lv~~ 67 (172)
T PF00548_consen 7 ERSLIKKNVVPVTT-----------GKGEFTMLALGIY----DRYFLVPTHE----EPEDTIYIDGVEYKVDDSVVLVDR 67 (172)
T ss_dssp HHHHHHHHEEEEEE-----------TTEEEEEEEEEEE----BTEEEEEGGG----GGCSEEEETTEEEEEEEEEEEEET
T ss_pred HHHHHhccEEEEEe-----------CCceEEEecceEe----eeEEEEECcC----CCcEEEEECCEEEEeeeeEEEecC
Confidence 34455666666655 3577889988898 6789999992 223344444323333445555555
Q ss_pred cc---ceEEEEecCCCcCcccccceeecccCCCcCCCCCCeEEEEEecCCCceeEEeEEEecceeecccCCCCCcccccc
Q 002474 701 VH---NFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRAM 777 (918)
Q Consensus 701 ~~---n~aiv~ydp~~~~~~~~~~v~~~~l~~~~~l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~~~~~~~~~~~pryr~~ 777 (918)
-+ +.++++.+..--= .+++.- |.+.. -+..+.+.+|=.+.-.+.+..-+.|+.-- .++... .+-+|
T Consensus 68 ~~~~~Dl~~v~l~~~~kf----rDIrk~-~~~~~-~~~~~~~l~v~~~~~~~~~~~v~~v~~~~-~i~~~g--~~~~~-- 136 (172)
T PF00548_consen 68 DGVDTDLTLVKLPRNPKF----RDIRKF-FPESI-PEYPECVLLVNSTKFPRMIVEVGFVTNFG-FINLSG--TTTPR-- 136 (172)
T ss_dssp TSSEEEEEEEEEESSS-B------GGGG-SBSSG-GTEEEEEEEEESSSSTCEEEEEEEEEEEE-EEEETT--EEEEE--
T ss_pred CCcceeEEEEEccCCccc----Cchhhh-hcccc-ccCCCcEEEEECCCCccEEEEEEEEeecC-ccccCC--CEeeE--
Confidence 44 8888888541110 122211 11221 13444444444444455666666666522 332222 22222
Q ss_pred ceeeEEEec-CcCCCCcceEECC---CccEEEEE
Q 002474 778 NMEVIELDT-DFGSTFSGVLTDE---HGRVQAIW 807 (918)
Q Consensus 778 n~e~i~~d~-~~~~~~~Gvl~d~---~G~v~alw 807 (918)
++.=+. ...+.|||+|+-+ .+++.|+=
T Consensus 137 ---~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiH 167 (172)
T PF00548_consen 137 ---SLKYKAPTKPGMCGSPLVSRIGGQGKIIGIH 167 (172)
T ss_dssp ---EEEEESEEETTGTTEEEEESCGGTTEEEEEE
T ss_pred ---EEEEccCCCCCccCCeEEEeeccCccEEEEE
Confidence 223332 3456898888653 34555543
No 148
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=22.07 E-value=52 Score=40.11 Aligned_cols=37 Identities=19% Similarity=0.427 Sum_probs=32.0
Q ss_pred CCceEEeEecCCCcccc--CCCCCCEEEEECCEEecCHH
Q 002474 297 TGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFL 333 (918)
Q Consensus 297 ~G~lVV~~V~p~spA~~--GLq~GDiIlsVNG~~I~s~~ 333 (918)
.|.-++..+.+++||+. .|.+||.++.||++.+-.|+
T Consensus 224 dg~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq 262 (638)
T KOG1738|consen 224 DGPHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ 262 (638)
T ss_pred CCceeccccccCChHHHhhcccCccceeeecccccccch
Confidence 45555678899999998 79999999999999988884
No 149
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=21.67 E-value=7.3e+02 Score=27.00 Aligned_cols=92 Identities=18% Similarity=0.118 Sum_probs=53.2
Q ss_pred cceEEEEecCCCcCcccccceeecccCCCc-CCCCCCeEEEEEecCCCceeEEeEEEecceeecccCCCCCcccccccee
Q 002474 702 HNFALIAYDPSSLGVAGASVVRAAELLPEP-ALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRAMNME 780 (918)
Q Consensus 702 ~n~aiv~ydp~~~~~~~~~~v~~~~l~~~~-~l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~~~~~~~~~~~pryr~~n~e 780 (918)
+...||..+.. .. ..+..+=|+.+. .+..||.+.+-|++.+..+.+++..+++.. . .-.
T Consensus 161 ~~~mIlEl~~~-~~----~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~~~~~~~~~i~~~~-~--------------~~~ 220 (282)
T PF03761_consen 161 YSPMILELEED-FS----KNVSPPCLADSSTNWEKGDEVDVYGFNSTGKLKHRKLKITNCT-K--------------CAY 220 (282)
T ss_pred cceEEEEEccc-cc----ccCCCEEeCCCccccccCceEEEeecCCCCeEEEEEEEEEEee-c--------------cce
Confidence 34455555554 11 133334444332 267899999999988888999999999854 1 112
Q ss_pred eEEEecCcCCCC-cceE-ECCCccEEEEEeeeecc
Q 002474 781 VIELDTDFGSTF-SGVL-TDEHGRVQAIWGSFSTQ 813 (918)
Q Consensus 781 ~i~~d~~~~~~~-~Gvl-~d~~G~v~alw~s~~~~ 813 (918)
.+..+.....+- ||.| ...+|+...+-....+.
T Consensus 221 ~~~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~ 255 (282)
T PF03761_consen 221 SICTKQYSCKGDRGGPLVKNINGRWTLIGVGASGN 255 (282)
T ss_pred eEecccccCCCCccCeEEEEECCCEEEEEEEccCC
Confidence 344444444333 5655 45666655555544444
No 150
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=21.51 E-value=2.2e+02 Score=24.47 Aligned_cols=32 Identities=3% Similarity=-0.106 Sum_probs=27.7
Q ss_pred EEEEEecCCeEEeEEEEEecCCCcEEEEEECC
Q 002474 95 VAEAMFVNREEIPVYPIYRDPVHDFGFFRYDP 126 (918)
Q Consensus 95 ~i~v~f~dg~~~~a~vv~~Dp~~DlAlLkvd~ 126 (918)
.+.++.-.|+++.++|+.+|....+.+||-..
T Consensus 8 ~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 8 QVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred EEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 46677778999999999999999999998654
No 151
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=20.60 E-value=5.7e+02 Score=31.96 Aligned_cols=90 Identities=21% Similarity=0.253 Sum_probs=51.2
Q ss_pred CCCCCeEEEEEecCCCceeEEeEEEecceeecccCCCCCccccccceeeEE--EecCcCCCC-cceEE-CCCc------c
Q 002474 733 LRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRAMNMEVIE--LDTDFGSTF-SGVLT-DEHG------R 802 (918)
Q Consensus 733 l~~Gd~v~~vG~~~~~~~~~~~t~vt~i~~~~~~~~~~~pryr~~n~e~i~--~d~~~~~~~-~Gvl~-d~~G------~ 802 (918)
++.|..|+=+|-+.+.. ++.|..+. ........+- .+ .-+|. -+..|+.+. ||-++ +.-+ -
T Consensus 587 ~~~G~~VfK~GrTTgyT----~G~lNg~k-lvyw~dG~i~-s~---efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLg 657 (695)
T PF08192_consen 587 LVPGMEVFKVGRTTGYT----TGILNGIK-LVYWADGKIQ-SS---EFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLG 657 (695)
T ss_pred cCCCCeEEEecccCCcc----ceEecceE-EEEecCCCeE-EE---EEEEecCCCccccCCCCcccEEEecccccccCce
Confidence 57799999999876654 34566654 2222222221 11 11332 245677777 77654 4322 4
Q ss_pred EEEEEeeeecceeccCCCCCCceEEeccchhhHHHHHHHHH
Q 002474 803 VQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKII 843 (918)
Q Consensus 803 v~alw~s~~~~~~~~~~~~~~~~~~~gl~~~~i~~v~~~l~ 843 (918)
|.|.--||-|+.+ + +|| .+-+-.|+++|+
T Consensus 658 vvGMlhsydge~k---------q--fgl-ftPi~~il~rl~ 686 (695)
T PF08192_consen 658 VVGMLHSYDGEQK---------Q--FGL-FTPINEILDRLE 686 (695)
T ss_pred eeEEeeecCCccc---------e--eec-cCcHHHHHHHHH
Confidence 8898889988842 3 354 344445556665
Done!