Query 007357
Match_columns 606
No_of_seqs 623 out of 3636
Neff 7.3
Searched_HMMs 46136
Date Thu Mar 28 22:29:21 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/007357.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/007357hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 2E-57 4.4E-62 495.7 42.3 383 109-539 45-446 (455)
2 TIGR02037 degP_htrA_DO peripla 100.0 9.7E-55 2.1E-59 474.4 43.8 389 110-540 7-421 (428)
3 PRK10942 serine endoprotease; 100.0 4.5E-54 9.7E-59 471.5 42.1 349 140-539 110-464 (473)
4 KOG1320 Serine protease [Postt 100.0 7.6E-52 1.7E-56 440.3 18.3 416 109-565 55-473 (473)
5 TIGR02038 protease_degS peripl 100.0 2.9E-46 6.3E-51 396.7 32.7 293 109-424 50-349 (351)
6 PRK10898 serine endoprotease; 100.0 7.7E-46 1.7E-50 393.2 33.4 294 109-425 50-351 (353)
7 KOG1421 Predicted signaling-as 100.0 3.1E-37 6.7E-42 329.6 27.0 382 105-536 53-457 (955)
8 COG0265 DegQ Trypsin-like seri 100.0 6.2E-35 1.3E-39 310.5 29.6 300 106-424 35-341 (347)
9 KOG1320 Serine protease [Postt 99.9 9.8E-22 2.1E-26 210.2 21.6 298 111-423 135-468 (473)
10 KOG1421 Predicted signaling-as 99.8 5.7E-18 1.2E-22 182.6 27.4 367 110-536 524-916 (955)
11 PRK10779 zinc metallopeptidase 99.7 1.3E-16 2.8E-21 175.4 14.6 149 347-539 129-278 (449)
12 PF00089 Trypsin: Trypsin; In 99.6 4.7E-14 1E-18 138.5 15.8 187 121-307 2-220 (220)
13 PF13365 Trypsin_2: Trypsin-li 99.6 4.8E-14 1E-18 125.7 14.1 108 143-281 1-120 (120)
14 TIGR00054 RIP metalloprotease 99.6 1.2E-14 2.6E-19 158.4 12.2 132 344-539 128-260 (420)
15 cd00190 Tryp_SPc Trypsin-like 99.5 7.8E-13 1.7E-17 130.8 15.8 165 122-286 3-208 (232)
16 PF13180 PDZ_2: PDZ domain; PD 99.4 5E-13 1.1E-17 112.3 9.5 81 320-421 1-82 (82)
17 smart00020 Tryp_SPc Trypsin-li 99.4 7.2E-12 1.6E-16 124.2 16.7 167 120-286 2-208 (229)
18 cd00987 PDZ_serine_protease PD 99.3 1.6E-11 3.5E-16 104.3 9.8 88 320-418 1-89 (90)
19 cd00991 PDZ_archaeal_metallopr 99.1 6.3E-10 1.4E-14 92.8 9.4 70 341-420 7-77 (79)
20 cd00986 PDZ_LON_protease PDZ d 99.1 7.3E-10 1.6E-14 92.2 9.4 71 344-424 8-78 (79)
21 cd00990 PDZ_glycyl_aminopeptid 99.0 3.3E-09 7.2E-14 88.2 10.7 77 320-422 1-78 (80)
22 TIGR01713 typeII_sec_gspC gene 99.0 2.8E-09 6E-14 108.7 11.8 100 301-421 159-259 (259)
23 cd00989 PDZ_metalloprotease PD 98.8 2.6E-08 5.6E-13 82.4 8.5 66 344-420 12-78 (79)
24 TIGR02037 degP_htrA_DO peripla 98.8 1.9E-08 4.2E-13 110.3 9.7 89 319-418 337-427 (428)
25 cd00988 PDZ_CTP_protease PDZ d 98.8 4.7E-08 1E-12 82.1 9.5 66 344-420 13-82 (85)
26 KOG3627 Trypsin [Amino acid tr 98.7 3.9E-07 8.4E-12 92.5 16.8 168 119-287 12-229 (256)
27 COG3591 V8-like Glu-specific e 98.7 7.5E-08 1.6E-12 96.5 10.6 158 143-310 66-249 (251)
28 PF13180 PDZ_2: PDZ domain; PD 98.6 5.7E-08 1.2E-12 81.5 6.4 59 481-539 13-72 (82)
29 cd00991 PDZ_archaeal_metallopr 98.6 2.6E-07 5.6E-12 77.0 8.2 60 481-540 9-69 (79)
30 KOG3209 WW domain-containing p 98.5 6.4E-07 1.4E-11 98.5 12.7 153 343-538 673-836 (984)
31 cd00987 PDZ_serine_protease PD 98.5 5.9E-07 1.3E-11 76.1 10.0 81 442-540 2-83 (90)
32 cd00136 PDZ PDZ domain, also c 98.5 2.8E-07 6.1E-12 74.4 7.5 67 321-408 2-69 (70)
33 smart00228 PDZ Domain present 98.4 1.5E-06 3.2E-11 72.5 9.0 73 320-412 12-85 (85)
34 PF12812 PDZ_1: PDZ-like domai 98.4 6.3E-07 1.4E-11 74.4 6.5 74 437-530 5-78 (78)
35 cd00989 PDZ_metalloprotease PD 98.3 1.3E-06 2.9E-11 72.1 6.8 56 484-539 14-69 (79)
36 cd00136 PDZ PDZ domain, also c 98.3 1.3E-06 2.7E-11 70.5 6.3 54 483-536 14-69 (70)
37 cd00986 PDZ_LON_protease PDZ d 98.3 1.9E-06 4.2E-11 71.5 7.4 57 482-539 8-65 (79)
38 TIGR00054 RIP metalloprotease 98.3 1.7E-06 3.6E-11 94.7 8.7 68 344-422 203-271 (420)
39 cd00988 PDZ_CTP_protease PDZ d 98.3 2.1E-06 4.5E-11 72.0 7.0 58 482-539 13-72 (85)
40 KOG3209 WW domain-containing p 98.3 5.3E-06 1.1E-10 91.6 11.2 177 348-539 782-982 (984)
41 PRK10779 zinc metallopeptidase 98.2 3.2E-06 7E-11 93.4 9.3 67 345-422 222-289 (449)
42 PF00595 PDZ: PDZ domain (Also 98.2 5.5E-06 1.2E-10 69.1 7.8 72 319-409 9-81 (81)
43 TIGR00225 prc C-terminal pepti 98.2 7.4E-06 1.6E-10 87.1 10.1 72 344-424 62-134 (334)
44 PRK10139 serine endoprotease; 98.1 6.9E-06 1.5E-10 90.7 9.2 80 442-539 268-348 (455)
45 TIGR03279 cyano_FeS_chp putati 98.1 4E-06 8.7E-11 90.3 6.9 62 347-422 1-64 (433)
46 PLN00049 carboxyl-terminal pro 98.1 2.3E-05 5.1E-10 85.0 12.0 69 344-421 102-171 (389)
47 PF00595 PDZ: PDZ domain (Also 98.0 1.1E-05 2.3E-10 67.3 6.6 55 482-537 25-81 (81)
48 TIGR02038 protease_degS peripl 98.0 1.7E-05 3.7E-10 84.9 9.2 80 442-539 256-336 (351)
49 cd00990 PDZ_glycyl_aminopeptid 98.0 1.3E-05 2.9E-10 66.4 6.6 56 482-539 12-67 (80)
50 cd00992 PDZ_signaling PDZ doma 98.0 1.9E-05 4.2E-10 65.4 7.0 54 482-536 26-81 (82)
51 PRK10942 serine endoprotease; 98.0 2.2E-05 4.8E-10 87.1 9.3 80 442-539 289-369 (473)
52 cd00992 PDZ_signaling PDZ doma 98.0 2.3E-05 5E-10 65.0 7.2 50 319-378 11-61 (82)
53 KOG3580 Tight junction protein 97.9 0.0001 2.2E-09 80.2 13.0 74 337-420 212-287 (1027)
54 PRK10898 serine endoprotease; 97.9 4.3E-05 9.3E-10 81.8 10.0 59 481-539 278-337 (353)
55 PF14685 Tricorn_PDZ: Tricorn 97.9 0.00012 2.7E-09 62.1 10.6 64 344-418 12-87 (88)
56 TIGR02860 spore_IV_B stage IV 97.9 3.4E-05 7.4E-10 82.7 8.9 67 344-421 105-180 (402)
57 PRK09681 putative type II secr 97.8 4.7E-05 1E-09 77.7 7.6 67 344-421 204-275 (276)
58 PF00863 Peptidase_C4: Peptida 97.8 0.002 4.4E-08 64.4 18.4 169 110-310 13-192 (235)
59 TIGR01713 typeII_sec_gspC gene 97.7 7E-05 1.5E-09 76.6 7.7 63 481-543 190-254 (259)
60 COG0793 Prc Periplasmic protea 97.7 0.00012 2.6E-09 79.6 9.0 81 319-420 99-182 (406)
61 COG3480 SdrC Predicted secrete 97.7 0.0001 2.2E-09 75.5 7.1 71 344-424 130-201 (342)
62 COG5640 Secreted trypsin-like 97.6 0.00035 7.6E-09 72.5 10.8 81 119-199 32-135 (413)
63 smart00228 PDZ Domain present 97.6 0.00013 2.9E-09 60.5 6.1 57 482-539 26-84 (85)
64 PF05579 Peptidase_S32: Equine 97.5 0.0009 1.9E-08 67.0 11.4 135 104-286 92-229 (297)
65 KOG3580 Tight junction protein 97.5 0.00057 1.2E-08 74.5 10.6 61 343-413 39-99 (1027)
66 PF03761 DUF316: Domain of unk 97.5 0.0025 5.5E-08 65.9 15.0 89 187-285 160-254 (282)
67 TIGR00225 prc C-terminal pepti 97.5 0.00021 4.5E-09 76.0 6.6 58 482-539 62-121 (334)
68 KOG3129 26S proteasome regulat 97.4 0.00044 9.4E-09 66.8 7.4 73 346-426 141-214 (231)
69 PF04495 GRASP55_65: GRASP55/6 97.4 0.001 2.3E-08 61.4 9.7 87 318-422 24-114 (138)
70 PLN00049 carboxyl-terminal pro 97.3 0.00035 7.6E-09 75.8 6.7 58 482-539 102-161 (389)
71 TIGR02860 spore_IV_B stage IV 97.3 0.00048 1E-08 74.0 7.4 48 492-539 123-170 (402)
72 KOG3605 Beta amyloid precursor 97.3 0.0005 1.1E-08 75.8 7.3 122 351-529 680-805 (829)
73 PRK11186 carboxy-terminal prot 97.3 0.0013 2.7E-08 75.6 10.7 83 319-420 243-332 (667)
74 TIGR03279 cyano_FeS_chp putati 97.2 0.0005 1.1E-08 74.4 6.0 57 486-545 2-60 (433)
75 COG3975 Predicted protease wit 97.0 0.0016 3.5E-08 71.0 7.4 84 322-425 439-526 (558)
76 COG3031 PulC Type II secretory 97.0 0.00096 2.1E-08 65.7 5.1 66 345-420 208-274 (275)
77 KOG3834 Golgi reassembly stack 97.0 0.0058 1.3E-07 65.0 10.9 147 342-538 13-166 (462)
78 KOG3553 Tax interaction protei 96.7 0.0021 4.6E-08 54.7 4.5 74 299-376 16-92 (124)
79 PF04495 GRASP55_65: GRASP55/6 96.7 0.003 6.5E-08 58.4 5.9 58 481-538 42-100 (138)
80 PF00548 Peptidase_C3: 3C cyst 96.7 0.042 9.1E-07 52.8 13.9 138 140-285 24-170 (172)
81 COG0793 Prc Periplasmic protea 96.6 0.0035 7.5E-08 68.4 6.7 57 482-538 112-170 (406)
82 COG0265 DegQ Trypsin-like seri 96.5 0.0056 1.2E-07 65.4 7.2 60 480-539 268-328 (347)
83 PF14685 Tricorn_PDZ: Tricorn 96.3 0.016 3.5E-07 49.3 7.1 58 482-539 12-79 (88)
84 PF12812 PDZ_1: PDZ-like domai 96.0 0.0096 2.1E-07 49.5 4.6 58 320-380 9-67 (78)
85 PRK11186 carboxy-terminal prot 95.8 0.014 3E-07 67.2 6.3 57 482-538 255-319 (667)
86 PF09342 DUF1986: Domain of un 95.6 0.15 3.1E-06 51.1 11.4 100 127-227 12-132 (267)
87 PF02122 Peptidase_S39: Peptid 95.5 0.011 2.3E-07 58.2 3.4 137 150-302 41-183 (203)
88 KOG3129 26S proteasome regulat 95.3 0.042 9E-07 53.4 6.5 63 482-544 139-205 (231)
89 PF08192 Peptidase_S64: Peptid 95.3 0.11 2.4E-06 58.6 10.7 117 187-310 542-688 (695)
90 PF10459 Peptidase_S46: Peptid 95.3 0.012 2.7E-07 67.9 3.5 18 143-160 49-67 (698)
91 PF10459 Peptidase_S46: Peptid 95.0 0.087 1.9E-06 61.1 9.3 43 187-229 199-254 (698)
92 PF00949 Peptidase_S7: Peptida 94.9 0.046 9.9E-07 50.0 5.2 31 258-288 90-120 (132)
93 KOG3550 Receptor targeting pro 94.9 0.14 3E-06 47.2 8.1 66 482-549 115-183 (207)
94 PRK09681 putative type II secr 94.7 0.05 1.1E-06 55.9 5.5 44 496-539 221-265 (276)
95 KOG3550 Receptor targeting pro 94.6 0.086 1.9E-06 48.5 6.2 37 343-379 114-152 (207)
96 COG3480 SdrC Predicted secrete 94.5 0.096 2.1E-06 54.2 6.8 57 481-538 129-186 (342)
97 KOG3553 Tax interaction protei 94.2 0.038 8.3E-07 47.3 2.7 45 481-525 58-104 (124)
98 KOG3532 Predicted protein kina 94.0 0.098 2.1E-06 58.6 6.3 57 479-535 395-451 (1051)
99 KOG3552 FERM domain protein FR 93.9 0.082 1.8E-06 61.1 5.5 57 344-410 75-131 (1298)
100 COG3975 Predicted protease wit 93.3 0.082 1.8E-06 58.1 4.1 60 437-512 433-492 (558)
101 KOG3552 FERM domain protein FR 92.4 0.18 3.9E-06 58.4 5.3 55 482-538 75-131 (1298)
102 KOG3532 Predicted protein kina 91.8 0.37 7.9E-06 54.2 6.7 38 343-380 397-435 (1051)
103 PF00944 Peptidase_S3: Alphavi 91.7 0.45 9.8E-06 43.1 6.0 28 259-286 100-127 (158)
104 KOG2921 Intramembrane metallop 91.6 0.28 6.1E-06 52.0 5.3 51 479-529 217-268 (484)
105 KOG3542 cAMP-regulated guanine 91.4 0.21 4.5E-06 56.0 4.3 36 343-378 561-597 (1283)
106 KOG3542 cAMP-regulated guanine 90.8 0.27 6E-06 55.1 4.4 57 481-539 561-619 (1283)
107 PF00947 Pico_P2A: Picornaviru 88.9 1.6 3.4E-05 39.5 7.0 45 240-285 65-109 (127)
108 KOG2921 Intramembrane metallop 88.3 0.31 6.8E-06 51.7 2.5 38 343-380 219-258 (484)
109 KOG3571 Dishevelled 3 and rela 87.4 1.4 3.1E-05 48.2 6.7 38 343-380 276-315 (626)
110 KOG3605 Beta amyloid precursor 86.9 0.63 1.4E-05 52.3 3.9 104 264-377 679-790 (829)
111 COG0750 Predicted membrane-ass 86.2 1.6 3.6E-05 46.8 6.7 58 348-416 133-195 (375)
112 PF02907 Peptidase_S29: Hepati 85.4 1.1 2.4E-05 40.8 4.0 38 263-301 106-144 (148)
113 COG0750 Predicted membrane-ass 85.4 1.8 4E-05 46.5 6.6 53 486-538 133-188 (375)
114 KOG1892 Actin filament-binding 85.1 1.2 2.5E-05 52.2 4.9 59 344-412 960-1020(1629)
115 PF02395 Peptidase_S6: Immunog 84.8 3 6.4E-05 49.2 8.3 160 143-310 67-266 (769)
116 KOG3549 Syntrophins (type gamm 83.8 0.87 1.9E-05 47.5 3.0 87 483-577 81-170 (505)
117 PF05580 Peptidase_S55: SpoIVB 80.9 1.5 3.3E-05 43.2 3.4 40 260-302 175-214 (218)
118 COG3031 PulC Type II secretory 80.9 3 6.5E-05 41.7 5.3 49 491-539 216-265 (275)
119 KOG3651 Protein kinase C, alph 76.4 2.9 6.2E-05 43.0 3.8 55 345-409 31-87 (429)
120 KOG0609 Calcium/calmodulin-dep 75.8 5.3 0.00011 44.4 5.9 56 345-410 147-204 (542)
121 KOG0609 Calcium/calmodulin-dep 75.0 3.9 8.5E-05 45.4 4.7 55 484-539 148-205 (542)
122 KOG3571 Dishevelled 3 and rela 75.0 11 0.00024 41.6 8.0 57 480-536 275-336 (626)
123 KOG3606 Cell polarity protein 73.0 6.9 0.00015 39.8 5.5 49 481-529 193-244 (358)
124 KOG3551 Syntrophins (type beta 72.4 3.6 7.8E-05 43.8 3.5 54 344-408 110-166 (506)
125 KOG3606 Cell polarity protein 71.6 4.2 9E-05 41.3 3.6 41 338-378 188-230 (358)
126 PF11874 DUF3394: Domain of un 69.9 17 0.00037 35.2 7.2 29 482-510 122-150 (183)
127 KOG0606 Microtubule-associated 69.9 4.5 9.7E-05 48.6 3.9 47 333-379 645-694 (1205)
128 KOG3549 Syntrophins (type gamm 69.7 9.2 0.0002 40.2 5.6 55 345-409 81-137 (505)
129 KOG3551 Syntrophins (type beta 69.2 4.2 9E-05 43.3 3.1 82 483-576 111-195 (506)
130 KOG3651 Protein kinase C, alph 67.5 11 0.00024 38.9 5.7 52 483-535 31-85 (429)
131 KOG0606 Microtubule-associated 67.1 7.3 0.00016 46.9 4.9 45 485-529 661-707 (1205)
132 PF01732 DUF31: Putative pepti 66.5 4.5 9.7E-05 43.8 2.9 23 261-283 351-373 (374)
133 PF05416 Peptidase_C37: Southa 64.7 36 0.00079 37.0 9.0 138 141-288 379-529 (535)
134 KOG3834 Golgi reassembly stack 61.4 14 0.0003 40.2 5.2 60 480-540 13-74 (462)
135 KOG3938 RGS-GAIP interacting p 60.5 20 0.00043 36.5 5.9 50 486-535 153-206 (334)
136 PF03510 Peptidase_C24: 2C end 59.0 74 0.0016 28.0 8.4 53 145-211 3-55 (105)
137 KOG3938 RGS-GAIP interacting p 54.7 14 0.0003 37.6 3.7 57 345-409 150-208 (334)
138 KOG1892 Actin filament-binding 34.9 62 0.0013 38.8 5.3 56 481-537 959-1017(1629)
139 cd01720 Sm_D2 The eukaryotic S 33.4 76 0.0017 26.9 4.5 36 159-195 10-45 (87)
140 KOG1738 Membrane-associated gu 27.8 57 0.0012 37.2 3.5 34 346-379 227-262 (638)
141 cd00600 Sm_like The eukaryotic 27.0 1.6E+02 0.0034 22.6 5.1 32 164-196 7-38 (63)
142 PF11874 DUF3394: Domain of un 26.5 69 0.0015 31.1 3.4 28 343-370 121-149 (183)
143 TIGR03000 plancto_dom_1 Planct 25.5 1.5E+02 0.0032 24.5 4.6 47 363-418 10-60 (75)
144 PF00571 CBS: CBS domain CBS d 25.3 61 0.0013 24.0 2.3 21 264-284 28-48 (57)
145 PF10055 DUF2292: Uncharacteri 25.2 2.2E+02 0.0048 20.3 4.9 32 517-548 3-34 (38)
146 PF12419 DUF3670: SNF2 Helicas 24.5 2.8E+02 0.0061 25.5 7.0 53 502-559 72-124 (141)
147 cd01728 LSm1 The eukaryotic Sm 21.4 2.6E+02 0.0056 22.9 5.4 32 164-196 13-44 (74)
148 cd01735 LSm12_N LSm12 belongs 21.0 3.1E+02 0.0068 21.7 5.5 34 163-197 6-39 (61)
149 PF09465 LBR_tudor: Lamin-B re 20.7 4E+02 0.0086 20.7 5.8 37 162-198 8-44 (55)
150 cd01731 archaeal_Sm1 The archa 20.5 2.3E+02 0.0049 22.5 4.9 33 164-197 11-43 (68)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=2e-57 Score=495.66 Aligned_cols=383 Identities=22% Similarity=0.348 Sum_probs=313.1
Q ss_pred chhhccCCcEEEEeeEecCC-------------CCCCccccccccceEEEEEEc--CCEEEecccccCCCCeEEEEEecC
Q 007357 109 QDAAFLNAVVKVYCTHTAPD-------------YSLPWQKQRQYTSTGSAFMIG--DGKLLTNAHCVEHYTQVKVKRRGD 173 (606)
Q Consensus 109 ~~~~~~~SVV~I~~~~~~~~-------------~~~Pw~~~~~~~~~GSGfvI~--~g~ILTnaHvV~~~~~v~V~~~~~ 173 (606)
..+++.+|||.|.+...... ...||+...+..+.||||||+ +||||||+|||.++..+.|++ .+
T Consensus 45 ~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~~-~d 123 (455)
T PRK10139 45 MLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQL-ND 123 (455)
T ss_pred HHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEEE-CC
Confidence 36888999999998654321 112454444456889999996 699999999999999999998 78
Q ss_pred CcEEEEEEEEeecCCCeEEEEecccccccCCcccccCCCC--CCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCc
Q 007357 174 DTKYVAKVLARGVDCDIALLSVESEEFWKDAEPLCLGHLP--RLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSS 251 (606)
Q Consensus 174 ~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~v~pl~l~~~~--~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~ 251 (606)
++.++|++++.|+.+||||||++.+. .+++++|+++. .+|++|+++|||++.. .+++.|+||+..+..+....
T Consensus 124 g~~~~a~vvg~D~~~DlAvlkv~~~~---~l~~~~lg~s~~~~~G~~V~aiG~P~g~~-~tvt~GivS~~~r~~~~~~~- 198 (455)
T PRK10139 124 GREFDAKLIGSDDQSDIALLQIQNPS---KLTQIAIADSDKLRVGDFAVAVGNPFGLG-QTATSGIISALGRSGLNLEG- 198 (455)
T ss_pred CCEEEEEEEEEcCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEecCCCCC-CceEEEEEccccccccCCCC-
Confidence 99999999999999999999998643 67899999755 5699999999999866 58999999998775432211
Q ss_pred ceeEEEEccCcCCCCCCCceEcCCCeEEEEEEeeecc-cccceeeeeecccccchhhhHHhhcCcccCccccceeeeecc
Q 007357 252 ELLGIQIDAAINPGNSGGPAFNDKGECIGVAFQVYRS-EEVENIGYVIPTTVVSHFLSDYERNGKYTGFPCLGVLLQKLE 330 (606)
Q Consensus 252 ~~~~iq~da~i~~G~SGGPl~n~~G~vVGI~~~~~~~-~~~~~~~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e 330 (606)
...+||+|+++++|||||||||.+|+||||+++.+.. ++..+++|+||++.++++++++.++|++. ++|||+.++.+
T Consensus 199 ~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~-r~~LGv~~~~l- 276 (455)
T PRK10139 199 LENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIK-RGLLGIKGTEM- 276 (455)
T ss_pred cceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCccc-ccceeEEEEEC-
Confidence 2246999999999999999999999999999987643 34578999999999999999999999998 99999999999
Q ss_pred chhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEE
Q 007357 331 NPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGII 409 (606)
Q Consensus 331 ~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~ 409 (606)
++++++.+|++...|++|..|.++|||++ |||+||+|++|||++|.+|.+ +...+....+|+++.++|+
T Consensus 277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~d----------l~~~l~~~~~g~~v~l~V~ 346 (455)
T PRK10139 277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAE----------LRSRIATTEPGTKVKLGLL 346 (455)
T ss_pred CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHH----------HHHHHHhcCCCCEEEEEEE
Confidence 89999999998778999999999999999 999999999999999999987 4466666668999999999
Q ss_pred ECCEEEEEEEEecccccccccccCCCCCcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEe
Q 007357 410 RAGTFMKVKVVLNPRVHLVPYHIDGGQPSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQV 489 (606)
Q Consensus 410 R~G~~~~v~v~l~~~~~l~p~~~~~~~p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~V 489 (606)
|+|+.+++++++...... +.......| .+.|+.+.+. . ......+++|..|
T Consensus 347 R~G~~~~l~v~~~~~~~~-~~~~~~~~~---~~~g~~l~~~-----------------------~--~~~~~~Gv~V~~V 397 (455)
T PRK10139 347 RNGKPLEVEVTLDTSTSS-SASAEMITP---ALQGATLSDG-----------------------Q--LKDGTKGIKIDEV 397 (455)
T ss_pred ECCEEEEEEEEECCCCCc-ccccccccc---cccccEeccc-----------------------c--cccCCCceEEEEe
Confidence 999999999987543211 000000000 0123222220 0 0112357899999
Q ss_pred cccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEecc
Q 007357 490 LANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 490 l~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
.++++|..+||++||+|++|||++|.+|++|.+++++.+ +.+.|.+.|+
T Consensus 398 ~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~-~~v~l~v~R~ 446 (455)
T PRK10139 398 VKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVRG 446 (455)
T ss_pred CCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEEC
Confidence 999999999999999999999999999999999998865 7888998886
No 2
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=9.7e-55 Score=474.44 Aligned_cols=389 Identities=28% Similarity=0.385 Sum_probs=323.8
Q ss_pred hhhccCCcEEEEeeEecCCCC------CC----cc----------ccccccceEEEEEEc-CCEEEecccccCCCCeEEE
Q 007357 110 DAAFLNAVVKVYCTHTAPDYS------LP----WQ----------KQRQYTSTGSAFMIG-DGKLLTNAHCVEHYTQVKV 168 (606)
Q Consensus 110 ~~~~~~SVV~I~~~~~~~~~~------~P----w~----------~~~~~~~~GSGfvI~-~g~ILTnaHvV~~~~~v~V 168 (606)
.+++.+|||.|.+........ .+ |. ......+.||||+|+ +|+||||+|||.++..+.|
T Consensus 7 ~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~~i~V 86 (428)
T TIGR02037 7 VEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGADEITV 86 (428)
T ss_pred HHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCCeEEE
Confidence 578899999999865332100 00 00 112245789999998 7899999999999999999
Q ss_pred EEecCCcEEEEEEEEeecCCCeEEEEecccccccCCcccccCCCC--CCCCeEEEEeecCCCCcceEEeeEEeeeeeeee
Q 007357 169 KRRGDDTKYVAKVLARGVDCDIALLSVESEEFWKDAEPLCLGHLP--RLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSY 246 (606)
Q Consensus 169 ~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~v~pl~l~~~~--~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~ 246 (606)
++ .+++.++|++++.|+.+|||||+++... .+++++|+++. .+|++|+++|||.+.. .+++.|+|+...+...
T Consensus 87 ~~-~~~~~~~a~vv~~d~~~DlAllkv~~~~---~~~~~~l~~~~~~~~G~~v~aiG~p~g~~-~~~t~G~vs~~~~~~~ 161 (428)
T TIGR02037 87 TL-SDGREFKAKLVGKDPRTDIAVLKIDAKK---NLPVIKLGDSDKLRVGDWVLAIGNPFGLG-QTVTSGIVSALGRSGL 161 (428)
T ss_pred Ee-CCCCEEEEEEEEecCCCCEEEEEecCCC---CceEEEccCCCCCCCCCEEEEEECCCcCC-CcEEEEEEEecccCcc
Confidence 98 7899999999999999999999998752 68899998644 6799999999999865 5899999999876532
Q ss_pred cCCCcceeEEEEccCcCCCCCCCceEcCCCeEEEEEEeeecc-cccceeeeeecccccchhhhHHhhcCcccCcccccee
Q 007357 247 AHGSSELLGIQIDAAINPGNSGGPAFNDKGECIGVAFQVYRS-EEVENIGYVIPTTVVSHFLSDYERNGKYTGFPCLGVL 325 (606)
Q Consensus 247 ~~~~~~~~~iq~da~i~~G~SGGPl~n~~G~vVGI~~~~~~~-~~~~~~~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~ 325 (606)
... ....++|+|+++++|||||||||.+|+||||+++.+.. ++..+++|+||++.+++++++++++|++. +||||+.
T Consensus 162 ~~~-~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~-~~~lGi~ 239 (428)
T TIGR02037 162 GIG-DYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQ-RGWLGVT 239 (428)
T ss_pred CCC-CccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCc-CCcCceE
Confidence 111 12246999999999999999999999999999887643 34568999999999999999999999988 9999999
Q ss_pred eeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEE
Q 007357 326 LQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVA 404 (606)
Q Consensus 326 ~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v 404 (606)
++.+ ++.+++.+|++...|++|.+|.++|||++ |||+||+|++|||++|.++.+ +...+....+|+++
T Consensus 240 ~~~~-~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~----------~~~~l~~~~~g~~v 308 (428)
T TIGR02037 240 IQEV-TSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFAD----------LRRAIGTLKPGKKV 308 (428)
T ss_pred eecC-CHHHHHHcCCCCCCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCCCCEE
Confidence 9999 88999999999888999999999999999 999999999999999999876 45667676789999
Q ss_pred EEEEEECCEEEEEEEEecccccccccccCCCCCcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEE
Q 007357 405 ELGIIRAGTFMKVKVVLNPRVHLVPYHIDGGQPSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMV 484 (606)
Q Consensus 405 ~l~V~R~G~~~~v~v~l~~~~~l~p~~~~~~~p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~V 484 (606)
+++|.|+|+.+++++++...+.. ..++...+.|+.+.++++.+.+.+. + .....++
T Consensus 309 ~l~v~R~g~~~~~~v~l~~~~~~-------~~~~~~~~lGi~~~~l~~~~~~~~~---------------l--~~~~~Gv 364 (428)
T TIGR02037 309 TLGILRKGKEKTITVTLGASPEE-------QASSSNPFLGLTVANLSPEIRKELR---------------L--KGDVKGV 364 (428)
T ss_pred EEEEEECCEEEEEEEEECcCCCc-------cccccccccceEEecCCHHHHHHcC---------------C--CcCcCce
Confidence 99999999999999988765321 1133455789999999876543211 0 0123688
Q ss_pred EEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhc-CCceEEEEEecce
Q 007357 485 ILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSC-KDKYLVFEFEDNY 540 (606)
Q Consensus 485 vIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~-~~~~v~l~v~r~~ 540 (606)
+|.+|.++++|..+||+.||+|++|||++|.++++|.+++++. .++.++|++.|+.
T Consensus 365 ~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~v~R~g 421 (428)
T TIGR02037 365 VVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALLILRGG 421 (428)
T ss_pred EEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECC
Confidence 9999999999999999999999999999999999999999986 5789999998874
No 3
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=4.5e-54 Score=471.51 Aligned_cols=349 Identities=28% Similarity=0.398 Sum_probs=293.9
Q ss_pred cceEEEEEEc--CCEEEecccccCCCCeEEEEEecCCcEEEEEEEEeecCCCeEEEEecccccccCCcccccCCCC--CC
Q 007357 140 TSTGSAFMIG--DGKLLTNAHCVEHYTQVKVKRRGDDTKYVAKVLARGVDCDIALLSVESEEFWKDAEPLCLGHLP--RL 215 (606)
Q Consensus 140 ~~~GSGfvI~--~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~v~pl~l~~~~--~l 215 (606)
.+.||||||+ +||||||+|||.++..+.|++ .++++|+|++++.|+.+||||||++... .+++++|+++. .+
T Consensus 110 ~~~GSG~ii~~~~G~IlTn~HVv~~a~~i~V~~-~dg~~~~a~vv~~D~~~DlAvlki~~~~---~l~~~~lg~s~~l~~ 185 (473)
T PRK10942 110 MALGSGVIIDADKGYVVTNNHVVDNATKIKVQL-SDGRKFDAKVVGKDPRSDIALIQLQNPK---NLTAIKMADSDALRV 185 (473)
T ss_pred cceEEEEEEECCCCEEEeChhhcCCCCEEEEEE-CCCCEEEEEEEEecCCCCEEEEEecCCC---CCceeEecCccccCC
Confidence 5789999997 489999999999999999999 7899999999999999999999997543 57899998755 56
Q ss_pred CCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEccCcCCCCCCCceEcCCCeEEEEEEeeecc-ccccee
Q 007357 216 QDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDAAINPGNSGGPAFNDKGECIGVAFQVYRS-EEVENI 294 (606)
Q Consensus 216 G~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~~G~SGGPl~n~~G~vVGI~~~~~~~-~~~~~~ 294 (606)
|++|+++|+|.+.. .+++.|+||++.+..+.... ...+||+|+++++|||||||+|.+|+||||+++.+.. ++..++
T Consensus 186 G~~V~aiG~P~g~~-~tvt~GiVs~~~r~~~~~~~-~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~ 263 (473)
T PRK10942 186 GDYTVAIGNPYGLG-ETVTSGIVSALGRSGLNVEN-YENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGI 263 (473)
T ss_pred CCEEEEEcCCCCCC-cceeEEEEEEeecccCCccc-ccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccE
Confidence 99999999999865 48999999998765322211 1246999999999999999999999999999987643 344679
Q ss_pred eeeecccccchhhhHHhhcCcccCccccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCE
Q 007357 295 GYVIPTTVVSHFLSDYERNGKYTGFPCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDV 373 (606)
Q Consensus 295 ~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~ 373 (606)
+|+||+++++++++++.++|++. ++|||+..+.+ ++.+++.++++...|++|..|.++|||++ |||+||+|++|||+
T Consensus 264 gfaIP~~~~~~v~~~l~~~g~v~-rg~lGv~~~~l-~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~ 341 (473)
T PRK10942 264 GFAIPSNMVKNLTSQMVEYGQVK-RGELGIMGTEL-NSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGK 341 (473)
T ss_pred EEEEEHHHHHHHHHHHHhccccc-cceeeeEeeec-CHHHHHhcCCCCCCceEEEEECCCChHHHcCCCCCCEEEEECCE
Confidence 99999999999999999999998 99999999999 78899999999889999999999999999 99999999999999
Q ss_pred EeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEecccccccccccCCCCCcceeecceEEecCChH
Q 007357 374 CVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLNPRVHLVPYHIDGGQPSYLIIAGLVFTPLSEP 453 (606)
Q Consensus 374 ~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~~~~~l~p~~~~~~~p~y~~~~Glvf~pl~~~ 453 (606)
+|.++.+ |...+....+|+++.++|+|+|+.+++.+++....... .++...+.|+...+++..
T Consensus 342 ~V~s~~d----------l~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~~~~~~-------~~~~~~~lGl~g~~l~~~ 404 (473)
T PRK10942 342 PISSFAA----------LRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQSSQNQ-------VDSSNIFNGIEGAELSNK 404 (473)
T ss_pred ECCCHHH----------HHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCcCcccc-------cccccccccceeeecccc
Confidence 9999977 45667676789999999999999999999876532100 011111234433332211
Q ss_pred HHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCceEE
Q 007357 454 LIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLV 533 (606)
Q Consensus 454 ~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~ 533 (606)
....+++|.+|.++++|..+||++||+|++|||++|.++++|.+++++.+ +.+.
T Consensus 405 -------------------------~~~~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~-~~v~ 458 (473)
T PRK10942 405 -------------------------GGDKGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-SVLA 458 (473)
T ss_pred -------------------------cCCCCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEE
Confidence 01247899999999999999999999999999999999999999999854 7888
Q ss_pred EEEecc
Q 007357 534 FEFEDN 539 (606)
Q Consensus 534 l~v~r~ 539 (606)
|+++|+
T Consensus 459 l~V~R~ 464 (473)
T PRK10942 459 LNIQRG 464 (473)
T ss_pred EEEEEC
Confidence 999886
No 4
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=7.6e-52 Score=440.26 Aligned_cols=416 Identities=42% Similarity=0.601 Sum_probs=388.5
Q ss_pred chhhccCCcEEEEeeEecCCCCCCccccccccceEEEEEEcCCEEEecccccC---CCCeEEEEEecCCcEEEEEEEEee
Q 007357 109 QDAAFLNAVVKVYCTHTAPDYSLPWQKQRQYTSTGSAFMIGDGKLLTNAHCVE---HYTQVKVKRRGDDTKYVAKVLARG 185 (606)
Q Consensus 109 ~~~~~~~SVV~I~~~~~~~~~~~Pw~~~~~~~~~GSGfvI~~g~ILTnaHvV~---~~~~v~V~~~~~~~~~~a~vv~~d 185 (606)
..+...+|++++++..+.+.+..||+...+..+.|+||.+....+|||+|+++ ++..+.|+.+++.++|.|++...-
T Consensus 55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~ 134 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGKKLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF 134 (473)
T ss_pred CccccccceeEEEeecccccccCcceeeehhcccccchhhcccceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence 44556789999999999999999999999999999999999999999999999 788899988888999999999999
Q ss_pred cCCCeEEEEecccccccCCcccccCCCCCCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEccCcCCC
Q 007357 186 VDCDIALLSVESEEFWKDAEPLCLGHLPRLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDAAINPG 265 (606)
Q Consensus 186 ~~~DlAlLkv~~~~~~~~v~pl~l~~~~~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~~G 265 (606)
.+||+|+|.++..+||..+.|+++++.+.+.+.++++| ++.+++|.|+|++++...|.++...+..+|+++++++|
T Consensus 135 ~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~----gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~~~ 210 (473)
T KOG1320|consen 135 EECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG----GDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIGPG 210 (473)
T ss_pred hcccceEEEEeeccccCCCcccccCCCcccCccEEEEc----CCcEEEEeeEEEEEEeccccCCCcceeeEEEEEeecCC
Confidence 99999999999999999999999999999999999999 57789999999999999999998888899999999999
Q ss_pred CCCCceEcCCCeEEEEEEeeecccccceeeeeecccccchhhhHHhhcCcccCccccceeeeeccchhhhccccCCCCCc
Q 007357 266 NSGGPAFNDKGECIGVAFQVYRSEEVENIGYVIPTTVVSHFLSDYERNGKYTGFPCLGVLLQKLENPALRTCLKVPSNEG 345 (606)
Q Consensus 266 ~SGGPl~n~~G~vVGI~~~~~~~~~~~~~~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e~~~l~~~lgl~~~~G 345 (606)
+||+|++...+++.|++++..+..+ +++|.||+..+.+|+......+.+.+|+.+++.+|.+++.++|+.++|...+|
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~~--~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~~g 288 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTPE--NILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLETG 288 (473)
T ss_pred ccCCCeEEccccccceEEEEEecCC--cccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcccc
Confidence 9999999977899999999876433 89999999999999999999999999999999999999999999999987799
Q ss_pred EEEEEeCCCChhhcCCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEecccc
Q 007357 346 VLVRRVEPTSDANNILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLNPRV 425 (606)
Q Consensus 346 v~V~~V~p~spA~~gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~~~~ 425 (606)
+++.++.+.+.|.+-++.||.|+++||+.|. |+|+..+|+.|.++++.+.+++++.+.|+|.+ ++.+.+....
T Consensus 289 ~~i~~~~qtd~ai~~~nsg~~ll~~DG~~Ig----Vn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~---e~~~~lr~~~ 361 (473)
T KOG1320|consen 289 VLISKINQTDAAINPGNSGGPLLNLDGEVIG----VNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG---EFQISLRPVK 361 (473)
T ss_pred eeeeeecccchhhhcccCCCcEEEecCcEee----eeeeeeEEeeccccceeccCchHhhhhhhhhh---hhceeecccc
Confidence 9999999999999999999999999999998 89999999999999999999999999999998 6677888888
Q ss_pred cccccccCCCCCcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCce
Q 007357 426 HLVPYHIDGGQPSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQ 505 (606)
Q Consensus 426 ~l~p~~~~~~~p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~ 505 (606)
.++|.+++...|.|++++||+|.+++.+|..+++ ..|+|++++|+++++|.+|+++.||+
T Consensus 362 ~~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~--------------------~~q~v~is~Vlp~~~~~~~~~~~g~~ 421 (473)
T KOG1320|consen 362 PLVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSG--------------------VVQLVLVSQVLPGSINGGYGLKPGDQ 421 (473)
T ss_pred CcccccccCCceeEEEecceEEeecCCCcccccc--------------------ceeEEEEEEeccCCCcccccccCCCE
Confidence 8999999999999999999999999987654311 13799999999999999999999999
Q ss_pred EEeeCCeecCCHHHHHHHHHhcCCceEEEEEecceEEEEehHHHHHhHHHHHHHcCCCCC
Q 007357 506 VLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDNYLAVLEREAAVAASSCILKDYGIPSE 565 (606)
Q Consensus 506 I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~~~ivl~~~~~~~~~~~il~~~~i~~~ 565 (606)
|.+|||++|+|+.|+.++++.|..+ ++++|||++.++.++-.||.+|.||++
T Consensus 422 V~~vng~~V~n~~~l~~~i~~~~~~--------~~v~vl~~~~~e~~tl~Il~~~~~p~~ 473 (473)
T KOG1320|consen 422 VVKVNGKPVKNLKHLYELIEECSTE--------DKVAVLDRRSAEDATLEILPEHKIPSA 473 (473)
T ss_pred EEEECCEEeechHHHHHHHHhcCcC--------ceEEEEEecCccceeEEecccccCCCC
Confidence 9999999999999999999999877 899999999999999999999999974
No 5
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=2.9e-46 Score=396.72 Aligned_cols=293 Identities=24% Similarity=0.386 Sum_probs=250.4
Q ss_pred chhhccCCcEEEEeeEecCCCCCCccccccccceEEEEEEc-CCEEEecccccCCCCeEEEEEecCCcEEEEEEEEeecC
Q 007357 109 QDAAFLNAVVKVYCTHTAPDYSLPWQKQRQYTSTGSAFMIG-DGKLLTNAHCVEHYTQVKVKRRGDDTKYVAKVLARGVD 187 (606)
Q Consensus 109 ~~~~~~~SVV~I~~~~~~~~~~~Pw~~~~~~~~~GSGfvI~-~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a~vv~~d~~ 187 (606)
..+++.+|||.|++.....+. +. .....+.||||+|+ +||||||+|||.++..+.|++ .+++.++|++++.|+.
T Consensus 50 ~~~~~~psVV~I~~~~~~~~~---~~-~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~-~dg~~~~a~vv~~d~~ 124 (351)
T TIGR02038 50 AVRRAAPAVVNIYNRSISQNS---LN-QLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVAL-QDGRKFEAELVGSDPL 124 (351)
T ss_pred HHHhcCCcEEEEEeEeccccc---cc-cccccceEEEEEEeCCeEEEecccEeCCCCEEEEEE-CCCCEEEEEEEEecCC
Confidence 367889999999987654432 21 12345789999997 789999999999999999998 7899999999999999
Q ss_pred CCeEEEEecccccccCCcccccCCC--CCCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEccCcCCC
Q 007357 188 CDIALLSVESEEFWKDAEPLCLGHL--PRLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDAAINPG 265 (606)
Q Consensus 188 ~DlAlLkv~~~~~~~~v~pl~l~~~--~~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~~G 265 (606)
+||||||++... +++++|+++ ..+|++|+++|||.+.. .+++.|+||.+.+..+... ....+||+|+++++|
T Consensus 125 ~DlAvlkv~~~~----~~~~~l~~s~~~~~G~~V~aiG~P~~~~-~s~t~GiIs~~~r~~~~~~-~~~~~iqtda~i~~G 198 (351)
T TIGR02038 125 TDLAVLKIEGDN----LPTIPVNLDRPPHVGDVVLAIGNPYNLG-QTITQGIISATGRNGLSSV-GRQNFIQTDAAINAG 198 (351)
T ss_pred CCEEEEEecCCC----CceEeccCcCccCCCCEEEEEeCCCCCC-CcEEEEEEEeccCcccCCC-CcceEEEECCccCCC
Confidence 999999999763 577788753 47799999999999865 5899999999887544221 123569999999999
Q ss_pred CCCCceEcCCCeEEEEEEeeecc---cccceeeeeecccccchhhhHHhhcCcccCccccceeeeeccchhhhccccCCC
Q 007357 266 NSGGPAFNDKGECIGVAFQVYRS---EEVENIGYVIPTTVVSHFLSDYERNGKYTGFPCLGVLLQKLENPALRTCLKVPS 342 (606)
Q Consensus 266 ~SGGPl~n~~G~vVGI~~~~~~~---~~~~~~~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e~~~l~~~lgl~~ 342 (606)
||||||||.+|+||||+++.+.. ....+++|+||++.+++++++++++|++. +||||+.++.+ ++..++.+|++.
T Consensus 199 nSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~-r~~lGv~~~~~-~~~~~~~lgl~~ 276 (351)
T TIGR02038 199 NSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVI-RGYIGVSGEDI-NSVVAQGLGLPD 276 (351)
T ss_pred CCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCccc-ceEeeeEEEEC-CHHHHHhcCCCc
Confidence 99999999999999999876532 22368999999999999999999999988 89999999998 788888999987
Q ss_pred CCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEe
Q 007357 343 NEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVL 421 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l 421 (606)
..|++|.+|.++|||++ ||++||+|++|||++|.++.+ |...+....+|+++.++|.|+|+.+++++++
T Consensus 277 ~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~d----------l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 346 (351)
T TIGR02038 277 LRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEE----------LMDRIAETRPGSKVMVTVLRQGKQLELPVTI 346 (351)
T ss_pred cccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCCCCEEEEEEEECCEEEEEEEEe
Confidence 78999999999999999 999999999999999999877 4566766678999999999999999999988
Q ss_pred ccc
Q 007357 422 NPR 424 (606)
Q Consensus 422 ~~~ 424 (606)
..+
T Consensus 347 ~~~ 349 (351)
T TIGR02038 347 DEK 349 (351)
T ss_pred cCC
Confidence 654
No 6
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=7.7e-46 Score=393.25 Aligned_cols=294 Identities=22% Similarity=0.328 Sum_probs=248.3
Q ss_pred chhhccCCcEEEEeeEecCCCCCCccccccccceEEEEEEc-CCEEEecccccCCCCeEEEEEecCCcEEEEEEEEeecC
Q 007357 109 QDAAFLNAVVKVYCTHTAPDYSLPWQKQRQYTSTGSAFMIG-DGKLLTNAHCVEHYTQVKVKRRGDDTKYVAKVLARGVD 187 (606)
Q Consensus 109 ~~~~~~~SVV~I~~~~~~~~~~~Pw~~~~~~~~~GSGfvI~-~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a~vv~~d~~ 187 (606)
..+++.+|||.|++...... +.......+.||||+|+ +|+||||+|||.++..+.|++ .+++.++|++++.|+.
T Consensus 50 ~~~~~~psvV~v~~~~~~~~----~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~-~dg~~~~a~vv~~d~~ 124 (353)
T PRK10898 50 AVRRAAPAVVNVYNRSLNST----SHNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVAL-QDGRVFEALLVGSDSL 124 (353)
T ss_pred HHHHhCCcEEEEEeEecccc----CcccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEe-CCCCEEEEEEEEEcCC
Confidence 46888999999999765432 12223345789999997 789999999999999999998 7889999999999999
Q ss_pred CCeEEEEecccccccCCcccccCCC--CCCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEccCcCCC
Q 007357 188 CDIALLSVESEEFWKDAEPLCLGHL--PRLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDAAINPG 265 (606)
Q Consensus 188 ~DlAlLkv~~~~~~~~v~pl~l~~~--~~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~~G 265 (606)
+||||||++.. .+++++|+++ ..+|+.|+++|||.+.. .+++.|+|+...+..+.... ...+||+|+++++|
T Consensus 125 ~DlAvl~v~~~----~l~~~~l~~~~~~~~G~~V~aiG~P~g~~-~~~t~Giis~~~r~~~~~~~-~~~~iqtda~i~~G 198 (353)
T PRK10898 125 TDLAVLKINAT----NLPVIPINPKRVPHIGDVVLAIGNPYNLG-QTITQGIISATGRIGLSPTG-RQNFLQTDASINHG 198 (353)
T ss_pred CCEEEEEEcCC----CCCeeeccCcCcCCCCCEEEEEeCCCCcC-CCcceeEEEeccccccCCcc-ccceEEeccccCCC
Confidence 99999999875 4677788754 46799999999999865 47999999988765432221 12469999999999
Q ss_pred CCCCceEcCCCeEEEEEEeeeccc----ccceeeeeecccccchhhhHHhhcCcccCccccceeeeeccchhhhccccCC
Q 007357 266 NSGGPAFNDKGECIGVAFQVYRSE----EVENIGYVIPTTVVSHFLSDYERNGKYTGFPCLGVLLQKLENPALRTCLKVP 341 (606)
Q Consensus 266 ~SGGPl~n~~G~vVGI~~~~~~~~----~~~~~~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e~~~l~~~lgl~ 341 (606)
||||||+|.+|+||||+++.+... ...+++|+||++.+++++++++++|++. ++|||+..+.+ ++..+..++++
T Consensus 199 nSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~-~~~lGi~~~~~-~~~~~~~~~~~ 276 (353)
T PRK10898 199 NSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVI-RGYIGIGGREI-APLHAQGGGID 276 (353)
T ss_pred CCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCccc-ccccceEEEEC-CHHHHHhcCCC
Confidence 999999999999999999876422 1357999999999999999999999998 89999999988 56666777887
Q ss_pred CCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEE
Q 007357 342 SNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVV 420 (606)
Q Consensus 342 ~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~ 420 (606)
...|++|.+|.++|||++ ||++||+|++|||++|.++.+ +...+....+|+++.|+|+|+|+.+++.++
T Consensus 277 ~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~----------l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 346 (353)
T PRK10898 277 QLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALE----------TMDQVAEIRPGSVIPVVVMRDDKQLTLQVT 346 (353)
T ss_pred CCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCCCCEEEEEEEECCEEEEEEEE
Confidence 778999999999999999 999999999999999999876 456666667899999999999999999998
Q ss_pred ecccc
Q 007357 421 LNPRV 425 (606)
Q Consensus 421 l~~~~ 425 (606)
+..++
T Consensus 347 l~~~p 351 (353)
T PRK10898 347 IQEYP 351 (353)
T ss_pred eccCC
Confidence 86553
No 7
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=100.00 E-value=3.1e-37 Score=329.57 Aligned_cols=382 Identities=18% Similarity=0.245 Sum_probs=315.2
Q ss_pred ccccchhhccCCcEEEEeeEecCCCCCCccccccccceEEEEEEc--CCEEEecccccCCCCeEEEEEecCCcEEEEEEE
Q 007357 105 SGNLQDAAFLNAVVKVYCTHTAPDYSLPWQKQRQYTSTGSAFMIG--DGKLLTNAHCVEHYTQVKVKRRGDDTKYVAKVL 182 (606)
Q Consensus 105 ~~~~~~~~~~~SVV~I~~~~~~~~~~~Pw~~~~~~~~~GSGfvI~--~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a~vv 182 (606)
.|+-....+.+|||.|.+..... |.......+.|+||+++ .|+||||+|+|.....+--..+.+..+.+...+
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~~-----fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pv 127 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVRA-----FDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPV 127 (955)
T ss_pred hhhhhhhhhcccEEEEEehheee-----cccccccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCcccc
Confidence 44455899999999999988765 44444567889999998 799999999999887765555588889999999
Q ss_pred EeecCCCeEEEEeccccc-ccCCcccccC-CCCCCCCeEEEEeecCCCCcceEEeeEEeeeeeee--ecC-CCccee--E
Q 007357 183 ARGVDCDIALLSVESEEF-WKDAEPLCLG-HLPRLQDAVTVVGYPLGGDTISVTKGVVSRIEVTS--YAH-GSSELL--G 255 (606)
Q Consensus 183 ~~d~~~DlAlLkv~~~~~-~~~v~pl~l~-~~~~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~--~~~-~~~~~~--~ 255 (606)
+.||.+|+.++|.++... +..+..++|+ +..++|.++.++|+..+ +.+++..|.++++++.. |.. .+.+++ +
T Consensus 128 yrDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDag-EklsIlagflSrldr~apdyg~~~yndfnTfy 206 (955)
T KOG1421|consen 128 YRDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAG-EKLSILAGFLSRLDRNAPDYGEDTYNDFNTFY 206 (955)
T ss_pred cCCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCcc-ceEEeehhhhhhccCCCcccccccccccccee
Confidence 999999999999998754 4567788887 46688999999999876 66899999999999765 222 233333 6
Q ss_pred EEEccCcCCCCCCCceEcCCCeEEEEEEeeecccccceeeeeecccccchhhhHHhhcCcccCccccceeeeeccchhhh
Q 007357 256 IQIDAAINPGNSGGPAFNDKGECIGVAFQVYRSEEVENIGYVIPTTVVSHFLSDYERNGKYTGFPCLGVLLQKLENPALR 335 (606)
Q Consensus 256 iq~da~i~~G~SGGPl~n~~G~vVGI~~~~~~~~~~~~~~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e~~~l~ 335 (606)
+|..+....|.||+||++.+|..|.++..+.. ....+|++|++.+.+.|..++.+..++ ++.|.++|-.. .-+.+
T Consensus 207 ~QaasstsggssgspVv~i~gyAVAl~agg~~---ssas~ffLpLdrV~RaL~clq~n~PIt-RGtLqvefl~k-~~de~ 281 (955)
T KOG1421|consen 207 IQAASSTSGGSSGSPVVDIPGYAVALNAGGSI---SSASDFFLPLDRVVRALRCLQNNTPIT-RGTLQVEFLHK-LFDEC 281 (955)
T ss_pred eeehhcCCCCCCCCceecccceEEeeecCCcc---cccccceeeccchhhhhhhhhcCCCcc-cceEEEEEehh-hhHHH
Confidence 88888889999999999999999999976543 345799999999999999999888887 99999998876 56778
Q ss_pred ccccCC------------CCCcEEEE-EeCCCChhhcCCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCC
Q 007357 336 TCLKVP------------SNEGVLVR-RVEPTSDANNILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGD 402 (606)
Q Consensus 336 ~~lgl~------------~~~Gv~V~-~V~p~spA~~gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~ 402 (606)
+.+||+ ..+|++|. .|.++|||++.|++||++++||+.-+.++.. +..++.. ..|+
T Consensus 282 rrlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~----------l~~iLDe-gvgk 350 (955)
T KOG1421|consen 282 RRLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEA----------LEQILDE-GVGK 350 (955)
T ss_pred HhcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHH----------HHHHHhh-ccCc
Confidence 888884 35788865 5999999999999999999999888877644 3445544 4899
Q ss_pred EEEEEEEECCEEEEEEEEecccccccccccCCCCCcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCc
Q 007357 403 VAELGIIRAGTFMKVKVVLNPRVHLVPYHIDGGQPSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQ 482 (606)
Q Consensus 403 ~v~l~V~R~G~~~~v~v~l~~~~~l~p~~~~~~~p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~ 482 (606)
.+.|+|+|+|++.+++++.+..+...| .+|+.+.|.+|+++++++++.+.-. .+
T Consensus 351 ~l~LtI~Rggqelel~vtvqdlh~itp-------~R~levcGav~hdlsyq~ar~y~lP-------------------~~ 404 (955)
T KOG1421|consen 351 NLELTIQRGGQELELTVTVQDLHGITP-------DRFLEVCGAVFHDLSYQLARLYALP-------------------VE 404 (955)
T ss_pred eEEEEEEeCCEEEEEEEEeccccCCCC-------ceEEEEcceEecCCCHHHHhhcccc-------------------cC
Confidence 999999999999999999999887666 6789999999999999987754322 22
Q ss_pred EEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCC-ceEEEEE
Q 007357 483 MVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKD-KYLVFEF 536 (606)
Q Consensus 483 ~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~-~~v~l~v 536 (606)
+|++++-- ++++.+++.. +.+|.+|||+++.+++.|++++++.++ +++.+.+
T Consensus 405 GvyVa~~~-gsf~~~~~~y-~~ii~~vanK~tPdLdaFidvlk~L~dg~rV~vry 457 (955)
T KOG1421|consen 405 GVYVASPG-GSFRHRGPRY-GQIIDSVANKPTPDLDAFIDVLKELPDGARVPVRY 457 (955)
T ss_pred cEEEccCC-CCccccCCcc-eEEEEeecCCcCCCHHHHHHHHHhccCCCeeeEEE
Confidence 66777766 8999999987 999999999999999999999998764 3555544
No 8
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=6.2e-35 Score=310.52 Aligned_cols=300 Identities=29% Similarity=0.416 Sum_probs=250.6
Q ss_pred cccchhhccCCcEEEEeeEecCC-CCCCcccccc-ccceEEEEEEc-CCEEEecccccCCCCeEEEEEecCCcEEEEEEE
Q 007357 106 GNLQDAAFLNAVVKVYCTHTAPD-YSLPWQKQRQ-YTSTGSAFMIG-DGKLLTNAHCVEHYTQVKVKRRGDDTKYVAKVL 182 (606)
Q Consensus 106 ~~~~~~~~~~SVV~I~~~~~~~~-~~~Pw~~~~~-~~~~GSGfvI~-~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a~vv 182 (606)
+.-..+++.++||.|........ ..+|-..... ..+.||||+++ +++|+||.|++.++..+.|.+ .++++++++++
T Consensus 35 ~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l-~dg~~~~a~~v 113 (347)
T COG0265 35 FATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTL-ADGREVPAKLV 113 (347)
T ss_pred HHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEe-CCCCEEEEEEE
Confidence 33446778999999999876553 0001000011 14789999998 999999999999999999999 89999999999
Q ss_pred EeecCCCeEEEEecccccccCCcccccCCCC--CCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEcc
Q 007357 183 ARGVDCDIALLSVESEEFWKDAEPLCLGHLP--RLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDA 260 (606)
Q Consensus 183 ~~d~~~DlAlLkv~~~~~~~~v~pl~l~~~~--~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da 260 (606)
+.|+..|||+|+++... .++.+.++++. .+|+.+.++|+|.+.. .+++.|+|+.+.+..+........+||+|+
T Consensus 114 g~d~~~dlavlki~~~~---~~~~~~~~~s~~l~vg~~v~aiGnp~g~~-~tvt~Givs~~~r~~v~~~~~~~~~IqtdA 189 (347)
T COG0265 114 GKDPISDLAVLKIDGAG---GLPVIALGDSDKLRVGDVVVAIGNPFGLG-QTVTSGIVSALGRTGVGSAGGYVNFIQTDA 189 (347)
T ss_pred ecCCccCEEEEEeccCC---CCceeeccCCCCcccCCEEEEecCCCCcc-cceeccEEeccccccccCcccccchhhccc
Confidence 99999999999999764 26677787655 4599999999999955 589999999998863332222445799999
Q ss_pred CcCCCCCCCceEcCCCeEEEEEEeeecccc-cceeeeeecccccchhhhHHhhcCcccCccccceeeeeccchhhhcccc
Q 007357 261 AINPGNSGGPAFNDKGECIGVAFQVYRSEE-VENIGYVIPTTVVSHFLSDYERNGKYTGFPCLGVLLQKLENPALRTCLK 339 (606)
Q Consensus 261 ~i~~G~SGGPl~n~~G~vVGI~~~~~~~~~-~~~~~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e~~~l~~~lg 339 (606)
++|+||||||++|.+|++|||.++.+.... ..+++|+||.+.+..++.++..+|++. ++++|+..+.+ ++.+. +|
T Consensus 190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~-~~~lgv~~~~~-~~~~~--~g 265 (347)
T COG0265 190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVV-RGYLGVIGEPL-TADIA--LG 265 (347)
T ss_pred ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCcc-ccccceEEEEc-ccccc--cC
Confidence 999999999999999999999999875433 456999999999999999999988887 99999999988 56555 88
Q ss_pred CCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEE
Q 007357 340 VPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVK 418 (606)
Q Consensus 340 l~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~ 418 (606)
++...|++|..|.+++||++ |++.||+|+++||+++.+..+ +...+....+|+.+.+++.|+|+++++.
T Consensus 266 ~~~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~----------l~~~v~~~~~g~~v~~~~~r~g~~~~~~ 335 (347)
T COG0265 266 LPVAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSD----------LVAAVASNRPGDEVALKLLRGGKERELA 335 (347)
T ss_pred CCCCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHH----------HHHHHhccCCCCEEEEEEEECCEEEEEE
Confidence 88888999999999999999 999999999999999999866 4566666668999999999999999999
Q ss_pred EEeccc
Q 007357 419 VVLNPR 424 (606)
Q Consensus 419 v~l~~~ 424 (606)
+++...
T Consensus 336 v~l~~~ 341 (347)
T COG0265 336 VTLGDR 341 (347)
T ss_pred EEecCc
Confidence 998763
No 9
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.89 E-value=9.8e-22 Score=210.24 Aligned_cols=298 Identities=25% Similarity=0.248 Sum_probs=218.8
Q ss_pred hhccCCcEEEEeeEecCCCCCCccccccccceEEEEEEc-CCEEEecccccCCCC-----------eEEEEEe-cCCcEE
Q 007357 111 AAFLNAVVKVYCTHTAPDYSLPWQKQRQYTSTGSAFMIG-DGKLLTNAHCVEHYT-----------QVKVKRR-GDDTKY 177 (606)
Q Consensus 111 ~~~~~SVV~I~~~~~~~~~~~Pw~~~~~~~~~GSGfvI~-~g~ILTnaHvV~~~~-----------~v~V~~~-~~~~~~ 177 (606)
++-..|||.|....- +....|+.........|||||++ +++|+||+||+.... .+.+... +.+..+
T Consensus 135 ~~cd~Avv~Ie~~~f-~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~~~~s~ 213 (473)
T KOG1320|consen 135 EECDLAVVYIESEEF-WKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIGPGNSG 213 (473)
T ss_pred hcccceEEEEeeccc-cCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEeecCCccC
Confidence 444577888876332 22223555544556789999998 999999999986432 2555552 234788
Q ss_pred EEEEEEeecCCCeEEEEecccccccCCcccccCCCC--CCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCc----
Q 007357 178 VAKVLARGVDCDIALLSVESEEFWKDAEPLCLGHLP--RLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSS---- 251 (606)
Q Consensus 178 ~a~vv~~d~~~DlAlLkv~~~~~~~~v~pl~l~~~~--~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~---- 251 (606)
.+.+...|+..|+|+++++.++ .-..+++++-+. ..|+++..+|.|++..+ ..+.|+++...|..+..+..
T Consensus 214 ep~i~g~d~~~gvA~l~ik~~~--~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~n-t~t~g~vs~~~R~~~~lg~~~g~~ 290 (473)
T KOG1320|consen 214 EPVIVGVDKVAGVAFLKIKTPE--NILYVIPLGVSSHFRTGVEVSAIGNGFGLLN-TLTQGMVSGQLRKSFKLGLETGVL 290 (473)
T ss_pred CCeEEccccccceEEEEEecCC--cccceeecceeeeecccceeeccccCceeee-eeeecccccccccccccCccccee
Confidence 9999999999999999997654 235666676433 55999999999999875 79999999988877654333
Q ss_pred ceeEEEEccCcCCCCCCCceEcCCCeEEEEEEeeecc-cccceeeeeecccccchhhhHHhhcC---cc-----cCcccc
Q 007357 252 ELLGIQIDAAINPGNSGGPAFNDKGECIGVAFQVYRS-EEVENIGYVIPTTVVSHFLSDYERNG---KY-----TGFPCL 322 (606)
Q Consensus 252 ~~~~iq~da~i~~G~SGGPl~n~~G~vVGI~~~~~~~-~~~~~~~~~IP~~~i~~~L~~l~~~g---~~-----~g~~~L 322 (606)
...++|+|++++.|+||||+++.+|++||+.++.... +=...++|++|.+.++.++.+..+.. +. ....++
T Consensus 291 i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~~~~~ 370 (473)
T KOG1320|consen 291 ISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPVHQYI 370 (473)
T ss_pred eeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccccccC
Confidence 3357999999999999999999999999998876421 11356899999999988888763222 11 112356
Q ss_pred ceeeeeccchhhh-----ccccCCC--CCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHH
Q 007357 323 GVLLQKLENPALR-----TCLKVPS--NEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYL 394 (606)
Q Consensus 323 Gi~~q~~e~~~l~-----~~lgl~~--~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~ 394 (606)
|+..-.+ ++.+. +.+-.+. ..+++|.+|.|++++.. ++++||+|++|||++|.+..+ +..+
T Consensus 371 g~~s~~i-~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~----------l~~~ 439 (473)
T KOG1320|consen 371 GLPSYYI-FAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKH----------LYEL 439 (473)
T ss_pred CceeEEE-ecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHH----------HHHH
Confidence 6554444 22221 1111121 24899999999999999 999999999999999999987 4578
Q ss_pred HhhcCCCCEEEEEEEECCEEEEEEEEecc
Q 007357 395 ISQKFAGDVAELGIIRAGTFMKVKVVLNP 423 (606)
Q Consensus 395 l~~~~~g~~v~l~V~R~G~~~~v~v~l~~ 423 (606)
+.....++++.+...|..|..++.+....
T Consensus 440 i~~~~~~~~v~vl~~~~~e~~tl~Il~~~ 468 (473)
T KOG1320|consen 440 IEECSTEDKVAVLDRRSAEDATLEILPEH 468 (473)
T ss_pred HHhcCcCceEEEEEecCccceeEEecccc
Confidence 88877788898888888888888876543
No 10
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.82 E-value=5.7e-18 Score=182.63 Aligned_cols=367 Identities=17% Similarity=0.181 Sum_probs=266.2
Q ss_pred hhhccCCcEEEEeeEecCCCCCCccccccccceEEEEEEc--CCEEEecccccC-CCCeEEEEEecCCcEEEEEEEEeec
Q 007357 110 DAAFLNAVVKVYCTHTAPDYSLPWQKQRQYTSTGSAFMIG--DGKLLTNAHCVE-HYTQVKVKRRGDDTKYVAKVLARGV 186 (606)
Q Consensus 110 ~~~~~~SVV~I~~~~~~~~~~~Pw~~~~~~~~~GSGfvI~--~g~ILTnaHvV~-~~~~v~V~~~~~~~~~~a~vv~~d~ 186 (606)
.+++..+.|.+.+.+..+-.+.- ...-.|||.|++ .|++++.+.+|. ++...+|+. .+...++|.+.+.++
T Consensus 524 ~~~i~~~~~~v~~~~~~~l~g~s-----~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~-~dS~~i~a~~~fL~~ 597 (955)
T KOG1421|consen 524 SADISNCLVDVEPMMPVNLDGVS-----SDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTE-ADSDGIPANVSFLHP 597 (955)
T ss_pred hhHHhhhhhhheeceeeccccch-----hhhhcCceEEEEccCCceeEecccCCchhhceEEee-cccccccceeeEecC
Confidence 56777888888887765543221 123459999997 899999999996 567888888 677789999999999
Q ss_pred CCCeEEEEecccccccCCcccccCCCC-CCCCeEEEEeecCCCCcc----eEEeeEEeeeeeeee-cCCCcceeEEEEcc
Q 007357 187 DCDIALLSVESEEFWKDAEPLCLGHLP-RLQDAVTVVGYPLGGDTI----SVTKGVVSRIEVTSY-AHGSSELLGIQIDA 260 (606)
Q Consensus 187 ~~DlAlLkv~~~~~~~~v~pl~l~~~~-~lG~~V~viG~p~g~~~~----svt~GvVs~~~~~~~-~~~~~~~~~iq~da 260 (606)
..++|.+|.++.- ...+.|.+.. .-|++|...|+.....-+ +++.-.+.-+..... ......+..|.+++
T Consensus 598 t~n~a~~kydp~~----~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~ 673 (955)
T KOG1421|consen 598 TENVASFKYDPAL----EVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMD 673 (955)
T ss_pred ccceeEeccChhH----hhhhccceeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEec
Confidence 9999999999763 2445565433 459999999987553321 222211111111111 11234456677777
Q ss_pred CcCCCCCCCceEcCCCeEEEEEEeeecc---cccceeeeeecccccchhhhHHhhcCcccCccccceeeeeccchhhhcc
Q 007357 261 AINPGNSGGPAFNDKGECIGVAFQVYRS---EEVENIGYVIPTTVVSHFLSDYERNGKYTGFPCLGVLLQKLENPALRTC 337 (606)
Q Consensus 261 ~i~~G~SGGPl~n~~G~vVGI~~~~~~~---~~~~~~~~~IP~~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e~~~l~~~ 337 (606)
.+..++--|-+.+.+|+|+|++-..+.. +..--.-|.+.+.+++..|+.++..+... .-.+|++|..+ +-+-++.
T Consensus 674 nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r-p~i~~vef~~i-~laqar~ 751 (955)
T KOG1421|consen 674 NLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR-PTIAGVEFSHI-TLAQART 751 (955)
T ss_pred cccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC-ceeeccceeeE-Eeehhhc
Confidence 7666655667888899999998765521 11223456788899999999998655554 44579999988 6666788
Q ss_pred ccCCCC-------------CcEEEEEeCCCChhhcCCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEE
Q 007357 338 LKVPSN-------------EGVLVRRVEPTSDANNILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVA 404 (606)
Q Consensus 338 lgl~~~-------------~Gv~V~~V~p~spA~~gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v 404 (606)
+|++.+ .=.+|.+|.+.-+ +-|..||+|+++||+.|+...++ .+ + ..+
T Consensus 752 lglp~e~imk~e~es~~~~ql~~ishv~~~~~--kil~~gdiilsvngk~itr~~dl----------~d-~------~ei 812 (955)
T KOG1421|consen 752 LGLPSEFIMKSEEESTIPRQLYVISHVRPLLH--KILGVGDIILSVNGKMITRLSDL----------HD-F------EEI 812 (955)
T ss_pred cCCCHHHHhhhhhcCCCcceEEEEEeeccCcc--cccccccEEEEecCeEEeeehhh----------hh-h------hhh
Confidence 888641 1345678877533 36999999999999999988774 22 1 156
Q ss_pred EEEEEECCEEEEEEEEecccccccccccCCCCCcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEE
Q 007357 405 ELGIIRAGTFMKVKVVLNPRVHLVPYHIDGGQPSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMV 484 (606)
Q Consensus 405 ~l~V~R~G~~~~v~v~l~~~~~l~p~~~~~~~p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~V 484 (606)
..+|+|+|.++++++++-+.. ...++.+|.|..+++.+...+++-. .-.++|
T Consensus 813 d~~ilrdg~~~~ikipt~p~~---------et~r~vi~~gailq~ph~av~~q~e-------------------dlp~gv 864 (955)
T KOG1421|consen 813 DAVILRDGIEMEIKIPTYPEY---------ETSRAVIWMGAILQPPHSAVFEQVE-------------------DLPEGV 864 (955)
T ss_pred heeeeecCcEEEEEecccccc---------ccceEEEEEeccccCchHHHHHHHh-------------------ccCCce
Confidence 789999999999998865432 3478899999999998877655310 113688
Q ss_pred EEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCc-eEEEEE
Q 007357 485 ILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDK-YLVFEF 536 (606)
Q Consensus 485 vIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~-~v~l~v 536 (606)
++....-+|||.. ++.+--.|++|||..+.++++|..++.+.++. |+++..
T Consensus 865 yvt~rg~gspalq-~l~aa~fitavng~~t~~lddf~~~~~~ipdnsyv~v~~ 916 (955)
T KOG1421|consen 865 YVTSRGYGSPALQ-MLRAAHFITAVNGHDTNTLDDFYHMLLEIPDNSYVQVKQ 916 (955)
T ss_pred EEeecccCChhHh-hcchheeEEEecccccCcHHHHHHHHhhCCCCceEEEEE
Confidence 9999999999999 99999999999999999999999999987755 777654
No 11
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=99.70 E-value=1.3e-16 Score=175.36 Aligned_cols=149 Identities=17% Similarity=0.239 Sum_probs=117.3
Q ss_pred EEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEecccc
Q 007357 347 LVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLNPRV 425 (606)
Q Consensus 347 ~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~~~~ 425 (606)
+|.+|.++|||++ |||+||+|++|||++|.+|.++ ...+....+|++++++|+|+|+++++++++...+
T Consensus 129 lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l----------~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~~~ 198 (449)
T PRK10779 129 VVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAV----------RLALVSKIGDESTTITVAPFGSDQRRDKTLDLRH 198 (449)
T ss_pred cccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHH----------HHHHHhhccCCceEEEEEeCCccceEEEEecccc
Confidence 6899999999999 9999999999999999999874 4566677788999999999999998888885432
Q ss_pred cccccccCCCCCcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCce
Q 007357 426 HLVPYHIDGGQPSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQ 505 (606)
Q Consensus 426 ~l~p~~~~~~~p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~ 505 (606)
... ......-....|+ .++++ +..++|.+|.++|+|..+|+++||+
T Consensus 199 ~~~----~~~~~~~~~~lGl--~~~~~----------------------------~~~~vV~~V~~~SpA~~AGL~~GDv 244 (449)
T PRK10779 199 WAF----EPDKQDPVSSLGI--RPRGP----------------------------QIEPVLAEVQPNSAASKAGLQAGDR 244 (449)
T ss_pred ccc----Cccccchhhcccc--cccCC----------------------------CcCcEEEeeCCCCHHHHcCCCCCCE
Confidence 110 0000000011222 22221 1235899999999999999999999
Q ss_pred EEeeCCeecCCHHHHHHHHHhcCCceEEEEEecc
Q 007357 506 VLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 506 I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
|++|||++|.+|+++.++++..+++.+.+++.|+
T Consensus 245 Il~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R~ 278 (449)
T PRK10779 245 IVKVDGQPLTQWQTFVTLVRDNPGKPLALEIERQ 278 (449)
T ss_pred EEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEEC
Confidence 9999999999999999999988888999999886
No 12
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.57 E-value=4.7e-14 Score=138.54 Aligned_cols=187 Identities=21% Similarity=0.284 Sum_probs=124.8
Q ss_pred EeeEecCCCCCCccccccc---cceEEEEEEcCCEEEecccccCCCCeEEEEEec------CC--cEEEEEEEEee----
Q 007357 121 YCTHTAPDYSLPWQKQRQY---TSTGSAFMIGDGKLLTNAHCVEHYTQVKVKRRG------DD--TKYVAKVLARG---- 185 (606)
Q Consensus 121 ~~~~~~~~~~~Pw~~~~~~---~~~GSGfvI~~g~ILTnaHvV~~~~~v~V~~~~------~~--~~~~a~vv~~d---- 185 (606)
.++.....+.+||+..... ...|+|++|++.+|||+|||+.....+.+.+.. .+ ..+..+-+..+
T Consensus 2 ~~g~~~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~ 81 (220)
T PF00089_consen 2 VGGDPASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYD 81 (220)
T ss_dssp BSSEECGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSB
T ss_pred CCCEECCCCCCCeEEEEeeCCCCeeEeEEecccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 4566777888999887654 346999999999999999999996677775521 12 23444333332
Q ss_pred c---CCCeEEEEeccc-ccccCCcccccCCC---CCCCCeEEEEeecCCCCcc---eEEeeEEeeee---eeeecCCCcc
Q 007357 186 V---DCDIALLSVESE-EFWKDAEPLCLGHL---PRLQDAVTVVGYPLGGDTI---SVTKGVVSRIE---VTSYAHGSSE 252 (606)
Q Consensus 186 ~---~~DlAlLkv~~~-~~~~~v~pl~l~~~---~~lG~~V~viG~p~g~~~~---svt~GvVs~~~---~~~~~~~~~~ 252 (606)
+ .+|||||+++.+ .+...+.|+.+... ...|+.+.++||+...... .+....+.-+. +.........
T Consensus 82 ~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~ 161 (220)
T PF00089_consen 82 PSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLT 161 (220)
T ss_dssp TTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTST
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 2 479999999987 55678889999863 2579999999998753221 33333333322 2221111111
Q ss_pred eeEEEEcc----CcCCCCCCCceEcCCCeEEEEEEeeecccccceeeeeecccccchhh
Q 007357 253 LLGIQIDA----AINPGNSGGPAFNDKGECIGVAFQVYRSEEVENIGYVIPTTVVSHFL 307 (606)
Q Consensus 253 ~~~iq~da----~i~~G~SGGPl~n~~G~vVGI~~~~~~~~~~~~~~~~IP~~~i~~~L 307 (606)
...+++.. ..+.|+|||||++.++.++||++.+...+.....++++++...++|+
T Consensus 162 ~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 162 PNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp TTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 22466665 78999999999997666999999874333333357788887666654
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.57 E-value=4.8e-14 Score=125.66 Aligned_cols=108 Identities=38% Similarity=0.554 Sum_probs=71.8
Q ss_pred EEEEEEcCC-EEEecccccC--------CCCeEEEEEecCCcEEE--EEEEEeecC-CCeEEEEecccccccCCcccccC
Q 007357 143 GSAFMIGDG-KLLTNAHCVE--------HYTQVKVKRRGDDTKYV--AKVLARGVD-CDIALLSVESEEFWKDAEPLCLG 210 (606)
Q Consensus 143 GSGfvI~~g-~ILTnaHvV~--------~~~~v~V~~~~~~~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~v~pl~l~ 210 (606)
||||+|++. +||||+|||. ....+.+.. .++..+. +++++.++. .|+|||+++
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~D~All~v~-------------- 65 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVF-PDGRRVPPVAEVVYFDPDDYDLALLKVD-------------- 65 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEE-TTSCEEETEEEEEEEETT-TTEEEEEES--------------
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEe-cCCCEEeeeEEEEEECCccccEEEEEEe--------------
Confidence 799999855 9999999998 456677777 5666677 999999999 999999998
Q ss_pred CCCCCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEccCcCCCCCCCceEcCCCeEEEE
Q 007357 211 HLPRLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDAAINPGNSGGPAFNDKGECIGV 281 (606)
Q Consensus 211 ~~~~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~~G~SGGPl~n~~G~vVGI 281 (606)
.....+.. ....+.......... .......+ +++.+.+|+|||||||.+|+||||
T Consensus 66 -------~~~~~~~~------~~~~~~~~~~~~~~~--~~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 66 -------PWTGVGGG------VRVPGSTSGVSPTST--NDNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -------CEEEEEEE------EEEEEEEEEEEEEEE--EETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred -------cccceeee------eEeeeeccccccccC--cccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 00011100 000001111100000 00111124 799999999999999999999997
No 14
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=99.57 E-value=1.2e-14 Score=158.39 Aligned_cols=132 Identities=17% Similarity=0.176 Sum_probs=108.7
Q ss_pred CcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEec
Q 007357 344 EGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLN 422 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~ 422 (606)
.|.+|.+|.++|||++ |||+||+|++|||+++.++.++ ...+.... +++.+++.|+|+..++.+++.
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl----------~~~ia~~~--~~v~~~I~r~g~~~~l~v~l~ 195 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDV----------RQQIADIA--GEPMVEILAERENWTFEVMKE 195 (420)
T ss_pred CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHH----------HHHHHhhc--ccceEEEEEecCceEeccccc
Confidence 5889999999999999 9999999999999999999874 34444443 678999999988765433321
Q ss_pred ccccccccccCCCCCcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCC
Q 007357 423 PRVHLVPYHIDGGQPSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMS 502 (606)
Q Consensus 423 ~~~~l~p~~~~~~~p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~ 502 (606)
+.+..+ +.+++|.+|.+++||..+|+++
T Consensus 196 ------------------------~~~~~~----------------------------~~g~vV~~V~~~SpA~~aGL~~ 223 (420)
T TIGR00054 196 ------------------------LIPRGP----------------------------KIEPVLSDVTPNSPAEKAGLKE 223 (420)
T ss_pred ------------------------ceecCC----------------------------CcCcEEEEECCCCHHHHcCCCC
Confidence 111111 1245899999999999999999
Q ss_pred CceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEecc
Q 007357 503 NQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 503 gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
||+|++|||++|.+|+++.+.+++.+++.+.++++|+
T Consensus 224 GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R~ 260 (420)
T TIGR00054 224 GDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVERN 260 (420)
T ss_pred CCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEEC
Confidence 9999999999999999999999988888899999886
No 15
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.48 E-value=7.8e-13 Score=130.80 Aligned_cols=165 Identities=24% Similarity=0.270 Sum_probs=102.9
Q ss_pred eeEecCCCCCCccccccc---cceEEEEEEcCCEEEecccccCCC--CeEEEEEec--------CCcEEEEEEEEee---
Q 007357 122 CTHTAPDYSLPWQKQRQY---TSTGSAFMIGDGKLLTNAHCVEHY--TQVKVKRRG--------DDTKYVAKVLARG--- 185 (606)
Q Consensus 122 ~~~~~~~~~~Pw~~~~~~---~~~GSGfvI~~g~ILTnaHvV~~~--~~v~V~~~~--------~~~~~~a~vv~~d--- 185 (606)
.+.......+||...... ...|+|++|++.+|||+|||+.+. ..+.|.+.. ....+..+-+..+
T Consensus 3 ~G~~~~~~~~Pw~v~i~~~~~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y 82 (232)
T cd00190 3 GGSEAKIGSFPWQVSLQYTGGRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNY 82 (232)
T ss_pred CCeECCCCCCCCEEEEEccCCcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCC
Confidence 344555667888876543 467999999999999999999875 566666521 1122333333343
Q ss_pred ----cCCCeEEEEeccc-ccccCCcccccCCC---CCCCCeEEEEeecCCCCc----ceEE---eeEEeeeeeeeecC--
Q 007357 186 ----VDCDIALLSVESE-EFWKDAEPLCLGHL---PRLQDAVTVVGYPLGGDT----ISVT---KGVVSRIEVTSYAH-- 248 (606)
Q Consensus 186 ----~~~DlAlLkv~~~-~~~~~v~pl~l~~~---~~lG~~V~viG~p~g~~~----~svt---~GvVs~~~~~~~~~-- 248 (606)
..+|||||+++.+ .+...+.|++|... ...++.+.++||...... .... ..+++...+.....
T Consensus 83 ~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~ 162 (232)
T cd00190 83 NPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYG 162 (232)
T ss_pred CCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCc
Confidence 3589999999976 44456899999754 345899999999765331 1112 22222222221111
Q ss_pred CCcceeEEEE-----ccCcCCCCCCCceEcCC---CeEEEEEEeee
Q 007357 249 GSSELLGIQI-----DAAINPGNSGGPAFNDK---GECIGVAFQVY 286 (606)
Q Consensus 249 ~~~~~~~iq~-----da~i~~G~SGGPl~n~~---G~vVGI~~~~~ 286 (606)
.......++. ....|+|+|||||+... +.++||++.+.
T Consensus 163 ~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 163 GTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred ccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 0111112333 33568999999999854 78999998764
No 16
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.44 E-value=5e-13 Score=112.27 Aligned_cols=81 Identities=33% Similarity=0.481 Sum_probs=69.1
Q ss_pred cccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhc
Q 007357 320 PCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQK 398 (606)
Q Consensus 320 ~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~ 398 (606)
|+||+.++...+ ..|++|..|.++|||++ ||++||+|++|||++|.++.+ |...+...
T Consensus 1 ~~lGv~~~~~~~-----------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~----------~~~~l~~~ 59 (82)
T PF13180_consen 1 GGLGVTVQNLSD-----------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSED----------LVNILSKG 59 (82)
T ss_dssp -E-SEEEEECSC-----------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHH----------HHHHHHCS
T ss_pred CEECeEEEEccC-----------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHH----------HHHHHHhC
Confidence 689999998732 46999999999999999 999999999999999988866 56777778
Q ss_pred CCCCEEEEEEEECCEEEEEEEEe
Q 007357 399 FAGDVAELGIIRAGTFMKVKVVL 421 (606)
Q Consensus 399 ~~g~~v~l~V~R~G~~~~v~v~l 421 (606)
.+|++++|+|+|+|+.++++++|
T Consensus 60 ~~g~~v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 60 KPGDTVTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp STTSEEEEEEEETTEEEEEEEE-
T ss_pred CCCCEEEEEEEECCEEEEEEEEC
Confidence 89999999999999999999875
No 17
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.40 E-value=7.2e-12 Score=124.24 Aligned_cols=167 Identities=23% Similarity=0.300 Sum_probs=104.4
Q ss_pred EEeeEecCCCCCCccccccc---cceEEEEEEcCCEEEecccccCCC--CeEEEEEecCC-------cEEEEEEEEee--
Q 007357 120 VYCTHTAPDYSLPWQKQRQY---TSTGSAFMIGDGKLLTNAHCVEHY--TQVKVKRRGDD-------TKYVAKVLARG-- 185 (606)
Q Consensus 120 I~~~~~~~~~~~Pw~~~~~~---~~~GSGfvI~~g~ILTnaHvV~~~--~~v~V~~~~~~-------~~~~a~vv~~d-- 185 (606)
|..+.......+||...... ...|+|++|++.+|||+|||+.+. ..+.|.+.... ..+.+.-+..+
T Consensus 2 ~~~G~~~~~~~~Pw~~~i~~~~~~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~ 81 (229)
T smart00020 2 IVGGSEANIGSFPWQVSLQYRGGRHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPN 81 (229)
T ss_pred ccCCCcCCCCCCCcEEEEEEcCCCcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCC
Confidence 44555666677888875432 356999999999999999999875 37777772211 22333333322
Q ss_pred -----cCCCeEEEEeccc-ccccCCcccccCCC---CCCCCeEEEEeecCCCCc-----ceEEeeEEeee---eeee-ec
Q 007357 186 -----VDCDIALLSVESE-EFWKDAEPLCLGHL---PRLQDAVTVVGYPLGGDT-----ISVTKGVVSRI---EVTS-YA 247 (606)
Q Consensus 186 -----~~~DlAlLkv~~~-~~~~~v~pl~l~~~---~~lG~~V~viG~p~g~~~-----~svt~GvVs~~---~~~~-~~ 247 (606)
...|||||+++.+ .+...+.|+.|... ...+..+.+.||+...+. .......+..+ .+.. +.
T Consensus 82 ~~~~~~~~DiAll~L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~ 161 (229)
T smart00020 82 YNPSTYDNDIALLKLKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYS 161 (229)
T ss_pred CCCCCCcCCEEEEEECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhc
Confidence 4579999999976 34456889998753 455899999999766430 01112222211 1111 10
Q ss_pred CC-CcceeEEEE-----ccCcCCCCCCCceEcCCC--eEEEEEEeee
Q 007357 248 HG-SSELLGIQI-----DAAINPGNSGGPAFNDKG--ECIGVAFQVY 286 (606)
Q Consensus 248 ~~-~~~~~~iq~-----da~i~~G~SGGPl~n~~G--~vVGI~~~~~ 286 (606)
.. ......++. ....|+|+||||++...+ .++||++.+.
T Consensus 162 ~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 162 GGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred cccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 00 001112333 345789999999998543 8999998764
No 18
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.29 E-value=1.6e-11 Score=104.29 Aligned_cols=88 Identities=36% Similarity=0.523 Sum_probs=75.1
Q ss_pred cccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhc
Q 007357 320 PCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQK 398 (606)
Q Consensus 320 ~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~ 398 (606)
+++|+.++.+ ++..++.++++...|++|.+|.++|||+. ||++||+|++|||++|.++.+ +..++...
T Consensus 1 ~~~G~~~~~~-~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~----------~~~~l~~~ 69 (90)
T cd00987 1 PWLGVTVQDL-TPDLAEELGLKDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVAD----------LRRALAEL 69 (90)
T ss_pred CccceEEeEC-CHHHHHHcCCCCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHH----------HHHHHHhc
Confidence 5899999999 67777777777678999999999999998 999999999999999999876 44566665
Q ss_pred CCCCEEEEEEEECCEEEEEE
Q 007357 399 FAGDVAELGIIRAGTFMKVK 418 (606)
Q Consensus 399 ~~g~~v~l~V~R~G~~~~v~ 418 (606)
..++.+.+++.|+|+..++.
T Consensus 70 ~~~~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 70 KPGDKVTLTVLRGGKELTVT 89 (90)
T ss_pred CCCCEEEEEEEECCEEEEee
Confidence 56899999999999876654
No 19
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.08 E-value=6.3e-10 Score=92.79 Aligned_cols=70 Identities=23% Similarity=0.288 Sum_probs=60.6
Q ss_pred CCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEE
Q 007357 341 PSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKV 419 (606)
Q Consensus 341 ~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v 419 (606)
+...|++|.+|.++|||++ |||+||+|++|||++|.+|.+ |...+....+|+.+.+++.|+|+..++++
T Consensus 7 ~~~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d----------~~~~l~~~~~g~~v~l~v~r~g~~~~~~~ 76 (79)
T cd00991 7 EAVAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLED----------FMEALKPTKPGEVITVTVLPSTTKLTNVS 76 (79)
T ss_pred ccCCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHH----------HHHHHhcCCCCCEEEEEEEECCEEEEEEE
Confidence 4457999999999999998 999999999999999999876 45666665568999999999999888776
Q ss_pred E
Q 007357 420 V 420 (606)
Q Consensus 420 ~ 420 (606)
+
T Consensus 77 ~ 77 (79)
T cd00991 77 T 77 (79)
T ss_pred E
Confidence 5
No 20
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.07 E-value=7.3e-10 Score=92.22 Aligned_cols=71 Identities=25% Similarity=0.282 Sum_probs=62.6
Q ss_pred CcEEEEEeCCCChhhcCCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEecc
Q 007357 344 EGVLVRRVEPTSDANNILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLNP 423 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~~ 423 (606)
.|++|.+|.++|||+.+|++||+|++|||+++.+|.+ +..++.....|+.+.+++.|+|+.++++++|..
T Consensus 8 ~Gv~V~~V~~~s~A~~gL~~GD~I~~Ing~~v~~~~~----------~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~~ 77 (79)
T cd00986 8 HGVYVTSVVEGMPAAGKLKAGDHIIAVDGKPFKEAEE----------LIDYIQSKKEGDTVKLKVKREEKELPEDLILKT 77 (79)
T ss_pred cCEEEEEECCCCchhhCCCCCCEEEEECCEECCCHHH----------HHHHHHhCCCCCEEEEEEEECCEEEEEEEEEec
Confidence 5899999999999988999999999999999999876 456666556788999999999999999998875
Q ss_pred c
Q 007357 424 R 424 (606)
Q Consensus 424 ~ 424 (606)
+
T Consensus 78 ~ 78 (79)
T cd00986 78 F 78 (79)
T ss_pred c
Confidence 4
No 21
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.00 E-value=3.3e-09 Score=88.21 Aligned_cols=77 Identities=29% Similarity=0.315 Sum_probs=63.2
Q ss_pred cccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhc
Q 007357 320 PCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQK 398 (606)
Q Consensus 320 ~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~ 398 (606)
|++|+.++.- ..|++|.+|.++|||+. ||++||+|++|||+++.+|. .++...
T Consensus 1 ~~~G~~~~~~-------------~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~-------------~~l~~~ 54 (80)
T cd00990 1 PYLGLTLDKE-------------EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQ-------------DRLKEY 54 (80)
T ss_pred CcccEEEEcc-------------CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHH-------------HHHHhc
Confidence 5788877532 35799999999999999 99999999999999998853 334444
Q ss_pred CCCCEEEEEEEECCEEEEEEEEec
Q 007357 399 FAGDVAELGIIRAGTFMKVKVVLN 422 (606)
Q Consensus 399 ~~g~~v~l~V~R~G~~~~v~v~l~ 422 (606)
..++.+.+++.|+|+..++.+++.
T Consensus 55 ~~~~~v~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 55 QAGDPVELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred CCCCEEEEEEEECCEEEEEEEEec
Confidence 578899999999999988887764
No 22
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.99 E-value=2.8e-09 Score=108.69 Aligned_cols=100 Identities=14% Similarity=0.119 Sum_probs=85.1
Q ss_pred cccchhhhHHhhcCcccCccccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCC
Q 007357 301 TVVSHFLSDYERNGKYTGFPCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEG 379 (606)
Q Consensus 301 ~~i~~~L~~l~~~g~~~g~~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~ 379 (606)
..++++++++.++|+.. ++++|+..... + +...|++|..+.++++|++ |||+||+|++|||+++.++.
T Consensus 159 ~~~~~v~~~l~~~g~~~-~~~lgi~p~~~-~---------g~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~~ 227 (259)
T TIGR01713 159 VVSRRIIEELTKDPQKM-FDYIRLSPVMK-N---------DKLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDPE 227 (259)
T ss_pred hhHHHHHHHHHHCHHhh-hheEeEEEEEe-C---------CceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCHH
Confidence 45678899999999888 89999987543 1 2246999999999999999 99999999999999999987
Q ss_pred CccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEe
Q 007357 380 TVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVL 421 (606)
Q Consensus 380 ~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l 421 (606)
+ +..++.....++.++|+|+|+|+.+++.+.+
T Consensus 228 ~----------~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 228 Q----------AFQALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred H----------HHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 7 4567777778899999999999998888764
No 23
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.78 E-value=2.6e-08 Score=82.44 Aligned_cols=66 Identities=21% Similarity=0.268 Sum_probs=55.0
Q ss_pred CcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEE
Q 007357 344 EGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVV 420 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~ 420 (606)
..++|..|.++|||++ ||++||+|++|||+++.++.+ +...+... .++.+.+++.|+|+..++.+.
T Consensus 12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~----------~~~~l~~~-~~~~~~l~v~r~~~~~~~~l~ 78 (79)
T cd00989 12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWED----------LVDAVQEN-PGKPLTLTVERNGETITLTLT 78 (79)
T ss_pred cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHHC-CCceEEEEEEECCEEEEEEec
Confidence 3578999999999998 999999999999999999876 44555554 478999999999987766653
No 24
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.78 E-value=1.9e-08 Score=110.35 Aligned_cols=89 Identities=31% Similarity=0.472 Sum_probs=78.1
Q ss_pred ccccceeeeeccchhhhccccCCCC-CcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHh
Q 007357 319 FPCLGVLLQKLENPALRTCLKVPSN-EGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLIS 396 (606)
Q Consensus 319 ~~~LGi~~q~~e~~~l~~~lgl~~~-~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~ 396 (606)
..++|+.++.+ ++.+++.++++.. .|++|.+|.++|||++ ||++||+|++|||++|.++.+ |..++.
T Consensus 337 ~~~lGi~~~~l-~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d----------~~~~l~ 405 (428)
T TIGR02037 337 NPFLGLTVANL-SPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAE----------LRKVLD 405 (428)
T ss_pred ccccceEEecC-CHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHH
Confidence 57899999999 7888888898753 7999999999999999 999999999999999999876 567777
Q ss_pred hcCCCCEEEEEEEECCEEEEEE
Q 007357 397 QKFAGDVAELGIIRAGTFMKVK 418 (606)
Q Consensus 397 ~~~~g~~v~l~V~R~G~~~~v~ 418 (606)
....|+.+.|+|.|+|+...+.
T Consensus 406 ~~~~g~~v~l~v~R~g~~~~~~ 427 (428)
T TIGR02037 406 RAKKGGRVALLILRGGATIFVT 427 (428)
T ss_pred hcCCCCEEEEEEEECCEEEEEE
Confidence 6667899999999999977653
No 25
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.76 E-value=4.7e-08 Score=82.12 Aligned_cols=66 Identities=29% Similarity=0.362 Sum_probs=55.4
Q ss_pred CcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCC--CCccccchhhHHHHHHHhhcCCCCEEEEEEEEC-CEEEEEEE
Q 007357 344 EGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSE--GTVPFRSNERIAFRYLISQKFAGDVAELGIIRA-GTFMKVKV 419 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~--~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~-G~~~~v~v 419 (606)
.+++|..|.++|||++ ||++||+|++|||+++.+| .+ +..++.. ..|+.+.+++.|+ |+..++++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~----------~~~~l~~-~~~~~i~l~v~r~~~~~~~~~~ 81 (85)
T cd00988 13 GGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLED----------VVKLLRG-KAGTKVRLTLKRGDGEPREVTL 81 (85)
T ss_pred CeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHH----------HHHHhcC-CCCCEEEEEEEcCCCCEEEEEE
Confidence 5899999999999999 9999999999999999998 44 3344443 3688999999998 88877776
Q ss_pred E
Q 007357 420 V 420 (606)
Q Consensus 420 ~ 420 (606)
.
T Consensus 82 ~ 82 (85)
T cd00988 82 T 82 (85)
T ss_pred E
Confidence 5
No 26
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.73 E-value=3.9e-07 Score=92.54 Aligned_cols=168 Identities=22% Similarity=0.272 Sum_probs=102.4
Q ss_pred EEEeeEecCCCCCCccccccccc----eEEEEEEcCCEEEecccccCCCC--eEEEEEec---C-----C---cEE-EEE
Q 007357 119 KVYCTHTAPDYSLPWQKQRQYTS----TGSAFMIGDGKLLTNAHCVEHYT--QVKVKRRG---D-----D---TKY-VAK 180 (606)
Q Consensus 119 ~I~~~~~~~~~~~Pw~~~~~~~~----~GSGfvI~~g~ILTnaHvV~~~~--~v~V~~~~---~-----~---~~~-~a~ 180 (606)
+|..+.......+||+....... .+.|.+|++.||||+|||+.... .+.|.+.. . + ... ..+
T Consensus 12 ~i~~g~~~~~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~ 91 (256)
T KOG3627|consen 12 RIVGGTEAEPGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEK 91 (256)
T ss_pred CEeCCccCCCCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeE
Confidence 45555566666899998765433 67787889889999999999865 66666521 0 0 111 112
Q ss_pred EEEeec-------C-CCeEEEEeccc-ccccCCcccccCCCC----CC-CCeEEEEeecCCCC-----cceE---EeeEE
Q 007357 181 VLARGV-------D-CDIALLSVESE-EFWKDAEPLCLGHLP----RL-QDAVTVVGYPLGGD-----TISV---TKGVV 238 (606)
Q Consensus 181 vv~~d~-------~-~DlAlLkv~~~-~~~~~v~pl~l~~~~----~l-G~~V~viG~p~g~~-----~~sv---t~GvV 238 (606)
++ .++ . +|||||++..+ .|...+.|++|.... .. +..+.+.||..... ...+ ...++
T Consensus 92 ~i-~H~~y~~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~ 170 (256)
T KOG3627|consen 92 II-VHPNYNPRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPII 170 (256)
T ss_pred EE-ECCCCCCCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEc
Confidence 22 221 3 79999999975 567789999986322 22 48888899854311 1111 22233
Q ss_pred eeeeeee-ecCC-CcceeEEEEc-----cCcCCCCCCCceEcCC---CeEEEEEEeeec
Q 007357 239 SRIEVTS-YAHG-SSELLGIQID-----AAINPGNSGGPAFNDK---GECIGVAFQVYR 287 (606)
Q Consensus 239 s~~~~~~-~~~~-~~~~~~iq~d-----a~i~~G~SGGPl~n~~---G~vVGI~~~~~~ 287 (606)
+...+.. +... ......+++. ...|.|+|||||+..+ ..++||++++..
T Consensus 171 ~~~~C~~~~~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 171 SNSECRRAYGGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred ChhHhcccccCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 3322222 1111 0111125543 2458999999999754 589999998753
No 27
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.71 E-value=7.5e-08 Score=96.52 Aligned_cols=158 Identities=23% Similarity=0.253 Sum_probs=88.7
Q ss_pred EEEEEEcCCEEEecccccCCCC----eEEEEEe---cC-CcEEE--EEEEEee-c---CCCeEEEEeccccc------cc
Q 007357 143 GSAFMIGDGKLLTNAHCVEHYT----QVKVKRR---GD-DTKYV--AKVLARG-V---DCDIALLSVESEEF------WK 202 (606)
Q Consensus 143 GSGfvI~~g~ILTnaHvV~~~~----~v~V~~~---~~-~~~~~--a~vv~~d-~---~~DlAlLkv~~~~~------~~ 202 (606)
+++|+|.++.||||+||+-... .+.+... .+ +..+. ....... . +.|.+...+.+-.+ ..
T Consensus 66 ~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~~ 145 (251)
T COG3591 66 TAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINIGD 145 (251)
T ss_pred eeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCccc
Confidence 4569999999999999985432 1222110 11 11111 1111111 2 34666666654322 11
Q ss_pred CCc--ccccCCCCCCCCeEEEEeecCCCC---cceEEeeEEeeeeeeeecCCCcceeEEEEccCcCCCCCCCceEcCCCe
Q 007357 203 DAE--PLCLGHLPRLQDAVTVVGYPLGGD---TISVTKGVVSRIEVTSYAHGSSELLGIQIDAAINPGNSGGPAFNDKGE 277 (606)
Q Consensus 203 ~v~--pl~l~~~~~lG~~V~viG~p~g~~---~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~~G~SGGPl~n~~G~ 277 (606)
... ...+.....+++.+.++|||.+.. ......+.|..+.. ..++.++.+.+|+||+|+++.+.+
T Consensus 146 ~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~----------~~l~y~~dT~pG~SGSpv~~~~~~ 215 (251)
T COG3591 146 VVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKG----------NKLFYDADTLPGSSGSPVLISKDE 215 (251)
T ss_pred cccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEec----------ceEEEEecccCCCCCCceEecCce
Confidence 122 223334446688899999998755 22334444444321 137788889999999999998889
Q ss_pred EEEEEEeeecccccceeeee-ecccccchhhhHH
Q 007357 278 CIGVAFQVYRSEEVENIGYV-IPTTVVSHFLSDY 310 (606)
Q Consensus 278 vVGI~~~~~~~~~~~~~~~~-IP~~~i~~~L~~l 310 (606)
++|+++.+....+....+++ .-...++.|+.++
T Consensus 216 vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~ 249 (251)
T COG3591 216 VIGVHYNGPGANGGSLANNAVRLTPEILNFIQQN 249 (251)
T ss_pred EEEEEecCCCcccccccCcceEecHHHHHHHHHh
Confidence 99999887542222222222 2234455555543
No 28
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=98.64 E-value=5.7e-08 Score=81.47 Aligned_cols=59 Identities=10% Similarity=0.168 Sum_probs=53.0
Q ss_pred CcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHH-hcCCceEEEEEecc
Q 007357 481 EQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVD-SCKDKYLVFEFEDN 539 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~-~~~~~~v~l~v~r~ 539 (606)
..+++|.+|.+++||.++|++.||+|++|||++|.++.+|.+++. ..+++.+.|++.|+
T Consensus 13 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~ 72 (82)
T PF13180_consen 13 TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRD 72 (82)
T ss_dssp SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEET
T ss_pred CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEEC
Confidence 347789999999999999999999999999999999999999995 45788999999886
No 29
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.56 E-value=2.6e-07 Score=76.98 Aligned_cols=60 Identities=10% Similarity=0.128 Sum_probs=54.5
Q ss_pred CcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhc-CCceEEEEEecce
Q 007357 481 EQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSC-KDKYLVFEFEDNY 540 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~-~~~~v~l~v~r~~ 540 (606)
.++++|..|.++++|..+|++.||+|++|||++|.+|++|.+++... +++.+.+++.|+.
T Consensus 9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g 69 (79)
T cd00991 9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPST 69 (79)
T ss_pred CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECC
Confidence 56789999999999999999999999999999999999999999876 4778999988764
No 30
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.54 E-value=6.4e-07 Score=98.55 Aligned_cols=153 Identities=18% Similarity=0.272 Sum_probs=99.2
Q ss_pred CCcEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEE
Q 007357 343 NEGVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVV 420 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~ 420 (606)
.+-++|..|.+.+.|+. -|++||.|+.|||.+|.....- + ...++........|.|+|.|.-..
T Consensus 673 ~qpi~iG~Iv~lGaAe~DGRL~~gDElv~iDG~pV~GksH~-----~---vv~Lm~~AArnghV~LtVRRkv~~------ 738 (984)
T KOG3209|consen 673 GQPIYIGAIVPLGAAEEDGRLREGDELVCIDGIPVEGKSHS-----E---VVDLMEAAARNGHVNLTVRRKVRT------ 738 (984)
T ss_pred CCeeEEeeeeecccccccCcccCCCeEEEecCeeccCccHH-----H---HHHHHHHHHhcCceEEEEeeeeee------
Confidence 35789999999999998 5999999999999999876541 1 234555555567899999873110
Q ss_pred ecccccccccccCCCCCcceee------cceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEeccccc
Q 007357 421 LNPRVHLVPYHIDGGQPSYLII------AGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEV 494 (606)
Q Consensus 421 l~~~~~l~p~~~~~~~p~y~~~------~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~ 494 (606)
.. ....|.......+.|=++ -||.|+-++.. .+.+.+ |.+|.++||
T Consensus 739 -~~-~~rsp~~s~~~~~~yDV~lhR~ENeGFGFVi~sS~------------------------~kp~sg--iGrIieGSP 790 (984)
T KOG3209|consen 739 -GP-ARRSPRNSAAPSGPYDVVLHRKENEGFGFVIMSSQ------------------------NKPESG--IGRIIEGSP 790 (984)
T ss_pred -cc-ccCCcccccCCCCCeeeEEecccCCceeEEEEecc------------------------cCCCCC--ccccccCCh
Confidence 00 011111111111111111 23333332211 011122 788999999
Q ss_pred ccccC-CCCCceEEeeCCeecCCHHH--HHHHHHhcCCceEEEEEec
Q 007357 495 SIGYE-DMSNQQVLKFNGTRIKNIHH--LAHLVDSCKDKYLVFEFED 538 (606)
Q Consensus 495 a~~ag-l~~gD~I~~VNG~~V~~l~~--l~~~v~~~~~~~v~l~v~r 538 (606)
|.+-| |+.||+|++|||+.|-|+.| .+.+|+.. +-.++|++.-
T Consensus 791 AdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKda-GlsVtLtIip 836 (984)
T KOG3209|consen 791 ADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKDA-GLSVTLTIIP 836 (984)
T ss_pred hHhhccccccceEEEecCeeeeccCchhHHHHHHhc-CceEEEEEcC
Confidence 99886 89999999999999999966 78888874 6688888743
No 31
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.54 E-value=5.9e-07 Score=76.07 Aligned_cols=81 Identities=16% Similarity=0.189 Sum_probs=65.1
Q ss_pred ecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHH
Q 007357 442 IAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLA 521 (606)
Q Consensus 442 ~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~ 521 (606)
+.|+.+.++++..... .......+++|..|.++++|..+|++.||+|++|||++|.++.++.
T Consensus 2 ~~G~~~~~~~~~~~~~------------------~~~~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~ 63 (90)
T cd00987 2 WLGVTVQDLTPDLAEE------------------LGLKDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLR 63 (90)
T ss_pred ccceEEeECCHHHHHH------------------cCCCCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHH
Confidence 4788888887653221 0112345889999999999999999999999999999999999999
Q ss_pred HHHHhcC-CceEEEEEecce
Q 007357 522 HLVDSCK-DKYLVFEFEDNY 540 (606)
Q Consensus 522 ~~v~~~~-~~~v~l~v~r~~ 540 (606)
+++.... ++.+.+.+.|+.
T Consensus 64 ~~l~~~~~~~~i~l~v~r~g 83 (90)
T cd00987 64 RALAELKPGDKVTLTVLRGG 83 (90)
T ss_pred HHHHhcCCCCEEEEEEEECC
Confidence 9998754 778899987753
No 32
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.54 E-value=2.8e-07 Score=74.39 Aligned_cols=67 Identities=28% Similarity=0.368 Sum_probs=52.0
Q ss_pred ccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcC
Q 007357 321 CLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKF 399 (606)
Q Consensus 321 ~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~ 399 (606)
.+|+.++..++ .|++|..|.++|||+. ||++||+|++|||+++.++.. -.+..++...
T Consensus 2 ~~G~~~~~~~~------------~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~--------~~~~~~l~~~- 60 (70)
T cd00136 2 GLGFSIRGGTE------------GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTL--------EDVAELLKKE- 60 (70)
T ss_pred CccEEEecCCC------------CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCH--------HHHHHHHhhC-
Confidence 57777765421 4899999999999999 999999999999999999832 1144566554
Q ss_pred CCCEEEEEE
Q 007357 400 AGDVAELGI 408 (606)
Q Consensus 400 ~g~~v~l~V 408 (606)
.|+.++|+|
T Consensus 61 ~g~~v~l~v 69 (70)
T cd00136 61 VGEKVTLTV 69 (70)
T ss_pred CCCeEEEEE
Confidence 378888876
No 33
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.41 E-value=1.5e-06 Score=72.48 Aligned_cols=73 Identities=32% Similarity=0.352 Sum_probs=54.4
Q ss_pred cccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhc
Q 007357 320 PCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQK 398 (606)
Q Consensus 320 ~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~ 398 (606)
+.+|+.++...++ ..|++|..|.++|||+. ||++||+|++|||+++.++.+. .......
T Consensus 12 ~~~G~~~~~~~~~----------~~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~----------~~~~~~~ 71 (85)
T smart00228 12 GGLGFSLVGGKDE----------GGGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHL----------EAVDLLK 71 (85)
T ss_pred CcccEEEECCCCC----------CCCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHH----------HHHHHHH
Confidence 5678877654211 16899999999999999 9999999999999999987652 1222222
Q ss_pred CCCCEEEEEEEECC
Q 007357 399 FAGDVAELGIIRAG 412 (606)
Q Consensus 399 ~~g~~v~l~V~R~G 412 (606)
..++.+.|++.|++
T Consensus 72 ~~~~~~~l~i~r~~ 85 (85)
T smart00228 72 KAGGKVTLTVLRGG 85 (85)
T ss_pred hCCCeEEEEEEeCC
Confidence 24568999998864
No 34
>PF12812 PDZ_1: PDZ-like domain
Probab=98.41 E-value=6.3e-07 Score=74.40 Aligned_cols=74 Identities=14% Similarity=0.153 Sum_probs=61.4
Q ss_pred CcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCceEEeeCCeecCC
Q 007357 437 PSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKN 516 (606)
Q Consensus 437 p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~ 516 (606)
.+|+.++|.+|++|+.+..+.+.... .+ ++..+..++++..+|+..|.+|.+|||++++|
T Consensus 5 ~r~v~~~Ga~f~~Ls~q~aR~~~~~~-------------------~g-v~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~ 64 (78)
T PF12812_consen 5 SRFVEVCGAVFHDLSYQQARQYGIPV-------------------GG-VYVAVSGGSLAFAGGISKGFIITSVNGKPTPD 64 (78)
T ss_pred CEEEEEcCeecccCCHHHHHHhCCCC-------------------CE-EEEEecCCChhhhCCCCCCeEEEeECCcCCcC
Confidence 57889999999999998777543221 13 44456789999999999999999999999999
Q ss_pred HHHHHHHHHhcCCc
Q 007357 517 IHHLAHLVDSCKDK 530 (606)
Q Consensus 517 l~~l~~~v~~~~~~ 530 (606)
+++|.+++++.++.
T Consensus 65 Ld~f~~vvk~ipd~ 78 (78)
T PF12812_consen 65 LDDFIKVVKKIPDN 78 (78)
T ss_pred HHHHHHHHHhCCCC
Confidence 99999999998863
No 35
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.33 E-value=1.3e-06 Score=72.14 Aligned_cols=56 Identities=21% Similarity=0.310 Sum_probs=51.6
Q ss_pred EEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEecc
Q 007357 484 VILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 484 VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
++|+.|.++++|..+|++.||+|++|||+++.+++++..++....++.+.+++.|+
T Consensus 14 ~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~ 69 (79)
T cd00989 14 PVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERN 69 (79)
T ss_pred cEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEEC
Confidence 58899999999999999999999999999999999999999887677888888775
No 36
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.32 E-value=1.3e-06 Score=70.54 Aligned_cols=54 Identities=19% Similarity=0.220 Sum_probs=50.7
Q ss_pred EEEEEEecccccccccCCCCCceEEeeCCeecCCH--HHHHHHHHhcCCceEEEEE
Q 007357 483 MVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNI--HHLAHLVDSCKDKYLVFEF 536 (606)
Q Consensus 483 ~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l--~~l~~~v~~~~~~~v~l~v 536 (606)
+++|..|.++++|..+|+++||+|++|||+++.++ +++.++++...++.++|++
T Consensus 14 ~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v 69 (70)
T cd00136 14 GVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV 69 (70)
T ss_pred CEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence 67999999999999999999999999999999999 9999999998888888876
No 37
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.31 E-value=1.9e-06 Score=71.49 Aligned_cols=57 Identities=12% Similarity=0.252 Sum_probs=51.1
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHh-cCCceEEEEEecc
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDS-CKDKYLVFEFEDN 539 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~-~~~~~v~l~v~r~ 539 (606)
.+++|..|.++++|.. |++.||+|++|||+++.+|++|.+++.. ..+..+.+++.|+
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~ 65 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKRE 65 (79)
T ss_pred cCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEEC
Confidence 4689999999999986 8999999999999999999999999986 4577889988875
No 38
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.30 E-value=1.7e-06 Score=94.74 Aligned_cols=68 Identities=24% Similarity=0.304 Sum_probs=59.6
Q ss_pred CcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEec
Q 007357 344 EGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLN 422 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~ 422 (606)
.|++|.+|.++|||++ |||+||+|++|||++|.+|.+ +...+.. ..++++.++|.|+|+..++++++.
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~d----------l~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 271 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTD----------FVSAVKE-NPGKSMDIKVERNGETLSISLTPE 271 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHh-CCCCceEEEEEECCEEEEEEEEEc
Confidence 5789999999999999 999999999999999999987 4455655 468889999999999988888774
No 39
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.28 E-value=2.1e-06 Score=72.04 Aligned_cols=58 Identities=7% Similarity=0.186 Sum_probs=53.4
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCCH--HHHHHHHHhcCCceEEEEEecc
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNI--HHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l--~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
..++|+.|.++++|.++|+++||+|++|||+++.+| .++.++++...++.+.+++.|+
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~ 72 (85)
T cd00988 13 GGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRG 72 (85)
T ss_pred CeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcC
Confidence 467899999999999999999999999999999999 9999999877788899999886
No 40
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.26 E-value=5.3e-06 Score=91.56 Aligned_cols=177 Identities=18% Similarity=0.156 Sum_probs=97.0
Q ss_pred EEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEE-------
Q 007357 348 VRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVK------- 418 (606)
Q Consensus 348 V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~------- 418 (606)
|.+|.++|||+. .|+.||.|++|||+.|.+...- +...++.. .|-+|+|+|.-..+.-...
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHa--------div~LIKd--aGlsVtLtIip~ee~~~~~~~~sa~~ 851 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHA--------DIVSLIKD--AGLSVTLTIIPPEEAGPPTSMTSAEK 851 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccCch--------hHHHHHHh--cCceEEEEEcChhccCCCCCCcchhh
Confidence 788999999999 7999999999999999987653 12344433 6889999997533221110
Q ss_pred ---EEecccccccccccCCCCCcceeecceEEecCChHHH-----hhhcccccCeeehhhhhc-hhhh---hcCCcEEEE
Q 007357 419 ---VVLNPRVHLVPYHIDGGQPSYLIIAGLVFTPLSEPLI-----EEECDDSIGLKLLAKARY-SLAR---FEGEQMVIL 486 (606)
Q Consensus 419 ---v~l~~~~~l~p~~~~~~~p~y~~~~Glvf~pl~~~~~-----~~~~~~~~g~~l~~~~~~-~~~~---~~~~~~VvI 486 (606)
++.... .-..++..+..|+++..-+ +|.+..++ .....+.+.+.|-.-.+. ++.- .++.=..+|
T Consensus 852 ~s~~t~~~~-~~q~~glp~~~~s~~~~~p---qpdt~~~~~~~~r~~qn~~~~~VelErG~kGFGFSiRGGreynM~LfV 927 (984)
T KOG3209|consen 852 QSPFTQNGP-YEQQYGLPGPRPSVYEEHP---QPDTFQGLSINDRMSQNGDLYTVELERGAKGFGFSIRGGREYNMDLFV 927 (984)
T ss_pred cCcccccCC-HhHccCCCCCCccccccCC---CCccccceeccccccccCCeeEEEeeccccccceEeecccccccceEE
Confidence 000000 0000011111222222111 22111111 000111122222111110 0000 011224577
Q ss_pred EEecccccccccC-CCCCceEEeeCCeecCCHHHH--HHHHHhcCCceEEEEEecc
Q 007357 487 SQVLANEVSIGYE-DMSNQQVLKFNGTRIKNIHHL--AHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 487 s~Vl~~s~a~~ag-l~~gD~I~~VNG~~V~~l~~l--~~~v~~~~~~~v~l~v~r~ 539 (606)
-.+..++||.+-| ++.||+|++|||++.+++.|- +++|++. +..+.+.+.|+
T Consensus 928 LRlAeDGPA~rdGrm~VGDqi~eINGesTkgmtH~rAIelIk~g-g~~vll~Lr~g 982 (984)
T KOG3209|consen 928 LRLAEDGPAIRDGRMRVGDQITEINGESTKGMTHDRAIELIKQG-GRRVLLLLRRG 982 (984)
T ss_pred EEeccCCCccccCceeecceEEEecCcccCCCcHHHHHHHHHhC-CeEEEEEeccC
Confidence 8888999999887 678999999999999999884 5567653 44555555554
No 41
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.24 E-value=3.2e-06 Score=93.37 Aligned_cols=67 Identities=27% Similarity=0.341 Sum_probs=58.8
Q ss_pred cEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEec
Q 007357 345 GVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLN 422 (606)
Q Consensus 345 Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~ 422 (606)
+++|.+|.++|||++ |||+||+|++|||++|.+|.+ +...+.. ..|+.+.++|.|+|+..++++++.
T Consensus 222 ~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~d----------l~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 289 (449)
T PRK10779 222 EPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQT----------FVTLVRD-NPGKPLALEIERQGSPLSLTLTPD 289 (449)
T ss_pred CcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHh-CCCCEEEEEEEECCEEEEEEEEee
Confidence 578999999999999 999999999999999999987 4455555 478899999999999988888875
No 42
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.20 E-value=5.5e-06 Score=69.07 Aligned_cols=72 Identities=25% Similarity=0.266 Sum_probs=53.0
Q ss_pred ccccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhh
Q 007357 319 FPCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQ 397 (606)
Q Consensus 319 ~~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~ 397 (606)
...+|+++....+.. ..+++|.+|.|+|||+. ||++||+|++|||+++.++... ....++..
T Consensus 9 ~~~lG~~l~~~~~~~---------~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~--------~~~~~l~~ 71 (81)
T PF00595_consen 9 NGPLGFTLRGGSDND---------EKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHD--------EVVQLLKS 71 (81)
T ss_dssp TSBSSEEEEEESTSS---------SEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHH--------HHHHHHHH
T ss_pred CCCcCEEEEecCCCC---------cCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHH--------HHHHHHHC
Confidence 456888887652100 36999999999999999 9999999999999999998542 12334444
Q ss_pred cCCCCEEEEEEE
Q 007357 398 KFAGDVAELGII 409 (606)
Q Consensus 398 ~~~g~~v~l~V~ 409 (606)
. +..++|+|+
T Consensus 72 ~--~~~v~L~V~ 81 (81)
T PF00595_consen 72 A--SNPVTLTVQ 81 (81)
T ss_dssp S--TSEEEEEEE
T ss_pred C--CCcEEEEEC
Confidence 3 448888764
No 43
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.17 E-value=7.4e-06 Score=87.07 Aligned_cols=72 Identities=22% Similarity=0.218 Sum_probs=57.0
Q ss_pred CcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEec
Q 007357 344 EGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLN 422 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~ 422 (606)
.+++|..|.++|||++ ||++||+|++|||++|.+|..- .+..++.. ..|+.+.++|.|+|+..++++++.
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~--------~~~~~l~~-~~g~~v~l~v~R~g~~~~~~v~l~ 132 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLD--------DAVALIRG-KKGTKVSLEILRAGKSKPLTFTLK 132 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHH--------HHHHhccC-CCCCEEEEEEEeCCCCceEEEEEE
Confidence 4789999999999999 9999999999999999987310 12233332 468999999999988777777766
Q ss_pred cc
Q 007357 423 PR 424 (606)
Q Consensus 423 ~~ 424 (606)
..
T Consensus 133 ~~ 134 (334)
T TIGR00225 133 RD 134 (334)
T ss_pred EE
Confidence 53
No 44
>PRK10139 serine endoprotease; Provisional
Probab=98.14 E-value=6.9e-06 Score=90.73 Aligned_cols=80 Identities=15% Similarity=0.169 Sum_probs=65.2
Q ss_pred ecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHH
Q 007357 442 IAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLA 521 (606)
Q Consensus 442 ~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~ 521 (606)
+.|+.+.++++...+.+. .+...+++|.+|.++++|.++|++.||+|++|||++|.+|.+|.
T Consensus 268 ~LGv~~~~l~~~~~~~lg------------------l~~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~ 329 (455)
T PRK10139 268 LLGIKGTEMSADIAKAFN------------------LDVQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELR 329 (455)
T ss_pred ceeEEEEECCHHHHHhcC------------------CCCCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHH
Confidence 567777777765433211 12346889999999999999999999999999999999999999
Q ss_pred HHHHh-cCCceEEEEEecc
Q 007357 522 HLVDS-CKDKYLVFEFEDN 539 (606)
Q Consensus 522 ~~v~~-~~~~~v~l~v~r~ 539 (606)
+.+.. ..++.+.+++.|+
T Consensus 330 ~~l~~~~~g~~v~l~V~R~ 348 (455)
T PRK10139 330 SRIATTEPGTKVKLGLLRN 348 (455)
T ss_pred HHHHhcCCCCEEEEEEEEC
Confidence 99986 5677899999875
No 45
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.13 E-value=4e-06 Score=90.32 Aligned_cols=62 Identities=24% Similarity=0.342 Sum_probs=50.7
Q ss_pred EEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEE-ECCEEEEEEEEec
Q 007357 347 LVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGII-RAGTFMKVKVVLN 422 (606)
Q Consensus 347 ~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~-R~G~~~~v~v~l~ 422 (606)
+|..|.|+|+|++ ||++||+|++|||++|.+|.++ ...+ .++.+.++|. |+|+..++++...
T Consensus 1 ~I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~----------~~~l----~~e~l~L~V~~rdGe~~~l~Ie~~ 64 (433)
T TIGR03279 1 LISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDY----------QFLC----ADEELELEVLDANGESHQIEIEKD 64 (433)
T ss_pred CcCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHH----------HHHh----cCCcEEEEEEcCCCeEEEEEEecC
Confidence 3678999999999 9999999999999999999774 2333 2467899997 8998887777654
No 46
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.09 E-value=2.3e-05 Score=84.96 Aligned_cols=69 Identities=20% Similarity=0.219 Sum_probs=55.1
Q ss_pred CcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEe
Q 007357 344 EGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVL 421 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l 421 (606)
.|++|..|.++|||++ ||++||+|++|||++|.++.. ..+..++. ...|+.+.|+|.|+|+..+++++-
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~--------~~~~~~l~-g~~g~~v~ltv~r~g~~~~~~l~r 171 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSL--------YEAADRLQ-GPEGSSVELTLRRGPETRLVTLTR 171 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCH--------HHHHHHHh-cCCCCEEEEEEEECCEEEEEEEEe
Confidence 4899999999999999 999999999999999987632 11233443 346889999999999887776653
No 47
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.05 E-value=1.1e-05 Score=67.32 Aligned_cols=55 Identities=15% Similarity=0.289 Sum_probs=48.0
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCCH--HHHHHHHHhcCCceEEEEEe
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNI--HHLAHLVDSCKDKYLVFEFE 537 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l--~~l~~~v~~~~~~~v~l~v~ 537 (606)
.+++|+.|.++++|..+|++.||+|++|||+++.++ .+..++++.+.+ .++|+++
T Consensus 25 ~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~-~v~L~V~ 81 (81)
T PF00595_consen 25 KGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASN-PVTLTVQ 81 (81)
T ss_dssp EEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTS-EEEEEEE
T ss_pred CCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCC-cEEEEEC
Confidence 588999999999999999999999999999999977 556777887766 8888764
No 48
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=98.03 E-value=1.7e-05 Score=84.89 Aligned_cols=80 Identities=9% Similarity=0.117 Sum_probs=64.8
Q ss_pred ecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHH
Q 007357 442 IAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLA 521 (606)
Q Consensus 442 ~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~ 521 (606)
+.|+.+.++++...+.. ..+..++++|.+|.+++||.++|++.||+|++|||++|.++++|.
T Consensus 256 ~lGv~~~~~~~~~~~~l------------------gl~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~ 317 (351)
T TIGR02038 256 YIGVSGEDINSVVAQGL------------------GLPDLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELM 317 (351)
T ss_pred EeeeEEEECCHHHHHhc------------------CCCccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHH
Confidence 57777877766533211 112346889999999999999999999999999999999999999
Q ss_pred HHHHh-cCCceEEEEEecc
Q 007357 522 HLVDS-CKDKYLVFEFEDN 539 (606)
Q Consensus 522 ~~v~~-~~~~~v~l~v~r~ 539 (606)
+.+.. .+++.+.|++.|+
T Consensus 318 ~~l~~~~~g~~v~l~v~R~ 336 (351)
T TIGR02038 318 DRIAETRPGSKVMVTVLRQ 336 (351)
T ss_pred HHHHhcCCCCEEEEEEEEC
Confidence 99986 5677899999885
No 49
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.02 E-value=1.3e-05 Score=66.35 Aligned_cols=56 Identities=7% Similarity=0.026 Sum_probs=46.5
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEecc
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
.+++|++|.++++|..+|++.||+|++|||+++.+|.++.+.+ ..++.+.+++.|+
T Consensus 12 ~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~--~~~~~v~l~v~r~ 67 (80)
T cd00990 12 GLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEY--QAGDPVELTVFRD 67 (80)
T ss_pred CcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhc--CCCCEEEEEEEEC
Confidence 3578999999999999999999999999999999966654333 2566888888775
No 50
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.98 E-value=1.9e-05 Score=65.44 Aligned_cols=54 Identities=13% Similarity=0.161 Sum_probs=48.0
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecC--CHHHHHHHHHhcCCceEEEEE
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIK--NIHHLAHLVDSCKDKYLVFEF 536 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~--~l~~l~~~v~~~~~~~v~l~v 536 (606)
.+++|..|.++++|..+|+++||+|++|||+++. +++++.++++.... .+++.+
T Consensus 26 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~-~v~l~v 81 (82)
T cd00992 26 GGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV 81 (82)
T ss_pred CCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC-eEEEEE
Confidence 4679999999999999999999999999999999 99999999997655 666654
No 51
>PRK10942 serine endoprotease; Provisional
Probab=97.97 E-value=2.2e-05 Score=87.11 Aligned_cols=80 Identities=19% Similarity=0.221 Sum_probs=65.3
Q ss_pred ecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHH
Q 007357 442 IAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLA 521 (606)
Q Consensus 442 ~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~ 521 (606)
+.|+.+.++++...+.+ ..+...+++|..|.++++|.++|++.||+|++|||++|.++++|.
T Consensus 289 ~lGv~~~~l~~~~a~~~------------------~l~~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~ 350 (473)
T PRK10942 289 ELGIMGTELNSELAKAM------------------KVDAQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAALR 350 (473)
T ss_pred eeeeEeeecCHHHHHhc------------------CCCCCCceEEEEECCCChHHHcCCCCCCEEEEECCEECCCHHHHH
Confidence 57888888887643321 112356889999999999999999999999999999999999999
Q ss_pred HHHHhc-CCceEEEEEecc
Q 007357 522 HLVDSC-KDKYLVFEFEDN 539 (606)
Q Consensus 522 ~~v~~~-~~~~v~l~v~r~ 539 (606)
+.+... +++.+.+++.|+
T Consensus 351 ~~l~~~~~g~~v~l~v~R~ 369 (473)
T PRK10942 351 AQVGTMPVGSKLTLGLLRD 369 (473)
T ss_pred HHHHhcCCCCEEEEEEEEC
Confidence 999764 466888888775
No 52
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.97 E-value=2.3e-05 Score=65.01 Aligned_cols=50 Identities=32% Similarity=0.415 Sum_probs=40.9
Q ss_pred ccccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCC
Q 007357 319 FPCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSE 378 (606)
Q Consensus 319 ~~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~ 378 (606)
...+|+.++...+ . ..|++|..|.++|||+. ||++||+|++|||+++.++
T Consensus 11 ~~~~G~~~~~~~~-~---------~~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~ 61 (82)
T cd00992 11 GGGLGFSLRGGKD-S---------GGGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGL 61 (82)
T ss_pred CCCcCEEEeCccc-C---------CCCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCcc
Confidence 3557888775421 1 35899999999999999 9999999999999999953
No 53
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=97.94 E-value=0.0001 Score=80.21 Aligned_cols=74 Identities=18% Similarity=0.249 Sum_probs=53.8
Q ss_pred cccCCCCCcEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEE
Q 007357 337 CLKVPSNEGVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTF 414 (606)
Q Consensus 337 ~lgl~~~~Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~ 414 (606)
.|||.-..-+.|.++...+-|+. +||+||+||+|||.-..|..- ..-..+|... ..++.|.|+||...
T Consensus 212 EyGlrLgSqIFvKeit~~gLAardgnlqEGDiiLkINGtvteNmSL--------tDar~LIEkS--~GKL~lvVlRD~~q 281 (1027)
T KOG3580|consen 212 EYGLRLGSQIFVKEITRTGLAARDGNLQEGDIILKINGTVTENMSL--------TDARKLIEKS--RGKLQLVVLRDSQQ 281 (1027)
T ss_pred hhcccccchhhhhhhcccchhhccCCcccccEEEEECcEeeccccc--------hhHHHHHHhc--cCceEEEEEecCCc
Confidence 45665455678888988888887 899999999999998887532 1123455443 34689999998776
Q ss_pred EEEEEE
Q 007357 415 MKVKVV 420 (606)
Q Consensus 415 ~~v~v~ 420 (606)
.-+.|+
T Consensus 282 tLiNiP 287 (1027)
T KOG3580|consen 282 TLINIP 287 (1027)
T ss_pred eeeecC
Confidence 666654
No 54
>PRK10898 serine endoprotease; Provisional
Probab=97.92 E-value=4.3e-05 Score=81.81 Aligned_cols=59 Identities=8% Similarity=0.017 Sum_probs=54.2
Q ss_pred CcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHh-cCCceEEEEEecc
Q 007357 481 EQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDS-CKDKYLVFEFEDN 539 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~-~~~~~v~l~v~r~ 539 (606)
.++++|.+|.+++||.++|++.||+|++|||++|.++.+|.+.+.. .+++.+.+++.|+
T Consensus 278 ~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~ 337 (353)
T PRK10898 278 LQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRD 337 (353)
T ss_pred CCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEEC
Confidence 4789999999999999999999999999999999999999999976 5677899999876
No 55
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.92 E-value=0.00012 Score=62.08 Aligned_cols=64 Identities=25% Similarity=0.280 Sum_probs=43.1
Q ss_pred CcEEEEEeCCC--------Chhhc-C--CCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECC
Q 007357 344 EGVLVRRVEPT--------SDANN-I--LKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAG 412 (606)
Q Consensus 344 ~Gv~V~~V~p~--------spA~~-g--Lq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G 412 (606)
.+..|.+|.++ ||..+ | +++||+|++|||+++....+ +..++..+ .|+.+.|+|.+.+
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~----------~~~lL~~~-agk~V~Ltv~~~~ 80 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADAN----------PYRLLEGK-AGKQVLLTVNRKP 80 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-----------HHHHHHTT-TTSEEEEEEE-ST
T ss_pred CEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCC----------HHHHhccc-CCCEEEEEEecCC
Confidence 57788888875 77777 5 56999999999999998766 34555554 7999999999865
Q ss_pred -EEEEEE
Q 007357 413 -TFMKVK 418 (606)
Q Consensus 413 -~~~~v~ 418 (606)
+.+++.
T Consensus 81 ~~~R~v~ 87 (88)
T PF14685_consen 81 GGARTVV 87 (88)
T ss_dssp T-EEEEE
T ss_pred CCceEEE
Confidence 555554
No 56
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.91 E-value=3.4e-05 Score=82.72 Aligned_cols=67 Identities=22% Similarity=0.318 Sum_probs=55.3
Q ss_pred CcEEEEEeC--------CCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEE
Q 007357 344 EGVLVRRVE--------PTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTF 414 (606)
Q Consensus 344 ~Gv~V~~V~--------p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~ 414 (606)
.||+|.... .+|||++ |||+||+|++|||++|.+|.+ +..++... .++.+.++|.|+|+.
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~D----------L~~iL~~~-~g~~V~LtV~R~Ge~ 173 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDD----------LANLINKA-GGEKLTLTIERGGKI 173 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHhC-CCCeEEEEEEECCEE
Confidence 588886542 2589998 999999999999999999987 44666655 488999999999998
Q ss_pred EEEEEEe
Q 007357 415 MKVKVVL 421 (606)
Q Consensus 415 ~~v~v~l 421 (606)
.++.+.+
T Consensus 174 ~tv~V~P 180 (402)
T TIGR02860 174 IETVIKP 180 (402)
T ss_pred EEEEEEE
Confidence 8888763
No 57
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.81 E-value=4.7e-05 Score=77.72 Aligned_cols=67 Identities=25% Similarity=0.415 Sum_probs=54.4
Q ss_pred CcEEEE-EeCCCChh---hc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEE
Q 007357 344 EGVLVR-RVEPTSDA---NN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVK 418 (606)
Q Consensus 344 ~Gv~V~-~V~p~spA---~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~ 418 (606)
.| +++ .+.|+... .+ |||+||++++|||.++.+.++ ...++........++|+|+|||+.+++.
T Consensus 204 ~G-l~GYrl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~q----------a~~l~~~L~~~tei~ltVeRdGq~~~i~ 272 (276)
T PRK09681 204 EG-IVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRA----------MIALMRQLPSMDSIQLTVLRKGARHDIS 272 (276)
T ss_pred CC-ceEEEECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHH----------HHHHHHHhccCCeEEEEEEECCEEEEEE
Confidence 46 553 58887543 44 999999999999999998765 3467777778899999999999999998
Q ss_pred EEe
Q 007357 419 VVL 421 (606)
Q Consensus 419 v~l 421 (606)
+.|
T Consensus 273 i~l 275 (276)
T PRK09681 273 IAL 275 (276)
T ss_pred EEc
Confidence 876
No 58
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.78 E-value=0.002 Score=64.44 Aligned_cols=169 Identities=18% Similarity=0.282 Sum_probs=83.6
Q ss_pred hhhccCCcEEEEeeEecCCCCCCccccccccceEEEEEEc-CCEEEecccccCC-CCeEEEEEecCCcEEEEE-----EE
Q 007357 110 DAAFLNAVVKVYCTHTAPDYSLPWQKQRQYTSTGSAFMIG-DGKLLTNAHCVEH-YTQVKVKRRGDDTKYVAK-----VL 182 (606)
Q Consensus 110 ~~~~~~SVV~I~~~~~~~~~~~Pw~~~~~~~~~GSGfvI~-~g~ILTnaHvV~~-~~~v~V~~~~~~~~~~a~-----vv 182 (606)
...+...|.+|....... ..+=+-|. ..+|+||+|.... ...++|+. ..-.|... -+
T Consensus 13 yn~Ia~~ic~l~n~s~~~--------------~~~l~gigyG~~iItn~HLf~~nng~L~i~s--~hG~f~v~nt~~lkv 76 (235)
T PF00863_consen 13 YNPIASNICRLTNESDGG--------------TRSLYGIGYGSYIITNAHLFKRNNGELTIKS--QHGEFTVPNTTQLKV 76 (235)
T ss_dssp -HHHHTTEEEEEEEETTE--------------EEEEEEEEETTEEEEEGGGGSSTTCEEEEEE--TTEEEEECEGGGSEE
T ss_pred cchhhheEEEEEEEeCCC--------------eEEEEEEeECCEEEEChhhhccCCCeEEEEe--CceEEEcCCccccce
Confidence 456677888887633111 11222222 7899999999854 45577765 33334332 12
Q ss_pred EeecCCCeEEEEecccccccCCcccccC---CCCCCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEc
Q 007357 183 ARGVDCDIALLSVESEEFWKDAEPLCLG---HLPRLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQID 259 (606)
Q Consensus 183 ~~d~~~DlAlLkv~~~~~~~~v~pl~l~---~~~~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~d 259 (606)
..=+..||.++|++.+ ++|.+-. ..|..++.|.+||.-+....++. .||..... +......+ ..--
T Consensus 77 ~~i~~~DiviirmPkD-----fpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~~s---~vSesS~i-~p~~~~~f--WkHw 145 (235)
T PF00863_consen 77 HPIEGRDIVIIRMPKD-----FPPFPQKLKFRAPKEGERVCMVGSNFQEKSISS---TVSESSWI-YPEENSHF--WKHW 145 (235)
T ss_dssp EE-TCSSEEEEE--TT-----S----S---B----TT-EEEEEEEECSSCCCEE---EEEEEEEE-EEETTTTE--EEE-
T ss_pred EEeCCccEEEEeCCcc-----cCCcchhhhccCCCCCCEEEEEEEEEEcCCeeE---EECCceEE-eecCCCCe--eEEE
Confidence 2335789999999753 3443221 34677999999997554333222 22222111 11122222 3344
Q ss_pred cCcCCCCCCCceEc-CCCeEEEEEEeeecccccceeeeeecccccchhhhHH
Q 007357 260 AAINPGNSGGPAFN-DKGECIGVAFQVYRSEEVENIGYVIPTTVVSHFLSDY 310 (606)
Q Consensus 260 a~i~~G~SGGPl~n-~~G~vVGI~~~~~~~~~~~~~~~~IP~~~i~~~L~~l 310 (606)
.....|+=|+|+++ .+|++|||++.... ....+|+.|++ ..|++.+
T Consensus 146 IsTk~G~CG~PlVs~~Dg~IVGiHsl~~~---~~~~N~F~~f~--~~f~~~~ 192 (235)
T PF00863_consen 146 ISTKDGDCGLPLVSTKDGKIVGIHSLTSN---TSSRNYFTPFP--DDFEEFY 192 (235)
T ss_dssp C---TT-TT-EEEETTT--EEEEEEEEET---TTSSEEEEE----TTHHHHH
T ss_pred ecCCCCccCCcEEEcCCCcEEEEEcCccC---CCCeEEEEcCC--HHHHHHH
Confidence 45568999999998 68999999997542 23456777763 2444444
No 59
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=97.75 E-value=7e-05 Score=76.61 Aligned_cols=63 Identities=8% Similarity=0.022 Sum_probs=56.1
Q ss_pred CcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcC-CceEEEEEecc-eEEE
Q 007357 481 EQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCK-DKYLVFEFEDN-YLAV 543 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~-~~~v~l~v~r~-~~iv 543 (606)
.+|+.|..+.+++++.++|++.||+|++|||+++.+++++.+++.+.+ ++.+.|+++|+ +.+.
T Consensus 190 ~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~ 254 (259)
T TIGR01713 190 LEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQRED 254 (259)
T ss_pred eeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEEE
Confidence 468999999999999999999999999999999999999999999864 46899999986 4433
No 60
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.69 E-value=0.00012 Score=79.62 Aligned_cols=81 Identities=32% Similarity=0.348 Sum_probs=60.1
Q ss_pred ccccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhh
Q 007357 319 FPCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQ 397 (606)
Q Consensus 319 ~~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~ 397 (606)
++.+|++++.-+ ..++.|.++.+++||++ ||++||+|++|||+++....-- .....+ .
T Consensus 99 ~~GiG~~i~~~~------------~~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~--------~av~~i-r 157 (406)
T COG0793 99 FGGIGIELQMED------------IGGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLD--------EAVKLI-R 157 (406)
T ss_pred ccceeEEEEEec------------CCCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHH--------HHHHHh-C
Confidence 677899887541 16888999999999999 9999999999999999876310 011223 3
Q ss_pred cCCCCEEEEEEEECC--EEEEEEEE
Q 007357 398 KFAGDVAELGIIRAG--TFMKVKVV 420 (606)
Q Consensus 398 ~~~g~~v~l~V~R~G--~~~~v~v~ 420 (606)
-.+|..|+|++.|.| +..++++.
T Consensus 158 G~~Gt~V~L~i~r~~~~k~~~v~l~ 182 (406)
T COG0793 158 GKPGTKVTLTILRAGGGKPFTVTLT 182 (406)
T ss_pred CCCCCeEEEEEEEcCCCceeEEEEE
Confidence 347999999999974 44444443
No 61
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.66 E-value=0.0001 Score=75.46 Aligned_cols=71 Identities=27% Similarity=0.250 Sum_probs=64.3
Q ss_pred CcEEEEEeCCCChhhcCCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEE-CCEEEEEEEEec
Q 007357 344 EGVLVRRVEPTSDANNILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIR-AGTFMKVKVVLN 422 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R-~G~~~~v~v~l~ 422 (606)
.||++..|..++|+...|+.||.|++|||+++.+.++ |..+++...+|++|++++.| ++++..+++++.
T Consensus 130 ~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e----------~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl~ 199 (342)
T COG3480 130 AGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDE----------LIDYVSSKKPGDEVTIDYERHNETPEIVTITLI 199 (342)
T ss_pred eeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHH----------HHHHHhccCCCCeEEEEEEeccCCCceEEEEEE
Confidence 6999999999999999999999999999999999877 66888888899999999997 888888888877
Q ss_pred cc
Q 007357 423 PR 424 (606)
Q Consensus 423 ~~ 424 (606)
..
T Consensus 200 ~~ 201 (342)
T COG3480 200 KN 201 (342)
T ss_pred ee
Confidence 65
No 62
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.64 E-value=0.00035 Score=72.54 Aligned_cols=81 Identities=17% Similarity=0.183 Sum_probs=46.8
Q ss_pred EEEeeEecCCCCCCcccccc-------ccceEEEEEEcCCEEEecccccCCCC-----eEEEEEe-cC---CcEEEEEEE
Q 007357 119 KVYCTHTAPDYSLPWQKQRQ-------YTSTGSAFMIGDGKLLTNAHCVEHYT-----QVKVKRR-GD---DTKYVAKVL 182 (606)
Q Consensus 119 ~I~~~~~~~~~~~Pw~~~~~-------~~~~GSGfvI~~g~ILTnaHvV~~~~-----~v~V~~~-~~---~~~~~a~vv 182 (606)
+|.+...+....+|++.... ....+.|-++..+||||+|||+.... .+.|... .| ++...++.+
T Consensus 32 rIigGs~Anag~~P~~VaLv~~isd~~s~tfCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d~Sq~~rg~vr~i 111 (413)
T COG5640 32 RIIGGSNANAGEYPSLVALVDRISDYVSGTFCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLNDSSQAERGHVRTI 111 (413)
T ss_pred eEecCcccccccCchHHHHHhhcccccceeEeccceecceEEeeehhhccCCCCccccceEEEecccccccccCcceEEE
Confidence 55666666666677765432 12235578888889999999997643 1222220 11 122222322
Q ss_pred Ee-------ecCCCeEEEEecccc
Q 007357 183 AR-------GVDCDIALLSVESEE 199 (606)
Q Consensus 183 ~~-------d~~~DlAlLkv~~~~ 199 (606)
.. ....|+|++++....
T Consensus 112 ~~~efY~~~n~~ND~Av~~l~~~a 135 (413)
T COG5640 112 YVHEFYSPGNLGNDIAVLELARAA 135 (413)
T ss_pred eeecccccccccCcceeecccccc
Confidence 22 235799999998753
No 63
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.61 E-value=0.00013 Score=60.54 Aligned_cols=57 Identities=16% Similarity=0.237 Sum_probs=47.0
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHH--HHhcCCceEEEEEecc
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHL--VDSCKDKYLVFEFEDN 539 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~--v~~~~~~~v~l~v~r~ 539 (606)
.+++|..|.+++++..+|++.||+|++|||+++.++.+.... +... +..++|.+.|+
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~-~~~~~l~i~r~ 84 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKA-GGKVTLTVLRG 84 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhC-CCeEEEEEEeC
Confidence 468999999999999999999999999999999988665444 3333 44888888774
No 64
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.52 E-value=0.0009 Score=67.00 Aligned_cols=135 Identities=20% Similarity=0.253 Sum_probs=68.1
Q ss_pred cccccchhhccCCcEEEEeeEecCCCCCCccccccccceEEEEEE--c-CCEEEecccccCCCCeEEEEEecCCcEEEEE
Q 007357 104 ESGNLQDAAFLNAVVKVYCTHTAPDYSLPWQKQRQYTSTGSAFMI--G-DGKLLTNAHCVEHYTQVKVKRRGDDTKYVAK 180 (606)
Q Consensus 104 ~~~~~~~~~~~~SVV~I~~~~~~~~~~~Pw~~~~~~~~~GSGfvI--~-~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a~ 180 (606)
-...+|..+....+|.|++. +.|||-+. + +-.|||+.||+. ...-.|.. .+....
T Consensus 92 lEGa~Rt~~~~~N~v~V~Gs-----------------s~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~--~g~~~~-- 149 (297)
T PF05579_consen 92 LEGAFRTNKPASNVVNVFGS-----------------SVGSGGVFTIGGNTVVVTATHVLG-GNTARVSG--VGTRRM-- 149 (297)
T ss_dssp -------SS--TTEEEEESS-----------------SEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEE--TTEEEE--
T ss_pred eechhcCCCCCCCeeEEEee-----------------cccccceEEECCeEEEEEEEEEcC-CCeEEEEe--cceEEE--
Confidence 33445566666778888772 34555444 4 457899999998 55666655 333332
Q ss_pred EEEeecCCCeEEEEecccccccCCcccccCCCCCCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEcc
Q 007357 181 VLARGVDCDIALLSVESEEFWKDAEPLCLGHLPRLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDA 260 (606)
Q Consensus 181 vv~~d~~~DlAlLkv~~~~~~~~v~pl~l~~~~~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da 260 (606)
..+...-|+|.-.+++ +....+.+++++. ..| .-+..-..-+..|.|..-.+ ++.
T Consensus 150 -~tF~~~GDfA~~~~~~--~~G~~P~~k~a~~-~~G-------rAyW~t~tGvE~G~ig~~~~------------~~f-- 204 (297)
T PF05579_consen 150 -LTFKKNGDFAEADITN--WPGAAPKYKFAQN-YTG-------RAYWLTSTGVEPGFIGGGGA------------VCF-- 204 (297)
T ss_dssp -EEEEEETTEEEEEETT--S-S---B--B-TT--SE-------EEEEEETTEEEEEEEETTEE------------EES--
T ss_pred -EEEeccCcEEEEECCC--CCCCCCceeecCC-ccc-------ceEEEcccCcccceecCceE------------EEE--
Confidence 2345567999998843 2235566666521 111 11100111234555542111 222
Q ss_pred CcCCCCCCCceEcCCCeEEEEEEeee
Q 007357 261 AINPGNSGGPAFNDKGECIGVAFQVY 286 (606)
Q Consensus 261 ~i~~G~SGGPl~n~~G~vVGI~~~~~ 286 (606)
..+||||+|++..+|.+|||++..-
T Consensus 205 -T~~GDSGSPVVt~dg~liGVHTGSn 229 (297)
T PF05579_consen 205 -TGPGDSGSPVVTEDGDLIGVHTGSN 229 (297)
T ss_dssp -S-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred -cCCCCCCCccCcCCCCEEEEEecCC
Confidence 3589999999999999999998753
No 65
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=97.50 E-value=0.00057 Score=74.53 Aligned_cols=61 Identities=21% Similarity=0.351 Sum_probs=47.7
Q ss_pred CCcEEEEEeCCCChhhcCCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCE
Q 007357 343 NEGVLVRRVEPTSDANNILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGT 413 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~ 413 (606)
.+.++|..|.|++||+.-||.||.|+-|||....+.... | .+-.....|+...|+|.|..+
T Consensus 39 etSiViSDVlpGGPAeG~LQenDrvvMVNGvsMenv~ha---------F-AvQqLrksgK~A~ItvkRprk 99 (1027)
T KOG3580|consen 39 ETSIVISDVLPGGPAEGLLQENDRVVMVNGVSMENVLHA---------F-AVQQLRKSGKVAAITVKRPRK 99 (1027)
T ss_pred ceeEEEeeccCCCCcccccccCCeEEEEcCcchhhhHHH---------H-HHHHHHhhccceeEEecccce
Confidence 467899999999999999999999999999999886442 2 222223468888999988544
No 66
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.48 E-value=0.0025 Score=65.88 Aligned_cols=89 Identities=24% Similarity=0.337 Sum_probs=56.3
Q ss_pred CCCeEEEEecccccccCCcccccCCCC---CCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEccCcC
Q 007357 187 DCDIALLSVESEEFWKDAEPLCLGHLP---RLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDAAIN 263 (606)
Q Consensus 187 ~~DlAlLkv~~~~~~~~v~pl~l~~~~---~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~ 263 (606)
..+++||+++.+ +.....|++|++.. ..++.+.+.|+... .. +....+.-..... ....+......+
T Consensus 160 ~~~~mIlEl~~~-~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~-~~--~~~~~~~i~~~~~------~~~~~~~~~~~~ 229 (282)
T PF03761_consen 160 PYSPMILELEED-FSKNVSPPCLADSSTNWEKGDEVDVYGFNST-GK--LKHRKLKITNCTK------CAYSICTKQYSC 229 (282)
T ss_pred ccceEEEEEccc-ccccCCCEEeCCCccccccCceEEEeecCCC-Ce--EEEEEEEEEEeec------cceeEecccccC
Confidence 468999999987 55588999998754 34888999998222 21 2222222111110 112255566678
Q ss_pred CCCCCCceEc-CCC--eEEEEEEee
Q 007357 264 PGNSGGPAFN-DKG--ECIGVAFQV 285 (606)
Q Consensus 264 ~G~SGGPl~n-~~G--~vVGI~~~~ 285 (606)
.|++|||++. .+| .||||.+.+
T Consensus 230 ~~d~Gg~lv~~~~gr~tlIGv~~~~ 254 (282)
T PF03761_consen 230 KGDRGGPLVKNINGRWTLIGVGASG 254 (282)
T ss_pred CCCccCeEEEEECCCEEEEEEEccC
Confidence 9999999983 344 599998754
No 67
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.45 E-value=0.00021 Score=76.04 Aligned_cols=58 Identities=10% Similarity=0.181 Sum_probs=51.5
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCCH--HHHHHHHHhcCCceEEEEEecc
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNI--HHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l--~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
..++|..|.+++||..+|++.||+|++|||++|.+| .++..++....++.+.|++.|+
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~ 121 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRA 121 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeC
Confidence 367899999999999999999999999999999987 5788888777788899998885
No 68
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.41 E-value=0.00044 Score=66.75 Aligned_cols=73 Identities=22% Similarity=0.264 Sum_probs=58.8
Q ss_pred EEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEeccc
Q 007357 346 VLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLNPR 424 (606)
Q Consensus 346 v~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~~~ 424 (606)
++|.+|.|+|||+. ||+.||.|+++....-.++..+ ..+. .+.+...++.+.++|.|.|+...+.++...|
T Consensus 141 a~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~l-------q~i~-~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~W 212 (231)
T KOG3129|consen 141 AVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPL-------QNIA-AVVQSNEDQIVSVTVIREGQKVVLSLTPKKW 212 (231)
T ss_pred EEEeecCCCChhhhhCcccCceEEEecccccccchhH-------HHHH-HHHHhccCcceeEEEecCCCEEEEEeCcccc
Confidence 57899999999999 9999999999998877776542 1222 2334557899999999999999999998887
Q ss_pred cc
Q 007357 425 VH 426 (606)
Q Consensus 425 ~~ 426 (606)
.-
T Consensus 213 ~G 214 (231)
T KOG3129|consen 213 QG 214 (231)
T ss_pred cC
Confidence 53
No 69
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.40 E-value=0.001 Score=61.40 Aligned_cols=87 Identities=22% Similarity=0.240 Sum_probs=56.2
Q ss_pred CccccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCC-CCEEEEECCEEeCCCCCccccchhhHHHHHHH
Q 007357 318 GFPCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKE-GDVIVSFDDVCVGSEGTVPFRSNERIAFRYLI 395 (606)
Q Consensus 318 g~~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~-GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l 395 (606)
+.+.||++++.-.... ....+.-|.+|.|+|||+. ||++ .|.|+.+|+..+.+..+ |..++
T Consensus 24 ~~g~LG~sv~~~~~~~-------~~~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~----------l~~~v 86 (138)
T PF04495_consen 24 GQGLLGISVRFESFEG-------AEEEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDD----------LFELV 86 (138)
T ss_dssp SSSSS-EEEEEEE-TT-------GCCCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCH----------HHHHH
T ss_pred CCCCCcEEEEEecccc-------cccceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHH----------HHHHH
Confidence 4577898887642110 1235888999999999999 9999 69999999988886554 66777
Q ss_pred hhcCCCCEEEEEEEE--CCEEEEEEEEec
Q 007357 396 SQKFAGDVAELGIIR--AGTFMKVKVVLN 422 (606)
Q Consensus 396 ~~~~~g~~v~l~V~R--~G~~~~v~v~l~ 422 (606)
... .++.+.|.|.. ....+++++.+.
T Consensus 87 ~~~-~~~~l~L~Vyns~~~~vR~V~i~P~ 114 (138)
T PF04495_consen 87 EAN-ENKPLQLYVYNSKTDSVREVTITPS 114 (138)
T ss_dssp HHT-TTS-EEEEEEETTTTCEEEEEE---
T ss_pred HHc-CCCcEEEEEEECCCCeEEEEEEEcC
Confidence 664 78999999986 344556666554
No 70
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.34 E-value=0.00035 Score=75.84 Aligned_cols=58 Identities=7% Similarity=0.071 Sum_probs=51.0
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCCH--HHHHHHHHhcCCceEEEEEecc
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNI--HHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l--~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
.+++|..|.+++||..+|+++||+|++|||++|.++ .++..+++...+..+.|++.|+
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r~ 161 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRRG 161 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEEC
Confidence 367899999999999999999999999999999864 7788888776788889988875
No 71
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.32 E-value=0.00048 Score=74.00 Aligned_cols=48 Identities=25% Similarity=0.380 Sum_probs=45.5
Q ss_pred cccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEecc
Q 007357 492 NEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 492 ~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
++||..+|++.||+|++|||++|.+|++|.+++++.+++.+.+++.|+
T Consensus 123 ~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~ 170 (402)
T TIGR02860 123 HSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERG 170 (402)
T ss_pred CCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEEC
Confidence 589999999999999999999999999999999999899999999886
No 72
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=97.30 E-value=0.0005 Score=75.81 Aligned_cols=122 Identities=16% Similarity=0.221 Sum_probs=79.5
Q ss_pred eCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEEeccccccc
Q 007357 351 VEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVVLNPRVHLV 428 (606)
Q Consensus 351 V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~l~~~~~l~ 428 (606)
...++||+. .|-.||.|++|||..+-.. ++ -..+.++.....-..|+|+|.+=--..++.|. .
T Consensus 680 mm~~GpAarsgkLnIGDQiiaING~SLVGL---PL-----stcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~--R----- 744 (829)
T KOG3605|consen 680 MMHGGPAARSGKLNIGDQIMSINGTSLVGL---PL-----STCQSIIKGLKNQTAVKLNIVSCPPVTTVLIR--R----- 744 (829)
T ss_pred cccCChhhhcCCccccceeEeecCceeccc---cH-----HHHHHHHhcccccceEEEEEecCCCceEEEee--c-----
Confidence 346789998 7999999999999876543 21 11344566554556678877653333333222 1
Q ss_pred ccccCCCCCcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCceEEe
Q 007357 429 PYHIDGGQPSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQVLK 508 (606)
Q Consensus 429 p~~~~~~~p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~ 508 (606)
|...-.+||.++. + ||-+.+-+++|++-|.+.|-+|++
T Consensus 745 --------Pd~kyQLGFSVQN---------------------------------G-iICSLlRGGIAERGGVRVGHRIIE 782 (829)
T KOG3605|consen 745 --------PDLRYQLGFSVQN---------------------------------G-IICSLLRGGIAERGGVRVGHRIIE 782 (829)
T ss_pred --------ccchhhccceeeC---------------------------------c-EeehhhcccchhccCceeeeeEEE
Confidence 1111135554432 1 677788999999999999999999
Q ss_pred eCCeecCCHHH--HHHHHHhcCC
Q 007357 509 FNGTRIKNIHH--LAHLVDSCKD 529 (606)
Q Consensus 509 VNG~~V~~l~~--l~~~v~~~~~ 529 (606)
|||+.|--..| ++++|...-+
T Consensus 783 INgQSVVA~pHekIV~lLs~aVG 805 (829)
T KOG3605|consen 783 INGQSVVATPHEKIVQLLSNAVG 805 (829)
T ss_pred ECCceEEeccHHHHHHHHHHhhh
Confidence 99999754433 5666655433
No 73
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.28 E-value=0.0013 Score=75.63 Aligned_cols=83 Identities=19% Similarity=0.148 Sum_probs=55.8
Q ss_pred ccccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc--CCCCCCEEEEEC--CEEeCCCCCccccchhhHHHHHH
Q 007357 319 FPCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN--ILKEGDVIVSFD--DVCVGSEGTVPFRSNERIAFRYL 394 (606)
Q Consensus 319 ~~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~--gLq~GDvIlaIn--G~~V~~~~~v~~~~~eri~~~~~ 394 (606)
+.-+|+.++.- ..+++|.+|.|+|||++ ||++||+|++|| |.++.+..... -. ....+
T Consensus 243 ~~GIGa~l~~~-------------~~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~--~~---~vv~l 304 (667)
T PRK11186 243 LEGIGAVLQMD-------------DDYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWR--LD---DVVAL 304 (667)
T ss_pred eeEEEEEEEEe-------------CCeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCC--HH---HHHHH
Confidence 34478887643 24688999999999997 899999999999 55544321110 00 12344
Q ss_pred HhhcCCCCEEEEEEEEC---CEEEEEEEE
Q 007357 395 ISQKFAGDVAELGIIRA---GTFMKVKVV 420 (606)
Q Consensus 395 l~~~~~g~~v~l~V~R~---G~~~~v~v~ 420 (606)
+.. ..|.+|.|+|.|+ |+..+++++
T Consensus 305 irG-~~Gt~V~LtV~r~~~~~~~~~vtl~ 332 (667)
T PRK11186 305 IKG-PKGSKVRLEILPAGKGTKTRIVTLT 332 (667)
T ss_pred hcC-CCCCEEEEEEEeCCCCCceEEEEEE
Confidence 433 4799999999993 455666554
No 74
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=97.20 E-value=0.0005 Score=74.37 Aligned_cols=57 Identities=18% Similarity=0.217 Sum_probs=47.5
Q ss_pred EEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEe-c-ceEEEEe
Q 007357 486 LSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFE-D-NYLAVLE 545 (606)
Q Consensus 486 Is~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~-r-~~~ivl~ 545 (606)
|..|.++|+|+.+|+++||+|++|||++|.+|.++..++. ++.+.+++. | ++...++
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~rdGe~~~l~ 60 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDANGESHQIE 60 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcCCCeEEEEE
Confidence 5678899999999999999999999999999999988874 467888885 3 4544544
No 75
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=96.99 E-value=0.0016 Score=70.96 Aligned_cols=84 Identities=19% Similarity=0.234 Sum_probs=64.7
Q ss_pred cceeeeeccchhhhccccCC---CCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhh
Q 007357 322 LGVLLQKLENPALRTCLKVP---SNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQ 397 (606)
Q Consensus 322 LGi~~q~~e~~~l~~~lgl~---~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~ 397 (606)
.|+++..+..+ .-+||+. ...+.+|..|.++|||.+ ||.+||.|++|||. +. .+..
T Consensus 439 ~gL~~~~~~~~--~~~LGl~v~~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~---s~---------------~l~~ 498 (558)
T COG3975 439 FGLTFTPKPRE--AYYLGLKVKSEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI---SD---------------QLDR 498 (558)
T ss_pred cceEEEecCCC--CcccceEecccCCeeEEEecCCCChhHhccCCCccEEEEEcCc---cc---------------cccc
Confidence 57777765222 4466664 235678999999999999 99999999999999 21 2334
Q ss_pred cCCCCEEEEEEEECCEEEEEEEEecccc
Q 007357 398 KFAGDVAELGIIRAGTFMKVKVVLNPRV 425 (606)
Q Consensus 398 ~~~g~~v~l~V~R~G~~~~v~v~l~~~~ 425 (606)
...++.+++++.|.|..+++.+++....
T Consensus 499 ~~~~d~i~v~~~~~~~L~e~~v~~~~~~ 526 (558)
T COG3975 499 YKVNDKIQVHVFREGRLREFLVKLGGDP 526 (558)
T ss_pred cccccceEEEEccCCceEEeecccCCCc
Confidence 5578999999999999999988876643
No 76
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=96.99 E-value=0.00096 Score=65.69 Aligned_cols=66 Identities=18% Similarity=0.164 Sum_probs=52.1
Q ss_pred cEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECCEEEEEEEE
Q 007357 345 GVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAGTFMKVKVV 420 (606)
Q Consensus 345 Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G~~~~v~v~ 420 (606)
|..+.-..+++..+. |||+||+.++||+..+++..+ +..++++...-+.+.++|+|+|+.+++.|.
T Consensus 208 Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~----------m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 208 GYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPED----------MFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred EEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHH----------HHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 444434445566777 999999999999999998766 446677766678899999999999988764
No 77
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=96.96 E-value=0.0058 Score=65.01 Aligned_cols=147 Identities=18% Similarity=0.212 Sum_probs=101.6
Q ss_pred CCCcEEEEEeCCCChhhc-CCCC-CCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECC--EEEEE
Q 007357 342 SNEGVLVRRVEPTSDANN-ILKE-GDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAG--TFMKV 417 (606)
Q Consensus 342 ~~~Gv~V~~V~p~spA~~-gLq~-GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G--~~~~v 417 (606)
...|.-|-+|..+|+|++ ||.+ -|.|++|||..+..+.+. |..++.... +.|+++|.--. ..+.+
T Consensus 13 gteg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~---------Lk~llk~~s--ekVkltv~n~kt~~~R~v 81 (462)
T KOG3834|consen 13 GTEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDT---------LKALLKANS--EKVKLTVYNSKTQEVRIV 81 (462)
T ss_pred CceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHH---------HHHHHHhcc--cceEEEEEecccceeEEE
Confidence 346888999999999999 9887 689999999999987663 566666543 33999987422 23344
Q ss_pred EEEecccccccccccCCCCCcceeecceE--EecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccc
Q 007357 418 KVVLNPRVHLVPYHIDGGQPSYLIIAGLV--FTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVS 495 (606)
Q Consensus 418 ~v~l~~~~~l~p~~~~~~~p~y~~~~Glv--f~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a 495 (606)
+|+..... .. . +.|.. |.....+ -+.+-=|-+|.+++||
T Consensus 82 ~I~ps~~w--------gg--q---llGvsvrFcsf~~A--------------------------~~~vwHvl~V~p~SPa 122 (462)
T KOG3834|consen 82 EIVPSNNW--------GG--Q---LLGVSVRFCSFDGA--------------------------VESVWHVLSVEPNSPA 122 (462)
T ss_pred Eecccccc--------cc--c---ccceEEEeccCccc--------------------------hhheeeeeecCCCCHH
Confidence 44433211 00 0 23333 2221111 1223346688899999
Q ss_pred cccCCC-CCceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEec
Q 007357 496 IGYEDM-SNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFED 538 (606)
Q Consensus 496 ~~agl~-~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r 538 (606)
..||+. .+|.|+-+-.......+||..+|+.+.++.+.+-+.+
T Consensus 123 alAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN 166 (462)
T KOG3834|consen 123 ALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYN 166 (462)
T ss_pred HhcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEee
Confidence 999999 7899999966666788999999999999998886643
No 78
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.73 E-value=0.0021 Score=54.74 Aligned_cols=74 Identities=24% Similarity=0.373 Sum_probs=47.8
Q ss_pred cccccchhhhHHhhcCcccCccccceeeee-cc-chhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEe
Q 007357 299 PTTVVSHFLSDYERNGKYTGFPCLGVLLQK-LE-NPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCV 375 (606)
Q Consensus 299 P~~~i~~~L~~l~~~g~~~g~~~LGi~~q~-~e-~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V 375 (606)
.+.+-.|-++++.+.|+.. -.+|+..-- +. +|. +.-++. ...|++|.+|..+|||+. ||+.+|.|+.+||-..
T Consensus 16 si~velHK~~~~d~~Gre~--l~~GFkIGGGIDQDp~-k~Pf~y-tD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~Df 91 (124)
T KOG3553|consen 16 SIRVELHKLRDYDQQGREN--LILGFKIGGGIDQDPS-KNPFSY-TDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDF 91 (124)
T ss_pred EEEEEeeeehhhhcCCcEE--EEEEEEeccccCCCcc-cCCCCc-CCccEEEEEeccCChhhhhcceecceEEEecCcee
Confidence 3344456677777777553 234444321 10 111 112232 247999999999999999 9999999999999764
Q ss_pred C
Q 007357 376 G 376 (606)
Q Consensus 376 ~ 376 (606)
+
T Consensus 92 T 92 (124)
T KOG3553|consen 92 T 92 (124)
T ss_pred E
Confidence 3
No 79
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=96.72 E-value=0.003 Score=58.37 Aligned_cols=58 Identities=16% Similarity=0.122 Sum_probs=47.7
Q ss_pred CcEEEEEEecccccccccCCCC-CceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEec
Q 007357 481 EQMVILSQVLANEVSIGYEDMS-NQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFED 538 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~~-gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r 538 (606)
..+.-|..|.++|||+.+||++ .|.|+.+|+..+.+.++|.++++++.++.+.|.+-+
T Consensus 42 ~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vyn 100 (138)
T PF04495_consen 42 EEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVYN 100 (138)
T ss_dssp CCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEEE
T ss_pred cceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEEE
Confidence 4567889999999999999998 699999999999999999999999999999998854
No 80
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.70 E-value=0.042 Score=52.80 Aligned_cols=138 Identities=21% Similarity=0.313 Sum_probs=79.3
Q ss_pred cceEEEEEEcCCEEEecccccCCCCeEEEEEecCCcEEEE--EEEEeecC---CCeEEEEecccccccCCcc-cccCCCC
Q 007357 140 TSTGSAFMIGDGKLLTNAHCVEHYTQVKVKRRGDDTKYVA--KVLARGVD---CDIALLSVESEEFWKDAEP-LCLGHLP 213 (606)
Q Consensus 140 ~~~GSGfvI~~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a--~vv~~d~~---~DlAlLkv~~~~~~~~v~p-l~l~~~~ 213 (606)
...++++.|-++++|.+.| -.....+.+ ++..++. .+...+.+ .||++++++..+-+.++.. +. ...+
T Consensus 24 ~~t~l~~gi~~~~~lvp~H-~~~~~~i~i----~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIrk~~~-~~~~ 97 (172)
T PF00548_consen 24 EFTMLALGIYDRYFLVPTH-EEPEDTIYI----DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIRKFFP-ESIP 97 (172)
T ss_dssp EEEEEEEEEEBTEEEEEGG-GGGCSEEEE----TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GGGGSB-SSGG
T ss_pred eEEEecceEeeeEEEEECc-CCCcEEEEE----CCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchhhhhc-cccc
Confidence 4557788999999999999 223333333 3444432 33334544 5999999986532334433 22 1222
Q ss_pred CCCCeEEEEeecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEccCcCCCCCCCceEcC---CCeEEEEEEee
Q 007357 214 RLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDAAINPGNSGGPAFND---KGECIGVAFQV 285 (606)
Q Consensus 214 ~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~~G~SGGPl~n~---~G~vVGI~~~~ 285 (606)
...+.+.++-.+ ......+..+-++..+... ..+......+..+++..+|.-||||+.. .++++||+.++
T Consensus 98 ~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i~-~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 98 EYPECVLLVNST-KFPRMIVEVGFVTNFGFIN-LSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp TEEEEEEEEESS-SSTCEEEEEEEEEEEEEEE-ETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred cCCCcEEEEECC-CCccEEEEEEEEeecCccc-cCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 333444444332 2232344555555544331 1222223467888888899999999862 57899999875
No 81
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=96.64 E-value=0.0035 Score=68.37 Aligned_cols=57 Identities=7% Similarity=0.099 Sum_probs=52.1
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCCH--HHHHHHHHhcCCceEEEEEec
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNI--HHLAHLVDSCKDKYLVFEFED 538 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l--~~l~~~v~~~~~~~v~l~v~r 538 (606)
..+.+.++.+++||.++|+++||+|++|||+++.+. ++.++.++..++..++|++.|
T Consensus 112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r 170 (406)
T COG0793 112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILR 170 (406)
T ss_pred CCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEE
Confidence 567899999999999999999999999999999988 568888888888899999988
No 82
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=96.51 E-value=0.0056 Score=65.42 Aligned_cols=60 Identities=12% Similarity=0.120 Sum_probs=55.1
Q ss_pred CCcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcC-CceEEEEEecc
Q 007357 480 GEQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCK-DKYLVFEFEDN 539 (606)
Q Consensus 480 ~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~-~~~v~l~v~r~ 539 (606)
...++++..|.+++|+.++|++.||+|+++||+++.+..++.+.+..+. +..+.+++.|+
T Consensus 268 ~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~ 328 (347)
T COG0265 268 VAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRG 328 (347)
T ss_pred CCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEEC
Confidence 4567899999999999999999999999999999999999999998765 77999999886
No 83
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=96.28 E-value=0.016 Score=49.26 Aligned_cols=58 Identities=9% Similarity=0.083 Sum_probs=41.5
Q ss_pred cEEEEEEeccc--------ccccccC--CCCCceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEecc
Q 007357 482 QMVILSQVLAN--------EVSIGYE--DMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 482 ~~VvIs~Vl~~--------s~a~~ag--l~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
....|..+.++ ||-...| ++.||.|++|||+++..-.++.+++....++.+.|++.+.
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~ 79 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRK 79 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-S
T ss_pred CEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecC
Confidence 44567777765 3444444 4699999999999999999999999999999999999874
No 84
>PF12812 PDZ_1: PDZ-like domain
Probab=96.04 E-value=0.0096 Score=49.48 Aligned_cols=58 Identities=16% Similarity=0.081 Sum_probs=48.4
Q ss_pred cccceeeeeccchhhhccccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCC
Q 007357 320 PCLGVLLQKLENPALRTCLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGT 380 (606)
Q Consensus 320 ~~LGi~~q~~e~~~l~~~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~ 380 (606)
-++|-.++.+ +-+.++.++++-. +++.....++++.. ++..|-+|++|||+++.+..+
T Consensus 9 ~~~Ga~f~~L-s~q~aR~~~~~~~--gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~ 67 (78)
T PF12812_consen 9 EVCGAVFHDL-SYQQARQYGIPVG--GVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDD 67 (78)
T ss_pred EEcCeecccC-CHHHHHHhCCCCC--EEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHH
Confidence 4689999998 7888888888753 45556688999988 599999999999999999765
No 85
>PRK11186 carboxy-terminal protease; Provisional
Probab=95.84 E-value=0.014 Score=67.20 Aligned_cols=57 Identities=11% Similarity=0.205 Sum_probs=48.7
Q ss_pred cEEEEEEeccccccccc-CCCCCceEEeeC--CeecCC-----HHHHHHHHHhcCCceEEEEEec
Q 007357 482 QMVILSQVLANEVSIGY-EDMSNQQVLKFN--GTRIKN-----IHHLAHLVDSCKDKYLVFEFED 538 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~a-gl~~gD~I~~VN--G~~V~~-----l~~l~~~v~~~~~~~v~l~v~r 538 (606)
..++|..|.+++||.++ |+++||+|++|| |+++.+ +++++++|+..++..++|++.|
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r 319 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILP 319 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEe
Confidence 45788999999999998 999999999999 555443 4588999998889999999976
No 86
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=95.59 E-value=0.15 Score=51.06 Aligned_cols=100 Identities=20% Similarity=0.293 Sum_probs=71.5
Q ss_pred CCCCCCcccccc--ccceEEEEEEcCCEEEecccccCCC----CeEEEEEecCCcEEEE------EEEEee-----cCCC
Q 007357 127 PDYSLPWQKQRQ--YTSTGSAFMIGDGKLLTNAHCVEHY----TQVKVKRRGDDTKYVA------KVLARG-----VDCD 189 (606)
Q Consensus 127 ~~~~~Pw~~~~~--~~~~GSGfvI~~g~ILTnaHvV~~~----~~v~V~~~~~~~~~~a------~vv~~d-----~~~D 189 (606)
..|.|||....- ..-.++|++|+..|||++..|+.+- ..+.+.+ +.++.+.- ++..+| ++.+
T Consensus 12 e~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~Yvsall-G~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~ 90 (267)
T PF09342_consen 12 EDYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALL-GGGKTYLSVDGPHEQISRVDCFKDVPESN 90 (267)
T ss_pred ccccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEe-cCcceecccCCChheEEEeeeeeeccccc
Confidence 467899987643 3456999999999999999999863 4577766 55553321 232233 6789
Q ss_pred eEEEEeccc-ccccCCcccccCCC--C-CCCCeEEEEeecCC
Q 007357 190 IALLSVESE-EFWKDAEPLCLGHL--P-RLQDAVTVVGYPLG 227 (606)
Q Consensus 190 lAlLkv~~~-~~~~~v~pl~l~~~--~-~lG~~V~viG~p~g 227 (606)
++||.++.+ .|..-+.|+-+.+. + ...+.+++||....
T Consensus 91 v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~~ 132 (267)
T PF09342_consen 91 VLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDDT 132 (267)
T ss_pred eeeeeecCcccceeeecccccccccCCCCCCCceEEEEcccC
Confidence 999999987 56777888877652 1 23568999997663
No 87
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=95.54 E-value=0.011 Score=58.17 Aligned_cols=137 Identities=19% Similarity=0.233 Sum_probs=48.8
Q ss_pred CCEEEecccccCCCCeEEEEEecCCcEEEE---EEEEeecCCCeEEEEecccccc--cCCcccccCCCCCC-CCeEEEEe
Q 007357 150 DGKLLTNAHCVEHYTQVKVKRRGDDTKYVA---KVLARGVDCDIALLSVESEEFW--KDAEPLCLGHLPRL-QDAVTVVG 223 (606)
Q Consensus 150 ~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a---~vv~~d~~~DlAlLkv~~~~~~--~~v~pl~l~~~~~l-G~~V~viG 223 (606)
...++|++||......+.... ++.+++- +.+..+...|++||++. +.++ -.+..+.|.....+ ...+.+.+
T Consensus 41 ~~~L~ta~Hv~~~~~~~~~~k--~g~kipl~~f~~~~~~~~~D~~il~~P-~n~~s~Lg~k~~~~~~~~~~~~g~~~~y~ 117 (203)
T PF02122_consen 41 EDALLTARHVWSRPSKVTSLK--TGEKIPLAEFTDLLESRIADFVILRGP-PNWESKLGVKAAQLSQNSQLAKGPVSFYG 117 (203)
T ss_dssp -EEEEE-HHHHTSSS---EEE--TTEEEE--S-EEEEE-TTT-EEEEE---HHHHHHHT-----B----SEEEEESSTTS
T ss_pred ccceecccccCCCccceeEcC--CCCcccchhChhhhCCCccCEEEEecC-cCHHHHhCcccccccchhhhCCCCeeeee
Confidence 568999999999866665554 4455443 45567788999999998 4332 13444444322111 00011101
Q ss_pred ecCCCCcceEEeeEEeeeeeeeecCCCcceeEEEEccCcCCCCCCCceEcCCCeEEEEEEeeecccccceeeeeecccc
Q 007357 224 YPLGGDTISVTKGVVSRIEVTSYAHGSSELLGIQIDAAINPGNSGGPAFNDKGECIGVAFQVYRSEEVENIGYVIPTTV 302 (606)
Q Consensus 224 ~p~g~~~~svt~GvVs~~~~~~~~~~~~~~~~iq~da~i~~G~SGGPl~n~~G~vVGI~~~~~~~~~~~~~~~~IP~~~ 302 (606)
+..+ +..+-... |. +.... +...-+...+|.||.|+|+.. ++||++.+..+....+|.++..|+.-
T Consensus 118 ~~~~-~~~~~sa~-i~---------g~~~~-~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~~~~n~n~~spip~ 183 (203)
T PF02122_consen 118 FSSG-EWPCSSAK-IP---------GTEGK-FASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGSNRENNNRMSPIPP 183 (203)
T ss_dssp EEEE-EEEEEE-S--------------STT-EEEE-----TT-TT-EEE-SS--EEEEEEEE-----------------
T ss_pred ecCC-CceeccCc-cc---------cccCc-CCceEcCCCCCCCCCCeEECC-CceEeecCcccccccccccccccccc
Confidence 0000 00011111 11 11111 345556778999999999976 89999998644455667777777644
No 88
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=95.34 E-value=0.042 Score=53.41 Aligned_cols=63 Identities=17% Similarity=0.250 Sum_probs=51.2
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCC---HHHHHHHHHhcCCceEEEEEecc-eEEEE
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKN---IHHLAHLVDSCKDKYLVFEFEDN-YLAVL 544 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~---l~~l~~~v~~~~~~~v~l~v~r~-~~ivl 544 (606)
..++|++|.++|||..+||..||.|+++....--| +..+...++...++.+.+++.|. +.++|
T Consensus 139 ~Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L 205 (231)
T KOG3129|consen 139 PFAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVL 205 (231)
T ss_pred ceEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEE
Confidence 46799999999999999999999999987666555 45566667888899999999874 55555
No 89
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=95.31 E-value=0.11 Score=58.57 Aligned_cols=117 Identities=21% Similarity=0.272 Sum_probs=72.5
Q ss_pred CCCeEEEEecccc-----cccCC------cccccCCC--------CCCCCeEEEEeecCCCCcceEEeeEEeeeeeeeec
Q 007357 187 DCDIALLSVESEE-----FWKDA------EPLCLGHL--------PRLQDAVTVVGYPLGGDTISVTKGVVSRIEVTSYA 247 (606)
Q Consensus 187 ~~DlAlLkv~~~~-----~~~~v------~pl~l~~~--------~~lG~~V~viG~p~g~~~~svt~GvVs~~~~~~~~ 247 (606)
-.|+||++++... +.+++ +.+.+.+. ...|.+|+=+|.-.| .|.|.|..+....+.
T Consensus 542 LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg-----yT~G~lNg~klvyw~ 616 (695)
T PF08192_consen 542 LSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG-----YTTGILNGIKLVYWA 616 (695)
T ss_pred ccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC-----ccceEecceEEEEec
Confidence 3599999998542 12222 22333221 123889999987544 567888877644333
Q ss_pred CCCcce-eEEEEc----cCcCCCCCCCceEcCCCe------EEEEEEeeecccccceeeeeecccccchhhhHH
Q 007357 248 HGSSEL-LGIQID----AAINPGNSGGPAFNDKGE------CIGVAFQVYRSEEVENIGYVIPTTVVSHFLSDY 310 (606)
Q Consensus 248 ~~~~~~-~~iq~d----a~i~~G~SGGPl~n~~G~------vVGI~~~~~~~~~~~~~~~~IP~~~i~~~L~~l 310 (606)
.+.... .++... +=...|+||+-|++.-+. |+||.++. .+....+|.+.|+..|..-|+++
T Consensus 617 dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsy--dge~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 617 DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSY--DGEQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeec--CCccceeeccCcHHHHHHHHHHh
Confidence 443221 222222 112579999999985333 99999863 35566799999998877666654
No 90
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.31 E-value=0.012 Score=67.89 Aligned_cols=18 Identities=44% Similarity=0.698 Sum_probs=14.2
Q ss_pred EEEEEEc-CCEEEeccccc
Q 007357 143 GSAFMIG-DGKLLTNAHCV 160 (606)
Q Consensus 143 GSGfvI~-~g~ILTnaHvV 160 (606)
|||.+|+ +|+||||+||+
T Consensus 49 CSgsfVS~~GLvlTNHHC~ 67 (698)
T PF10459_consen 49 CSGSFVSPDGLVLTNHHCG 67 (698)
T ss_pred eeEEEEcCCceEEecchhh
Confidence 7777886 78888888887
No 91
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.05 E-value=0.087 Score=61.08 Aligned_cols=43 Identities=26% Similarity=0.395 Sum_probs=25.8
Q ss_pred CCCeEEEEeccc------cccc---CCcc---cccC-CCCCCCCeEEEEeecCCCC
Q 007357 187 DCDIALLSVESE------EFWK---DAEP---LCLG-HLPRLQDAVTVVGYPLGGD 229 (606)
Q Consensus 187 ~~DlAlLkv~~~------~~~~---~v~p---l~l~-~~~~lG~~V~viG~p~g~~ 229 (606)
..|++++|+=.. ++.. ++.| +++. .-.+-|+-|+|+|||....
T Consensus 199 tgDfs~fRvY~~~dg~PA~Ys~dnvP~~p~~~l~is~~G~keGD~vmv~GyPG~T~ 254 (698)
T PF10459_consen 199 TGDFSFFRVYADKDGKPADYSKDNVPYKPKHFLKISLKGVKEGDFVMVAGYPGRTN 254 (698)
T ss_pred CCceEEEEEEeCCCCCccccCcCCCCCCCccccccCCCCCCCCCeEEEccCCCccc
Confidence 469999999432 2211 2222 2232 1235599999999996544
No 92
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=94.89 E-value=0.046 Score=50.01 Aligned_cols=31 Identities=35% Similarity=0.556 Sum_probs=22.5
Q ss_pred EccCcCCCCCCCceEcCCCeEEEEEEeeecc
Q 007357 258 IDAAINPGNSGGPAFNDKGECIGVAFQVYRS 288 (606)
Q Consensus 258 ~da~i~~G~SGGPl~n~~G~vVGI~~~~~~~ 288 (606)
++..+.+|.||+|+||.+|++|||.+.+...
T Consensus 90 ~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 90 IDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp E---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred eecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 4445779999999999999999999887643
No 93
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=94.89 E-value=0.14 Score=47.19 Aligned_cols=66 Identities=20% Similarity=0.241 Sum_probs=48.7
Q ss_pred cEEEEEEeccccccccc-CCCCCceEEeeCCeecCCHHH--HHHHHHhcCCceEEEEEecceEEEEehHHH
Q 007357 482 QMVILSQVLANEVSIGY-EDMSNQQVLKFNGTRIKNIHH--LAHLVDSCKDKYLVFEFEDNYLAVLEREAA 549 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~a-gl~~gD~I~~VNG~~V~~l~~--l~~~v~~~~~~~v~l~v~r~~~ivl~~~~~ 549 (606)
..++|+.+.|++.|.+- ||+.||++++|||..|..-.| .++++++..+ .+.|.+ |-.+-||+.-++
T Consensus 115 spiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~g-svklvv-rytpkvleeme~ 183 (207)
T KOG3550|consen 115 SPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVG-SVKLVV-RYTPKVLEEMEA 183 (207)
T ss_pred CceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcC-cEEEEE-ecChHHHHHHHH
Confidence 45689999999999876 799999999999999987665 6677877655 556654 444444444433
No 94
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=94.72 E-value=0.05 Score=55.90 Aligned_cols=44 Identities=5% Similarity=0.002 Sum_probs=39.7
Q ss_pred cccCCCCCceEEeeCCeecCCHHHHHHHHHhcCC-ceEEEEEecc
Q 007357 496 IGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKD-KYLVFEFEDN 539 (606)
Q Consensus 496 ~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~-~~v~l~v~r~ 539 (606)
..+||++||++++|||.++.+.++..+++++..+ ..++|+++|+
T Consensus 221 ~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRd 265 (276)
T PRK09681 221 DASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRK 265 (276)
T ss_pred HHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEEC
Confidence 4779999999999999999999999999987654 5899999996
No 95
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=94.64 E-value=0.086 Score=48.48 Aligned_cols=37 Identities=32% Similarity=0.480 Sum_probs=33.3
Q ss_pred CCcEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCC
Q 007357 343 NEGVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEG 379 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~ 379 (606)
+.-++|.+|.|++.|+. ||+-||.+++|||..|....
T Consensus 114 nspiyisriipggvadrhgglkrgdqllsvngvsvege~ 152 (207)
T KOG3550|consen 114 NSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEH 152 (207)
T ss_pred CCceEEEeecCCccccccCcccccceeEeecceeecchh
Confidence 45799999999999998 99999999999999997653
No 96
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=94.46 E-value=0.096 Score=54.15 Aligned_cols=57 Identities=16% Similarity=0.194 Sum_probs=50.0
Q ss_pred CcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhc-CCceEEEEEec
Q 007357 481 EQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSC-KDKYLVFEFED 538 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~-~~~~v~l~v~r 538 (606)
..+|++..|..++++.+- |+.||.|++|||+++.+.++|.+++++. .++.++++++|
T Consensus 129 y~gvyv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r 186 (342)
T COG3480 129 YAGVYVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYER 186 (342)
T ss_pred EeeEEEEEccCCcchhce-eccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEe
Confidence 368888899888888764 7899999999999999999999999864 57799999986
No 97
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=94.16 E-value=0.038 Score=47.27 Aligned_cols=45 Identities=11% Similarity=0.138 Sum_probs=36.1
Q ss_pred CcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHH--HHHHHH
Q 007357 481 EQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHH--LAHLVD 525 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~--l~~~v~ 525 (606)
+.+++++.|..+|||+.|||+.+|.|+.|||-...-..| .++.|+
T Consensus 58 D~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTMvTHd~Avk~i~ 104 (124)
T KOG3553|consen 58 DKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTMVTHDQAVKRIT 104 (124)
T ss_pred CccEEEEEeccCChhhhhcceecceEEEecCceeEEEEhHHHHHHhh
Confidence 568899999999999999999999999999976544333 344444
No 98
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=94.05 E-value=0.098 Score=58.58 Aligned_cols=57 Identities=12% Similarity=0.134 Sum_probs=50.6
Q ss_pred cCCcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCceEEEE
Q 007357 479 EGEQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFE 535 (606)
Q Consensus 479 ~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~ 535 (606)
.+.+.|-|-.|+++++|.++.+++||++++|||.||++.++..+.++...+....|.
T Consensus 395 ~~~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~~~~l~ 451 (1051)
T KOG3532|consen 395 NTNRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGDLTVLV 451 (1051)
T ss_pred CCceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccceEEEE
Confidence 356788899999999999999999999999999999999999999998777654444
No 99
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=93.92 E-value=0.082 Score=61.07 Aligned_cols=57 Identities=23% Similarity=0.246 Sum_probs=45.1
Q ss_pred CcEEEEEeCCCChhhcCCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEE
Q 007357 344 EGVLVRRVEPTSDANNILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIR 410 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R 410 (606)
.-|+|..|.+|+|+...|++||.|++|||++|.+.- .|| ..+++... .+.|.|+|.+
T Consensus 75 rPviVr~VT~GGps~GKL~PGDQIl~vN~Epv~dap------rer--vIdlvRac--e~sv~ltV~q 131 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDAP------RER--VIDLVRAC--ESSVNLTVCQ 131 (1298)
T ss_pred CceEEEEecCCCCccccccCCCeEEEecCccccccc------HHH--HHHHHHHH--hhhcceEEec
Confidence 458899999999999999999999999999998742 233 33566654 4678888877
No 100
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=93.32 E-value=0.082 Score=58.11 Aligned_cols=60 Identities=20% Similarity=0.230 Sum_probs=46.0
Q ss_pred CcceeecceEEecCChHHHhhhcccccCeeehhhhhchhhhhcCCcEEEEEEecccccccccCCCCCceEEeeCCe
Q 007357 437 PSYLIIAGLVFTPLSEPLIEEECDDSIGLKLLAKARYSLARFEGEQMVILSQVLANEVSIGYEDMSNQQVLKFNGT 512 (606)
Q Consensus 437 p~y~~~~Glvf~pl~~~~~~~~~~~~~g~~l~~~~~~~~~~~~~~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~ 512 (606)
+.++...||+|.+...+ .-+.|+++-. .....+|+.|.+++||.+||+.+||.|++|||.
T Consensus 433 ~~~l~~~gL~~~~~~~~------~~~LGl~v~~----------~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~ 492 (558)
T COG3975 433 NPLLERFGLTFTPKPRE------AYYLGLKVKS----------EGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI 492 (558)
T ss_pred hhhhhhcceEEEecCCC------CcccceEecc----------cCCeeEEEecCCCChhHhccCCCccEEEEEcCc
Confidence 55666789999887765 1234554432 233458999999999999999999999999999
No 101
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=92.38 E-value=0.18 Score=58.37 Aligned_cols=55 Identities=25% Similarity=0.360 Sum_probs=45.8
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeCCeecCC--HHHHHHHHHhcCCceEEEEEec
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKN--IHHLAHLVDSCKDKYLVFEFED 538 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~--l~~l~~~v~~~~~~~v~l~v~r 538 (606)
..|||..|.+++|+.|- |++||+|+.|||++|++ |++.++++++|+. .|.|+|-+
T Consensus 75 rPviVr~VT~GGps~GK-L~PGDQIl~vN~Epv~daprervIdlvRace~-sv~ltV~q 131 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSIGK-LQPGDQILAVNGEPVKDAPRERVIDLVRACES-SVNLTVCQ 131 (1298)
T ss_pred CceEEEEecCCCCcccc-ccCCCeEEEecCcccccccHHHHHHHHHHHhh-hcceEEec
Confidence 45689999999999985 88999999999999976 6899999999876 44555544
No 102
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=91.81 E-value=0.37 Score=54.24 Aligned_cols=38 Identities=39% Similarity=0.491 Sum_probs=34.3
Q ss_pred CCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCC
Q 007357 343 NEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGT 380 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~ 380 (606)
...|.|..|.|+++|.+ .|++||++++|||.||.+..+
T Consensus 397 ~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q 435 (1051)
T KOG3532|consen 397 NRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQ 435 (1051)
T ss_pred ceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHH
Confidence 35677889999999999 999999999999999998765
No 103
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=91.72 E-value=0.45 Score=43.15 Aligned_cols=28 Identities=25% Similarity=0.580 Sum_probs=23.9
Q ss_pred ccCcCCCCCCCceEcCCCeEEEEEEeee
Q 007357 259 DAAINPGNSGGPAFNDKGECIGVAFQVY 286 (606)
Q Consensus 259 da~i~~G~SGGPl~n~~G~vVGI~~~~~ 286 (606)
...-.+|+||-|++|..|+||||+..+.
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~ 127 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGGA 127 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence 4455899999999999999999998865
No 104
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=91.62 E-value=0.28 Score=52.00 Aligned_cols=51 Identities=10% Similarity=0.126 Sum_probs=45.7
Q ss_pred cCCcEEEEEEeccccccccc-CCCCCceEEeeCCeecCCHHHHHHHHHhcCC
Q 007357 479 EGEQMVILSQVLANEVSIGY-EDMSNQQVLKFNGTRIKNIHHLAHLVDSCKD 529 (606)
Q Consensus 479 ~~~~~VvIs~Vl~~s~a~~a-gl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~ 529 (606)
...++|.+.+|...||..++ ||..||+|+++||-+|.+.+++.+.++...+
T Consensus 217 a~g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~tsl~ 268 (484)
T KOG2921|consen 217 AHGEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLATSLD 268 (484)
T ss_pred hcCceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHhhcc
Confidence 45788999999999999888 8999999999999999999999999987443
No 105
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=91.41 E-value=0.21 Score=55.96 Aligned_cols=36 Identities=31% Similarity=0.353 Sum_probs=32.5
Q ss_pred CCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCC
Q 007357 343 NEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSE 378 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~ 378 (606)
.-|++|.+|.|++.|+. ||+-||.|+.|||+...+.
T Consensus 561 GfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfeni 597 (1283)
T KOG3542|consen 561 GFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENI 597 (1283)
T ss_pred cceeEEeeecCCchHHHhhhhhhhhhhhccccchhhh
Confidence 35899999999999998 9999999999999977664
No 106
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=90.76 E-value=0.27 Score=55.07 Aligned_cols=57 Identities=16% Similarity=0.255 Sum_probs=45.8
Q ss_pred CcEEEEEEecccccccccCCCCCceEEeeCCeecCCHHHH--HHHHHhcCCceEEEEEecc
Q 007357 481 EQMVILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHL--AHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l--~~~v~~~~~~~v~l~v~r~ 539 (606)
..++++..|.|++-|...|++.||+|++|||+..+|+..- .+++.. +..++|++..+
T Consensus 561 GfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrn--nthLtltvKtN 619 (1283)
T KOG3542|consen 561 GFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRN--NTHLTLTVKTN 619 (1283)
T ss_pred cceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcC--CceEEEEEecc
Confidence 4578999999999999999999999999999999998654 344443 44666766554
No 107
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=88.85 E-value=1.6 Score=39.45 Aligned_cols=45 Identities=22% Similarity=0.284 Sum_probs=28.2
Q ss_pred eeeeeeecCCCcceeEEEEccCcCCCCCCCceEcCCCeEEEEEEee
Q 007357 240 RIEVTSYAHGSSELLGIQIDAAINPGNSGGPAFNDKGECIGVAFQV 285 (606)
Q Consensus 240 ~~~~~~~~~~~~~~~~iq~da~i~~G~SGGPl~n~~G~vVGI~~~~ 285 (606)
.++.+.|.........+.......||+-||+|+...| |+||++++
T Consensus 65 ~i~~s~YYP~h~Q~~~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Tag 109 (127)
T PF00947_consen 65 WIEESEYYPKHYQYNLLIGEGPAEPGDCGGILRCKHG-VIGIVTAG 109 (127)
T ss_dssp EE-SBTTB-SEEEECEEEEE-SSSTT-TCSEEEETTC-EEEEEEEE
T ss_pred EECCccCchhheecCceeecccCCCCCCCceeEeCCC-eEEEEEeC
Confidence 3444444333333334556678899999999998665 99999975
No 108
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=88.31 E-value=0.31 Score=51.67 Aligned_cols=38 Identities=37% Similarity=0.353 Sum_probs=35.1
Q ss_pred CCcEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCC
Q 007357 343 NEGVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGT 380 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~ 380 (606)
..|+.|.+|...||+.. ||++||+|+++||-+|.+.++
T Consensus 219 g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~d 258 (484)
T KOG2921|consen 219 GEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSD 258 (484)
T ss_pred CceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHH
Confidence 57999999999999988 999999999999999998765
No 109
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=87.38 E-value=1.4 Score=48.23 Aligned_cols=38 Identities=16% Similarity=0.328 Sum_probs=32.6
Q ss_pred CCcEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCC
Q 007357 343 NEGVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGT 380 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~ 380 (606)
..|++|..|.+++..+. -+.+||.||.||....++...
T Consensus 276 DggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSN 315 (626)
T KOG3571|consen 276 DGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSN 315 (626)
T ss_pred CCceEEeeeccCceeeccCccCccceEEEeeecchhhcCc
Confidence 46999999999998666 699999999999998877643
No 110
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=86.88 E-value=0.63 Score=52.29 Aligned_cols=104 Identities=16% Similarity=0.195 Sum_probs=70.8
Q ss_pred CCCCCCceE-----cCCCeEEEEEEeeecccccceeeeeecccccchhhhHHhhcCcc--cCccccceeeeeccchhhhc
Q 007357 264 PGNSGGPAF-----NDKGECIGVAFQVYRSEEVENIGYVIPTTVVSHFLSDYERNGKY--TGFPCLGVLLQKLENPALRT 336 (606)
Q Consensus 264 ~G~SGGPl~-----n~~G~vVGI~~~~~~~~~~~~~~~~IP~~~i~~~L~~l~~~g~~--~g~~~LGi~~q~~e~~~l~~ 336 (606)
.=++|||.- |...+++.|+--. -..+|.+.....++.++..-.+ +-.++--++--.+.-|+++-
T Consensus 679 nmm~~GpAarsgkLnIGDQiiaING~S---------LVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~ky 749 (829)
T KOG3605|consen 679 NMMHGGPAARSGKLNIGDQIMSINGTS---------LVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDLRY 749 (829)
T ss_pred hcccCChhhhcCCccccceeEeecCce---------eccccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccchh
Confidence 346677763 3334566655221 2337888888888776532222 11234445544555788888
Q ss_pred cccCCCCCcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCC
Q 007357 337 CLKVPSNEGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGS 377 (606)
Q Consensus 337 ~lgl~~~~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~ 377 (606)
.||..-+.|| |.....++.|+. |++.|-+|+.|||+.|--
T Consensus 750 QLGFSVQNGi-ICSLlRGGIAERGGVRVGHRIIEINgQSVVA 790 (829)
T KOG3605|consen 750 QLGFSVQNGI-ICSLLRGGIAERGGVRVGHRIIEINGQSVVA 790 (829)
T ss_pred hccceeeCcE-eehhhcccchhccCceeeeeEEEECCceEEe
Confidence 8988877887 568889999999 999999999999998754
No 111
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=86.15 E-value=1.6 Score=46.83 Aligned_cols=58 Identities=40% Similarity=0.428 Sum_probs=44.1
Q ss_pred EEEeCCCChhhc-CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCE---EEEEEEE-CCEEEE
Q 007357 348 VRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDV---AELGIIR-AGTFMK 416 (606)
Q Consensus 348 V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~---v~l~V~R-~G~~~~ 416 (606)
+..+..+|+|.. ++++||.|+++|++++.+|.++ ...+.. ..+.. +.+.+.| ++....
T Consensus 133 ~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~----------~~~~~~-~~~~~~~~~~i~~~~~~~~~~~ 195 (375)
T COG0750 133 VGEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDV----------RRLLVA-AAGDVFNLLTILVIRLDGEAHA 195 (375)
T ss_pred eeecCCCCHHHHcCCCCCCEEEeECCEEccCHHHH----------HHHHHh-ccCCcccceEEEEEeccceeee
Confidence 337889999999 9999999999999999999764 223322 23444 7888899 777654
No 112
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=85.43 E-value=1.1 Score=40.81 Aligned_cols=38 Identities=24% Similarity=0.470 Sum_probs=24.5
Q ss_pred CCCCCCCceEcCCCeEEEEEEeeec-ccccceeeeeeccc
Q 007357 263 NPGNSGGPAFNDKGECIGVAFQVYR-SEEVENIGYVIPTT 301 (606)
Q Consensus 263 ~~G~SGGPl~n~~G~vVGI~~~~~~-~~~~~~~~~~IP~~ 301 (606)
-.|+||||++...|.+|||..+..- .+-...+-|. |..
T Consensus 106 lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e 144 (148)
T PF02907_consen 106 LKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVE 144 (148)
T ss_dssp HTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHH
T ss_pred EecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eee
Confidence 4799999999999999999766542 2233344443 653
No 113
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=85.40 E-value=1.8 Score=46.48 Aligned_cols=53 Identities=8% Similarity=0.135 Sum_probs=47.5
Q ss_pred EEEecccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCce---EEEEEec
Q 007357 486 LSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDKY---LVFEFED 538 (606)
Q Consensus 486 Is~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~---v~l~v~r 538 (606)
+..+..++++..++++.||+|+++|++++.++++....+....+.. +.+.+.|
T Consensus 133 ~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~ 188 (375)
T COG0750 133 VGEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIR 188 (375)
T ss_pred eeecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEe
Confidence 3367899999999999999999999999999999999998877766 7888888
No 114
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=85.14 E-value=1.2 Score=52.24 Aligned_cols=59 Identities=27% Similarity=0.420 Sum_probs=45.2
Q ss_pred CcEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEECC
Q 007357 344 EGVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIRAG 412 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R~G 412 (606)
-|++|..|.+|++|+. -|+.||.+|+|||+.+-...+ ||- .+++. ..|..|.|.|...|
T Consensus 960 lGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQ------ErA--A~lmt--rtg~vV~leVaKqg 1020 (1629)
T KOG1892|consen 960 LGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQ------ERA--ARLMT--RTGNVVHLEVAKQG 1020 (1629)
T ss_pred cceEEEEeccCCccccccccccCceeeeecCcccccccH------HHH--HHHHh--ccCCeEEEehhhhh
Confidence 5999999999999987 599999999999998765433 221 23333 46888999987544
No 115
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=84.82 E-value=3 Score=49.21 Aligned_cols=160 Identities=22% Similarity=0.248 Sum_probs=73.9
Q ss_pred EEEEEEcCCEEEecccccCCCCeEEEEEecC-CcEEEEEEEEee--cCCCeEEEEecccccccCCcccccCCCC------
Q 007357 143 GSAFMIGDGKLLTNAHCVEHYTQVKVKRRGD-DTKYVAKVLARG--VDCDIALLSVESEEFWKDAEPLCLGHLP------ 213 (606)
Q Consensus 143 GSGfvI~~g~ILTnaHvV~~~~~v~V~~~~~-~~~~~a~vv~~d--~~~DlAlLkv~~~~~~~~v~pl~l~~~~------ 213 (606)
|...+|++.||+|.+|+..+...+ .+ +. +. ..-+++.+. +..|+.+.|++.- ..++.|++.....
T Consensus 67 G~aTLigpqYiVSV~HN~~gy~~v--~F-G~~g~-~~Y~iV~RNn~~~~Df~~pRLnK~--VTEvaP~~~t~~~~~~~~y 140 (769)
T PF02395_consen 67 GVATLIGPQYIVSVKHNGKGYNSV--SF-GNEGQ-NTYKIVDRNNYPSGDFHMPRLNKF--VTEVAPAEMTTAGSDSNTY 140 (769)
T ss_dssp SS-EEEETTEEEBETTG-TSCCEE--CE-SCSST-CEEEEEEEEBETTSTEBEEEESS-----SS----BBSSTTSTTGG
T ss_pred ceEEEecCCeEEEEEccCCCcCce--ee-cccCC-ceEEEEEccCCCCcccceeecCce--EEEEecccccccccccccc
Confidence 668899999999999998544443 33 22 22 333344333 3369999999853 2245555553221
Q ss_pred ----CCCCeEE------EEeecCCCC-------cceEEeeEEeeeeeeeecCCCccee-----EEEEc----cCcCCCCC
Q 007357 214 ----RLQDAVT------VVGYPLGGD-------TISVTKGVVSRIEVTSYAHGSSELL-----GIQID----AAINPGNS 267 (606)
Q Consensus 214 ----~lG~~V~------viG~p~g~~-------~~svt~GvVs~~~~~~~~~~~~~~~-----~iq~d----a~i~~G~S 267 (606)
+....+. .+++..+.. ....+.|.+.... .+..+..... ....+ ....+|+|
T Consensus 141 ~d~~rY~~f~R~GsG~Q~i~~~~g~~~~~~~~ay~yltgGt~~~~~--~~~n~~~~~~~~~~~~~~~~~pL~n~~~~GDS 218 (769)
T PF02395_consen 141 NDKERYPAFVRVGSGTQYIKDRNGNGTTILGGAYNYLTGGTVYNLP--GYGNGSMILSGDLKKFNSYNGPLPNYGSPGDS 218 (769)
T ss_dssp GHTTTC-EEEEEESSSEEEEECCEEEEEEEEETTSCEEEEEESSEE--EEECTCEEEEESTTTCCCCCSSSBEB--TT-T
T ss_pred ccchhchheeecCCceEEEEcCCCCeeEEEEeccceecCCcccccc--ccccceEEEecccccccccCCccccccccCcC
Confidence 1111121 122222110 0123344443211 0111110000 01111 23468999
Q ss_pred CCceEc---CC--CeEEEEEEeeecccccceeeeeecccccchhhhHH
Q 007357 268 GGPAFN---DK--GECIGVAFQVYRSEEVENIGYVIPTTVVSHFLSDY 310 (606)
Q Consensus 268 GGPl~n---~~--G~vVGI~~~~~~~~~~~~~~~~IP~~~i~~~L~~l 310 (606)
|+|||- .+ .-++|+.+......+..+...++|.+.+..++..-
T Consensus 219 GSPlF~YD~~~kKWvl~Gv~~~~~~~~g~~~~~~~~~~~f~~~~~~~d 266 (769)
T PF02395_consen 219 GSPLFAYDKEKKKWVLVGVLSGGNGYNGKGNWWNVIPPDFINQIKQND 266 (769)
T ss_dssp T-EEEEEETTTTEEEEEEEEEEECCCCHSEEEEEEECHHHHHHHHHHC
T ss_pred CCceEEEEccCCeEEEEEEEccccccCCccceeEEecHHHHHHHHhhh
Confidence 999983 22 45999998765433344566777887776666553
No 116
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=83.83 E-value=0.87 Score=47.53 Aligned_cols=87 Identities=16% Similarity=0.180 Sum_probs=60.7
Q ss_pred EEEEEEecccccccccC-CCCCceEEeeCCeecCCH--HHHHHHHHhcCCceEEEEEecceEEEEehHHHHHhHHHHHHH
Q 007357 483 MVILSQVLANEVSIGYE-DMSNQQVLKFNGTRIKNI--HHLAHLVDSCKDKYLVFEFEDNYLAVLEREAAVAASSCILKD 559 (606)
Q Consensus 483 ~VvIs~Vl~~s~a~~ag-l~~gD~I~~VNG~~V~~l--~~l~~~v~~~~~~~v~l~v~r~~~ivl~~~~~~~~~~~il~~ 559 (606)
.|+|+.+..+-.|..-| |+.||-|++|||.-|.+- ++.+.++++ .++.++|+++.- .+|.+..+.=|..
T Consensus 81 PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRN-AGdeVtlTV~~l-------r~ApaFLklpL~~ 152 (505)
T KOG3549|consen 81 PVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRN-AGDEVTLTVKHL-------RAAPAFLKLPLTK 152 (505)
T ss_pred cEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHh-cCCEEEEEeHhh-------hcCcHHhcCccCC
Confidence 46899999998888776 689999999999999876 457788876 566777776532 1233333333444
Q ss_pred cCCCCCCCcCCCCcCCCC
Q 007357 560 YGIPSERSSDLLEPYVDP 577 (606)
Q Consensus 560 ~~i~~~~s~dl~~~~~~~ 577 (606)
-+-+|+.|..-..|.-|+
T Consensus 153 ~~p~SD~ssg~sspl~Ds 170 (505)
T KOG3549|consen 153 DTPDSDNSSGCSSPLADS 170 (505)
T ss_pred CCCCcccccccccccccc
Confidence 455677777777777665
No 117
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=80.88 E-value=1.5 Score=43.22 Aligned_cols=40 Identities=30% Similarity=0.432 Sum_probs=31.1
Q ss_pred cCcCCCCCCCceEcCCCeEEEEEEeeecccccceeeeeecccc
Q 007357 260 AAINPGNSGGPAFNDKGECIGVAFQVYRSEEVENIGYVIPTTV 302 (606)
Q Consensus 260 a~i~~G~SGGPl~n~~G~vVGI~~~~~~~~~~~~~~~~IP~~~ 302 (606)
..|-+|+||+|++. +|++||-++-.+- .....||.++++.
T Consensus 175 GGIvqGMSGSPI~q-dGKLiGAVthvf~--~dp~~Gygi~ie~ 214 (218)
T PF05580_consen 175 GGIVQGMSGSPIIQ-DGKLIGAVTHVFV--NDPTKGYGIFIEW 214 (218)
T ss_pred CCEEecccCCCEEE-CCEEEEEEEEEEe--cCCCceeeecHHH
Confidence 34568999999997 8999999887764 3456788888654
No 118
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=80.88 E-value=3 Score=41.72 Aligned_cols=49 Identities=6% Similarity=0.041 Sum_probs=41.3
Q ss_pred ccccccccCCCCCceEEeeCCeecCCHHHHHHHHHhcCCc-eEEEEEecc
Q 007357 491 ANEVSIGYEDMSNQQVLKFNGTRIKNIHHLAHLVDSCKDK-YLVFEFEDN 539 (606)
Q Consensus 491 ~~s~a~~agl~~gD~I~~VNG~~V~~l~~l~~~v~~~~~~-~v~l~v~r~ 539 (606)
+.+.=...||+.||+.+++|+..+.+.++...+++...+. .+.|+++|+
T Consensus 216 d~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~ 265 (275)
T COG3031 216 DGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRR 265 (275)
T ss_pred CcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEec
Confidence 4444456799999999999999999999999999986554 789999875
No 119
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=76.39 E-value=2.9 Score=43.04 Aligned_cols=55 Identities=16% Similarity=0.202 Sum_probs=41.0
Q ss_pred cEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEE
Q 007357 345 GVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGII 409 (606)
Q Consensus 345 Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~ 409 (606)
-++|..|..++||+. .++.||.|++|||..|.....+ ....++... -+.|++.+-
T Consensus 31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKv--------eVAkmIQ~~--~~eV~IhyN 87 (429)
T KOG3651|consen 31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKV--------EVAKMIQVS--LNEVKIHYN 87 (429)
T ss_pred eEEEEEeccCCchhccCccccCCeeEEecceeecCccHH--------HHHHHHHHh--ccceEEEeh
Confidence 478899999999998 5999999999999999876554 234455432 244566653
No 120
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=75.76 E-value=5.3 Score=44.42 Aligned_cols=56 Identities=21% Similarity=0.325 Sum_probs=43.7
Q ss_pred cEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEEE
Q 007357 345 GVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGIIR 410 (606)
Q Consensus 345 Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~R 410 (606)
-++|.+|..|+.+++ -|+.||.|+.|||..+.+..- ..+..++.... .+++++|.-
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~--------~e~q~~l~~~~--G~itfkiiP 204 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSP--------EELQELLRNSR--GSITFKIIP 204 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCH--------HHHHHHHHhCC--CcEEEEEcc
Confidence 588999999999998 599999999999999988532 23556666653 578888764
No 121
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=75.03 E-value=3.9 Score=45.39 Aligned_cols=55 Identities=15% Similarity=0.306 Sum_probs=46.9
Q ss_pred EEEEEecccccccccC-CCCCceEEeeCCeecCC--HHHHHHHHHhcCCceEEEEEecc
Q 007357 484 VILSQVLANEVSIGYE-DMSNQQVLKFNGTRIKN--IHHLAHLVDSCKDKYLVFEFEDN 539 (606)
Q Consensus 484 VvIs~Vl~~s~a~~ag-l~~gD~I~~VNG~~V~~--l~~l~~~v~~~~~~~v~l~v~r~ 539 (606)
+++..++.|+.+.+-| |..||.|.+|||..|.+ ..++.+++..+. +.++|.+.-.
T Consensus 148 ~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G~itfkiiP~ 205 (542)
T KOG0609|consen 148 VVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSR-GSITFKIIPS 205 (542)
T ss_pred cEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCC-CcEEEEEccc
Confidence 5899999999998887 56799999999999976 589999999987 7888877544
No 122
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=75.00 E-value=11 Score=41.58 Aligned_cols=57 Identities=7% Similarity=0.160 Sum_probs=39.8
Q ss_pred CCcEEEEEEeccccccccc-CCCCCceEEeeCCeecCCH--HHHHHHHHhc--CCceEEEEE
Q 007357 480 GEQMVILSQVLANEVSIGY-EDMSNQQVLKFNGTRIKNI--HHLAHLVDSC--KDKYLVFEF 536 (606)
Q Consensus 480 ~~~~VvIs~Vl~~s~a~~a-gl~~gD~I~~VNG~~V~~l--~~l~~~v~~~--~~~~v~l~v 536 (606)
++.+++|..++++++-+.- -+.+||.|+.||.+...|+ ++.+++|++. +..+++|++
T Consensus 275 gDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~gPi~ltv 336 (626)
T KOG3571|consen 275 GDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPGPIKLTV 336 (626)
T ss_pred CCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCCCeEEEE
Confidence 4667899999988765444 4789999999999999887 4455555432 233455543
No 123
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=73.01 E-value=6.9 Score=39.78 Aligned_cols=49 Identities=14% Similarity=0.306 Sum_probs=42.2
Q ss_pred CcEEEEEEecccccccccCCC-CCceEEeeCCeec--CCHHHHHHHHHhcCC
Q 007357 481 EQMVILSQVLANEVSIGYEDM-SNQQVLKFNGTRI--KNIHHLAHLVDSCKD 529 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~agl~-~gD~I~~VNG~~V--~~l~~l~~~v~~~~~ 529 (606)
..+++|+...+++.|+.-||. .+|.|++|||..| +++++..+++-++..
T Consensus 193 vpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANsh 244 (358)
T KOG3606|consen 193 VPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANSH 244 (358)
T ss_pred cCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhccc
Confidence 457899999999999999865 5899999999998 689999999876543
No 124
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=72.41 E-value=3.6 Score=43.77 Aligned_cols=54 Identities=20% Similarity=0.199 Sum_probs=40.7
Q ss_pred CcEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHh-hcCCCCEEEEEE
Q 007357 344 EGVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLIS-QKFAGDVAELGI 408 (606)
Q Consensus 344 ~Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~-~~~~g~~v~l~V 408 (606)
.-++|.+|.++-.|++ .|..||.|++|||..+.+... .+.++ .+..|+.|.+.|
T Consensus 110 MPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtH-----------deAVqaLKraGkeV~lev 166 (506)
T KOG3551|consen 110 MPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATH-----------DEAVQALKRAGKEVLLEV 166 (506)
T ss_pred CceehhHhccccccccccceeeccEEEEecchhhhhcch-----------HHHHHHHHhhCceeeeee
Confidence 3588999999999988 799999999999999877543 12232 244677776655
No 125
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=71.64 E-value=4.2 Score=41.32 Aligned_cols=41 Identities=17% Similarity=0.290 Sum_probs=35.9
Q ss_pred ccCCCCCcEEEEEeCCCChhhc-C-CCCCCEEEEECCEEeCCC
Q 007357 338 LKVPSNEGVLVRRVEPTSDANN-I-LKEGDVIVSFDDVCVGSE 378 (606)
Q Consensus 338 lgl~~~~Gv~V~~V~p~spA~~-g-Lq~GDvIlaInG~~V~~~ 378 (606)
-||....|+.|.+..|++-|+. | |..+|.|++|||.+|...
T Consensus 188 ~GlekvpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGK 230 (358)
T KOG3606|consen 188 HGLEKVPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGK 230 (358)
T ss_pred ccccccCceEEEeecCCccccccceeeecceeEEEcCEEeccc
Confidence 4666678999999999999998 5 788999999999999764
No 126
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=69.95 E-value=17 Score=35.25 Aligned_cols=29 Identities=10% Similarity=-0.043 Sum_probs=25.2
Q ss_pred cEEEEEEecccccccccCCCCCceEEeeC
Q 007357 482 QMVILSQVLANEVSIGYEDMSNQQVLKFN 510 (606)
Q Consensus 482 ~~VvIs~Vl~~s~a~~agl~~gD~I~~VN 510 (606)
.-++|+.|..+|+|+++|+..++.|.+|-
T Consensus 122 ~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 122 GKVIVDEVEFGSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred CEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence 34689999999999999999999888763
No 127
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=69.87 E-value=4.5 Score=48.64 Aligned_cols=47 Identities=26% Similarity=0.130 Sum_probs=35.9
Q ss_pred hhhccccCCCC--CcEEEEEeCCCChhhc-CCCCCCEEEEECCEEeCCCC
Q 007357 333 ALRTCLKVPSN--EGVLVRRVEPTSDANN-ILKEGDVIVSFDDVCVGSEG 379 (606)
Q Consensus 333 ~l~~~lgl~~~--~Gv~V~~V~p~spA~~-gLq~GDvIlaInG~~V~~~~ 379 (606)
++|-++|=..- .--+|..|.++|||.. ||++||.|+.+||+++....
T Consensus 645 airVy~Gd~d~ytvhh~v~sv~egsPA~~agls~~DlIthvnge~v~gl~ 694 (1205)
T KOG0606|consen 645 AIRVYMGDKDVYTVHHSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLV 694 (1205)
T ss_pred eEEEecCCcccceeeeeeeeecCCCCccccCCCccceeEeccCcccchhh
Confidence 34555553321 1346789999999988 99999999999999998753
No 128
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=69.68 E-value=9.2 Score=40.22 Aligned_cols=55 Identities=22% Similarity=0.301 Sum_probs=42.0
Q ss_pred cEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEE
Q 007357 345 GVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGII 409 (606)
Q Consensus 345 Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~ 409 (606)
-|+|.+|..+-.|+. .|-.||-|+.|||..|+.-.. .|- ..++.+ .|+.|+|+|.
T Consensus 81 PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~H-----eev---V~iLRN--AGdeVtlTV~ 137 (505)
T KOG3549|consen 81 PVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPH-----EEV---VNILRN--AGDEVTLTVK 137 (505)
T ss_pred cEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCCh-----HHH---HHHHHh--cCCEEEEEeH
Confidence 578889998888887 588999999999999987543 222 234443 6899999885
No 129
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=69.19 E-value=4.2 Score=43.32 Aligned_cols=82 Identities=15% Similarity=0.207 Sum_probs=53.8
Q ss_pred EEEEEEeccccccccc-CCCCCceEEeeCCeecCCHHH--HHHHHHhcCCceEEEEEecceEEEEehHHHHHhHHHHHHH
Q 007357 483 MVILSQVLANEVSIGY-EDMSNQQVLKFNGTRIKNIHH--LAHLVDSCKDKYLVFEFEDNYLAVLEREAAVAASSCILKD 559 (606)
Q Consensus 483 ~VvIs~Vl~~s~a~~a-gl~~gD~I~~VNG~~V~~l~~--l~~~v~~~~~~~v~l~v~r~~~ivl~~~~~~~~~~~il~~ 559 (606)
.++|+.+.++-+|.+. .|+.||.|++|||....+-.| -++.++. .++.+.+++ +-.++.++-+.|.
T Consensus 111 PIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKr-aGkeV~lev----------Ky~REvtPy~kk~ 179 (506)
T KOG3551|consen 111 PILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKR-AGKEVLLEV----------KYMREVTPYFKKE 179 (506)
T ss_pred ceehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHHh-hCceeeeee----------eeehhcchhhccC
Confidence 4588999999988877 488999999999999887655 5555654 345554444 3345667766555
Q ss_pred cCCCCCCCcCCCCcCCC
Q 007357 560 YGIPSERSSDLLEPYVD 576 (606)
Q Consensus 560 ~~i~~~~s~dl~~~~~~ 576 (606)
.|-++.-=|...|...
T Consensus 180 -sivs~vgWe~~~p~sp 195 (506)
T KOG3551|consen 180 -SIVSEVGWEDPAPQSP 195 (506)
T ss_pred -ccccccCcCCCCccCc
Confidence 3444444344444433
No 130
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=67.49 E-value=11 Score=38.88 Aligned_cols=52 Identities=12% Similarity=0.186 Sum_probs=41.6
Q ss_pred EEEEEEecccccccccC-CCCCceEEeeCCeecCCH--HHHHHHHHhcCCceEEEE
Q 007357 483 MVILSQVLANEVSIGYE-DMSNQQVLKFNGTRIKNI--HHLAHLVDSCKDKYLVFE 535 (606)
Q Consensus 483 ~VvIs~Vl~~s~a~~ag-l~~gD~I~~VNG~~V~~l--~~l~~~v~~~~~~~v~l~ 535 (606)
.++|.||..+.||++-| ++.||.|++|||..|+.- -+..++|+...+ .+++.
T Consensus 31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~-eV~Ih 85 (429)
T KOG3651|consen 31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLN-EVKIH 85 (429)
T ss_pred eEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhcc-ceEEE
Confidence 56899999999998876 788999999999999754 567888887665 34443
No 131
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=67.15 E-value=7.3 Score=46.95 Aligned_cols=45 Identities=11% Similarity=0.158 Sum_probs=38.9
Q ss_pred EEEEecccccccccCCCCCceEEeeCCeecCCHHH--HHHHHHhcCC
Q 007357 485 ILSQVLANEVSIGYEDMSNQQVLKFNGTRIKNIHH--LAHLVDSCKD 529 (606)
Q Consensus 485 vIs~Vl~~s~a~~agl~~gD~I~~VNG~~V~~l~~--l~~~v~~~~~ 529 (606)
++..|..++||..+|+.++|.|+.|||++|..+.| +.+++.+..+
T Consensus 661 ~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~~gn 707 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLKSGN 707 (1205)
T ss_pred eeeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHhcCC
Confidence 67889999999999999999999999999998866 7777765433
No 132
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=66.49 E-value=4.5 Score=43.76 Aligned_cols=23 Identities=30% Similarity=0.616 Sum_probs=20.8
Q ss_pred CcCCCCCCCceEcCCCeEEEEEE
Q 007357 261 AINPGNSGGPAFNDKGECIGVAF 283 (606)
Q Consensus 261 ~i~~G~SGGPl~n~~G~vVGI~~ 283 (606)
....|+||+.|+|.+|++|||.+
T Consensus 351 ~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 351 SLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred CCCCCCCcCeEECCCCCEEEEeC
Confidence 45689999999999999999986
No 133
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=64.69 E-value=36 Score=36.99 Aligned_cols=138 Identities=17% Similarity=0.241 Sum_probs=66.0
Q ss_pred ceEEEEEEcCCEEEecccccCCCCeEEEEEecCCcEEEEEEEEeecCCCeEEEEecccccccCCcccccCCCCCCCCeEE
Q 007357 141 STGSAFMIGDGKLLTNAHCVEHYTQVKVKRRGDDTKYVAKVLARGVDCDIALLSVESEEFWKDAEPLCLGHLPRLQDAVT 220 (606)
Q Consensus 141 ~~GSGfvI~~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~v~pl~l~~~~~lG~~V~ 220 (606)
++|=||-|++...+|+-||+.....-..- .+..-+.++..-++.-+++..+ +-.++.-+-|-+-..-|.-+.
T Consensus 379 GsGWGfWVS~~lfITttHViP~g~~E~FG-------v~i~~i~vh~sGeF~~~rFpk~-iRPDvtgmiLEeGapEGtV~s 450 (535)
T PF05416_consen 379 GSGWGFWVSPTLFITTTHVIPPGAKEAFG-------VPISQIQVHKSGEFCRFRFPKP-IRPDVTGMILEEGAPEGTVCS 450 (535)
T ss_dssp TTEEEEESSSSEEEEEGGGS-STTSEETT-------EECGGEEEEEETTEEEEEESS--SSTTS---EE-SS--TT-EEE
T ss_pred CCceeeeecceEEEEeeeecCCcchhhhC-------CChhHeEEeeccceEEEecCCC-CCCCccceeeccCCCCceEEE
Confidence 56779999999999999999753210000 0001122333446666776643 112455555543333455444
Q ss_pred EE-eecCCCC-cceEEeeEEeeeeeee-ecCCCcceeEEEE-------ccCcCCCCCCCceEcCCC---eEEEEEEeeec
Q 007357 221 VV-GYPLGGD-TISVTKGVVSRIEVTS-YAHGSSELLGIQI-------DAAINPGNSGGPAFNDKG---ECIGVAFQVYR 287 (606)
Q Consensus 221 vi-G~p~g~~-~~svt~GvVs~~~~~~-~~~~~~~~~~iq~-------da~i~~G~SGGPl~n~~G---~vVGI~~~~~~ 287 (606)
++ =.|.|.. .+.+..|....+.... ..++..- ++.+ |-...||+-|.|-|-..| -|+|++.+..+
T Consensus 451 iLiKR~sGEllpLAvRMgt~AsmkIqgr~v~GQ~G--MLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr 528 (535)
T PF05416_consen 451 ILIKRPSGELLPLAVRMGTHASMKIQGRTVHGQMG--MLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATR 528 (535)
T ss_dssp EEEE-TTSBEEEEEEEEEEEEEEEETTEEEEEEEE--EETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-S
T ss_pred EEEEcCCccchhhhhhhccceeEEEcceeecceee--eeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhcc
Confidence 43 4454421 2345566665554221 1111111 1111 334569999999997555 49999988655
Q ss_pred c
Q 007357 288 S 288 (606)
Q Consensus 288 ~ 288 (606)
+
T Consensus 529 ~ 529 (535)
T PF05416_consen 529 S 529 (535)
T ss_dssp S
T ss_pred C
Confidence 3
No 134
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=61.43 E-value=14 Score=40.19 Aligned_cols=60 Identities=13% Similarity=0.157 Sum_probs=48.5
Q ss_pred CCcEEEEEEecccccccccCCCCC-ceEEeeCCeecC-CHHHHHHHHHhcCCceEEEEEecce
Q 007357 480 GEQMVILSQVLANEVSIGYEDMSN-QQVLKFNGTRIK-NIHHLAHLVDSCKDKYLVFEFEDNY 540 (606)
Q Consensus 480 ~~~~VvIs~Vl~~s~a~~agl~~g-D~I~~VNG~~V~-~l~~l~~~v~~~~~~~v~l~v~r~~ 540 (606)
+.++.-+-+|..++++.++|+.+- |.|++|||..+. +-+.|.++++.+.++ |++++-.-+
T Consensus 13 gteg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~k 74 (462)
T KOG3834|consen 13 GTEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNSK 74 (462)
T ss_pred CceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEecc
Confidence 345667889999999999999774 789999999987 556688888888777 888876643
No 135
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=60.52 E-value=20 Score=36.51 Aligned_cols=50 Identities=12% Similarity=0.183 Sum_probs=31.5
Q ss_pred EEEeccccccccc-CCCCCceEEeeCCeecCCHHHH--HHHHHhcC-CceEEEE
Q 007357 486 LSQVLANEVSIGY-EDMSNQQVLKFNGTRIKNIHHL--AHLVDSCK-DKYLVFE 535 (606)
Q Consensus 486 Is~Vl~~s~a~~a-gl~~gD~I~~VNG~~V~~l~~l--~~~v~~~~-~~~v~l~ 535 (606)
|..+.++|+=.+- ....||.|.+|||+.|-.+.|. .++++..+ ++..++.
T Consensus 153 IKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlr 206 (334)
T KOG3938|consen 153 IKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLR 206 (334)
T ss_pred eEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEE
Confidence 3444444433222 3468999999999999999884 56666543 3444443
No 136
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=58.97 E-value=74 Score=28.01 Aligned_cols=53 Identities=19% Similarity=0.285 Sum_probs=33.4
Q ss_pred EEEEcCCEEEecccccCCCCeEEEEEecCCcEEEEEEEEeecCCCeEEEEecccccccCCcccccCC
Q 007357 145 AFMIGDGKLLTNAHCVEHYTQVKVKRRGDDTKYVAKVLARGVDCDIALLSVESEEFWKDAEPLCLGH 211 (606)
Q Consensus 145 GfvI~~g~ILTnaHvV~~~~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~v~pl~l~~ 211 (606)
++=|.+|..+|+.|+++.++.|.= .+ -+++ ....|+++++..... ++.+++++
T Consensus 3 avHIGnG~~vt~tHva~~~~~v~g------~~--f~~~--~~~ge~~~v~~~~~~----~p~~~ig~ 55 (105)
T PF03510_consen 3 AVHIGNGRYVTVTHVAKSSDSVDG------QP--FKIV--KTDGELCWVQSPLVH----LPAAQIGT 55 (105)
T ss_pred eEEeCCCEEEEEEEEeccCceEcC------cC--cEEE--EeccCEEEEECCCCC----CCeeEecc
Confidence 556789999999999887654421 11 1222 245699999987652 34455543
No 137
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=54.71 E-value=14 Score=37.58 Aligned_cols=57 Identities=14% Similarity=0.223 Sum_probs=45.0
Q ss_pred cEEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCEEEEEEE
Q 007357 345 GVLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDVAELGII 409 (606)
Q Consensus 345 Gv~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~v~l~V~ 409 (606)
-..|..+.++|.... -++.||.|-+|||+.|-.+... ....++.....|++.+|.+.
T Consensus 150 yAFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHY--------eVArmLKel~rge~ftlrLi 208 (334)
T KOG3938|consen 150 YAFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHY--------EVARMLKELPRGETFTLRLI 208 (334)
T ss_pred eeeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHH--------HHHHHHHhcccCCeeEEEee
Confidence 357888999999887 8999999999999999887552 23456667777888777665
No 138
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=34.88 E-value=62 Score=38.78 Aligned_cols=56 Identities=14% Similarity=0.180 Sum_probs=43.1
Q ss_pred CcEEEEEEecccccccccC-CCCCceEEeeCCeecCCHHH--HHHHHHhcCCceEEEEEe
Q 007357 481 EQMVILSQVLANEVSIGYE-DMSNQQVLKFNGTRIKNIHH--LAHLVDSCKDKYLVFEFE 537 (606)
Q Consensus 481 ~~~VvIs~Vl~~s~a~~ag-l~~gD~I~~VNG~~V~~l~~--l~~~v~~~~~~~v~l~v~ 537 (606)
+-|++|..|.++.+|..-| |..||++++|||...-.+.+ ..+++.. .+..+.|+|.
T Consensus 959 klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lmtr-tg~vV~leVa 1017 (1629)
T KOG1892|consen 959 KLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLMTR-TGNVVHLEVA 1017 (1629)
T ss_pred ccceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHHHHhc-cCCeEEEehh
Confidence 4578999999999988775 88999999999999876644 4444543 5667777764
No 139
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=33.43 E-value=76 Score=26.91 Aligned_cols=36 Identities=28% Similarity=0.423 Sum_probs=30.3
Q ss_pred ccCCCCeEEEEEecCCcEEEEEEEEeecCCCeEEEEe
Q 007357 159 CVEHYTQVKVKRRGDDTKYVAKVLARGVDCDIALLSV 195 (606)
Q Consensus 159 vV~~~~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 195 (606)
++.....+.|++ .+++.+.+++.++|.+++|.|=.+
T Consensus 10 ~~~~~~~V~V~l-r~~r~~~G~L~~fD~hmNlvL~d~ 45 (87)
T cd01720 10 AVKNNTQVLINC-RNNKKLLGRVKAFDRHCNMVLENV 45 (87)
T ss_pred HHcCCCEEEEEE-cCCCEEEEEEEEecCccEEEEcce
Confidence 445567899999 788999999999999999987654
No 140
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=27.76 E-value=57 Score=37.23 Aligned_cols=34 Identities=15% Similarity=0.204 Sum_probs=30.8
Q ss_pred EEEEEeCCCChhhc--CCCCCCEEEEECCEEeCCCC
Q 007357 346 VLVRRVEPTSDANN--ILKEGDVIVSFDDVCVGSEG 379 (606)
Q Consensus 346 v~V~~V~p~spA~~--gLq~GDvIlaInG~~V~~~~ 379 (606)
.+|.++.++|||.. .|..||.|+.||++.|-.|.
T Consensus 227 h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq 262 (638)
T KOG1738|consen 227 HVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ 262 (638)
T ss_pred eeccccccCChHHHhhcccCccceeeecccccccch
Confidence 46788999999998 89999999999999988885
No 141
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=27.02 E-value=1.6e+02 Score=22.58 Aligned_cols=32 Identities=22% Similarity=0.183 Sum_probs=27.3
Q ss_pred CeEEEEEecCCcEEEEEEEEeecCCCeEEEEec
Q 007357 164 TQVKVKRRGDDTKYVAKVLARGVDCDIALLSVE 196 (606)
Q Consensus 164 ~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~ 196 (606)
..+.|.+ .+++.+.+.+.++|...++.|-...
T Consensus 7 ~~V~V~l-~~g~~~~G~L~~~D~~~Ni~L~~~~ 38 (63)
T cd00600 7 KTVRVEL-KDGRVLEGVLVAFDKYMNLVLDDVE 38 (63)
T ss_pred CEEEEEE-CCCcEEEEEEEEECCCCCEEECCEE
Confidence 5788888 7899999999999999998876554
No 142
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=26.50 E-value=69 Score=31.11 Aligned_cols=28 Identities=25% Similarity=0.099 Sum_probs=25.0
Q ss_pred CCcEEEEEeCCCChhhc-CCCCCCEEEEE
Q 007357 343 NEGVLVRRVEPTSDANN-ILKEGDVIVSF 370 (606)
Q Consensus 343 ~~Gv~V~~V~p~spA~~-gLq~GDvIlaI 370 (606)
...++|..|..+|+|++ |+.-|+.|++|
T Consensus 121 ~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v 149 (183)
T PF11874_consen 121 GGKVIVDEVEFGSPAEKAGIDFDWEITEV 149 (183)
T ss_pred CCEEEEEecCCCCHHHHcCCCCCcEEEEE
Confidence 35688999999999999 99999999887
No 143
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=25.53 E-value=1.5e+02 Score=24.51 Aligned_cols=47 Identities=19% Similarity=0.188 Sum_probs=31.4
Q ss_pred CCCEEEEECCEEeCCCCCccccchhhHHHHHHHhhcCCCCE----EEEEEEECCEEEEEE
Q 007357 363 EGDVIVSFDDVCVGSEGTVPFRSNERIAFRYLISQKFAGDV----AELGIIRAGTFMKVK 418 (606)
Q Consensus 363 ~GDvIlaInG~~V~~~~~v~~~~~eri~~~~~l~~~~~g~~----v~l~V~R~G~~~~v~ 418 (606)
|-|-.+.+||++..+.+.++- ..-..+..|.. +..++.|||+..+.+
T Consensus 10 PadAkl~v~G~~t~~~G~~R~---------F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~ 60 (75)
T TIGR03000 10 PADAKLKVDGKETNGTGTVRT---------FTTPPLEAGKEYEYTVTAEYDRDGRILTRT 60 (75)
T ss_pred CCCCEEEECCeEcccCccEEE---------EECCCCCCCCEEEEEEEEEEecCCcEEEEE
Confidence 468889999999999988631 11223445655 455566899876554
No 144
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=25.30 E-value=61 Score=23.97 Aligned_cols=21 Identities=24% Similarity=0.476 Sum_probs=17.0
Q ss_pred CCCCCCceEcCCCeEEEEEEe
Q 007357 264 PGNSGGPAFNDKGECIGVAFQ 284 (606)
Q Consensus 264 ~G~SGGPl~n~~G~vVGI~~~ 284 (606)
.+-+.-|++|.+|+++|+.+.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 345667999999999999874
No 145
>PF10055 DUF2292: Uncharacterized small protein (DUF2292); InterPro: IPR018743 Members of this family of hypothetical bacterial proteins have no known function.
Probab=25.22 E-value=2.2e+02 Score=20.31 Aligned_cols=32 Identities=9% Similarity=0.214 Sum_probs=27.7
Q ss_pred HHHHHHHHHhcCCceEEEEEecceEEEEehHH
Q 007357 517 IHHLAHLVDSCKDKYLVFEFEDNYLAVLEREA 548 (606)
Q Consensus 517 l~~l~~~v~~~~~~~v~l~v~r~~~ivl~~~~ 548 (606)
++.+.++++...=..+++.+.+++++-+|+.+
T Consensus 3 ~~~I~~~l~~i~yGsV~iiiqdG~vvQIe~~E 34 (38)
T PF10055_consen 3 LEKILEALKSIRYGSVTIIIQDGRVVQIEKTE 34 (38)
T ss_pred HHHHHHHHhcCCcceEEEEEECCEEEEEEhhh
Confidence 56778888888888999999999999999865
No 146
>PF12419 DUF3670: SNF2 Helicase protein ; InterPro: IPR022138 This domain family is found in bacteria, archaea and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with PF00271 from PFAM, PF00176 from PFAM. Most of the proteins in this family are annotated as SNF2 helicases but there is little accompanying literature to confirm this.
Probab=24.50 E-value=2.8e+02 Score=25.49 Aligned_cols=53 Identities=15% Similarity=0.149 Sum_probs=41.0
Q ss_pred CCceEEeeCCeecCCHHHHHHHHHhcCCceEEEEEecceEEEEehHHHHHhHHHHHHH
Q 007357 502 SNQQVLKFNGTRIKNIHHLAHLVDSCKDKYLVFEFEDNYLAVLEREAAVAASSCILKD 559 (606)
Q Consensus 502 ~gD~I~~VNG~~V~~l~~l~~~v~~~~~~~v~l~v~r~~~ivl~~~~~~~~~~~il~~ 559 (606)
.-|.=++|+|+.+ +.++|.+++++. ...|+| ||+-|.+|.++++++...+.+.
T Consensus 72 ~f~W~lalGd~~L-s~eEf~~L~~~~-~~LV~~---rg~WV~ld~~~l~~~~~~~~~~ 124 (141)
T PF12419_consen 72 DFDWELALGDEEL-SEEEFEQLVEQK-RPLVRF---RGRWVELDPEELRRALAFLEKA 124 (141)
T ss_pred cceEEEEECCEEC-CHHHHHHHHHcC-CCeEEE---CCEEEEECHHHHHHHHHHHHhc
Confidence 3455567888776 789999999864 345544 8999999999999998887764
No 147
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=21.44 E-value=2.6e+02 Score=22.87 Aligned_cols=32 Identities=16% Similarity=0.095 Sum_probs=27.1
Q ss_pred CeEEEEEecCCcEEEEEEEEeecCCCeEEEEec
Q 007357 164 TQVKVKRRGDDTKYVAKVLARGVDCDIALLSVE 196 (606)
Q Consensus 164 ~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~ 196 (606)
..+.|.+ .+++.+.+.+.++|+..+|.|=...
T Consensus 13 k~v~V~l-~~gr~~~G~L~~fD~~~NlvL~d~~ 44 (74)
T cd01728 13 KKVVVLL-RDGRKLIGILRSFDQFANLVLQDTV 44 (74)
T ss_pred CEEEEEE-cCCeEEEEEEEEECCcccEEecceE
Confidence 5688888 7899999999999999998875553
No 148
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=21.04 E-value=3.1e+02 Score=21.68 Aligned_cols=34 Identities=18% Similarity=0.183 Sum_probs=27.7
Q ss_pred CCeEEEEEecCCcEEEEEEEEeecCCCeEEEEecc
Q 007357 163 YTQVKVKRRGDDTKYVAKVLARGVDCDIALLSVES 197 (606)
Q Consensus 163 ~~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~ 197 (606)
...+.++. -.+..+.++|+++|+...+.+|+-..
T Consensus 6 Gs~V~~kT-c~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 6 GSQVSCRT-CFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred ccEEEEEe-cCCceEEEEEEEecCCCcEEEEECcc
Confidence 34566666 56889999999999999999998554
No 149
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=20.65 E-value=4e+02 Score=20.70 Aligned_cols=37 Identities=24% Similarity=0.170 Sum_probs=29.4
Q ss_pred CCCeEEEEEecCCcEEEEEEEEeecCCCeEEEEeccc
Q 007357 162 HYTQVKVKRRGDDTKYVAKVLARGVDCDIALLSVESE 198 (606)
Q Consensus 162 ~~~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~ 198 (606)
....|.++-.++...|+|++..+|...++.-+++++-
T Consensus 8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG 44 (55)
T PF09465_consen 8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG 44 (55)
T ss_dssp SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence 4567888887777888999999999999999999864
No 150
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=20.47 E-value=2.3e+02 Score=22.46 Aligned_cols=33 Identities=15% Similarity=0.143 Sum_probs=28.6
Q ss_pred CeEEEEEecCCcEEEEEEEEeecCCCeEEEEecc
Q 007357 164 TQVKVKRRGDDTKYVAKVLARGVDCDIALLSVES 197 (606)
Q Consensus 164 ~~v~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~ 197 (606)
..+.|.+ .+++.+.+++.++|...+|.|-....
T Consensus 11 ~~V~V~l-~~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 11 KPVLVKL-KGGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CEEEEEE-CCCCEEEEEEEEECCcceEEEeeEEE
Confidence 5788888 78999999999999999998877643
Done!