Query 007765
Match_columns 590
No_of_seqs 462 out of 3732
Neff 8.1
Searched_HMMs 46136
Date Thu Mar 28 15:02:23 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/007765.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/007765hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3.9E-57 8.5E-62 486.4 46.1 390 104-545 41-453 (455)
2 TIGR02037 degP_htrA_DO peripla 100.0 1.7E-55 3.7E-60 474.7 47.1 394 104-543 2-425 (428)
3 PRK10942 serine endoprotease; 100.0 2.2E-54 4.7E-59 467.3 43.8 387 104-545 39-471 (473)
4 TIGR02038 protease_degS peripl 100.0 1.6E-46 3.6E-51 393.1 36.5 296 102-421 44-349 (351)
5 PRK10898 serine endoprotease; 100.0 6E-46 1.3E-50 388.5 36.6 296 103-422 45-351 (353)
6 KOG1320 Serine protease [Postt 100.0 2.5E-41 5.4E-46 353.3 16.8 414 108-564 55-472 (473)
7 COG0265 DegQ Trypsin-like seri 100.0 6E-35 1.3E-39 307.0 31.2 299 103-421 33-341 (347)
8 KOG1421 Predicted signaling-as 100.0 2.5E-31 5.4E-36 279.1 28.7 381 104-536 53-457 (955)
9 KOG1320 Serine protease [Postt 99.9 1.7E-23 3.7E-28 219.6 23.2 303 102-419 127-467 (473)
10 KOG1421 Predicted signaling-as 99.8 5.4E-18 1.2E-22 179.3 29.2 377 109-548 524-930 (955)
11 PRK10779 zinc metallopeptidase 99.8 4.5E-19 9.7E-24 192.2 11.2 200 343-589 128-336 (449)
12 TIGR00054 RIP metalloprotease 99.7 5.3E-17 1.1E-21 174.4 10.6 178 341-589 128-307 (420)
13 PF13365 Trypsin_2: Trypsin-li 99.5 1.3E-13 2.8E-18 121.7 12.2 107 142-279 1-120 (120)
14 PF13180 PDZ_2: PDZ domain; PD 99.4 7.8E-13 1.7E-17 109.1 8.9 81 316-418 1-82 (82)
15 PF00089 Trypsin: Trypsin; In 99.4 3.7E-11 7.9E-16 117.2 19.7 165 139-303 24-220 (220)
16 cd00190 Tryp_SPc Trypsin-like 99.3 9E-11 1.9E-15 115.3 19.8 167 138-304 23-230 (232)
17 cd00987 PDZ_serine_protease PD 99.2 4.1E-11 9E-16 100.4 9.6 88 316-415 1-89 (90)
18 smart00020 Tryp_SPc Trypsin-li 99.2 3.6E-10 7.7E-15 111.2 17.0 146 139-284 25-208 (229)
19 cd00990 PDZ_glycyl_aminopeptid 99.1 6.7E-10 1.5E-14 90.9 9.2 77 316-419 1-78 (80)
20 cd00986 PDZ_LON_protease PDZ d 99.0 1.3E-09 2.9E-14 89.0 9.1 72 340-421 7-78 (79)
21 cd00991 PDZ_archaeal_metallopr 99.0 1.6E-09 3.4E-14 88.6 8.9 68 340-417 9-77 (79)
22 TIGR01713 typeII_sec_gspC gene 99.0 2.5E-09 5.3E-14 107.2 10.1 101 296-418 158-259 (259)
23 PF13180 PDZ_2: PDZ domain; PD 98.8 1.5E-08 3.2E-13 83.5 7.6 66 481-546 13-80 (82)
24 TIGR02037 degP_htrA_DO peripla 98.8 1.8E-08 3.9E-13 109.3 9.6 90 315-415 337-427 (428)
25 cd00989 PDZ_metalloprotease PD 98.8 2.5E-08 5.4E-13 81.3 7.9 65 341-416 12-77 (79)
26 cd00988 PDZ_CTP_protease PDZ d 98.7 5.8E-08 1.3E-12 80.3 9.1 66 341-417 13-82 (85)
27 cd00987 PDZ_serine_protease PD 98.7 1.3E-07 2.8E-12 79.0 10.1 80 439-539 2-82 (90)
28 COG3591 V8-like Glu-specific e 98.6 4.1E-07 8.9E-12 89.4 13.1 156 141-306 65-249 (251)
29 cd00991 PDZ_archaeal_metallopr 98.6 1.8E-07 3.8E-12 76.5 7.9 67 480-546 8-76 (79)
30 cd00136 PDZ PDZ domain, also c 98.5 2.1E-07 4.5E-12 73.9 6.8 54 341-405 13-69 (70)
31 PRK10139 serine endoprotease; 98.5 5.8E-07 1.3E-11 97.7 10.2 89 438-547 267-357 (455)
32 KOG3627 Trypsin [Amino acid tr 98.4 1.4E-05 3E-10 80.3 18.3 144 141-285 39-229 (256)
33 cd00986 PDZ_LON_protease PDZ d 98.4 9E-07 2E-11 72.2 7.6 64 482-546 8-73 (79)
34 cd00988 PDZ_CTP_protease PDZ d 98.4 1.5E-06 3.3E-11 71.8 8.2 58 482-539 13-72 (85)
35 TIGR02038 protease_degS peripl 98.4 1.5E-06 3.2E-11 91.7 9.6 89 438-547 255-345 (351)
36 KOG3209 WW domain-containing p 98.3 9.6E-07 2.1E-11 95.4 7.9 153 340-538 673-836 (984)
37 cd00989 PDZ_metalloprotease PD 98.3 1.6E-06 3.4E-11 70.6 7.1 56 484-539 14-69 (79)
38 PRK10942 serine endoprotease; 98.3 2.5E-06 5.4E-11 93.3 10.2 90 437-547 287-378 (473)
39 smart00228 PDZ Domain present 98.3 2.4E-06 5.1E-11 70.2 7.8 59 341-409 26-85 (85)
40 cd00136 PDZ PDZ domain, also c 98.3 1.4E-06 3E-11 69.1 5.9 55 482-536 13-69 (70)
41 TIGR00054 RIP metalloprotease 98.3 1.7E-06 3.7E-11 93.4 8.3 68 341-419 203-271 (420)
42 PRK10898 serine endoprotease; 98.2 4.5E-06 9.9E-11 88.0 10.1 88 439-547 257-346 (353)
43 PRK10779 zinc metallopeptidase 98.2 3.3E-06 7.3E-11 92.0 8.5 67 342-419 222-289 (449)
44 PF00863 Peptidase_C4: Peptida 98.2 4.8E-05 1E-09 74.4 15.3 160 110-296 14-184 (235)
45 TIGR00225 prc C-terminal pepti 98.2 4.5E-06 9.8E-11 87.5 8.7 71 341-420 62-133 (334)
46 PF12812 PDZ_1: PDZ-like domai 98.1 1E-05 2.3E-10 65.5 7.7 73 434-529 5-77 (78)
47 TIGR03279 cyano_FeS_chp putati 98.1 4.3E-06 9.3E-11 88.4 6.8 61 345-419 2-64 (433)
48 PF00595 PDZ: PDZ domain (Also 98.1 8.3E-06 1.8E-10 66.8 6.7 72 315-406 9-81 (81)
49 PLN00049 carboxyl-terminal pro 98.1 2E-05 4.3E-10 84.3 11.4 69 341-418 102-171 (389)
50 cd00990 PDZ_glycyl_aminopeptid 98.0 1.4E-05 3E-10 65.1 6.3 61 482-544 12-73 (80)
51 COG0793 Prc Periplasmic protea 98.0 0.00028 6.1E-09 75.6 17.4 80 315-418 99-181 (406)
52 KOG3209 WW domain-containing p 98.0 1.6E-05 3.5E-10 86.2 7.7 159 345-522 782-964 (984)
53 TIGR02860 spore_IV_B stage IV 97.9 2.8E-05 6.1E-10 81.8 8.4 68 340-418 104-180 (402)
54 cd00992 PDZ_signaling PDZ doma 97.9 2.5E-05 5.4E-10 63.8 6.2 49 316-375 12-61 (82)
55 cd00992 PDZ_signaling PDZ doma 97.9 3.4E-05 7.3E-10 63.0 6.7 54 482-536 26-81 (82)
56 PF14685 Tricorn_PDZ: Tricorn 97.8 0.00015 3.3E-09 60.0 9.5 64 341-415 12-87 (88)
57 PF00595 PDZ: PDZ domain (Also 97.8 3E-05 6.4E-10 63.5 5.3 55 481-536 24-80 (81)
58 TIGR01713 typeII_sec_gspC gene 97.8 4.1E-05 9E-10 76.9 7.4 67 480-546 189-257 (259)
59 KOG3580 Tight junction protein 97.8 0.00021 4.6E-09 76.3 12.1 75 332-417 211-287 (1027)
60 PRK09681 putative type II secr 97.7 6.7E-05 1.5E-09 75.1 6.8 67 341-418 204-275 (276)
61 COG3480 SdrC Predicted secrete 97.7 0.00011 2.4E-09 73.5 7.4 72 340-421 129-201 (342)
62 PF05579 Peptidase_S32: Equine 97.6 0.00047 1E-08 67.4 11.1 116 139-285 111-230 (297)
63 KOG3580 Tight junction protein 97.6 0.0002 4.4E-09 76.4 8.4 61 340-410 39-99 (1027)
64 KOG3129 26S proteasome regulat 97.6 0.00023 4.9E-09 67.0 7.6 74 342-423 140-214 (231)
65 PF00548 Peptidase_C3: 3C cyst 97.5 0.002 4.2E-08 60.8 13.1 138 138-283 23-170 (172)
66 TIGR02860 spore_IV_B stage IV 97.5 0.00029 6.2E-09 74.3 8.0 56 492-547 123-179 (402)
67 COG5640 Secreted trypsin-like 97.5 0.002 4.4E-08 65.6 13.2 48 260-307 224-278 (413)
68 smart00228 PDZ Domain present 97.4 0.00028 6.1E-09 57.7 5.3 57 482-538 26-83 (85)
69 TIGR00225 prc C-terminal pepti 97.4 0.00031 6.8E-09 73.7 6.7 58 482-539 62-121 (334)
70 PLN00049 carboxyl-terminal pro 97.4 0.00035 7.5E-09 74.8 6.9 58 482-539 102-161 (389)
71 TIGR03279 cyano_FeS_chp putati 97.3 0.00025 5.4E-09 75.3 4.8 61 486-549 2-64 (433)
72 PF03761 DUF316: Domain of unk 97.3 0.015 3.3E-07 59.3 17.6 106 185-300 159-272 (282)
73 COG3975 Predicted protease wit 97.2 0.00072 1.6E-08 72.2 6.6 84 318-421 439-525 (558)
74 PF04495 GRASP55_65: GRASP55/6 97.1 0.0013 2.7E-08 59.6 7.1 87 314-419 24-114 (138)
75 PRK11186 carboxy-terminal prot 97.1 0.0019 4.1E-08 73.0 9.6 71 341-417 255-332 (667)
76 PF14685 Tricorn_PDZ: Tricorn 96.9 0.0034 7.3E-08 52.0 6.9 58 482-539 12-79 (88)
77 COG0265 DegQ Trypsin-like seri 96.8 0.0041 8.9E-08 65.6 8.2 69 480-548 268-338 (347)
78 KOG3834 Golgi reassembly stack 96.6 0.011 2.3E-07 61.7 9.8 156 340-549 14-182 (462)
79 COG3031 PulC Type II secretory 96.6 0.0022 4.9E-08 61.7 4.3 66 342-417 208-274 (275)
80 PF04495 GRASP55_65: GRASP55/6 96.6 0.0045 9.8E-08 56.0 5.9 69 480-548 41-115 (138)
81 KOG3553 Tax interaction protei 96.6 0.0028 6.1E-08 52.5 4.0 34 340-373 58-92 (124)
82 COG0793 Prc Periplasmic protea 96.5 0.0036 7.7E-08 67.2 5.4 57 482-538 112-170 (406)
83 KOG3605 Beta amyloid precursor 96.4 0.0057 1.2E-07 66.5 6.2 123 347-529 679-805 (829)
84 PRK11186 carboxy-terminal prot 95.9 0.014 3.1E-07 66.1 6.5 57 482-538 255-319 (667)
85 PF12812 PDZ_1: PDZ-like domai 95.5 0.024 5.3E-07 45.9 4.6 59 316-378 9-68 (78)
86 COG3480 SdrC Predicted secrete 95.4 0.031 6.8E-07 56.4 6.0 58 480-538 128-186 (342)
87 PF02122 Peptidase_S39: Peptid 95.2 0.016 3.4E-07 55.9 3.2 135 149-298 41-183 (203)
88 PRK09681 putative type II secr 95.1 0.044 9.6E-07 55.2 6.1 58 489-546 211-273 (276)
89 KOG3129 26S proteasome regulat 94.7 0.073 1.6E-06 50.5 6.2 68 483-550 140-211 (231)
90 PF05580 Peptidase_S55: SpoIVB 94.6 0.032 7E-07 53.5 3.6 42 258-299 174-215 (218)
91 PF08192 Peptidase_S64: Peptid 94.5 0.29 6.4E-06 54.3 10.9 117 185-306 541-688 (695)
92 KOG3553 Tax interaction protei 94.4 0.0083 1.8E-07 49.7 -0.7 40 480-519 57-96 (124)
93 KOG3552 FERM domain protein FR 94.4 0.048 1E-06 61.7 4.8 57 341-407 75-131 (1298)
94 PF10459 Peptidase_S46: Peptid 94.1 0.078 1.7E-06 60.5 6.0 21 141-161 48-69 (698)
95 PF09342 DUF1986: Domain of un 94.0 0.38 8.2E-06 47.1 9.4 98 127-225 13-131 (267)
96 KOG3532 Predicted protein kina 93.9 0.093 2E-06 57.6 5.7 57 478-534 394-450 (1051)
97 COG3975 Predicted protease wit 93.1 0.085 1.8E-06 56.9 3.7 91 433-549 432-523 (558)
98 KOG3532 Predicted protein kina 92.9 0.13 2.9E-06 56.5 4.8 38 341-378 398-436 (1051)
99 KOG3550 Receptor targeting pro 92.8 0.27 5.9E-06 44.1 5.9 51 479-529 112-165 (207)
100 KOG3552 FERM domain protein FR 92.3 0.2 4.3E-06 57.1 5.3 48 482-530 75-124 (1298)
101 KOG3550 Receptor targeting pro 91.8 0.34 7.3E-06 43.5 5.2 37 340-376 114-152 (207)
102 KOG3542 cAMP-regulated guanine 90.2 0.27 5.8E-06 54.1 3.6 42 335-376 556-598 (1283)
103 KOG2921 Intramembrane metallop 89.6 0.49 1.1E-05 49.2 4.9 50 480-529 218-268 (484)
104 PF00947 Pico_P2A: Picornaviru 89.6 1.2 2.7E-05 39.1 6.7 34 250-283 76-109 (127)
105 PF00944 Peptidase_S3: Alphavi 88.9 0.69 1.5E-05 40.9 4.5 29 258-286 100-129 (158)
106 COG0750 Predicted membrane-ass 88.6 1 2.2E-05 47.8 6.9 58 345-413 133-195 (375)
107 KOG3571 Dishevelled 3 and rela 88.3 0.72 1.6E-05 49.4 5.2 37 340-376 276-314 (626)
108 PF00949 Peptidase_S7: Peptida 87.6 0.95 2.1E-05 40.5 4.8 30 256-285 89-119 (132)
109 KOG3549 Syntrophins (type gamm 87.5 1.5 3.2E-05 44.9 6.5 55 482-537 80-137 (505)
110 KOG1892 Actin filament-binding 87.4 0.68 1.5E-05 53.1 4.5 62 338-409 957-1020(1629)
111 KOG3606 Cell polarity protein 86.5 1.5 3.4E-05 43.3 5.9 50 480-529 192-244 (358)
112 KOG3571 Dishevelled 3 and rela 85.2 3 6.4E-05 44.9 7.6 58 479-536 274-336 (626)
113 PF10459 Peptidase_S46: Peptid 84.8 0.49 1.1E-05 54.1 1.9 43 186-228 199-254 (698)
114 COG3031 PulC Type II secretory 84.6 1.9 4E-05 42.1 5.4 46 493-538 218-264 (275)
115 PF03510 Peptidase_C24: 2C end 84.6 6.9 0.00015 33.5 8.2 55 143-211 2-56 (105)
116 KOG3542 cAMP-regulated guanine 84.1 0.99 2.2E-05 49.9 3.7 57 481-539 561-619 (1283)
117 PF02395 Peptidase_S6: Immunog 83.6 4.8 0.0001 46.8 9.2 64 142-209 67-131 (769)
118 KOG3551 Syntrophins (type beta 82.9 0.85 1.8E-05 47.3 2.5 54 480-534 108-164 (506)
119 KOG3605 Beta amyloid precursor 82.9 1.5 3.3E-05 48.4 4.5 105 262-375 678-791 (829)
120 KOG3651 Protein kinase C, alph 82.9 1.8 3.9E-05 43.4 4.6 38 341-378 30-69 (429)
121 KOG0609 Calcium/calmodulin-dep 82.5 4.9 0.00011 43.8 8.0 56 342-407 147-204 (542)
122 PF05416 Peptidase_C37: Southa 82.1 8.5 0.00018 40.7 9.3 136 140-286 379-529 (535)
123 PF02907 Peptidase_S29: Hepati 80.7 2.2 4.7E-05 37.9 3.9 23 262-284 106-129 (148)
124 KOG3606 Cell polarity protein 80.6 3.4 7.4E-05 40.9 5.6 80 291-375 147-230 (358)
125 COG0750 Predicted membrane-ass 80.2 2.7 5.8E-05 44.7 5.3 52 487-538 134-188 (375)
126 KOG3651 Protein kinase C, alph 76.6 4.2 9.1E-05 40.9 4.9 49 482-530 30-81 (429)
127 KOG2921 Intramembrane metallop 76.3 1.8 3.9E-05 45.2 2.4 38 340-377 219-258 (484)
128 KOG3549 Syntrophins (type gamm 75.9 3.2 6.9E-05 42.6 3.9 55 342-406 81-137 (505)
129 KOG3551 Syntrophins (type beta 73.7 1.9 4.1E-05 44.8 1.8 36 342-377 111-148 (506)
130 KOG0606 Microtubule-associated 73.4 2.8 6.1E-05 49.4 3.2 35 343-377 660-695 (1205)
131 KOG0609 Calcium/calmodulin-dep 71.3 6.2 0.00014 43.0 5.0 55 483-538 147-204 (542)
132 KOG3938 RGS-GAIP interacting p 68.3 4.4 9.5E-05 40.1 2.9 56 343-406 151-208 (334)
133 KOG3938 RGS-GAIP interacting p 64.3 15 0.00032 36.5 5.6 52 485-536 152-207 (334)
134 PF11874 DUF3394: Domain of un 61.5 32 0.00068 32.7 7.1 29 482-510 122-150 (183)
135 KOG3834 Golgi reassembly stack 57.7 18 0.00039 38.5 5.2 58 480-538 13-72 (462)
136 KOG1738 Membrane-associated gu 45.7 29 0.00063 38.8 4.7 35 342-376 226-262 (638)
137 KOG0606 Microtubule-associated 44.0 30 0.00065 41.3 4.7 50 485-535 661-712 (1205)
138 PF01732 DUF31: Putative pepti 40.2 20 0.00043 38.2 2.5 22 260-281 351-373 (374)
139 cd01720 Sm_D2 The eukaryotic S 39.4 57 0.0012 27.0 4.5 36 158-194 10-45 (87)
140 PF12381 Peptidase_C3G: Tungro 38.5 32 0.00068 33.4 3.2 54 253-307 169-229 (231)
141 PF09465 LBR_tudor: Lamin-B re 37.0 1.6E+02 0.0035 22.1 6.0 38 160-197 7-44 (55)
142 PF11874 DUF3394: Domain of un 35.7 57 0.0012 31.0 4.4 29 340-368 121-150 (183)
143 TIGR03000 plancto_dom_1 Planct 33.7 1.5E+02 0.0032 23.9 5.7 47 360-415 10-60 (75)
144 KOG1892 Actin filament-binding 33.0 64 0.0014 38.0 5.0 54 482-536 960-1016(1629)
145 cd00600 Sm_like The eukaryotic 31.8 1.3E+02 0.0027 22.6 5.2 32 163-195 7-38 (63)
146 COG0260 PepB Leucyl aminopepti 31.4 43 0.00093 36.9 3.3 56 332-392 291-346 (485)
147 cd01735 LSm12_N LSm12 belongs 30.7 1.8E+02 0.0039 22.3 5.6 33 163-196 7-39 (61)
148 PF09122 DUF1930: Domain of un 28.2 2.1E+02 0.0046 22.0 5.4 44 503-546 19-64 (68)
149 cd01731 archaeal_Sm1 The archa 25.8 1.7E+02 0.0036 22.7 5.0 32 163-195 11-42 (68)
150 cd01726 LSm6 The eukaryotic Sm 25.8 1.5E+02 0.0033 22.9 4.7 32 163-195 11-42 (67)
151 PRK00737 small nuclear ribonuc 25.6 1.7E+02 0.0036 23.1 5.0 32 163-195 15-46 (72)
152 cd06168 LSm9 The eukaryotic Sm 24.9 1.8E+02 0.0038 23.3 5.0 31 163-194 11-41 (75)
153 cd01717 Sm_B The eukaryotic Sm 24.8 1.6E+02 0.0034 23.7 4.8 31 163-194 11-41 (79)
154 cd01722 Sm_F The eukaryotic Sm 24.7 1.5E+02 0.0033 23.0 4.5 31 163-194 12-42 (68)
155 cd01730 LSm3 The eukaryotic Sm 24.4 1.4E+02 0.003 24.2 4.4 31 163-194 12-42 (82)
156 PRK05015 aminopeptidase B; Pro 23.8 86 0.0019 33.8 3.8 45 345-392 240-284 (424)
157 cd01732 LSm5 The eukaryotic Sm 22.8 1.7E+02 0.0037 23.4 4.5 31 163-194 14-44 (76)
158 cd01728 LSm1 The eukaryotic Sm 22.3 2.1E+02 0.0045 22.8 4.9 32 163-195 13-44 (74)
159 cd01727 LSm8 The eukaryotic Sm 22.2 3.6E+02 0.0077 21.3 6.3 32 163-195 10-41 (74)
160 COG2524 Predicted transcriptio 22.2 6.7E+02 0.014 25.4 9.2 21 262-282 200-220 (294)
161 cd01729 LSm7 The eukaryotic Sm 21.8 2.1E+02 0.0045 23.2 4.9 31 163-194 13-43 (81)
162 cd01719 Sm_G The eukaryotic Sm 21.0 2.3E+02 0.0051 22.3 4.9 31 163-194 11-41 (72)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=3.9e-57 Score=486.39 Aligned_cols=390 Identities=21% Similarity=0.310 Sum_probs=317.9
Q ss_pred cHHHHHHHhCCCcEEEEeeecCC----------CCC---CCccCCCCCCceEEEEEEe--CCeEEEcCcccCCCcEEEEE
Q 007765 104 NAYAAIELALDSVVKIFTVSSSP----------NYG---LPWQNKSQRETTGSGFVIP--GKKILTNAHVVADSTFVLVR 168 (590)
Q Consensus 104 ~~~~~~~~~~~sVV~I~~~~~~~----------~~~---~p~~~~~~~~~~GsGfiI~--~g~IlT~aHvv~~~~~i~V~ 168 (590)
++.++++++.||||.|.+..... .++ .||+......+.||||||+ +||||||+|||++++.+.|+
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~ 120 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ 120 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence 68999999999999999865321 011 2444444456899999997 58999999999999999999
Q ss_pred EcCCCcEEEEEEEEecCCCCeEEEEecCCcccCcceeEEcCCCC--CCCCeEEEEecCCCCCCceEEEeEEecccccccc
Q 007765 169 KHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYV 246 (590)
Q Consensus 169 ~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~--~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~ 246 (590)
+. |+++++|++++.|+.+||||||++.+. .+++++|+++. .+|++|+++|||++.. .+++.|+|+.+.+....
T Consensus 121 ~~-dg~~~~a~vvg~D~~~DlAvlkv~~~~---~l~~~~lg~s~~~~~G~~V~aiG~P~g~~-~tvt~GivS~~~r~~~~ 195 (455)
T PRK10139 121 LN-DGREFDAKLIGSDDQSDIALLQIQNPS---KLTQIAIADSDKLRVGDFAVAVGNPFGLG-QTATSGIISALGRSGLN 195 (455)
T ss_pred EC-CCCEEEEEEEEEcCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEecCCCCC-CceEEEEEccccccccC
Confidence 97 899999999999999999999998653 68999999876 5699999999999976 58999999998775332
Q ss_pred cCcceeeEEEEcccccCCCCCceEE-eCCEEEEEEeeecC---CCCceeEEeehHHHHHHHHHHHHcCccccccccccce
Q 007765 247 HGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSC 322 (590)
Q Consensus 247 ~~~~~~~~i~~~~~i~~G~SGGPl~-~~G~vVGI~~~~~~---~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~ 322 (590)
. ..+..++++|+++++|||||||+ .+|+||||+++... +..+++|+||++.+++++++|.++|++. ++|||+.+
T Consensus 196 ~-~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~-r~~LGv~~ 273 (455)
T PRK10139 196 L-EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIK-RGLLGIKG 273 (455)
T ss_pred C-CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCccc-ccceeEEE
Confidence 2 22346899999999999999999 99999999998765 3468999999999999999999999998 99999999
Q ss_pred eeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEE
Q 007765 323 QTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKS 401 (590)
Q Consensus 323 ~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v 401 (590)
+++ ++++++.+|++. ..|++|..|.++|||+++ ||+||+|++|||++|.+|.++. ..+.....|+++
T Consensus 274 ~~l-~~~~~~~lgl~~-~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~----------~~l~~~~~g~~v 341 (455)
T PRK10139 274 TEM-SADIAKAFNLDV-QRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELR----------SRIATTEPGTKV 341 (455)
T ss_pred EEC-CHHHHHhcCCCC-CCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHH----------HHHHhcCCCCEE
Confidence 999 899999999975 579999999999999999 9999999999999999998764 556555788999
Q ss_pred EEEEEeCCEEEEEEEEEecCCCCCCCccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCc
Q 007765 402 LVRVLRDGKEHEFSITLRLLQPLVPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGE 481 (590)
Q Consensus 402 ~l~V~R~g~~~~~~v~l~~~~~l~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~ 481 (590)
.++|+|+|+.+++++++.......... ....| .+.|+.+.+. ..+...
T Consensus 342 ~l~V~R~G~~~~l~v~~~~~~~~~~~~-~~~~~---~~~g~~l~~~----------------------------~~~~~~ 389 (455)
T PRK10139 342 KLGLLRNGKPLEVEVTLDTSTSSSASA-EMITP---ALQGATLSDG----------------------------QLKDGT 389 (455)
T ss_pred EEEEEECCEEEEEEEEECCCCCccccc-ccccc---cccccEeccc----------------------------ccccCC
Confidence 999999999999988875432210000 00000 1123322210 001123
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCCC-eEEEEE
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDDD-RVVVLN 545 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r~-~~i~l~ 545 (590)
.+++|..|.++++|+.+|+++||+|++|||++|.+|++|.+++++++ +.+.+.+.|+ +.+.+.
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~-~~v~l~v~R~g~~~~~~ 453 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVRGNESIYLL 453 (455)
T ss_pred CceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEECCEEEEEE
Confidence 57899999999999999999999999999999999999999999865 7888888885 444443
No 2
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=1.7e-55 Score=474.71 Aligned_cols=394 Identities=25% Similarity=0.353 Sum_probs=331.4
Q ss_pred cHHHHHHHhCCCcEEEEeeecCCC-------------CCC---Ccc----CCCCCCceEEEEEEe-CCeEEEcCcccCCC
Q 007765 104 NAYAAIELALDSVVKIFTVSSSPN-------------YGL---PWQ----NKSQRETTGSGFVIP-GKKILTNAHVVADS 162 (590)
Q Consensus 104 ~~~~~~~~~~~sVV~I~~~~~~~~-------------~~~---p~~----~~~~~~~~GsGfiI~-~g~IlT~aHvv~~~ 162 (590)
++.++++++.||||.|.+...... ++. |.+ ......+.||||+|+ +||||||+||+.++
T Consensus 2 ~~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~ 81 (428)
T TIGR02037 2 SFAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA 81 (428)
T ss_pred cHHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC
Confidence 478999999999999998652211 110 000 012345789999999 78999999999999
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccCcceeEEcCCCC--CCCCeEEEEecCCCCCCceEEEeEEecc
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRV 240 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~--~~G~~V~~iG~p~g~~~~~v~~G~Vs~~ 240 (590)
..+.|++. +++.++|++++.|+.+||||||++... .+++++|+++. .+|++|+++|||.+.. .+++.|+|+..
T Consensus 82 ~~i~V~~~-~~~~~~a~vv~~d~~~DlAllkv~~~~---~~~~~~l~~~~~~~~G~~v~aiG~p~g~~-~~~t~G~vs~~ 156 (428)
T TIGR02037 82 DEITVTLS-DGREFKAKLVGKDPRTDIAVLKIDAKK---NLPVIKLGDSDKLRVGDWVLAIGNPFGLG-QTVTSGIVSAL 156 (428)
T ss_pred CeEEEEeC-CCCEEEEEEEEecCCCCEEEEEecCCC---CceEEEccCCCCCCCCCEEEEEECCCcCC-CcEEEEEEEec
Confidence 99999997 899999999999999999999998753 68999998765 6799999999999976 58999999988
Q ss_pred cccccccCcceeeEEEEcccccCCCCCceEE-eCCEEEEEEeeecC---CCCceeEEeehHHHHHHHHHHHHcCcccccc
Q 007765 241 EPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHGKYVGFC 316 (590)
Q Consensus 241 ~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~-~~G~vVGI~~~~~~---~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~ 316 (590)
.+... ....+..++++|+++++|+|||||+ .+|+||||+++... +..+++|+||++.+++++++++++|++. +|
T Consensus 157 ~~~~~-~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~-~~ 234 (428)
T TIGR02037 157 GRSGL-GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQ-RG 234 (428)
T ss_pred ccCcc-CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCc-CC
Confidence 76532 1223346899999999999999999 99999999988655 3467899999999999999999999987 99
Q ss_pred ccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhcc
Q 007765 317 SLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMK 395 (590)
Q Consensus 317 ~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~ 395 (590)
|||+.++.+ ++.+++.+|++. ..|++|..|.++|||+++ |++||+|++|||++|.+|.++. ..+...
T Consensus 235 ~lGi~~~~~-~~~~~~~lgl~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~~~----------~~l~~~ 302 (428)
T TIGR02037 235 WLGVTIQEV-TSDLAKSLGLEK-QRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFADLR----------RAIGTL 302 (428)
T ss_pred cCceEeecC-CHHHHHHcCCCC-CCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHHHH----------HHHHhc
Confidence 999999999 899999999986 589999999999999999 9999999999999999998654 566666
Q ss_pred CCCCEEEEEEEeCCEEEEEEEEEecCCCCCCCccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcC
Q 007765 396 KPNEKSLVRVLRDGKEHEFSITLRLLQPLVPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALREL 475 (590)
Q Consensus 396 ~~g~~v~l~V~R~g~~~~~~v~l~~~~~l~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~ 475 (590)
..|++++++|.|+|+.+++++++...+.. ..+....++|+.+.++++...+.++..
T Consensus 303 ~~g~~v~l~v~R~g~~~~~~v~l~~~~~~-------~~~~~~~~lGi~~~~l~~~~~~~~~l~----------------- 358 (428)
T TIGR02037 303 KPGKKVTLGILRKGKEKTITVTLGASPEE-------QASSSNPFLGLTVANLSPEIRKELRLK----------------- 358 (428)
T ss_pred CCCCEEEEEEEECCEEEEEEEEECcCCCc-------cccccccccceEEecCCHHHHHHcCCC-----------------
Confidence 78999999999999999999988755421 122345678999999998777766553
Q ss_pred CccCCcceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhc-CCCceEEEeCCC-eEEE
Q 007765 476 PKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENC-SSENLRFDLDDD-RVVV 543 (590)
Q Consensus 476 ~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~-~~~~v~l~~~r~-~~i~ 543 (590)
....+++|..|.++++++.+++++||+|++|||++|.++++|.++++++ +++.+.+++.|+ +.+.
T Consensus 359 ---~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~v~R~g~~~~ 425 (428)
T TIGR02037 359 ---GDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALLILRGGATIF 425 (428)
T ss_pred ---cCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEE
Confidence 1236899999999999999999999999999999999999999999986 578899999885 4433
No 3
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=2.2e-54 Score=467.26 Aligned_cols=387 Identities=24% Similarity=0.334 Sum_probs=316.3
Q ss_pred cHHHHHHHhCCCcEEEEeeecCCC-------C----C---CCcc----------------------CCCCCCceEEEEEE
Q 007765 104 NAYAAIELALDSVVKIFTVSSSPN-------Y----G---LPWQ----------------------NKSQRETTGSGFVI 147 (590)
Q Consensus 104 ~~~~~~~~~~~sVV~I~~~~~~~~-------~----~---~p~~----------------------~~~~~~~~GsGfiI 147 (590)
++.++++++.|+||.|.+...... . | .|+. ......+.||||||
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 599999999999999998663211 0 0 0110 00122468999999
Q ss_pred e--CCeEEEcCcccCCCcEEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccCcceeEEcCCCC--CCCCeEEEEec
Q 007765 148 P--GKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGY 223 (590)
Q Consensus 148 ~--~g~IlT~aHvv~~~~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~--~~G~~V~~iG~ 223 (590)
+ +||||||+|||.+++.+.|++. |+++|+|++++.|+.+||||||++... .+++++|+++. ++|++|+++|+
T Consensus 119 ~~~~G~IlTn~HVv~~a~~i~V~~~-dg~~~~a~vv~~D~~~DlAvlki~~~~---~l~~~~lg~s~~l~~G~~V~aiG~ 194 (473)
T PRK10942 119 DADKGYVVTNNHVVDNATKIKVQLS-DGRKFDAKVVGKDPRSDIALIQLQNPK---NLTAIKMADSDALRVGDYTVAIGN 194 (473)
T ss_pred ECCCCEEEeChhhcCCCCEEEEEEC-CCCEEEEEEEEecCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEcC
Confidence 8 4899999999999999999997 899999999999999999999998643 68999999876 56999999999
Q ss_pred CCCCCCceEEEeEEecccccccccCcceeeEEEEcccccCCCCCceEE-eCCEEEEEEeeecC---CCCceeEEeehHHH
Q 007765 224 PQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVI 299 (590)
Q Consensus 224 p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~-~~G~vVGI~~~~~~---~~~~~~~aip~~~i 299 (590)
|.+.. .+++.|+|+++.+..... ..+..++++|+++++|||||||+ .+|+||||+++.+. +..+++|+||++.+
T Consensus 195 P~g~~-~tvt~GiVs~~~r~~~~~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~ 272 (473)
T PRK10942 195 PYGLG-ETVTSGIVSALGRSGLNV-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMV 272 (473)
T ss_pred CCCCC-cceeEEEEEEeecccCCc-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHH
Confidence 99876 489999999887653221 12346899999999999999999 99999999998765 33578999999999
Q ss_pred HHHHHHHHHcCccccccccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcc
Q 007765 300 KHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTV 378 (590)
Q Consensus 300 ~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v 378 (590)
++++++|.++|++. ++|||+.++++ ++++++.++++. ..|++|..|.++|||+++ |++||+|++|||++|.+|.++
T Consensus 273 ~~v~~~l~~~g~v~-rg~lGv~~~~l-~~~~a~~~~l~~-~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl 349 (473)
T PRK10942 273 KNLTSQMVEYGQVK-RGELGIMGTEL-NSELAKAMKVDA-QRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAAL 349 (473)
T ss_pred HHHHHHHHhccccc-cceeeeEeeec-CHHHHHhcCCCC-CCceEEEEECCCChHHHcCCCCCCEEEEECCEECCCHHHH
Confidence 99999999999988 89999999999 888999999986 589999999999999999 999999999999999999876
Q ss_pred cccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEecCCCCCCCccCCCCCcceeeccEEEeeCCHHHHHHhCCC
Q 007765 379 AFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLRLLQPLVPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGED 458 (590)
Q Consensus 379 ~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~~~l~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~ 458 (590)
. ..+.....|+++.++|+|+|+.+++.+++...... ..++...++|+....+++.
T Consensus 350 ~----------~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~~~~~-------~~~~~~~~lGl~g~~l~~~-------- 404 (473)
T PRK10942 350 R----------AQVGTMPVGSKLTLGLLRDGKPVNVNVELQQSSQN-------QVDSSNIFNGIEGAELSNK-------- 404 (473)
T ss_pred H----------HHHHhcCCCCEEEEEEEECCeEEEEEEEeCcCccc-------ccccccccccceeeecccc--------
Confidence 4 56666678999999999999999998887553210 0011122345544433320
Q ss_pred ccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCC
Q 007765 459 WYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDD 538 (590)
Q Consensus 459 ~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r 538 (590)
....+++|..|.++++++.+|+++||+|++|||++|.+|++|.+++++.+ +.+.|++.|
T Consensus 405 --------------------~~~~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~-~~v~l~V~R 463 (473)
T PRK10942 405 --------------------GGDKGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-SVLALNIQR 463 (473)
T ss_pred --------------------cCCCCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEE
Confidence 01247899999999999999999999999999999999999999999854 788888888
Q ss_pred C-eEEEEE
Q 007765 539 D-RVVVLN 545 (590)
Q Consensus 539 ~-~~i~l~ 545 (590)
+ ..+.+.
T Consensus 464 ~g~~~~v~ 471 (473)
T PRK10942 464 GDSSIYLL 471 (473)
T ss_pred CCEEEEEE
Confidence 5 544443
No 4
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=1.6e-46 Score=393.08 Aligned_cols=296 Identities=25% Similarity=0.423 Sum_probs=254.3
Q ss_pred CCcHHHHHHHhCCCcEEEEeeecCCCCCCCccCCCCCCceEEEEEEe-CCeEEEcCcccCCCcEEEEEEcCCCcEEEEEE
Q 007765 102 TTNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQV 180 (590)
Q Consensus 102 ~~~~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfiI~-~g~IlT~aHvv~~~~~i~V~~~~~~~~~~a~v 180 (590)
..++.++++++.||||.|++.....+. + ......+.||||+|+ +||||||+|||.+++.+.|++. |++.++|++
T Consensus 44 ~~~~~~~~~~~~psVV~I~~~~~~~~~---~-~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~-dg~~~~a~v 118 (351)
T TIGR02038 44 EISFNKAVRRAAPAVVNIYNRSISQNS---L-NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQ-DGRKFEAEL 118 (351)
T ss_pred chhHHHHHHhcCCcEEEEEeEeccccc---c-ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEEC-CCCEEEEEE
Confidence 346889999999999999986543321 1 122345789999999 8899999999999999999997 899999999
Q ss_pred EEecCCCCeEEEEecCCcccCcceeEEcCCCC--CCCCeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEEc
Q 007765 181 EAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQID 258 (590)
Q Consensus 181 v~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~--~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~ 258 (590)
++.|+.+||||||++.. .+++++++++. ++|++|+++|||.+.. .+++.|+|+.+.+..... .....++++|
T Consensus 119 v~~d~~~DlAvlkv~~~----~~~~~~l~~s~~~~~G~~V~aiG~P~~~~-~s~t~GiIs~~~r~~~~~-~~~~~~iqtd 192 (351)
T TIGR02038 119 VGSDPLTDLAVLKIEGD----NLPTIPVNLDRPPHVGDVVLAIGNPYNLG-QTITQGIISATGRNGLSS-VGRQNFIQTD 192 (351)
T ss_pred EEecCCCCEEEEEecCC----CCceEeccCcCccCCCCEEEEEeCCCCCC-CcEEEEEEEeccCcccCC-CCcceEEEEC
Confidence 99999999999999976 47888888654 6799999999999876 589999999887754322 2234689999
Q ss_pred ccccCCCCCceEE-eCCEEEEEEeeecC-----CCCceeEEeehHHHHHHHHHHHHcCccccccccccceeeeccHhhhh
Q 007765 259 AAINPGNSGGPAI-MGNKVAGVAFQNLS-----GAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN 332 (590)
Q Consensus 259 ~~i~~G~SGGPl~-~~G~vVGI~~~~~~-----~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~ 332 (590)
+++++|||||||+ .+|+||||+++.+. ...+++|+||++.+++++++++++|++. +||||+.++++ ++..++
T Consensus 193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~-r~~lGv~~~~~-~~~~~~ 270 (351)
T TIGR02038 193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVI-RGYIGVSGEDI-NSVVAQ 270 (351)
T ss_pred CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCccc-ceEeeeEEEEC-CHHHHH
Confidence 9999999999999 99999999987653 2257899999999999999999999987 89999999998 788889
Q ss_pred hcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEE
Q 007765 333 NFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKE 411 (590)
Q Consensus 333 ~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~ 411 (590)
.+|++. ..|++|..|.++|||+++ |++||+|++|||++|.+|.++. ..+.....|+++.++|+|+|+.
T Consensus 271 ~lgl~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~----------~~l~~~~~g~~v~l~v~R~g~~ 339 (351)
T TIGR02038 271 GLGLPD-LRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELM----------DRIAETRPGSKVMVTVLRQGKQ 339 (351)
T ss_pred hcCCCc-cccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHH----------HHHHhcCCCCEEEEEEEECCEE
Confidence 999975 579999999999999999 9999999999999999998754 5665667899999999999999
Q ss_pred EEEEEEEecC
Q 007765 412 HEFSITLRLL 421 (590)
Q Consensus 412 ~~~~v~l~~~ 421 (590)
+++.+++..+
T Consensus 340 ~~~~v~l~~~ 349 (351)
T TIGR02038 340 LELPVTIDEK 349 (351)
T ss_pred EEEEEEecCC
Confidence 9998887643
No 5
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=6e-46 Score=388.49 Aligned_cols=296 Identities=22% Similarity=0.360 Sum_probs=251.9
Q ss_pred CcHHHHHHHhCCCcEEEEeeecCCCCCCCccCCCCCCceEEEEEEe-CCeEEEcCcccCCCcEEEEEEcCCCcEEEEEEE
Q 007765 103 TNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQVE 181 (590)
Q Consensus 103 ~~~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfiI~-~g~IlT~aHvv~~~~~i~V~~~~~~~~~~a~vv 181 (590)
..+.++++++.+|||.|.+....... .......+.||||+|+ +||||||+|||.+++.+.|++. |++.++|+++
T Consensus 45 ~~~~~~~~~~~psvV~v~~~~~~~~~----~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~-dg~~~~a~vv 119 (353)
T PRK10898 45 ASYNQAVRRAAPAVVNVYNRSLNSTS----HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQ-DGRVFEALLV 119 (353)
T ss_pred chHHHHHHHhCCcEEEEEeEeccccC----cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeC-CCCEEEEEEE
Confidence 46889999999999999986543211 1223345789999999 7899999999999999999997 8999999999
Q ss_pred EecCCCCeEEEEecCCcccCcceeEEcCCCC--CCCCeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEEcc
Q 007765 182 AVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDA 259 (590)
Q Consensus 182 ~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~--~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~ 259 (590)
+.|+.+||||||++.. .+++++|+++. .+|++|+++|||.+.. .+++.|+|+...+...... ....++++|+
T Consensus 120 ~~d~~~DlAvl~v~~~----~l~~~~l~~~~~~~~G~~V~aiG~P~g~~-~~~t~Giis~~~r~~~~~~-~~~~~iqtda 193 (353)
T PRK10898 120 GSDSLTDLAVLKINAT----NLPVIPINPKRVPHIGDVVLAIGNPYNLG-QTITQGIISATGRIGLSPT-GRQNFLQTDA 193 (353)
T ss_pred EEcCCCCEEEEEEcCC----CCCeeeccCcCcCCCCCEEEEEeCCCCcC-CCcceeEEEeccccccCCc-cccceEEecc
Confidence 9999999999999875 57888898764 6799999999999866 4899999998776533221 2236799999
Q ss_pred cccCCCCCceEE-eCCEEEEEEeeecCC------CCceeEEeehHHHHHHHHHHHHcCccccccccccceeeeccHhhhh
Q 007765 260 AINPGNSGGPAI-MGNKVAGVAFQNLSG------AENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN 332 (590)
Q Consensus 260 ~i~~G~SGGPl~-~~G~vVGI~~~~~~~------~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~ 332 (590)
++++|||||||+ .+|+||||+++.+.. ..+++|+||++.+++++++|+++|++. +||||+.++++ ++..+.
T Consensus 194 ~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~-~~~lGi~~~~~-~~~~~~ 271 (353)
T PRK10898 194 SINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVI-RGYIGIGGREI-APLHAQ 271 (353)
T ss_pred ccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCccc-ccccceEEEEC-CHHHHH
Confidence 999999999999 999999999976542 257899999999999999999999988 89999999988 666667
Q ss_pred hcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEE
Q 007765 333 NFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKE 411 (590)
Q Consensus 333 ~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~ 411 (590)
.++++. ..|++|..|.++|||+++ |++||+|++|||++|.++.++. +.+.....|+++.++|+|+|+.
T Consensus 272 ~~~~~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~----------~~l~~~~~g~~v~l~v~R~g~~ 340 (353)
T PRK10898 272 GGGIDQ-LQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETM----------DQVAEIRPGSVIPVVVMRDDKQ 340 (353)
T ss_pred hcCCCC-CCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHH----------HHHHhcCCCCEEEEEEEECCEE
Confidence 778765 589999999999999999 9999999999999999998653 5565667899999999999999
Q ss_pred EEEEEEEecCC
Q 007765 412 HEFSITLRLLQ 422 (590)
Q Consensus 412 ~~~~v~l~~~~ 422 (590)
.++.+++..++
T Consensus 341 ~~~~v~l~~~p 351 (353)
T PRK10898 341 LTLQVTIQEYP 351 (353)
T ss_pred EEEEEEeccCC
Confidence 99988886553
No 6
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=2.5e-41 Score=353.34 Aligned_cols=414 Identities=53% Similarity=0.784 Sum_probs=370.0
Q ss_pred HHHHhCCCcEEEEeeecCCCCCCCccCCCCCCceEEEEEEeCCeEEEcCcccC---CCcEEEEEEcCCCcEEEEEEEEec
Q 007765 108 AIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIPGKKILTNAHVVA---DSTFVLVRKHGSPTKYRAQVEAVG 184 (590)
Q Consensus 108 ~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfiI~~g~IlT~aHvv~---~~~~i~V~~~~~~~~~~a~vv~~d 184 (590)
..+...+|++.+.+....+.+.+||+...+....|+||.+....++||+|++. +...+.|...+.-++|.+++...-
T Consensus 55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~ 134 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGKKLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF 134 (473)
T ss_pred CccccccceeEEEeecccccccCcceeeehhcccccchhhcccceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence 45677789999999999999999999999999999999999999999999999 667777777667788999999999
Q ss_pred CCCCeEEEEecCCcccCcceeEEcCCCCCCCCeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEEcccccCC
Q 007765 185 HECDLAILIVESDEFWEGMHFLELGDIPFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPG 264 (590)
Q Consensus 185 ~~~DlAlLkv~~~~~~~~~~~~~l~~~~~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~i~~G 264 (590)
.++|+|++.++..+||+.+.|+++++.+.+.+.++++| ++...+|.|.|++.....|.++......+++++++++|
T Consensus 135 ~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~----gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~~~ 210 (473)
T KOG1320|consen 135 EECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG----GDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIGPG 210 (473)
T ss_pred hcccceEEEEeeccccCCCcccccCCCcccCccEEEEc----CCcEEEEeeEEEEEEeccccCCCcceeeEEEEEeecCC
Confidence 99999999999999999999999999999999999998 45579999999999988888877777889999999999
Q ss_pred CCCceEE-eCCEEEEEEeeecCCCCceeEEeehHHHHHHHHHHHHcCccccccccccceeeeccHhhhhhcCCCCCcCce
Q 007765 265 NSGGPAI-MGNKVAGVAFQNLSGAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGV 343 (590)
Q Consensus 265 ~SGGPl~-~~G~vVGI~~~~~~~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~lgl~~~~~gv 343 (590)
+||+|.+ -.+++.|+++....-.++.++.||...+.+|.......+.+.+|++++..++.+++...|+.+.|..+ .|+
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~-~g~ 289 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE-TGV 289 (473)
T ss_pred ccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc-cce
Confidence 9999999 45999999999886445889999999999999998889999899999999999999999999999987 999
Q ss_pred EEEEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEecCCC
Q 007765 344 LVNKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLRLLQP 423 (590)
Q Consensus 344 ~V~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~~~ 423 (590)
.+.++.+-+.|.+.++.||.|+++||+.|. +.++..+|+.|.+++..+.+++++.+.|+|.+ ++.+.+...+.
T Consensus 290 ~i~~~~qtd~ai~~~nsg~~ll~~DG~~Ig----Vn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~---e~~~~lr~~~~ 362 (473)
T KOG1320|consen 290 LISKINQTDAAINPGNSGGPLLNLDGEVIG----VNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG---EFQISLRPVKP 362 (473)
T ss_pred eeeeecccchhhhcccCCCcEEEecCcEee----eeeeeeEEeeccccceeccCchHhhhhhhhhh---hhceeeccccC
Confidence 999999999888889999999999999998 77888999999999999999999999999988 56677777788
Q ss_pred CCCCccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCC
Q 007765 424 LVPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFAD 503 (590)
Q Consensus 424 l~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~g 503 (590)
+.+.+++...|.|+++.||+|++++.+++...+. .++|++++|+++++|.+|++++|
T Consensus 363 ~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~-----------------------~q~v~is~Vlp~~~~~~~~~~~g 419 (473)
T KOG1320|consen 363 LVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSGV-----------------------VQLVLVSQVLPGSINGGYGLKPG 419 (473)
T ss_pred cccccccCCceeEEEecceEEeecCCCccccccc-----------------------eeEEEEEEeccCCCcccccccCC
Confidence 8888999999999999999999988766544222 27899999999999999999999
Q ss_pred ceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCCCeEEEEEechhhhhhHHHhhhcCCCc
Q 007765 504 LQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDDDRVVVLNYDVAKIATSKILKRHRIPS 564 (590)
Q Consensus 504 d~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r~~~i~l~~~~~~~~~~~i~~~~~i~~ 564 (590)
|+|.+|||++|.+..|+.++++.+..+ ++..+|+.+..++.+..|+.++.++.
T Consensus 420 ~~V~~vng~~V~n~~~l~~~i~~~~~~--------~~v~vl~~~~~e~~tl~Il~~~~~p~ 472 (473)
T KOG1320|consen 420 DQVVKVNGKPVKNLKHLYELIEECSTE--------DKVAVLDRRSAEDATLEILPEHKIPS 472 (473)
T ss_pred CEEEEECCEEeechHHHHHHHHhcCcC--------ceEEEEEecCccceeEEecccccCCC
Confidence 999999999999999999999998654 56667777777777777777776653
No 7
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=6e-35 Score=307.00 Aligned_cols=299 Identities=30% Similarity=0.433 Sum_probs=252.2
Q ss_pred CcHHHHHHHhCCCcEEEEeeecCCC-CCCCccCCCC-CCceEEEEEEe-CCeEEEcCcccCCCcEEEEEEcCCCcEEEEE
Q 007765 103 TNAYAAIELALDSVVKIFTVSSSPN-YGLPWQNKSQ-RETTGSGFVIP-GKKILTNAHVVADSTFVLVRKHGSPTKYRAQ 179 (590)
Q Consensus 103 ~~~~~~~~~~~~sVV~I~~~~~~~~-~~~p~~~~~~-~~~~GsGfiI~-~g~IlT~aHvv~~~~~i~V~~~~~~~~~~a~ 179 (590)
..+..+++++.++||.|........ .+.|-..... ..+.||||+++ ++||+||.||+.++..+.|.+. ++++++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~-dg~~~~a~ 111 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLA-DGREVPAK 111 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeC-CCCEEEEE
Confidence 6789999999999999998664332 0001000000 15889999999 9999999999999999999995 99999999
Q ss_pred EEEecCCCCeEEEEecCCcccCcceeEEcCCCC--CCCCeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEE
Q 007765 180 VEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQI 257 (590)
Q Consensus 180 vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~--~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~ 257 (590)
+++.|+..|+|+||++... .++.+.++++. .+|+++.++|+|.+.. .+++.|+|+.+.+........+..+||+
T Consensus 112 ~vg~d~~~dlavlki~~~~---~~~~~~~~~s~~l~vg~~v~aiGnp~g~~-~tvt~Givs~~~r~~v~~~~~~~~~Iqt 187 (347)
T COG0265 112 LVGKDPISDLAVLKIDGAG---GLPVIALGDSDKLRVGDVVVAIGNPFGLG-QTVTSGIVSALGRTGVGSAGGYVNFIQT 187 (347)
T ss_pred EEecCCccCEEEEEeccCC---CCceeeccCCCCcccCCEEEEecCCCCcc-cceeccEEeccccccccCcccccchhhc
Confidence 9999999999999999874 37788898876 4599999999999965 5999999999988622221225578999
Q ss_pred cccccCCCCCceEE-eCCEEEEEEeeecCCC---CceeEEeehHHHHHHHHHHHHcCccccccccccceeeeccHhhhhh
Q 007765 258 DAAINPGNSGGPAI-MGNKVAGVAFQNLSGA---ENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNN 333 (590)
Q Consensus 258 ~~~i~~G~SGGPl~-~~G~vVGI~~~~~~~~---~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~ 333 (590)
|+++++|+||||++ .+|++|||++...... .+++|+||++.++.++.++.+.|++. ++++|+.++++ ++...
T Consensus 188 dAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~-~~~lgv~~~~~-~~~~~-- 263 (347)
T COG0265 188 DAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVV-RGYLGVIGEPL-TADIA-- 263 (347)
T ss_pred ccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCcc-ccccceEEEEc-ccccc--
Confidence 99999999999999 9999999999987743 35899999999999999999988777 99999999988 55555
Q ss_pred cCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEE
Q 007765 334 FGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEH 412 (590)
Q Consensus 334 lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~ 412 (590)
+|++. ..|++|..|.+++||+++ ++.||+|+++||+++.+..++. ..+.....|+.+.+++.|+|++.
T Consensus 264 ~g~~~-~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~----------~~v~~~~~g~~v~~~~~r~g~~~ 332 (347)
T COG0265 264 LGLPV-AAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLV----------AAVASNRPGDEVALKLLRGGKER 332 (347)
T ss_pred cCCCC-CCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHH----------HHHhccCCCCEEEEEEEECCEEE
Confidence 78774 688999999999999999 9999999999999999987653 56666668999999999999999
Q ss_pred EEEEEEecC
Q 007765 413 EFSITLRLL 421 (590)
Q Consensus 413 ~~~v~l~~~ 421 (590)
++.+++...
T Consensus 333 ~~~v~l~~~ 341 (347)
T COG0265 333 ELAVTLGDR 341 (347)
T ss_pred EEEEEecCc
Confidence 999998763
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=100.00 E-value=2.5e-31 Score=279.09 Aligned_cols=381 Identities=18% Similarity=0.242 Sum_probs=304.5
Q ss_pred cHHHHHHHhCCCcEEEEeeecCCCCCCCccCCCCCCceEEEEEEe--CCeEEEcCcccCCCcEE-EEEEcCCCcEEEEEE
Q 007765 104 NAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP--GKKILTNAHVVADSTFV-LVRKHGSPTKYRAQV 180 (590)
Q Consensus 104 ~~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfiI~--~g~IlT~aHvv~~~~~i-~V~~~~~~~~~~a~v 180 (590)
+|...+..+.+|||.|+..... +|+......+-||||+++ .||||||+||+.....+ .+.+. +..+.+.-.
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~-----~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~-n~ee~ei~p 126 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVR-----AFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFD-NHEEIEIYP 126 (955)
T ss_pred hhhhhhhhhcccEEEEEehhee-----ecccccccccceeEEEEecccceEEEeccccCCCCceeEEEec-ccccCCccc
Confidence 7889999999999999986643 556667788889999999 79999999999966543 45443 667778888
Q ss_pred EEecCCCCeEEEEecCCcc-cCcceeEEcCC-CCCCCCeEEEEecCCCCCCceEEEeEEecccccccccCc-----ceee
Q 007765 181 EAVGHECDLAILIVESDEF-WEGMHFLELGD-IPFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGA-----TQLM 253 (590)
Q Consensus 181 v~~d~~~DlAlLkv~~~~~-~~~~~~~~l~~-~~~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~-----~~~~ 253 (590)
++.|+.+|+.++|.+.... ...+..+.++. ..++|.++.++|+..+.. +++..|.++++++.....+. ....
T Consensus 127 vyrDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDagEk-lsIlagflSrldr~apdyg~~~yndfnTf 205 (955)
T KOG1421|consen 127 VYRDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAGEK-LSILAGFLSRLDRNAPDYGEDTYNDFNTF 205 (955)
T ss_pred ccCCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCccce-EEeehhhhhhccCCCccccccccccccce
Confidence 9999999999999997632 12444455543 347799999999987754 78999999999886554332 2334
Q ss_pred EEEEcccccCCCCCceEE-eCCEEEEEEeeecCCCCceeEEeehHHHHHHHHHHHHcCccccccccccceeeeccHhhhh
Q 007765 254 AIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLSGAENIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRN 332 (590)
Q Consensus 254 ~i~~~~~i~~G~SGGPl~-~~G~vVGI~~~~~~~~~~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~ 332 (590)
++|..+....|.||+||+ .+|..|.++..+.. ....+|++|++.+.+.|..++++..++ ++.|-+++-.- .-+.++
T Consensus 206 y~QaasstsggssgspVv~i~gyAVAl~agg~~-ssas~ffLpLdrV~RaL~clq~n~PIt-RGtLqvefl~k-~~de~r 282 (955)
T KOG1421|consen 206 YIQAASSTSGGSSGSPVVDIPGYAVALNAGGSI-SSASDFFLPLDRVVRALRCLQNNTPIT-RGTLQVEFLHK-LFDECR 282 (955)
T ss_pred eeeehhcCCCCCCCCceecccceEEeeecCCcc-cccccceeeccchhhhhhhhhcCCCcc-cceEEEEEehh-hhHHHH
Confidence 688888889999999999 99999999987654 345689999999999999999777766 77766665443 347788
Q ss_pred hcCCCC-----------CcCceEE-EEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCE
Q 007765 333 NFGMRS-----------EVTGVLV-NKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEK 400 (590)
Q Consensus 333 ~lgl~~-----------~~~gv~V-~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~ 400 (590)
.+||+. +..|++| ..|.+++||++.|++||++++||+.-+.++..+ ..++ ....|+.
T Consensus 283 rlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~l----------~~iL-Degvgk~ 351 (955)
T KOG1421|consen 283 RLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEAL----------EQIL-DEGVGKN 351 (955)
T ss_pred hcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHHH----------HHHH-hhccCce
Confidence 899865 3577765 789999999999999999999999888877654 2454 4568999
Q ss_pred EEEEEEeCCEEEEEEEEEecCCCCCCCccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCC
Q 007765 401 SLVRVLRDGKEHEFSITLRLLQPLVPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAG 480 (590)
Q Consensus 401 v~l~V~R~g~~~~~~v~l~~~~~l~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~ 480 (590)
+.|+|+|.|++.++++..++++... ...|+.++|.+|+++++++.+.+.+.
T Consensus 352 l~LtI~Rggqelel~vtvqdlh~it-------p~R~levcGav~hdlsyq~ar~y~lP---------------------- 402 (955)
T KOG1421|consen 352 LELTIQRGGQELELTVTVQDLHGIT-------PDRFLEVCGAVFHDLSYQLARLYALP---------------------- 402 (955)
T ss_pred EEEEEEeCCEEEEEEEEeccccCCC-------CceEEEEcceEecCCCHHHHhhcccc----------------------
Confidence 9999999999999999998876533 33678899999999999988876664
Q ss_pred cceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCC-CceEEEe
Q 007765 481 EQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSS-ENLRFDL 536 (590)
Q Consensus 481 ~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~-~~v~l~~ 536 (590)
.+|++++.-- ++++.+++.. +.+|.+||++++.++++|.+++++.+. +.+.+.+
T Consensus 403 ~~GvyVa~~~-gsf~~~~~~y-~~ii~~vanK~tPdLdaFidvlk~L~dg~rV~vry 457 (955)
T KOG1421|consen 403 VEGVYVASPG-GSFRHRGPRY-GQIIDSVANKPTPDLDAFIDVLKELPDGARVPVRY 457 (955)
T ss_pred cCcEEEccCC-CCccccCCcc-eEEEEeecCCcCCCHHHHHHHHHhccCCCeeeEEE
Confidence 4588888766 7788888866 999999999999999999999999655 4566543
No 9
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.92 E-value=1.7e-23 Score=219.58 Aligned_cols=303 Identities=25% Similarity=0.257 Sum_probs=232.9
Q ss_pred CCcHHHHHHHhCCCcEEEEeeecCCCCCCCccCCCCCCceEEEEEEe-CCeEEEcCcccCCCc-----------EEEEEE
Q 007765 102 TTNAYAAIELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP-GKKILTNAHVVADST-----------FVLVRK 169 (590)
Q Consensus 102 ~~~~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfiI~-~g~IlT~aHvv~~~~-----------~i~V~~ 169 (590)
.....++.++-..++|.|....- +....|+....-....|||||++ +++++||+||+.... .+.|..
T Consensus 127 ~~~v~~~~~~cd~Avv~Ie~~~f-~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~a 205 (473)
T KOG1320|consen 127 KAFVAAVFEECDLAVVYIESEEF-WKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDA 205 (473)
T ss_pred hhhHHHhhhcccceEEEEeeccc-cCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEE
Confidence 34567788888999999987442 22223677777788999999999 999999999998543 266666
Q ss_pred cC-CCcEEEEEEEEecCCCCeEEEEecCCcccCcceeEEcCCCC--CCCCeEEEEecCCCCCCceEEEeEEecccccccc
Q 007765 170 HG-SPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIP--FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYV 246 (590)
Q Consensus 170 ~~-~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~--~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~ 246 (590)
.. .+..+.+.+.+.|+..|+|+++++.++ .-.++++++-+. ..|+++.++|.|++..+ +.+.|+++...|..+.
T Consensus 206 a~~~~~s~ep~i~g~d~~~gvA~l~ik~~~--~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~n-t~t~g~vs~~~R~~~~ 282 (473)
T KOG1320|consen 206 AIGPGNSGEPVIVGVDKVAGVAFLKIKTPE--NILYVIPLGVSSHFRTGVEVSAIGNGFGLLN-TLTQGMVSGQLRKSFK 282 (473)
T ss_pred eecCCccCCCeEEccccccceEEEEEecCC--cccceeecceeeeecccceeeccccCceeee-eeeecccccccccccc
Confidence 52 247889999999999999999997664 237788887665 45999999999999886 8999999988776554
Q ss_pred cC----cceeeEEEEcccccCCCCCceEE-eCCEEEEEEeeecC---CCCceeEEeehHHHHHHHHHHHHcC---ccc--
Q 007765 247 HG----ATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS---GAENIGYIIPVPVIKHFITGVVEHG---KYV-- 313 (590)
Q Consensus 247 ~~----~~~~~~i~~~~~i~~G~SGGPl~-~~G~vVGI~~~~~~---~~~~~~~aip~~~i~~~l~~l~~~g---~~~-- 313 (590)
-+ .....++++|++++.|+||||++ .+|++||++++... -..+++|++|.+.+..++.+..+.. +..
T Consensus 283 lg~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~ 362 (473)
T KOG1320|consen 283 LGLETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKP 362 (473)
T ss_pred cCcccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccC
Confidence 22 23346899999999999999999 99999999988755 2357899999999999888763322 111
Q ss_pred ---cccccccceeeeccHhh-----hhhcCCCC-CcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCccccccc
Q 007765 314 ---GFCSLGLSCQTTENVQL-----RNNFGMRS-EVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNR 383 (590)
Q Consensus 314 ---~~~~lGi~~~~~~~~~~-----~~~lgl~~-~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~ 383 (590)
...|+|+....+ ++.+ .+.+-++. ...++++..|.|++++... +++||+|++|||++|.+..++.
T Consensus 363 ~~p~~~~~g~~s~~i-~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~---- 437 (473)
T KOG1320|consen 363 LVPVHQYIGLPSYYI-FAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLY---- 437 (473)
T ss_pred cccccccCCceeEEE-ecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHHHH----
Confidence 134777766544 2222 12222221 2358899999999999999 9999999999999999999875
Q ss_pred ccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEe
Q 007765 384 ERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLR 419 (590)
Q Consensus 384 ~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 419 (590)
+++.....++++.+...|..|..++.+...
T Consensus 438 ------~~i~~~~~~~~v~vl~~~~~e~~tl~Il~~ 467 (473)
T KOG1320|consen 438 ------ELIEECSTEDKVAVLDRRSAEDATLEILPE 467 (473)
T ss_pred ------HHHHhcCcCceEEEEEecCccceeEEeccc
Confidence 788887788888888888888888876544
No 10
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.83 E-value=5.4e-18 Score=179.35 Aligned_cols=377 Identities=16% Similarity=0.153 Sum_probs=269.7
Q ss_pred HHHhCCCcEEEEeeecCCCCCCCccCCCCCCceEEEEEEe--CCeEEEcCcccC-CCcEEEEEEcCCCcEEEEEEEEecC
Q 007765 109 IELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIP--GKKILTNAHVVA-DSTFVLVRKHGSPTKYRAQVEAVGH 185 (590)
Q Consensus 109 ~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfiI~--~g~IlT~aHvv~-~~~~i~V~~~~~~~~~~a~vv~~d~ 185 (590)
.+.+..+.|.+.+....+-.. -......|||.|++ .|++++...++. ++...+|+.. |...++|.+...++
T Consensus 524 ~~~i~~~~~~v~~~~~~~l~g-----~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~-dS~~i~a~~~fL~~ 597 (955)
T KOG1421|consen 524 SADISNCLVDVEPMMPVNLDG-----VSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEA-DSDGIPANVSFLHP 597 (955)
T ss_pred hhHHhhhhhhheeceeecccc-----chhhhhcCceEEEEccCCceeEecccCCchhhceEEeec-ccccccceeeEecC
Confidence 577788888888765432211 11234569999999 899999999997 6678888886 77889999999999
Q ss_pred CCCeEEEEecCCcccCcceeEEcCCCC-CCCCeEEEEecCCCCCC----ceEEEeEEecccccccc-cCcceeeEEEEcc
Q 007765 186 ECDLAILIVESDEFWEGMHFLELGDIP-FLQQAVAVVGYPQGGDN----ISVTKGVVSRVEPTQYV-HGATQLMAIQIDA 259 (590)
Q Consensus 186 ~~DlAlLkv~~~~~~~~~~~~~l~~~~-~~G~~V~~iG~p~g~~~----~~v~~G~Vs~~~~~~~~-~~~~~~~~i~~~~ 259 (590)
..++|.+|.++.. ...++|.+.. .-|++|...|+....+- -+++.-.+..+...... .....+..|.++.
T Consensus 598 t~n~a~~kydp~~----~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~ 673 (955)
T KOG1421|consen 598 TENVASFKYDPAL----EVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMD 673 (955)
T ss_pred ccceeEeccChhH----hhhhccceeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEec
Confidence 9999999998763 3445555443 45999999999765442 12222211111111111 1123456777776
Q ss_pred cccCCCCCceEE-eCCEEEEEEeeecC---CCC--ceeEEeehHHHHHHHHHHHHcCccccccccccceeeeccHhhhhh
Q 007765 260 AINPGNSGGPAI-MGNKVAGVAFQNLS---GAE--NIGYIIPVPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNN 333 (590)
Q Consensus 260 ~i~~G~SGGPl~-~~G~vVGI~~~~~~---~~~--~~~~aip~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~ 333 (590)
....++--|-+. .+|+|+|++...+. +.. .+-|.+.+..++..|+.|+.++... .-.+|+++..+ +..-++.
T Consensus 674 nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r-p~i~~vef~~i-~laqar~ 751 (955)
T KOG1421|consen 674 NLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR-PTIAGVEFSHI-TLAQART 751 (955)
T ss_pred cccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC-ceeeccceeeE-Eeehhhc
Confidence 666555556677 99999999976655 111 2456677889999999999776655 44688888887 6667788
Q ss_pred cCCCCC------------cCceEEEEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEE
Q 007765 334 FGMRSE------------VTGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKS 401 (590)
Q Consensus 334 lgl~~~------------~~gv~V~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v 401 (590)
+||+.+ .+=.+|..|.+..+ +.|..||+|+++||+-|....++. + +. .+
T Consensus 752 lglp~e~imk~e~es~~~~ql~~ishv~~~~~--kil~~gdiilsvngk~itr~~dl~----------d-~~------ei 812 (955)
T KOG1421|consen 752 LGLPSEFIMKSEEESTIPRQLYVISHVRPLLH--KILGVGDIILSVNGKMITRLSDLH----------D-FE------EI 812 (955)
T ss_pred cCCCHHHHhhhhhcCCCcceEEEEEeeccCcc--cccccccEEEEecCeEEeeehhhh----------h-hh------hh
Confidence 888752 12245677766533 339999999999999999887653 2 11 56
Q ss_pred EEEEEeCCEEEEEEEEEecCCCCCCCccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCc
Q 007765 402 LVRVLRDGKEHEFSITLRLLQPLVPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGE 481 (590)
Q Consensus 402 ~l~V~R~g~~~~~~v~l~~~~~l~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~ 481 (590)
...|+|||++++++++.-+.. +..+..+|.|-.+++.+..+.++ +.++ .
T Consensus 813 d~~ilrdg~~~~ikipt~p~~---------et~r~vi~~gailq~ph~av~~q-----------------~edl-----p 861 (955)
T KOG1421|consen 813 DAVILRDGIEMEIKIPTYPEY---------ETSRAVIWMGAILQPPHSAVFEQ-----------------VEDL-----P 861 (955)
T ss_pred heeeeecCcEEEEEecccccc---------ccceEEEEEeccccCchHHHHHH-----------------Hhcc-----C
Confidence 789999999999998875432 34467899999999988776665 2222 2
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCC-ceEEEe--CCCeEEEEEech
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSE-NLRFDL--DDDRVVVLNYDV 548 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~-~v~l~~--~r~~~i~l~~~~ 548 (590)
++|+++....+|||.. ++.+--.|.+|||..+.++++|..++.+.++. ++++.. -|+....+.++.
T Consensus 862 ~gvyvt~rg~gspalq-~l~aa~fitavng~~t~~lddf~~~~~~ipdnsyv~v~~mtfd~vp~~~s~k~ 930 (955)
T KOG1421|consen 862 EGVYVTSRGYGSPALQ-MLRAAHFITAVNGHDTNTLDDFYHMLLEIPDNSYVQVKQMTFDGVPSIVSVKP 930 (955)
T ss_pred CceEEeecccCChhHh-hcchheeEEEecccccCcHHHHHHHHhhCCCCceEEEEEeccCCCceEEEecc
Confidence 7899999999999887 77788899999999999999999999997665 566653 345555555554
No 11
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=99.78 E-value=4.5e-19 Score=192.22 Aligned_cols=200 Identities=16% Similarity=0.139 Sum_probs=142.1
Q ss_pred eEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEecC
Q 007765 343 VLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLRLL 421 (590)
Q Consensus 343 v~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~ 421 (590)
.+|.+|.++|||++| ||+||+|++|||++|.+|++++ ..+.....|++++++|.|+|+++++++++...
T Consensus 128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~----------~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~~ 197 (449)
T PRK10779 128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVR----------LALVSKIGDESTTITVAPFGSDQRRDKTLDLR 197 (449)
T ss_pred ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHH----------HHHHhhccCCceEEEEEeCCccceEEEEeccc
Confidence 378999999999999 9999999999999999998765 45556678889999999999999888887543
Q ss_pred CCCCCCccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeecccccccccc
Q 007765 422 QPLVPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERF 501 (590)
Q Consensus 422 ~~l~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~ 501 (590)
+.... +........+|+ .++.+ ...+++.+|.++|||+.+|++
T Consensus 198 ~~~~~----~~~~~~~~~lGl--~~~~~-------------------------------~~~~vV~~V~~~SpA~~AGL~ 240 (449)
T PRK10779 198 HWAFE----PDKQDPVSSLGI--RPRGP-------------------------------QIEPVLAEVQPNSAASKAGLQ 240 (449)
T ss_pred ccccC----ccccchhhcccc--cccCC-------------------------------CcCcEEEeeCCCCHHHHcCCC
Confidence 22100 000111112333 22221 123588999999999999999
Q ss_pred CCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCCC-eEEEEEechhhhh-hHHHhhhcCCCcCCCC------CCCCc
Q 007765 502 ADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDDD-RVVVLNYDVAKIA-TSKILKRHRIPSAMSG------DLNGE 573 (590)
Q Consensus 502 ~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r~-~~i~l~~~~~~~~-~~~i~~~~~i~~~~s~------~l~~~ 573 (590)
+||+|++|||++|++|+++.+.++.++++.+.+++.|+ +.+.+..+..... .....++.|+...... ....+
T Consensus 241 ~GDvIl~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~~~~~v~~~~~~~~g~~~~~iGi~~~~~~~~~~~~~~~~~ 320 (449)
T PRK10779 241 AGDRIVKVDGQPLTQWQTFVTLVRDNPGKPLALEIERQGSPLSLTLTPDSKPGNGKAEGFAGVVPKVIPLPDEYKTVRQY 320 (449)
T ss_pred CCCEEEEECCEEcCCHHHHHHHHHhCCCCEEEEEEEECCEEEEEEEEeeeecCCCceeeEEEEeccccCCcccceeEEec
Confidence 99999999999999999999999998888999999884 5444444432111 0011223444332110 01235
Q ss_pred cchHHHHhhccccccc
Q 007765 574 QISEIELASRHKEWSQ 589 (590)
Q Consensus 574 ~~~ea~~~~~~~~~~~ 589 (590)
.+.+|+.+++.+||++
T Consensus 321 ~~~~ai~~a~~~~~~~ 336 (449)
T PRK10779 321 GPFSAIYEATDKTWQL 336 (449)
T ss_pred CHHHHHHHHHHHHHHH
Confidence 6679999999999986
No 12
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=99.69 E-value=5.3e-17 Score=174.39 Aligned_cols=178 Identities=13% Similarity=0.162 Sum_probs=132.8
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEe
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLR 419 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 419 (590)
.|++|.+|.++|||+++ ||+||+|++|||+++.++.++. ..+.... +++.+++.|+|+..++.+++.
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~----------~~ia~~~--~~v~~~I~r~g~~~~l~v~l~ 195 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVR----------QQIADIA--GEPMVEILAERENWTFEVMKE 195 (420)
T ss_pred CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHH----------HHHHhhc--ccceEEEEEecCceEeccccc
Confidence 68899999999999999 9999999999999999999764 3343333 678999999988766433321
Q ss_pred cCCCCCCCccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeecccccccc
Q 007765 420 LLQPLVPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYE 499 (590)
Q Consensus 420 ~~~~l~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~ 499 (590)
+ .++.+ +.++++..|.+++||+.+|
T Consensus 196 ----------------------~--~~~~~-------------------------------~~g~vV~~V~~~SpA~~aG 220 (420)
T TIGR00054 196 ----------------------L--IPRGP-------------------------------KIEPVLSDVTPNSPAEKAG 220 (420)
T ss_pred ----------------------c--eecCC-------------------------------CcCcEEEEECCCCHHHHcC
Confidence 1 11111 1245889999999999999
Q ss_pred ccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCCC-eEEEEEechhhhhhHHHhhhcCCCcCCCCCCCCccchHH
Q 007765 500 RFADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDDD-RVVVLNYDVAKIATSKILKRHRIPSAMSGDLNGEQISEI 578 (590)
Q Consensus 500 ~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r~-~~i~l~~~~~~~~~~~i~~~~~i~~~~s~~l~~~~~~ea 578 (590)
+++||+|++|||++|++|+++.+.+++++++.+.+++.|+ +...+..+...... . ..|+..........+.+.+|
T Consensus 221 L~~GD~Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~~~~~v~~~~~~~---~-~iGi~~~~~~~~~~~~~~~a 296 (420)
T TIGR00054 221 LKEGDYIQSINGEKLRSWTDFVSAVKENPGKSMDIKVERNGETLSISLTPEAKGK---I-GIGISPSLAPLEVSYGILNA 296 (420)
T ss_pred CCCCCEEEEECCEECCCHHHHHHHHHhCCCCceEEEEEECCEEEEEEEEEcCCCc---e-EEEEeccccceeeecCHHHH
Confidence 9999999999999999999999999998899999999884 54444444422210 0 13443222111134577799
Q ss_pred HHhhccccccc
Q 007765 579 ELASRHKEWSQ 589 (590)
Q Consensus 579 ~~~~~~~~~~~ 589 (590)
+.+|+.+||++
T Consensus 297 ~~~~~~~t~~~ 307 (420)
T TIGR00054 297 FAKGASATVDI 307 (420)
T ss_pred HHHHHHHHHHH
Confidence 99999999985
No 13
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.52 E-value=1.3e-13 Score=121.66 Aligned_cols=107 Identities=36% Similarity=0.475 Sum_probs=70.7
Q ss_pred EEEEEEeC-CeEEEcCcccC--------CCcEEEEEEcCCCcEEE--EEEEEecCC-CCeEEEEecCCcccCcceeEEcC
Q 007765 142 GSGFVIPG-KKILTNAHVVA--------DSTFVLVRKHGSPTKYR--AQVEAVGHE-CDLAILIVESDEFWEGMHFLELG 209 (590)
Q Consensus 142 GsGfiI~~-g~IlT~aHvv~--------~~~~i~V~~~~~~~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~~~~l~ 209 (590)
||||+|++ |+||||+||+. ....+.+... ++..+. ++++..++. .|+|||+++... .
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~D~All~v~~~~------~---- 69 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFP-DGRRVPPVAEVVYFDPDDYDLALLKVDPWT------G---- 69 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEET-TSCEEETEEEEEEEETT-TTEEEEEESCEE------E----
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEec-CCCEEeeeEEEEEECCccccEEEEEEeccc------c----
Confidence 89999994 59999999999 4567888877 666677 999999999 999999999100 0
Q ss_pred CCCCCCCeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEEcccccCCCCCceEE-eCCEEEEE
Q 007765 210 DIPFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGV 279 (590)
Q Consensus 210 ~~~~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~-~~G~vVGI 279 (590)
.+......+. ..+..... . .......+ +++.+.+|+|||||| .+|+||||
T Consensus 70 ----~~~~~~~~~~---------~~~~~~~~----~--~~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 70 ----VGGGVRVPGS---------TSGVSPTS----T--NDNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ----EEEEEEEEEE---------EEEEEEEE----E--EETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred ----eeeeeEeeee---------cccccccc----C--cccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 0000000000 00000000 0 00111124 799999999999999 99999997
No 14
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.41 E-value=7.8e-13 Score=109.13 Aligned_cols=81 Identities=35% Similarity=0.559 Sum_probs=68.4
Q ss_pred cccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhc
Q 007765 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSM 394 (590)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~ 394 (590)
||||+.++...+ ..|++|..|.++|||+++ |++||+|++|||++|.++.++. ..+..
T Consensus 1 ~~lGv~~~~~~~------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~----------~~l~~ 58 (82)
T PF13180_consen 1 GGLGVTVQNLSD------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLV----------NILSK 58 (82)
T ss_dssp -E-SEEEEECSC------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHH----------HHHHC
T ss_pred CEECeEEEEccC------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHH----------HHHHh
Confidence 689999988731 369999999999999999 9999999999999999887653 66767
Q ss_pred cCCCCEEEEEEEeCCEEEEEEEEE
Q 007765 395 KKPNEKSLVRVLRDGKEHEFSITL 418 (590)
Q Consensus 395 ~~~g~~v~l~V~R~g~~~~~~v~l 418 (590)
..+|++++|+|+|+|+.++++++|
T Consensus 59 ~~~g~~v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 59 GKPGDTVTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp SSTTSEEEEEEEETTEEEEEEEE-
T ss_pred CCCCCEEEEEEEECCEEEEEEEEC
Confidence 789999999999999999998875
No 15
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.38 E-value=3.7e-11 Score=117.20 Aligned_cols=165 Identities=21% Similarity=0.224 Sum_probs=108.5
Q ss_pred CceEEEEEEeCCeEEEcCcccCCCcEEEEEEcC------CC--cEEEEEEEEec----C---CCCeEEEEecCC-cccCc
Q 007765 139 ETTGSGFVIPGKKILTNAHVVADSTFVLVRKHG------SP--TKYRAQVEAVG----H---ECDLAILIVESD-EFWEG 202 (590)
Q Consensus 139 ~~~GsGfiI~~g~IlT~aHvv~~~~~i~V~~~~------~~--~~~~a~vv~~d----~---~~DlAlLkv~~~-~~~~~ 202 (590)
...|+|++|++.+|||+|||+.+...+.+.+.. ++ ..+..+-+..+ . .+|||||+++.+ .+.+.
T Consensus 24 ~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~~ 103 (220)
T PF00089_consen 24 RFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGDN 103 (220)
T ss_dssp EEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBSS
T ss_pred CeeEeEEecccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 567999999999999999999996666665431 22 23444433332 2 579999999988 45668
Q ss_pred ceeEEcCCCC---CCCCeEEEEecCCCCCCc---eEEEeEE---ecccccccccCcceeeEEEEcc----cccCCCCCce
Q 007765 203 MHFLELGDIP---FLQQAVAVVGYPQGGDNI---SVTKGVV---SRVEPTQYVHGATQLMAIQIDA----AINPGNSGGP 269 (590)
Q Consensus 203 ~~~~~l~~~~---~~G~~V~~iG~p~g~~~~---~v~~G~V---s~~~~~~~~~~~~~~~~i~~~~----~i~~G~SGGP 269 (590)
+.++.+.... ..|+.+.++||+...... .+....+ +.-.+............++... ..+.|+||||
T Consensus 104 ~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~p 183 (220)
T PF00089_consen 104 IQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGGP 183 (220)
T ss_dssp BEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTSE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 8899998733 568999999998753321 3333333 2221211111111123455554 7899999999
Q ss_pred EEeCC-EEEEEEeeecCCCC--ceeEEeehHHHHHHH
Q 007765 270 AIMGN-KVAGVAFQNLSGAE--NIGYIIPVPVIKHFI 303 (590)
Q Consensus 270 l~~~G-~vVGI~~~~~~~~~--~~~~aip~~~i~~~l 303 (590)
|+.++ .|+||++....... ..+++.++....+++
T Consensus 184 l~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 184 LICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 99444 59999998744222 258888888777664
No 16
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.34 E-value=9e-11 Score=115.32 Aligned_cols=167 Identities=19% Similarity=0.175 Sum_probs=100.0
Q ss_pred CCceEEEEEEeCCeEEEcCcccCCC--cEEEEEEcC--------CCcEEEEEEEEec-------CCCCeEEEEecCC-cc
Q 007765 138 RETTGSGFVIPGKKILTNAHVVADS--TFVLVRKHG--------SPTKYRAQVEAVG-------HECDLAILIVESD-EF 199 (590)
Q Consensus 138 ~~~~GsGfiI~~g~IlT~aHvv~~~--~~i~V~~~~--------~~~~~~a~vv~~d-------~~~DlAlLkv~~~-~~ 199 (590)
....|+|++|++.+|||+|||+.+. ..+.|.+.. ....+..+-+..+ ..+|||||+++.+ .+
T Consensus 23 ~~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~ 102 (232)
T cd00190 23 GRHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTL 102 (232)
T ss_pred CcEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccC
Confidence 3468999999999999999999875 456665421 1222334334444 3579999999976 44
Q ss_pred cCcceeEEcCCC--C-CCCCeEEEEecCCCCCC----ceEEEeEE---eccccccccc--CcceeeEEEE-----ccccc
Q 007765 200 WEGMHFLELGDI--P-FLQQAVAVVGYPQGGDN----ISVTKGVV---SRVEPTQYVH--GATQLMAIQI-----DAAIN 262 (590)
Q Consensus 200 ~~~~~~~~l~~~--~-~~G~~V~~iG~p~g~~~----~~v~~G~V---s~~~~~~~~~--~~~~~~~i~~-----~~~i~ 262 (590)
...+.|+.|... . ..++.+.+.||...... .......+ ....+..... .......++. ....|
T Consensus 103 ~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 182 (232)
T cd00190 103 SDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDAC 182 (232)
T ss_pred CCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccc
Confidence 456889998866 2 44899999998664321 11222222 2211111111 0001112222 33468
Q ss_pred CCCCCceEE-eC---CEEEEEEeeecCCC--CceeEEeehHHHHHHHH
Q 007765 263 PGNSGGPAI-MG---NKVAGVAFQNLSGA--ENIGYIIPVPVIKHFIT 304 (590)
Q Consensus 263 ~G~SGGPl~-~~---G~vVGI~~~~~~~~--~~~~~aip~~~i~~~l~ 304 (590)
.|+|||||+ .. +.|+||++...... ...+....+...+++++
T Consensus 183 ~gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI~ 230 (232)
T cd00190 183 QGDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWIQ 230 (232)
T ss_pred cCCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHhh
Confidence 899999999 43 88999999865422 23344454455444443
No 17
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.24 E-value=4.1e-11 Score=100.35 Aligned_cols=88 Identities=33% Similarity=0.529 Sum_probs=73.6
Q ss_pred cccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhc
Q 007765 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSM 394 (590)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~ 394 (590)
||+|+.++.+ ++..++.++++. ..|++|..|.++|||+++ |++||+|++|||+++.++.++. .++..
T Consensus 1 ~~~G~~~~~~-~~~~~~~~~~~~-~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~----------~~l~~ 68 (90)
T cd00987 1 PWLGVTVQDL-TPDLAEELGLKD-TKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLR----------RALAE 68 (90)
T ss_pred CccceEEeEC-CHHHHHHcCCCC-CCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHH----------HHHHh
Confidence 5899999998 677777677654 579999999999999999 9999999999999999998653 55655
Q ss_pred cCCCCEEEEEEEeCCEEEEEE
Q 007765 395 KKPNEKSLVRVLRDGKEHEFS 415 (590)
Q Consensus 395 ~~~g~~v~l~V~R~g~~~~~~ 415 (590)
...++.+.+++.|+|+...+.
T Consensus 69 ~~~~~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 69 LKPGDKVTLTVLRGGKELTVT 89 (90)
T ss_pred cCCCCEEEEEEEECCEEEEee
Confidence 556889999999999876553
No 18
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.23 E-value=3.6e-10 Score=111.23 Aligned_cols=146 Identities=19% Similarity=0.218 Sum_probs=91.5
Q ss_pred CceEEEEEEeCCeEEEcCcccCCC--cEEEEEEcCCC-------cEEEEEEEEec-------CCCCeEEEEecCC-cccC
Q 007765 139 ETTGSGFVIPGKKILTNAHVVADS--TFVLVRKHGSP-------TKYRAQVEAVG-------HECDLAILIVESD-EFWE 201 (590)
Q Consensus 139 ~~~GsGfiI~~g~IlT~aHvv~~~--~~i~V~~~~~~-------~~~~a~vv~~d-------~~~DlAlLkv~~~-~~~~ 201 (590)
...|+|++|++.+|||+|||+.+. ..+.|.+.... ..+.+.-+..+ ...|||||+++.+ .+..
T Consensus 25 ~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~ 104 (229)
T smart00020 25 RHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSD 104 (229)
T ss_pred CcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCC
Confidence 467999999999999999999875 36777764211 23344434332 4679999999987 3445
Q ss_pred cceeEEcCCC---CCCCCeEEEEecCCCCC-----CceEEEeEEecccc---cc-cccC-cceeeEEEE-----cccccC
Q 007765 202 GMHFLELGDI---PFLQQAVAVVGYPQGGD-----NISVTKGVVSRVEP---TQ-YVHG-ATQLMAIQI-----DAAINP 263 (590)
Q Consensus 202 ~~~~~~l~~~---~~~G~~V~~iG~p~g~~-----~~~v~~G~Vs~~~~---~~-~~~~-~~~~~~i~~-----~~~i~~ 263 (590)
.+.|+.|... ...+..+.+.||+.... ........+..+.. .. +... ......++. ....|+
T Consensus 105 ~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~ 184 (229)
T smart00020 105 NVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQ 184 (229)
T ss_pred ceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccC
Confidence 6889888764 34588999999876542 01122222222111 10 0000 000011222 345788
Q ss_pred CCCCceEEeCC---EEEEEEeeec
Q 007765 264 GNSGGPAIMGN---KVAGVAFQNL 284 (590)
Q Consensus 264 G~SGGPl~~~G---~vVGI~~~~~ 284 (590)
|+|||||+.++ .++||++...
T Consensus 185 gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 185 GDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCCCCeeEEECCCEEEEEEEEECC
Confidence 99999999545 9999999865
No 19
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.07 E-value=6.7e-10 Score=90.94 Aligned_cols=77 Identities=21% Similarity=0.171 Sum_probs=63.5
Q ss_pred cccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhc
Q 007765 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSM 394 (590)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~ 394 (590)
||+|+.+..- ..|++|..|.++|+|+++ |++||+|++|||+++.+|. +++..
T Consensus 1 ~~~G~~~~~~--------------~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~-------------~~l~~ 53 (80)
T cd00990 1 PYLGLTLDKE--------------EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQ-------------DRLKE 53 (80)
T ss_pred CcccEEEEcc--------------CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHH-------------HHHHh
Confidence 5788877542 367999999999999999 9999999999999999865 33444
Q ss_pred cCCCCEEEEEEEeCCEEEEEEEEEe
Q 007765 395 KKPNEKSLVRVLRDGKEHEFSITLR 419 (590)
Q Consensus 395 ~~~g~~v~l~V~R~g~~~~~~v~l~ 419 (590)
...++.+.+++.|+|+..++.+++.
T Consensus 54 ~~~~~~v~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 54 YQAGDPVELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred cCCCCEEEEEEEECCEEEEEEEEec
Confidence 4578899999999999888877654
No 20
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.03 E-value=1.3e-09 Score=89.03 Aligned_cols=72 Identities=26% Similarity=0.343 Sum_probs=62.1
Q ss_pred cCceEEEEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEe
Q 007765 340 VTGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLR 419 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 419 (590)
..|++|..|.++|||+..|++||+|++|||+++.+|.++. .++.....|+.+.+++.|+|+..++.+++.
T Consensus 7 ~~Gv~V~~V~~~s~A~~gL~~GD~I~~Ing~~v~~~~~~~----------~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~ 76 (79)
T cd00986 7 YHGVYVTSVVEGMPAAGKLKAGDHIIAVDGKPFKEAEELI----------DYIQSKKEGDTVKLKVKREEKELPEDLILK 76 (79)
T ss_pred ecCEEEEEECCCCchhhCCCCCCEEEEECCEECCCHHHHH----------HHHHhCCCCCEEEEEEEECCEEEEEEEEEe
Confidence 3689999999999998669999999999999999998654 566555678899999999999999999887
Q ss_pred cC
Q 007765 420 LL 421 (590)
Q Consensus 420 ~~ 421 (590)
.+
T Consensus 77 ~~ 78 (79)
T cd00986 77 TF 78 (79)
T ss_pred cc
Confidence 54
No 21
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.01 E-value=1.6e-09 Score=88.65 Aligned_cols=68 Identities=26% Similarity=0.309 Sum_probs=59.0
Q ss_pred cCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEE
Q 007765 340 VTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSIT 417 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~ 417 (590)
..|++|..|.++|||+++ |++||+|++|||+++.+|.++. ..+.....|+.+.+++.|+|+..++.++
T Consensus 9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~----------~~l~~~~~g~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFM----------EALKPTKPGEVITVTVLPSTTKLTNVST 77 (79)
T ss_pred CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHH----------HHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence 479999999999999999 9999999999999999998654 5565555688999999999998877654
No 22
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.97 E-value=2.5e-09 Score=107.19 Aligned_cols=101 Identities=18% Similarity=0.264 Sum_probs=85.1
Q ss_pred hHHHHHHHHHHHHcCccccccccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecC
Q 007765 296 VPVIKHFITGVVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAN 374 (590)
Q Consensus 296 ~~~i~~~l~~l~~~g~~~~~~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~ 374 (590)
...++++++++.+++.+. +.|+|+...... ....|++|..+.++++|+++ ||+||+|++|||+++.+
T Consensus 158 ~~~~~~v~~~l~~~g~~~-~~~lgi~p~~~~-----------g~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~ 225 (259)
T TIGR01713 158 IVVSRRIIEELTKDPQKM-FDYIRLSPVMKN-----------DKLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRD 225 (259)
T ss_pred hhhHHHHHHHHHHCHHhh-hheEeEEEEEeC-----------CceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCC
Confidence 356788999999999887 889999875441 12479999999999999999 99999999999999999
Q ss_pred CCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEE
Q 007765 375 DGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITL 418 (590)
Q Consensus 375 ~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l 418 (590)
+.++. .++.....++++.|+|+|+|+.+++.+.+
T Consensus 226 ~~~~~----------~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 226 PEQAF----------QALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred HHHHH----------HHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 98653 66766678899999999999998887653
No 23
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=98.81 E-value=1.5e-08 Score=83.54 Aligned_cols=66 Identities=11% Similarity=0.136 Sum_probs=56.7
Q ss_pred cceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHH-hcCCCceEEEeCCC-eEEEEEe
Q 007765 481 EQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVE-NCSSENLRFDLDDD-RVVVLNY 546 (590)
Q Consensus 481 ~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~-~~~~~~v~l~~~r~-~~i~l~~ 546 (590)
.++++|..|.+++||+.+|+++||+|++|||++|.++.+|.+++. ..+++.++|++.|+ +.+.+++
T Consensus 13 ~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~~~~v 80 (82)
T PF13180_consen 13 TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRDGEELTVEV 80 (82)
T ss_dssp SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEETTEEEEEEE
T ss_pred CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEE
Confidence 457889999999999999999999999999999999999999995 56788999999884 5555554
No 24
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.78 E-value=1.8e-08 Score=109.25 Aligned_cols=90 Identities=22% Similarity=0.369 Sum_probs=78.2
Q ss_pred ccccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhh
Q 007765 315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVS 393 (590)
Q Consensus 315 ~~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~ 393 (590)
..++|+.++.+ ++..++.++++....|++|..|.++|||+++ |++||+|++|||++|.++.++. +++.
T Consensus 337 ~~~lGi~~~~l-~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~----------~~l~ 405 (428)
T TIGR02037 337 NPFLGLTVANL-SPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELR----------KVLD 405 (428)
T ss_pred ccccceEEecC-CHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHH----------HHHH
Confidence 56899999998 7888888999865579999999999999999 9999999999999999988653 6666
Q ss_pred ccCCCCEEEEEEEeCCEEEEEE
Q 007765 394 MKKPNEKSLVRVLRDGKEHEFS 415 (590)
Q Consensus 394 ~~~~g~~v~l~V~R~g~~~~~~ 415 (590)
....|+.+.++|+|+|+...+.
T Consensus 406 ~~~~g~~v~l~v~R~g~~~~~~ 427 (428)
T TIGR02037 406 RAKKGGRVALLILRGGATIFVT 427 (428)
T ss_pred hcCCCCEEEEEEEECCEEEEEE
Confidence 6567899999999999877653
No 25
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.77 E-value=2.5e-08 Score=81.27 Aligned_cols=65 Identities=25% Similarity=0.341 Sum_probs=54.1
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEE
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSI 416 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v 416 (590)
..++|..|.++|+|+++ |++||+|++|||+++.+|.++. ..+... .++.+.+++.|+|+..++.+
T Consensus 12 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~----------~~l~~~-~~~~~~l~v~r~~~~~~~~l 77 (79)
T cd00989 12 IEPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLV----------DAVQEN-PGKPLTLTVERNGETITLTL 77 (79)
T ss_pred cCcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHH----------HHHHHC-CCceEEEEEEECCEEEEEEe
Confidence 45789999999999999 9999999999999999998653 444433 47789999999998776654
No 26
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.74 E-value=5.8e-08 Score=80.32 Aligned_cols=66 Identities=30% Similarity=0.471 Sum_probs=55.1
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCC--CcccccccccchHHHHhhccCCCCEEEEEEEeC-CEEEEEEE
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND--GTVAFRNRERITFDHLVSMKKPNEKSLVRVLRD-GKEHEFSI 416 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~--~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~-g~~~~~~v 416 (590)
.+++|..|.++|||+++ |++||+|++|||+++.+| .++. ..+.. ..|+.+.+++.|+ |+..++++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~----------~~l~~-~~~~~i~l~v~r~~~~~~~~~~ 81 (85)
T cd00988 13 GGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVV----------KLLRG-KAGTKVRLTLKRGDGEPREVTL 81 (85)
T ss_pred CeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHH----------HHhcC-CCCCEEEEEEEcCCCCEEEEEE
Confidence 68999999999999999 999999999999999998 5442 44433 4688999999998 88777766
Q ss_pred E
Q 007765 417 T 417 (590)
Q Consensus 417 ~ 417 (590)
.
T Consensus 82 ~ 82 (85)
T cd00988 82 T 82 (85)
T ss_pred E
Confidence 4
No 27
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.69 E-value=1.3e-07 Score=79.01 Aligned_cols=80 Identities=15% Similarity=0.158 Sum_probs=65.1
Q ss_pred eccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCceEEeeCCeecCCHH
Q 007765 439 FAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLK 518 (590)
Q Consensus 439 ~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~ 518 (590)
++|+.++++++.....++.. ...+++|..|.+++++..+++++||+|++|||+++.++.
T Consensus 2 ~~G~~~~~~~~~~~~~~~~~---------------------~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~ 60 (90)
T cd00987 2 WLGVTVQDLTPDLAEELGLK---------------------DTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVA 60 (90)
T ss_pred ccceEEeECCHHHHHHcCCC---------------------CCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHH
Confidence 56899998887544332221 345799999999999998999999999999999999999
Q ss_pred HHHHHHHhcC-CCceEEEeCCC
Q 007765 519 HLCQLVENCS-SENLRFDLDDD 539 (590)
Q Consensus 519 ~l~~~v~~~~-~~~v~l~~~r~ 539 (590)
++.+++.... ++.+.+++.|+
T Consensus 61 ~~~~~l~~~~~~~~i~l~v~r~ 82 (90)
T cd00987 61 DLRRALAELKPGDKVTLTVLRG 82 (90)
T ss_pred HHHHHHHhcCCCCEEEEEEEEC
Confidence 9999998764 67788888764
No 28
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.63 E-value=4.1e-07 Score=89.41 Aligned_cols=156 Identities=19% Similarity=0.212 Sum_probs=90.3
Q ss_pred eEEEEEEeCCeEEEcCcccCCCc----EEEEEEc---CC-CcEEEEE--EEEec-C---CCCeEEEEecCCccc------
Q 007765 141 TGSGFVIPGKKILTNAHVVADST----FVLVRKH---GS-PTKYRAQ--VEAVG-H---ECDLAILIVESDEFW------ 200 (590)
Q Consensus 141 ~GsGfiI~~g~IlT~aHvv~~~~----~i~V~~~---~~-~~~~~a~--vv~~d-~---~~DlAlLkv~~~~~~------ 200 (590)
-+++|+|+++.+|||+||+.... .+.+... ++ +..+..+ ..... . +.|.+...+....+.
T Consensus 65 ~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~ 144 (251)
T COG3591 65 CTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINIG 144 (251)
T ss_pred eeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCcc
Confidence 34569999999999999997543 2222211 11 2122211 11112 2 446666655433211
Q ss_pred Ccce--eEEcCCCCCCCCeEEEEecCCCCCC---ceEEEeEEecccccccccCcceeeEEEEcccccCCCCCceEE-eCC
Q 007765 201 EGMH--FLELGDIPFLQQAVAVVGYPQGGDN---ISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGN 274 (590)
Q Consensus 201 ~~~~--~~~l~~~~~~G~~V~~iG~p~g~~~---~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~-~~G 274 (590)
.... ..++....++++.+.++|||....+ .....+.|..+.. ..+..++.+.+|+||+||+ .+.
T Consensus 145 ~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~----------~~l~y~~dT~pG~SGSpv~~~~~ 214 (251)
T COG3591 145 DVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKG----------NKLFYDADTLPGSSGSPVLISKD 214 (251)
T ss_pred ccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEec----------ceEEEEecccCCCCCCceEecCc
Confidence 1222 2233334466888999999987652 2233344432221 2577888999999999999 888
Q ss_pred EEEEEEeeecCCC--CceeEE-eehHHHHHHHHHH
Q 007765 275 KVAGVAFQNLSGA--ENIGYI-IPVPVIKHFITGV 306 (590)
Q Consensus 275 ~vVGI~~~~~~~~--~~~~~a-ip~~~i~~~l~~l 306 (590)
+|||+.+...... ...+++ .-...++++++++
T Consensus 215 ~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~ 249 (251)
T COG3591 215 EVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQN 249 (251)
T ss_pred eEEEEEecCCCcccccccCcceEecHHHHHHHHHh
Confidence 9999998876522 223333 3335677777665
No 29
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.59 E-value=1.8e-07 Score=76.48 Aligned_cols=67 Identities=10% Similarity=0.095 Sum_probs=57.3
Q ss_pred CcceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhc-CCCceEEEeCCC-eEEEEEe
Q 007765 480 GEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENC-SSENLRFDLDDD-RVVVLNY 546 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~-~~~~v~l~~~r~-~~i~l~~ 546 (590)
..+++++..|.++++++.+++++||+|++|||+++.+|++|.+.+... +++.+.+++.|+ +...+..
T Consensus 8 ~~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~ 76 (79)
T cd00991 8 AVAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVS 76 (79)
T ss_pred cCCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Confidence 346789999999999999999999999999999999999999999986 477888888774 4555443
No 30
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.54 E-value=2.1e-07 Score=73.93 Aligned_cols=54 Identities=33% Similarity=0.451 Sum_probs=44.8
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCC--CcccccccccchHHHHhhccCCCCEEEEEE
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND--GTVAFRNRERITFDHLVSMKKPNEKSLVRV 405 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~--~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V 405 (590)
.|++|..|.+++||+.+ |++||+|++|||+++.+| .++ ..++... .|+.++|+|
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~----------~~~l~~~-~g~~v~l~v 69 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDV----------AELLKKE-VGEKVTLTV 69 (70)
T ss_pred CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHH----------HHHHhhC-CCCeEEEEE
Confidence 48999999999999999 999999999999999999 443 2555443 478888876
No 31
>PRK10139 serine endoprotease; Provisional
Probab=98.47 E-value=5.8e-07 Score=97.72 Aligned_cols=89 Identities=12% Similarity=0.137 Sum_probs=75.2
Q ss_pred eeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCceEEeeCCeecCCH
Q 007765 438 IFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENL 517 (590)
Q Consensus 438 ~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~ 517 (590)
.|+|+.++++++...+.++.+ ...+++|..|.+++|++.+|+++||+|++|||++|.+|
T Consensus 267 ~~LGv~~~~l~~~~~~~lgl~---------------------~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~ 325 (455)
T PRK10139 267 GLLGIKGTEMSADIAKAFNLD---------------------VQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSF 325 (455)
T ss_pred cceeEEEEECCHHHHHhcCCC---------------------CCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCH
Confidence 467999999998887777764 35689999999999999999999999999999999999
Q ss_pred HHHHHHHHh-cCCCceEEEeCCC-eEEEEEec
Q 007765 518 KHLCQLVEN-CSSENLRFDLDDD-RVVVLNYD 547 (590)
Q Consensus 518 ~~l~~~v~~-~~~~~v~l~~~r~-~~i~l~~~ 547 (590)
.+|.+.+.. .+++.+.+++.|+ +.+.+.+.
T Consensus 326 ~dl~~~l~~~~~g~~v~l~V~R~G~~~~l~v~ 357 (455)
T PRK10139 326 AELRSRIATTEPGTKVKLGLLRNGKPLEVEVT 357 (455)
T ss_pred HHHHHHHHhcCCCCEEEEEEEECCEEEEEEEE
Confidence 999999987 6788899998874 55444443
No 32
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.42 E-value=1.4e-05 Score=80.31 Aligned_cols=144 Identities=18% Similarity=0.156 Sum_probs=85.2
Q ss_pred eEEEEEEeCCeEEEcCcccCCCc--EEEEEEcC--------CC---cEE-EEEEEEecC-------C-CCeEEEEecCC-
Q 007765 141 TGSGFVIPGKKILTNAHVVADST--FVLVRKHG--------SP---TKY-RAQVEAVGH-------E-CDLAILIVESD- 197 (590)
Q Consensus 141 ~GsGfiI~~g~IlT~aHvv~~~~--~i~V~~~~--------~~---~~~-~a~vv~~d~-------~-~DlAlLkv~~~- 197 (590)
.+.|.+|++.||||+|||+.+.. .+.|.+.. .+ ... ..+++ .++ . +|||||+++.+
T Consensus 39 ~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~v 117 (256)
T KOG3627|consen 39 LCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEPV 117 (256)
T ss_pred eeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCCc
Confidence 67787889889999999999865 66666521 11 111 11222 221 3 79999999986
Q ss_pred cccCcceeEEcCCCC----CC-CCeEEEEecCCCC----C-CceEEEeEEeccc---ccccccCc--ceeeEEEEc----
Q 007765 198 EFWEGMHFLELGDIP----FL-QQAVAVVGYPQGG----D-NISVTKGVVSRVE---PTQYVHGA--TQLMAIQID---- 258 (590)
Q Consensus 198 ~~~~~~~~~~l~~~~----~~-G~~V~~iG~p~g~----~-~~~v~~G~Vs~~~---~~~~~~~~--~~~~~i~~~---- 258 (590)
.|.+.+.|+.|.... .. +..+.+.||+... . ........+.-+. +....... .....++..
T Consensus 118 ~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~~ 197 (256)
T KOG3627|consen 118 TFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPEG 197 (256)
T ss_pred ccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccCC
Confidence 577788888886332 22 4788888975431 1 1122222222211 11111110 001123333
Q ss_pred -ccccCCCCCceEE-eC---CEEEEEEeeecC
Q 007765 259 -AAINPGNSGGPAI-MG---NKVAGVAFQNLS 285 (590)
Q Consensus 259 -~~i~~G~SGGPl~-~~---G~vVGI~~~~~~ 285 (590)
...|.|||||||+ .+ ..++||++++..
T Consensus 198 ~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 198 GKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred CCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 2358999999999 54 699999999865
No 33
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.41 E-value=9e-07 Score=72.20 Aligned_cols=64 Identities=13% Similarity=0.217 Sum_probs=53.8
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHh-cCCCceEEEeCCC-eEEEEEe
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVEN-CSSENLRFDLDDD-RVVVLNY 546 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~-~~~~~v~l~~~r~-~~i~l~~ 546 (590)
++++|..|.++++++. ++++||+|++|||+++.+|++|.+++.. .+++.+.+++.|+ +...+..
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v 73 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDL 73 (79)
T ss_pred cCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEE
Confidence 5789999999999876 7999999999999999999999999986 5677888888774 4444433
No 34
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.38 E-value=1.5e-06 Score=71.76 Aligned_cols=58 Identities=16% Similarity=0.233 Sum_probs=52.7
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCH--HHHHHHHHhcCCCceEEEeCCC
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENL--KHLCQLVENCSSENLRFDLDDD 539 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~--~~l~~~v~~~~~~~v~l~~~r~ 539 (590)
.+++|..|.+++++..+++++||+|++|||+++.+| .++.++++..+++.+.+++.|+
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~ 72 (85)
T cd00988 13 GGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRG 72 (85)
T ss_pred CeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcC
Confidence 567899999999999999999999999999999999 9999999887888888888764
No 35
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=98.36 E-value=1.5e-06 Score=91.67 Aligned_cols=89 Identities=9% Similarity=0.060 Sum_probs=72.9
Q ss_pred eeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCceEEeeCCeecCCH
Q 007765 438 IFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENL 517 (590)
Q Consensus 438 ~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~ 517 (590)
.|+|+.+.++++...+.+|.+ ..++++|..|.+++|++.+++++||+|++|||++|.+|
T Consensus 255 ~~lGv~~~~~~~~~~~~lgl~---------------------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~ 313 (351)
T TIGR02038 255 GYIGVSGEDINSVVAQGLGLP---------------------DLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGA 313 (351)
T ss_pred eEeeeEEEECCHHHHHhcCCC---------------------ccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCH
Confidence 367888888887666665553 34689999999999999999999999999999999999
Q ss_pred HHHHHHHHh-cCCCceEEEeCCC-eEEEEEec
Q 007765 518 KHLCQLVEN-CSSENLRFDLDDD-RVVVLNYD 547 (590)
Q Consensus 518 ~~l~~~v~~-~~~~~v~l~~~r~-~~i~l~~~ 547 (590)
++|.+.+++ .+++.+.+++.|+ +.+.+...
T Consensus 314 ~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 345 (351)
T TIGR02038 314 EELMDRIAETRPGSKVMVTVLRQGKQLELPVT 345 (351)
T ss_pred HHHHHHHHhcCCCCEEEEEEEECCEEEEEEEE
Confidence 999999987 5777899998774 55444443
No 36
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.35 E-value=9.6e-07 Score=95.37 Aligned_cols=153 Identities=18% Similarity=0.262 Sum_probs=95.0
Q ss_pred cCceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEE
Q 007765 340 VTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSIT 417 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~ 417 (590)
.+-++|..|.+.+.|++- |++||.|+.|||.+|....+-. ...++........|.|+|.|.-..-.
T Consensus 673 ~qpi~iG~Iv~lGaAe~DGRL~~gDElv~iDG~pV~GksH~~--------vv~Lm~~AArnghV~LtVRRkv~~~~---- 740 (984)
T KOG3209|consen 673 GQPIYIGAIVPLGAAEEDGRLREGDELVCIDGIPVEGKSHSE--------VVDLMEAAARNGHVNLTVRRKVRTGP---- 740 (984)
T ss_pred CCeeEEeeeeecccccccCcccCCCeEEEecCeeccCccHHH--------HHHHHHHHHhcCceEEEEeeeeeecc----
Confidence 456889999999999876 9999999999999999876531 12444444455678999987311000
Q ss_pred EecCCCCCCCccCCCCCccee------eccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEee
Q 007765 418 LRLLQPLVPVHQFDKLPSYYI------FAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLM 491 (590)
Q Consensus 418 l~~~~~l~~~~~~~~~p~~~~------~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~ 491 (590)
..... .......++|-+ .-||.|.-++. ..+ .+.+ |.+|.+
T Consensus 741 -~~rsp---~~s~~~~~~yDV~lhR~ENeGFGFVi~sS--------------------------~~k-p~sg--iGrIie 787 (984)
T KOG3209|consen 741 -ARRSP---RNSAAPSGPYDVVLHRKENEGFGFVIMSS--------------------------QNK-PESG--IGRIIE 787 (984)
T ss_pred -ccCCc---ccccCCCCCeeeEEecccCCceeEEEEec--------------------------ccC-CCCC--cccccc
Confidence 00000 000000001100 01444444332 001 1222 788999
Q ss_pred ccccccc-cccCCceEEeeCCeecCCHH--HHHHHHHhcCCCceEEEeCC
Q 007765 492 DDINAGY-ERFADLQVKKVNGVEIENLK--HLCQLVENCSSENLRFDLDD 538 (590)
Q Consensus 492 ~s~a~g~-~~~~gd~I~~VNG~~v~~~~--~l~~~v~~~~~~~v~l~~~r 538 (590)
+|||+.- .|+.||+|++|||+.|-++. +.+++|+. .|-.|+|++.-
T Consensus 788 GSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd-aGlsVtLtIip 836 (984)
T KOG3209|consen 788 GSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD-AGLSVTLTIIP 836 (984)
T ss_pred CChhHhhccccccceEEEecCeeeeccCchhHHHHHHh-cCceEEEEEcC
Confidence 9998886 48899999999999998886 55666665 35567776643
No 37
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.33 E-value=1.6e-06 Score=70.57 Aligned_cols=56 Identities=16% Similarity=0.249 Sum_probs=50.9
Q ss_pred EEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCCC
Q 007765 484 VILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDDD 539 (590)
Q Consensus 484 vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r~ 539 (590)
++++.|.+++++...++++||+|++|||+++.+|+++.+.+....++.+.+++.|+
T Consensus 14 ~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~ 69 (79)
T cd00989 14 PVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERN 69 (79)
T ss_pred cEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEEC
Confidence 58899999999998999999999999999999999999999987777888888764
No 38
>PRK10942 serine endoprotease; Provisional
Probab=98.31 E-value=2.5e-06 Score=93.27 Aligned_cols=90 Identities=20% Similarity=0.141 Sum_probs=74.1
Q ss_pred eeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCceEEeeCCeecCC
Q 007765 437 YIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIEN 516 (590)
Q Consensus 437 ~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~ 516 (590)
..|+|+.++++++...+.++.+ ..++++|..|.++++++.++++.||+|++|||++|.+
T Consensus 287 rg~lGv~~~~l~~~~a~~~~l~---------------------~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~s 345 (473)
T PRK10942 287 RGELGIMGTELNSELAKAMKVD---------------------AQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISS 345 (473)
T ss_pred cceeeeEeeecCHHHHHhcCCC---------------------CCCceEEEEECCCChHHHcCCCCCCEEEEECCEECCC
Confidence 3578999999998877776664 4578999999999999999999999999999999999
Q ss_pred HHHHHHHHHhc-CCCceEEEeCC-CeEEEEEec
Q 007765 517 LKHLCQLVENC-SSENLRFDLDD-DRVVVLNYD 547 (590)
Q Consensus 517 ~~~l~~~v~~~-~~~~v~l~~~r-~~~i~l~~~ 547 (590)
|.+|.+.+... +++.+.+++.| ++.+.+.+.
T Consensus 346 ~~dl~~~l~~~~~g~~v~l~v~R~G~~~~v~v~ 378 (473)
T PRK10942 346 FAALRAQVGTMPVGSKLTLGLLRDGKPVNVNVE 378 (473)
T ss_pred HHHHHHHHHhcCCCCEEEEEEEECCeEEEEEEE
Confidence 99999999774 56778888877 444444433
No 39
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.30 E-value=2.4e-06 Score=70.23 Aligned_cols=59 Identities=32% Similarity=0.451 Sum_probs=47.2
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCC
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDG 409 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g 409 (590)
.|++|..|.++|+|+++ |++||+|++|||+++.++.+.. ........++.+.+++.|++
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~----------~~~~~~~~~~~~~l~i~r~~ 85 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLE----------AVDLLKKAGGKVTLTVLRGG 85 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHH----------HHHHHHhCCCeEEEEEEeCC
Confidence 68999999999999999 9999999999999999886542 22222234568899998864
No 40
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.29 E-value=1.4e-06 Score=69.09 Aligned_cols=55 Identities=18% Similarity=0.207 Sum_probs=50.5
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCH--HHHHHHHHhcCCCceEEEe
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENL--KHLCQLVENCSSENLRFDL 536 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~--~~l~~~v~~~~~~~v~l~~ 536 (590)
.+++|..|.+++|++.+++++||+|++|||+++.+| +++.++++.+.++.++|++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v 69 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV 69 (70)
T ss_pred CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence 368999999999999999999999999999999999 9999999998878888765
No 41
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.29 E-value=1.7e-06 Score=93.39 Aligned_cols=68 Identities=22% Similarity=0.324 Sum_probs=59.2
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEe
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLR 419 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 419 (590)
.|++|.+|.++|||+++ ||+||+|++|||++|.+|+++. ..+.. ..++++.++|.|+|+..++++++.
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~----------~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 271 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFV----------SAVKE-NPGKSMDIKVERNGETLSISLTPE 271 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHH----------HHHHh-CCCCceEEEEEECCEEEEEEEEEc
Confidence 57899999999999999 9999999999999999998764 45543 467889999999999988888764
No 42
>PRK10898 serine endoprotease; Provisional
Probab=98.24 E-value=4.5e-06 Score=87.95 Aligned_cols=88 Identities=14% Similarity=0.035 Sum_probs=69.3
Q ss_pred eccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCceEEeeCCeecCCHH
Q 007765 439 FAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLK 518 (590)
Q Consensus 439 ~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~ 518 (590)
|+|+....+++......+.+ ..++++|.+|.+++|++.+++++||+|++|||++|.+|.
T Consensus 257 ~lGi~~~~~~~~~~~~~~~~---------------------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~ 315 (353)
T PRK10898 257 YIGIGGREIAPLHAQGGGID---------------------QLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISAL 315 (353)
T ss_pred ccceEEEECCHHHHHhcCCC---------------------CCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHH
Confidence 56888777765433332222 347899999999999999999999999999999999999
Q ss_pred HHHHHHHh-cCCCceEEEeCCC-eEEEEEec
Q 007765 519 HLCQLVEN-CSSENLRFDLDDD-RVVVLNYD 547 (590)
Q Consensus 519 ~l~~~v~~-~~~~~v~l~~~r~-~~i~l~~~ 547 (590)
+|.+.+.. .+++.+.+++.|+ +.+.+...
T Consensus 316 ~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 346 (353)
T PRK10898 316 ETMDQVAEIRPGSVIPVVVMRDDKQLTLQVT 346 (353)
T ss_pred HHHHHHHhcCCCCEEEEEEEECCEEEEEEEE
Confidence 99999877 6777899988874 55454443
No 43
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.21 E-value=3.3e-06 Score=91.99 Aligned_cols=67 Identities=24% Similarity=0.372 Sum_probs=58.4
Q ss_pred ceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEe
Q 007765 342 GVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLR 419 (590)
Q Consensus 342 gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 419 (590)
+++|..|.++|||+++ |++||+|++|||++|.+|.++. ..+.. ..++.+.++|.|+|+..++++++.
T Consensus 222 ~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~----------~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 289 (449)
T PRK10779 222 EPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFV----------TLVRD-NPGKPLALEIERQGSPLSLTLTPD 289 (449)
T ss_pred CcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHH----------HHHHh-CCCCEEEEEEEECCEEEEEEEEee
Confidence 5789999999999999 9999999999999999998764 45543 577899999999999988888765
No 44
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.20 E-value=4.8e-05 Score=74.38 Aligned_cols=160 Identities=17% Similarity=0.197 Sum_probs=81.8
Q ss_pred HHhCCCcEEEEeeecCCCCCCCccCCCCCCceEEEEEEeCCeEEEcCcccCC-CcEEEEEEcCCCcEEEEE-----EEEe
Q 007765 110 ELALDSVVKIFTVSSSPNYGLPWQNKSQRETTGSGFVIPGKKILTNAHVVAD-STFVLVRKHGSPTKYRAQ-----VEAV 183 (590)
Q Consensus 110 ~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfiI~~g~IlT~aHvv~~-~~~i~V~~~~~~~~~~a~-----vv~~ 183 (590)
.-+...|++|...... ....=-|+.. ..+|+|++|.... ...+.|... - -.|... -+..
T Consensus 14 n~Ia~~ic~l~n~s~~------------~~~~l~gigy-G~~iItn~HLf~~nng~L~i~s~-h-G~f~v~nt~~lkv~~ 78 (235)
T PF00863_consen 14 NPIASNICRLTNESDG------------GTRSLYGIGY-GSYIITNAHLFKRNNGELTIKSQ-H-GEFTVPNTTQLKVHP 78 (235)
T ss_dssp HHHHTTEEEEEEEETT------------EEEEEEEEEE-TTEEEEEGGGGSSTTCEEEEEET-T-EEEEECEGGGSEEEE
T ss_pred chhhheEEEEEEEeCC------------CeEEEEEEeE-CCEEEEChhhhccCCCeEEEEeC-c-eEEEcCCccccceEE
Confidence 3455788888864321 1111123322 8899999999963 455777764 2 233332 1334
Q ss_pred cCCCCeEEEEecCCcccCcceeEEcC---CCCCCCCeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEEccc
Q 007765 184 GHECDLAILIVESDEFWEGMHFLELG---DIPFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAA 260 (590)
Q Consensus 184 d~~~DlAlLkv~~~~~~~~~~~~~l~---~~~~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~ 260 (590)
-+..||.++|++.. ++|.+-. ..+..++.|.++|.-+.....+.+...-+.+.+ ..+. .++..-..
T Consensus 79 i~~~DiviirmPkD-----fpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~~s~vSesS~i~p---~~~~---~fWkHwIs 147 (235)
T PF00863_consen 79 IEGRDIVIIRMPKD-----FPPFPQKLKFRAPKEGERVCMVGSNFQEKSISSTVSESSWIYP---EENS---HFWKHWIS 147 (235)
T ss_dssp -TCSSEEEEE--TT-----S----S---B----TT-EEEEEEEECSSCCCEEEEEEEEEEEE---ETTT---TEEEE-C-
T ss_pred eCCccEEEEeCCcc-----cCCcchhhhccCCCCCCEEEEEEEEEEcCCeeEEECCceEEee---cCCC---CeeEEEec
Confidence 46899999999864 4554332 345669999999975443322222222222111 1111 24555556
Q ss_pred ccCCCCCceEE--eCCEEEEEEeeecCCCCceeEEeeh
Q 007765 261 INPGNSGGPAI--MGNKVAGVAFQNLSGAENIGYIIPV 296 (590)
Q Consensus 261 i~~G~SGGPl~--~~G~vVGI~~~~~~~~~~~~~aip~ 296 (590)
...|+-|+|++ .||.+|||++..... ...+|+.|+
T Consensus 148 Tk~G~CG~PlVs~~Dg~IVGiHsl~~~~-~~~N~F~~f 184 (235)
T PF00863_consen 148 TKDGDCGLPLVSTKDGKIVGIHSLTSNT-SSRNYFTPF 184 (235)
T ss_dssp --TT-TT-EEEETTT--EEEEEEEEETT-TSSEEEEE-
T ss_pred CCCCccCCcEEEcCCCcEEEEEcCccCC-CCeEEEEcC
Confidence 67899999999 899999999987643 345566654
No 45
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.19 E-value=4.5e-06 Score=87.51 Aligned_cols=71 Identities=23% Similarity=0.268 Sum_probs=55.8
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEe
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLR 419 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 419 (590)
.+++|..|.++|||+++ |++||+|++|||++|.+|..- ++..++ ....|+++.++|.|+|+...+++++.
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~--------~~~~~l-~~~~g~~v~l~v~R~g~~~~~~v~l~ 132 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLD--------DAVALI-RGKKGTKVSLEILRAGKSKPLTFTLK 132 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHH--------HHHHhc-cCCCCCEEEEEEEeCCCCceEEEEEE
Confidence 57899999999999999 999999999999999988411 111233 23468899999999987776666665
Q ss_pred c
Q 007765 420 L 420 (590)
Q Consensus 420 ~ 420 (590)
.
T Consensus 133 ~ 133 (334)
T TIGR00225 133 R 133 (334)
T ss_pred E
Confidence 4
No 46
>PF12812 PDZ_1: PDZ-like domain
Probab=98.13 E-value=1e-05 Score=65.51 Aligned_cols=73 Identities=18% Similarity=0.195 Sum_probs=60.6
Q ss_pred CcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCceEEeeCCee
Q 007765 434 PSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVE 513 (590)
Q Consensus 434 p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~ 513 (590)
.+++.++|..|++|+.+..+.++.. -.+++ ...-.++++..+++..|.+|.+|||++
T Consensus 5 ~r~v~~~Ga~f~~Ls~q~aR~~~~~----------------------~~gv~-v~~~~g~~~~~~~i~~g~iI~~Vn~kp 61 (78)
T PF12812_consen 5 SRFVEVCGAVFHDLSYQQARQYGIP----------------------VGGVY-VAVSGGSLAFAGGISKGFIITSVNGKP 61 (78)
T ss_pred CEEEEEcCeecccCCHHHHHHhCCC----------------------CCEEE-EEecCCChhhhCCCCCCeEEEeECCcC
Confidence 3578899999999999998887775 22444 455677787777788999999999999
Q ss_pred cCCHHHHHHHHHhcCC
Q 007765 514 IENLKHLCQLVENCSS 529 (590)
Q Consensus 514 v~~~~~l~~~v~~~~~ 529 (590)
+.|+++|.+++++.++
T Consensus 62 t~~Ld~f~~vvk~ipd 77 (78)
T PF12812_consen 62 TPDLDDFIKVVKKIPD 77 (78)
T ss_pred CcCHHHHHHHHHhCCC
Confidence 9999999999998765
No 47
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.12 E-value=4.3e-06 Score=88.43 Aligned_cols=61 Identities=25% Similarity=0.397 Sum_probs=50.1
Q ss_pred EEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEE-eCCEEEEEEEEEe
Q 007765 345 VNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVL-RDGKEHEFSITLR 419 (590)
Q Consensus 345 V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~-R~g~~~~~~v~l~ 419 (590)
|..|.|+|+|+++ |++||+|++|||++|.+|.++++ .+ .++.+.++|. |+|+..++++...
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~----------~l----~~e~l~L~V~~rdGe~~~l~Ie~~ 64 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQF----------LC----ADEELELEVLDANGESHQIEIEKD 64 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHH----------Hh----cCCcEEEEEEcCCCeEEEEEEecC
Confidence 5679999999999 99999999999999999987642 22 2467899997 7898877776654
No 48
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.10 E-value=8.3e-06 Score=66.80 Aligned_cols=72 Identities=22% Similarity=0.271 Sum_probs=52.4
Q ss_pred ccccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhh
Q 007765 315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVS 393 (590)
Q Consensus 315 ~~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~ 393 (590)
...+||.+....... ..+++|..|.++++|+++ |++||+|++|||+++.++.... ...++.
T Consensus 9 ~~~lG~~l~~~~~~~----------~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~--------~~~~l~ 70 (81)
T PF00595_consen 9 NGPLGFTLRGGSDND----------EKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDE--------VVQLLK 70 (81)
T ss_dssp TSBSSEEEEEESTSS----------SEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHH--------HHHHHH
T ss_pred CCCcCEEEEecCCCC----------cCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHH--------HHHHHH
Confidence 345888887652100 258999999999999999 9999999999999999986532 223333
Q ss_pred ccCCCCEEEEEEE
Q 007765 394 MKKPNEKSLVRVL 406 (590)
Q Consensus 394 ~~~~g~~v~l~V~ 406 (590)
. .++.++|+|+
T Consensus 71 ~--~~~~v~L~V~ 81 (81)
T PF00595_consen 71 S--ASNPVTLTVQ 81 (81)
T ss_dssp H--STSEEEEEEE
T ss_pred C--CCCcEEEEEC
Confidence 2 3447888774
No 49
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.09 E-value=2e-05 Score=84.28 Aligned_cols=69 Identities=17% Similarity=0.202 Sum_probs=54.3
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEE
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITL 418 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l 418 (590)
.|++|..|.++|||+++ |++||+|++|||++|.++... ++..++ ....|+.+.|+|.|+|+..++++.-
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~--------~~~~~l-~g~~g~~v~ltv~r~g~~~~~~l~r 171 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLY--------EAADRL-QGPEGSSVELTLRRGPETRLVTLTR 171 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHH--------HHHHHH-hcCCCCEEEEEEEECCEEEEEEEEe
Confidence 38999999999999999 999999999999999876321 122334 2346889999999999877666543
No 50
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.00 E-value=1.4e-05 Score=65.13 Aligned_cols=61 Identities=16% Similarity=0.141 Sum_probs=47.8
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCCC-eEEEE
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDDD-RVVVL 544 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r~-~~i~l 544 (590)
.+++|..|.++++++.+++++||+|++|||+++.+|.++.+.+ ..++.+.+++.|+ ....+
T Consensus 12 ~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~--~~~~~v~l~v~r~g~~~~~ 73 (80)
T cd00990 12 GLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEY--QAGDPVELTVFRDDRLIEV 73 (80)
T ss_pred CcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhc--CCCCEEEEEEEECCEEEEE
Confidence 4578999999999999999999999999999999976654333 3566788888764 44333
No 51
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.98 E-value=0.00028 Score=75.57 Aligned_cols=80 Identities=30% Similarity=0.446 Sum_probs=59.4
Q ss_pred ccccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHh-
Q 007765 315 FCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLV- 392 (590)
Q Consensus 315 ~~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~- 392 (590)
+..+|++++.-. ..++.|.++.+++||+++ +++||+|++|||+++....- ++.+
T Consensus 99 ~~GiG~~i~~~~-------------~~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~-----------~~av~ 154 (406)
T COG0793 99 FGGIGIELQMED-------------IGGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSL-----------DEAVK 154 (406)
T ss_pred ccceeEEEEEec-------------CCCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCH-----------HHHHH
Confidence 666888877541 267889999999999999 99999999999999987641 1222
Q ss_pred -hccCCCCEEEEEEEeCCEEEEEEEEE
Q 007765 393 -SMKKPNEKSLVRVLRDGKEHEFSITL 418 (590)
Q Consensus 393 -~~~~~g~~v~l~V~R~g~~~~~~v~l 418 (590)
..-..|..++|++.|.+....+.+++
T Consensus 155 ~irG~~Gt~V~L~i~r~~~~k~~~v~l 181 (406)
T COG0793 155 LIRGKPGTKVTLTILRAGGGKPFTVTL 181 (406)
T ss_pred HhCCCCCCeEEEEEEEcCCCceeEEEE
Confidence 23457899999999974333333333
No 52
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=97.97 E-value=1.6e-05 Score=86.17 Aligned_cols=159 Identities=18% Similarity=0.165 Sum_probs=86.3
Q ss_pred EEEeCCCChhhhc--cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEE---------
Q 007765 345 VNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHE--------- 413 (590)
Q Consensus 345 V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~--------- 413 (590)
|..|.++|||++- |+.||.|++|||+.|-+..+- ++..++. ..|-+|+|+|.-..+.-.
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHa--------div~LIK--daGlsVtLtIip~ee~~~~~~~~sa~~ 851 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHA--------DIVSLIK--DAGLSVTLTIIPPEEAGPPTSMTSAEK 851 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccCch--------hHHHHHH--hcCceEEEEEcChhccCCCCCCcchhh
Confidence 6788999999987 999999999999999988764 2223433 468899999975322110
Q ss_pred -EEEEEec-C--C-CCCCC--ccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcC-----CccCCc
Q 007765 414 -FSITLRL-L--Q-PLVPV--HQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALREL-----PKKAGE 481 (590)
Q Consensus 414 -~~v~l~~-~--~-~l~~~--~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~-----~~~~~~ 481 (590)
-.++... . + .+... ..+.+.|....+-|+..-....+ ..++|.- -+++.-.++ ..++..
T Consensus 852 ~s~~t~~~~~~q~~glp~~~~s~~~~~pqpdt~~~~~~~~r~~q-----n~~~~~V----elErG~kGFGFSiRGGreyn 922 (984)
T KOG3209|consen 852 QSPFTQNGPYEQQYGLPGPRPSVYEEHPQPDTFQGLSINDRMSQ-----NGDLYTV----ELERGAKGFGFSIRGGREYN 922 (984)
T ss_pred cCcccccCCHhHccCCCCCCccccccCCCCccccceeccccccc-----cCCeeEE----EeeccccccceEeecccccc
Confidence 0000000 0 0 01000 00112222211222221111100 0000000 000000000 012333
Q ss_pred ceEEEEEEeecccccccc-ccCCceEEeeCCeecCCHHHHHH
Q 007765 482 QLVILSQVLMDDINAGYE-RFADLQVKKVNGVEIENLKHLCQ 522 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~-~~~gd~I~~VNG~~v~~~~~l~~ 522 (590)
-.++|-.+..|+||..-| ++.||.|++|||++.+++.|-.+
T Consensus 923 M~LfVLRlAeDGPA~rdGrm~VGDqi~eINGesTkgmtH~rA 964 (984)
T KOG3209|consen 923 MDLFVLRLAEDGPAIRDGRMRVGDQITEINGESTKGMTHDRA 964 (984)
T ss_pred cceEEEEeccCCCccccCceeecceEEEecCcccCCCcHHHH
Confidence 467888888999988875 67899999999999999988543
No 53
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.93 E-value=2.8e-05 Score=81.83 Aligned_cols=68 Identities=31% Similarity=0.337 Sum_probs=54.8
Q ss_pred cCceEEEEe---C-----CCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCE
Q 007765 340 VTGVLVNKI---N-----PLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGK 410 (590)
Q Consensus 340 ~~gv~V~~V---~-----~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~ 410 (590)
..||+|.+. . .+|||+++ ||+||+|++|||++|.+|.++. +++... .++.+.++|.|+|+
T Consensus 104 t~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~----------~iL~~~-~g~~V~LtV~R~Ge 172 (402)
T TIGR02860 104 TKGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLA----------NLINKA-GGEKLTLTIERGGK 172 (402)
T ss_pred cCEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHH----------HHHHhC-CCCeEEEEEEECCE
Confidence 368888543 2 25899999 9999999999999999998764 555443 48899999999999
Q ss_pred EEEEEEEE
Q 007765 411 EHEFSITL 418 (590)
Q Consensus 411 ~~~~~v~l 418 (590)
..++.+..
T Consensus 173 ~~tv~V~P 180 (402)
T TIGR02860 173 IIETVIKP 180 (402)
T ss_pred EEEEEEEE
Confidence 88887763
No 54
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.91 E-value=2.5e-05 Score=63.78 Aligned_cols=49 Identities=27% Similarity=0.424 Sum_probs=39.7
Q ss_pred cccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCC
Q 007765 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND 375 (590)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~ 375 (590)
..+|+.++...+ ...|++|..|.++|||+++ |++||+|++|||+++.++
T Consensus 12 ~~~G~~~~~~~~-----------~~~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~ 61 (82)
T cd00992 12 GGLGFSLRGGKD-----------SGGGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGL 61 (82)
T ss_pred CCcCEEEeCccc-----------CCCCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCcc
Confidence 447777765411 0258999999999999999 999999999999999943
No 55
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.89 E-value=3.4e-05 Score=63.01 Aligned_cols=54 Identities=17% Similarity=0.195 Sum_probs=47.1
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecC--CHHHHHHHHHhcCCCceEEEe
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIE--NLKHLCQLVENCSSENLRFDL 536 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~--~~~~l~~~v~~~~~~~v~l~~ 536 (590)
.+++|..|.+++++..+++++||+|++|||+++. +++++.+.++...+ .+++.+
T Consensus 26 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~-~v~l~v 81 (82)
T cd00992 26 GGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV 81 (82)
T ss_pred CCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC-eEEEEE
Confidence 4689999999999999999999999999999999 99999999997554 555543
No 56
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.83 E-value=0.00015 Score=60.02 Aligned_cols=64 Identities=25% Similarity=0.367 Sum_probs=41.8
Q ss_pred CceEEEEeCCC--------Chhhhc---cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCC
Q 007765 341 TGVLVNKINPL--------SDAHEI---LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDG 409 (590)
Q Consensus 341 ~gv~V~~V~~~--------s~A~~a---L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g 409 (590)
.+..|..|.++ ||..+. +++||.|++|||+++....++. .++ ....|+.+.|+|.+.+
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~----------~lL-~~~agk~V~Ltv~~~~ 80 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPY----------RLL-EGKAGKQVLLTVNRKP 80 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HH----------HHH-HTTTTSEEEEEEE-ST
T ss_pred CEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHH----------HHh-cccCCCEEEEEEecCC
Confidence 67778888775 677765 6799999999999999876542 455 4468999999999955
Q ss_pred -EEEEEE
Q 007765 410 -KEHEFS 415 (590)
Q Consensus 410 -~~~~~~ 415 (590)
+.+++.
T Consensus 81 ~~~R~v~ 87 (88)
T PF14685_consen 81 GGARTVV 87 (88)
T ss_dssp T-EEEEE
T ss_pred CCceEEE
Confidence 455443
No 57
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.83 E-value=3e-05 Score=63.49 Aligned_cols=55 Identities=16% Similarity=0.247 Sum_probs=45.9
Q ss_pred cceEEEEEEeeccccccccccCCceEEeeCCeecCCH--HHHHHHHHhcCCCceEEEe
Q 007765 481 EQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENL--KHLCQLVENCSSENLRFDL 536 (590)
Q Consensus 481 ~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~--~~l~~~v~~~~~~~v~l~~ 536 (590)
..+++++.|.++++++..+++.||+|++|||+++.++ .+..++++.+.+ .++|++
T Consensus 24 ~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~-~v~L~V 80 (81)
T PF00595_consen 24 EKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASN-PVTLTV 80 (81)
T ss_dssp SEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTS-EEEEEE
T ss_pred cCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCC-cEEEEE
Confidence 3688999999999999999999999999999999977 566677777655 666654
No 58
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=97.83 E-value=4.1e-05 Score=76.89 Aligned_cols=67 Identities=10% Similarity=0.011 Sum_probs=57.9
Q ss_pred CcceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCC-CceEEEeCCC-eEEEEEe
Q 007765 480 GEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSS-ENLRFDLDDD-RVVVLNY 546 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~-~~v~l~~~r~-~~i~l~~ 546 (590)
..+|+.+..+.++++++.+|+++||+|++|||+++.+++++.+++.+.+. +.+++++.|+ +.+.+.+
T Consensus 189 ~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v 257 (259)
T TIGR01713 189 KLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQREDIYV 257 (259)
T ss_pred ceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEE
Confidence 34789999999999999999999999999999999999999999998644 5799999884 5555543
No 59
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=97.79 E-value=0.00021 Score=76.25 Aligned_cols=75 Identities=27% Similarity=0.394 Sum_probs=52.2
Q ss_pred hhcCCCCCcCceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCC
Q 007765 332 NNFGMRSEVTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDG 409 (590)
Q Consensus 332 ~~lgl~~~~~gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g 409 (590)
+.|||.- ..-++|..+...+.|++- ||.||+||+|||.-..+.. +. +...++. +..| ++.+.|+||.
T Consensus 211 EEyGlrL-gSqIFvKeit~~gLAardgnlqEGDiiLkINGtvteNmS-Lt-------Dar~LIE-kS~G-KL~lvVlRD~ 279 (1027)
T KOG3580|consen 211 EEYGLRL-GSQIFVKEITRTGLAARDGNLQEGDIILKINGTVTENMS-LT-------DARKLIE-KSRG-KLQLVVLRDS 279 (1027)
T ss_pred hhhcccc-cchhhhhhhcccchhhccCCcccccEEEEECcEeecccc-ch-------hHHHHHH-hccC-ceEEEEEecC
Confidence 4456643 245677888888888876 9999999999999888763 22 2334553 3334 6799999987
Q ss_pred EEEEEEEE
Q 007765 410 KEHEFSIT 417 (590)
Q Consensus 410 ~~~~~~v~ 417 (590)
...-+.|+
T Consensus 280 ~qtLiNiP 287 (1027)
T KOG3580|consen 280 QQTLINIP 287 (1027)
T ss_pred CceeeecC
Confidence 66555554
No 60
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.72 E-value=6.7e-05 Score=75.10 Aligned_cols=67 Identities=25% Similarity=0.419 Sum_probs=54.0
Q ss_pred CceEE-EEeCCCCh---hhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEE
Q 007765 341 TGVLV-NKINPLSD---AHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFS 415 (590)
Q Consensus 341 ~gv~V-~~V~~~s~---A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~ 415 (590)
.| ++ ..+.|+.. ..++ ||+||++++|||.++.+.++.. +++........++|+|+|||+..++.
T Consensus 204 ~G-l~GYrl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~----------~l~~~L~~~tei~ltVeRdGq~~~i~ 272 (276)
T PRK09681 204 EG-IVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMI----------ALMRQLPSMDSIQLTVLRKGARHDIS 272 (276)
T ss_pred CC-ceEEEECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHH----------HHHHHhccCCeEEEEEEECCEEEEEE
Confidence 56 55 45778743 3567 9999999999999999876532 67777778889999999999999988
Q ss_pred EEE
Q 007765 416 ITL 418 (590)
Q Consensus 416 v~l 418 (590)
+.+
T Consensus 273 i~l 275 (276)
T PRK09681 273 IAL 275 (276)
T ss_pred EEc
Confidence 765
No 61
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.66 E-value=0.00011 Score=73.48 Aligned_cols=72 Identities=28% Similarity=0.327 Sum_probs=63.1
Q ss_pred cCceEEEEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEe-CCEEEEEEEEE
Q 007765 340 VTGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLR-DGKEHEFSITL 418 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R-~g~~~~~~v~l 418 (590)
-.||++..|..++++...|+.||.|++|||+++.+.+++. +.+....+|+++++++.| +++....++++
T Consensus 129 y~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e~i----------~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl 198 (342)
T COG3480 129 YAGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDELI----------DYVSSKKPGDEVTIDYERHNETPEIVTITL 198 (342)
T ss_pred EeeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHHHH----------HHHhccCCCCeEEEEEEeccCCCceEEEEE
Confidence 3799999999999998889999999999999999988643 778888999999999997 88888777777
Q ss_pred ecC
Q 007765 419 RLL 421 (590)
Q Consensus 419 ~~~ 421 (590)
...
T Consensus 199 ~~~ 201 (342)
T COG3480 199 IKN 201 (342)
T ss_pred Eee
Confidence 654
No 62
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.64 E-value=0.00047 Score=67.41 Aligned_cols=116 Identities=20% Similarity=0.256 Sum_probs=62.6
Q ss_pred CceEEEEEEe-C--CeEEEcCcccCCCcEEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccCcceeEEcCCCCCCC
Q 007765 139 ETTGSGFVIP-G--KKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIPFLQ 215 (590)
Q Consensus 139 ~~~GsGfiI~-~--g~IlT~aHvv~~~~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~~~G 215 (590)
.+.|||-+.. + -.|||+.||+. .+..+|.. .+.... ..++..-|+|.-.++. +....|.+++++. ..|
T Consensus 111 ss~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~--~g~~~~---~tF~~~GDfA~~~~~~--~~G~~P~~k~a~~-~~G 181 (297)
T PF05579_consen 111 SSVGSGGVFTIGGNTVVVTATHVLG-GNTARVSG--VGTRRM---LTFKKNGDFAEADITN--WPGAAPKYKFAQN-YTG 181 (297)
T ss_dssp SSEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEE--TTEEEE---EEEEEETTEEEEEETT--S-S---B--B-TT--SE
T ss_pred ecccccceEEECCeEEEEEEEEEcC-CCeEEEEe--cceEEE---EEEeccCcEEEEECCC--CCCCCCceeecCC-ccc
Confidence 3567777766 3 38999999998 55556665 333332 3355678999988833 2236777777622 122
Q ss_pred CeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEEcccccCCCCCceEE-eCCEEEEEEeeecC
Q 007765 216 QAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAI-MGNKVAGVAFQNLS 285 (590)
Q Consensus 216 ~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~-~~G~vVGI~~~~~~ 285 (590)
..-.. + .--+..|.|..-. .+. -..+||||+|++ .+|.+|||++....
T Consensus 182 rAyW~---t----~tGvE~G~ig~~~------------~~~---fT~~GDSGSPVVt~dg~liGVHTGSn~ 230 (297)
T PF05579_consen 182 RAYWL---T----STGVEPGFIGGGG------------AVC---FTGPGDSGSPVVTEDGDLIGVHTGSNK 230 (297)
T ss_dssp EEEEE---E----TTEEEEEEEETTE------------EEE---SS-GGCTT-EEEETTC-EEEEEEEEET
T ss_pred ceEEE---c----ccCcccceecCce------------EEE---EcCCCCCCCccCcCCCCEEEEEecCCC
Confidence 11000 0 1133445544211 122 235799999999 99999999998643
No 63
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=97.59 E-value=0.0002 Score=76.40 Aligned_cols=61 Identities=21% Similarity=0.390 Sum_probs=48.1
Q ss_pred cCceEEEEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCE
Q 007765 340 VTGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGK 410 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~ 410 (590)
.+.++|..|.||+||+..||.||.|+.|||...++..+. | .+-.....|+...|+|.|..+
T Consensus 39 etSiViSDVlpGGPAeG~LQenDrvvMVNGvsMenv~ha---------F-AvQqLrksgK~A~ItvkRprk 99 (1027)
T KOG3580|consen 39 ETSIVISDVLPGGPAEGLLQENDRVVMVNGVSMENVLHA---------F-AVQQLRKSGKVAAITVKRPRK 99 (1027)
T ss_pred ceeEEEeeccCCCCcccccccCCeEEEEcCcchhhhHHH---------H-HHHHHHhhccceeEEecccce
Confidence 567899999999999988999999999999998876542 1 222334568888999988544
No 64
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.57 E-value=0.00023 Score=67.03 Aligned_cols=74 Identities=27% Similarity=0.264 Sum_probs=60.3
Q ss_pred ceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEec
Q 007765 342 GVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLRL 420 (590)
Q Consensus 342 gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~ 420 (590)
-++|..|.|+|||+.+ |+.||.|+++....-.++..+. .+ ..+.....+..+.++|.|.|+...+.++...
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq-------~i-~~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~ 211 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQ-------NI-AAVVQSNEDQIVSVTVIREGQKVVLSLTPKK 211 (231)
T ss_pred eEEEeecCCCChhhhhCcccCceEEEecccccccchhHH-------HH-HHHHHhccCcceeEEEecCCCEEEEEeCccc
Confidence 4578999999999999 9999999999988777765432 12 2333456788999999999999999999998
Q ss_pred CCC
Q 007765 421 LQP 423 (590)
Q Consensus 421 ~~~ 423 (590)
|+.
T Consensus 212 W~G 214 (231)
T KOG3129|consen 212 WQG 214 (231)
T ss_pred ccC
Confidence 875
No 65
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=97.50 E-value=0.002 Score=60.83 Aligned_cols=138 Identities=20% Similarity=0.301 Sum_probs=81.6
Q ss_pred CCceEEEEEEeCCeEEEcCcccCCCcEEEEEEcCCCcEEEE--EEEEecC---CCCeEEEEecCC-cccCcceeEEcCCC
Q 007765 138 RETTGSGFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRA--QVEAVGH---ECDLAILIVESD-EFWEGMHFLELGDI 211 (590)
Q Consensus 138 ~~~~GsGfiI~~g~IlT~aHvv~~~~~i~V~~~~~~~~~~a--~vv~~d~---~~DlAlLkv~~~-~~~~~~~~~~l~~~ 211 (590)
....++|+.|.+.++|.+.| -.....+ .+ ++..++. .+...+. ..||++++++.. .|.+-.+.+. ...
T Consensus 23 g~~t~l~~gi~~~~~lvp~H-~~~~~~i--~i--~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIrk~~~-~~~ 96 (172)
T PF00548_consen 23 GEFTMLALGIYDRYFLVPTH-EEPEDTI--YI--DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIRKFFP-ESI 96 (172)
T ss_dssp EEEEEEEEEEEBTEEEEEGG-GGGCSEE--EE--TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GGGGSB-SSG
T ss_pred ceEEEecceEeeeEEEEECc-CCCcEEE--EE--CCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchhhhhc-ccc
Confidence 55778899999999999999 2223333 33 4455443 2223443 469999999875 3433233333 111
Q ss_pred CCCCCeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEEcccccCCCCCceEEe----CCEEEEEEeee
Q 007765 212 PFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAIM----GNKVAGVAFQN 283 (590)
Q Consensus 212 ~~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~~----~G~vVGI~~~~ 283 (590)
....+...++-.+.. ....+..+.+...... ...+......+.++++..+|+-||||+. .++++||+.++
T Consensus 97 ~~~~~~~l~v~~~~~-~~~~~~v~~v~~~~~i-~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 97 PEYPECVLLVNSTKF-PRMIVEVGFVTNFGFI-NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp GTEEEEEEEEESSSS-TCEEEEEEEEEEEEEE-EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred ccCCCcEEEEECCCC-ccEEEEEEEEeecCcc-ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 233444444433322 2223344444433322 2233344467888999999999999996 78999999875
No 66
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.48 E-value=0.00029 Score=74.32 Aligned_cols=56 Identities=16% Similarity=0.302 Sum_probs=48.4
Q ss_pred ccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCCC-eEEEEEec
Q 007765 492 DDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDDD-RVVVLNYD 547 (590)
Q Consensus 492 ~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r~-~~i~l~~~ 547 (590)
++||+.+++++||+|++|||++|.+|++|.+++++++++.+.+++.|+ +.+.+...
T Consensus 123 ~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~Ge~~tv~V~ 179 (402)
T TIGR02860 123 HSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERGGKIIETVIK 179 (402)
T ss_pred CCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEECCEEEEEEEE
Confidence 467888899999999999999999999999999999999999999885 55555444
No 67
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.46 E-value=0.002 Score=65.64 Aligned_cols=48 Identities=25% Similarity=0.367 Sum_probs=32.2
Q ss_pred cccCCCCCceEE-eC--C-EEEEEEeeecCCCCc---eeEEeehHHHHHHHHHHH
Q 007765 260 AINPGNSGGPAI-MG--N-KVAGVAFQNLSGAEN---IGYIIPVPVIKHFITGVV 307 (590)
Q Consensus 260 ~i~~G~SGGPl~-~~--G-~vVGI~~~~~~~~~~---~~~aip~~~i~~~l~~l~ 307 (590)
..|+|+||||+| .. | .-+||++++.....+ .+..--++....++++..
T Consensus 224 daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~ 278 (413)
T COG5640 224 DACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMT 278 (413)
T ss_pred ccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHh
Confidence 468999999999 33 4 479999998762221 233344567777777654
No 68
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.39 E-value=0.00028 Score=57.71 Aligned_cols=57 Identities=19% Similarity=0.121 Sum_probs=44.6
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHh-cCCCceEEEeCC
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVEN-CSSENLRFDLDD 538 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~-~~~~~v~l~~~r 538 (590)
.++++..|.++++++..++++||+|++|||+++.++.+....... ..++.+++.+.|
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r 83 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLR 83 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEe
Confidence 578999999999999999999999999999999987664444332 223467776655
No 69
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.37 E-value=0.00031 Score=73.68 Aligned_cols=58 Identities=7% Similarity=0.124 Sum_probs=51.6
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCH--HHHHHHHHhcCCCceEEEeCCC
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENL--KHLCQLVENCSSENLRFDLDDD 539 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~--~~l~~~v~~~~~~~v~l~~~r~ 539 (590)
..++|..|.+++||+.+|+++||+|++|||++|.+| .++.+.+....++.+.+++.|+
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~ 121 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRA 121 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeC
Confidence 467889999999999999999999999999999987 5788888877888999998874
No 70
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.36 E-value=0.00035 Score=74.80 Aligned_cols=58 Identities=9% Similarity=0.043 Sum_probs=51.2
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCC--HHHHHHHHHhcCCCceEEEeCCC
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIEN--LKHLCQLVENCSSENLRFDLDDD 539 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~--~~~l~~~v~~~~~~~v~l~~~r~ 539 (590)
++++|..|.+++||+.+|+++||+|++|||++|.+ +.++.+.++...+..+.+++.|+
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r~ 161 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRRG 161 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEEC
Confidence 46789999999999999999999999999999985 47888888877888899998874
No 71
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=97.29 E-value=0.00025 Score=75.29 Aligned_cols=61 Identities=18% Similarity=0.173 Sum_probs=49.8
Q ss_pred EEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeC-C-CeEEEEEechh
Q 007765 486 LSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLD-D-DRVVVLNYDVA 549 (590)
Q Consensus 486 l~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~-r-~~~i~l~~~~~ 549 (590)
|..|.|+|+|+.+|+++||+|++|||++|.+|.++...+. ++.+.+++. | |+...+.+...
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~rdGe~~~l~Ie~~ 64 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDANGESHQIEIEKD 64 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcCCCeEEEEEEecC
Confidence 4568999999999999999999999999999999888774 467888884 4 56556655543
No 72
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.27 E-value=0.015 Score=59.31 Aligned_cols=106 Identities=21% Similarity=0.259 Sum_probs=65.1
Q ss_pred CCCCeEEEEecCCcccCcceeEEcCCCC---CCCCeEEEEecCCCCCCceEEEeEEecccccccccCcceeeEEEEcccc
Q 007765 185 HECDLAILIVESDEFWEGMHFLELGDIP---FLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAI 261 (590)
Q Consensus 185 ~~~DlAlLkv~~~~~~~~~~~~~l~~~~---~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~i 261 (590)
...++.||+++.+ +.....|+-|+++. ..++.+.+.|+.... .+....+.-..... ....+......
T Consensus 159 ~~~~~mIlEl~~~-~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~~---~~~~~~~~i~~~~~------~~~~~~~~~~~ 228 (282)
T PF03761_consen 159 RPYSPMILELEED-FSKNVSPPCLADSSTNWEKGDEVDVYGFNSTG---KLKHRKLKITNCTK------CAYSICTKQYS 228 (282)
T ss_pred cccceEEEEEccc-ccccCCCEEeCCCccccccCceEEEeecCCCC---eEEEEEEEEEEeec------cceeEeccccc
Confidence 3579999999987 44578888898765 247889999982221 22222222111100 11235556677
Q ss_pred cCCCCCceEE--eCCE--EEEEEeeecCC-CCceeEEeehHHHH
Q 007765 262 NPGNSGGPAI--MGNK--VAGVAFQNLSG-AENIGYIIPVPVIK 300 (590)
Q Consensus 262 ~~G~SGGPl~--~~G~--vVGI~~~~~~~-~~~~~~aip~~~i~ 300 (590)
+.|++|||++ .+|+ ||||.+..... ..+..+++.+...+
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~ 272 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQ 272 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhh
Confidence 8999999999 5664 99998765432 11245566555543
No 73
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.17 E-value=0.00072 Score=72.20 Aligned_cols=84 Identities=25% Similarity=0.323 Sum_probs=62.5
Q ss_pred cccceeeeccHhhhhhcCCCC--CcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhc
Q 007765 318 LGLSCQTTENVQLRNNFGMRS--EVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSM 394 (590)
Q Consensus 318 lGi~~~~~~~~~~~~~lgl~~--~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~ 394 (590)
.|+.+.+. +...-+||+.- +.++.+|..|.++|||.+| |.+||.|++|||. + ..+.+
T Consensus 439 ~gL~~~~~--~~~~~~LGl~v~~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~---s---------------~~l~~ 498 (558)
T COG3975 439 FGLTFTPK--PREAYYLGLKVKSEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI---S---------------DQLDR 498 (558)
T ss_pred cceEEEec--CCCCcccceEecccCCeeEEEecCCCChhHhccCCCccEEEEEcCc---c---------------ccccc
Confidence 45555443 22244666633 2355788999999999999 9999999999999 2 12345
Q ss_pred cCCCCEEEEEEEeCCEEEEEEEEEecC
Q 007765 395 KKPNEKSLVRVLRDGKEHEFSITLRLL 421 (590)
Q Consensus 395 ~~~g~~v~l~V~R~g~~~~~~v~l~~~ 421 (590)
...++.+++++.|.|..+++.+++...
T Consensus 499 ~~~~d~i~v~~~~~~~L~e~~v~~~~~ 525 (558)
T COG3975 499 YKVNDKIQVHVFREGRLREFLVKLGGD 525 (558)
T ss_pred cccccceEEEEccCCceEEeecccCCC
Confidence 678999999999999999887776543
No 74
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.15 E-value=0.0013 Score=59.58 Aligned_cols=87 Identities=26% Similarity=0.324 Sum_probs=54.8
Q ss_pred cccccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCC-CCEEEEECCEEecCCCcccccccccchHHHH
Q 007765 314 GFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKK-DDIILAFDGVPIANDGTVAFRNRERITFDHL 391 (590)
Q Consensus 314 ~~~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~-GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~ 391 (590)
+.+-||++++.-. .. +- ...+..|.+|.|+|||++| |++ .|.|+.+|+..+.+.++ |.++
T Consensus 24 ~~g~LG~sv~~~~-~~-----~~--~~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~----------l~~~ 85 (138)
T PF04495_consen 24 GQGLLGISVRFES-FE-----GA--EEEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDD----------LFEL 85 (138)
T ss_dssp SSSSS-EEEEEEE--T-----TG--CCCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCH----------HHHH
T ss_pred CCCCCcEEEEEec-cc-----cc--ccceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHH----------HHHH
Confidence 4567999887652 11 11 1367889999999999999 998 69999999988887654 3466
Q ss_pred hhccCCCCEEEEEEEeC--CEEEEEEEEEe
Q 007765 392 VSMKKPNEKSLVRVLRD--GKEHEFSITLR 419 (590)
Q Consensus 392 ~~~~~~g~~v~l~V~R~--g~~~~~~v~l~ 419 (590)
+. ...+..+.|.|... ...+++.+...
T Consensus 86 v~-~~~~~~l~L~Vyns~~~~vR~V~i~P~ 114 (138)
T PF04495_consen 86 VE-ANENKPLQLYVYNSKTDSVREVTITPS 114 (138)
T ss_dssp HH-HTTTS-EEEEEEETTTTCEEEEEE---
T ss_pred HH-HcCCCcEEEEEEECCCCeEEEEEEEcC
Confidence 64 46788999999873 34455555544
No 75
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.11 E-value=0.0019 Score=73.04 Aligned_cols=71 Identities=18% Similarity=0.170 Sum_probs=47.6
Q ss_pred CceEEEEeCCCChhhhc--cCCCCEEEEEC--CEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeC---CEEEE
Q 007765 341 TGVLVNKINPLSDAHEI--LKKDDIILAFD--GVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRD---GKEHE 413 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a--L~~GD~Il~Vn--G~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~---g~~~~ 413 (590)
.+++|..|.|||||+++ |++||+|++|| |+++.+...... -+...++. -..|.+|.|+|.|+ ++..+
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~-----~~vv~lir-G~~Gt~V~LtV~r~~~~~~~~~ 328 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRL-----DDVVALIK-GPKGSKVRLEILPAGKGTKTRI 328 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCH-----HHHHHHhc-CCCCCEEEEEEEeCCCCCceEE
Confidence 56889999999999986 99999999999 555433211110 01223443 35789999999983 44555
Q ss_pred EEEE
Q 007765 414 FSIT 417 (590)
Q Consensus 414 ~~v~ 417 (590)
+++.
T Consensus 329 vtl~ 332 (667)
T PRK11186 329 VTLT 332 (667)
T ss_pred EEEE
Confidence 5544
No 76
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=96.90 E-value=0.0034 Score=52.04 Aligned_cols=58 Identities=10% Similarity=0.142 Sum_probs=40.9
Q ss_pred ceEEEEEEeeccc----------cccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCCC
Q 007765 482 QLVILSQVLMDDI----------NAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDDD 539 (590)
Q Consensus 482 ~~vvl~~V~~~s~----------a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r~ 539 (590)
....|..++++++ ..|..++.||.|++|||+++..-.++.+++....++.+.|++.++
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~ 79 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRK 79 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-S
T ss_pred CEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecC
Confidence 5567888887643 334456799999999999999999999999999999999998763
No 77
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=96.77 E-value=0.0041 Score=65.61 Aligned_cols=69 Identities=17% Similarity=0.111 Sum_probs=59.3
Q ss_pred CcceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcC-CCceEEEeCCC-eEEEEEech
Q 007765 480 GEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCS-SENLRFDLDDD-RVVVLNYDV 548 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~-~~~v~l~~~r~-~~i~l~~~~ 548 (590)
...|+++..|.+++|++.++++.||+|+++||+++.+..++.+.+..+. ++.+.+++.|+ +...+....
T Consensus 268 ~~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~~~v~l 338 (347)
T COG0265 268 VAAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERELAVTL 338 (347)
T ss_pred CCCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEEEEEEe
Confidence 4567899999999999999999999999999999999999999998866 77899988875 555555543
No 78
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=96.64 E-value=0.011 Score=61.74 Aligned_cols=156 Identities=19% Similarity=0.196 Sum_probs=103.3
Q ss_pred cCceEEEEeCCCChhhhc-cCC-CCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCC--EEEEEE
Q 007765 340 VTGVLVNKINPLSDAHEI-LKK-DDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDG--KEHEFS 415 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~a-L~~-GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g--~~~~~~ 415 (590)
..|.-|-+|..+|+|.++ |.+ -|-|++|||..+....+. +..++.... ++++++|.-.. ..+.++
T Consensus 14 teg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~---------Lk~llk~~s--ekVkltv~n~kt~~~R~v~ 82 (462)
T KOG3834|consen 14 TEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDT---------LKALLKANS--EKVKLTVYNSKTQEVRIVE 82 (462)
T ss_pred ceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHH---------HHHHHHhcc--cceEEEEEecccceeEEEE
Confidence 356778899999999999 776 689999999999977652 445554332 33899997532 222333
Q ss_pred EEEec-CCCCCCCccCCCCCcceeeccEE--EeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeec
Q 007765 416 ITLRL-LQPLVPVHQFDKLPSYYIFAGLV--FIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMD 492 (590)
Q Consensus 416 v~l~~-~~~l~~~~~~~~~p~~~~~~Gl~--~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~ 492 (590)
|+... |.. . ++|+. |..... ..+...-+-.|.++
T Consensus 83 I~ps~~wgg-----------q---llGvsvrFcsf~~-----------------------------A~~~vwHvl~V~p~ 119 (462)
T KOG3834|consen 83 IVPSNNWGG-----------Q---LLGVSVRFCSFDG-----------------------------AVESVWHVLSVEPN 119 (462)
T ss_pred ecccccccc-----------c---ccceEEEeccCcc-----------------------------chhheeeeeecCCC
Confidence 33221 110 0 23443 222111 12223346678999
Q ss_pred ccccccccc-CCceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCC---C--eEEEEEechh
Q 007765 493 DINAGYERF-ADLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDD---D--RVVVLNYDVA 549 (590)
Q Consensus 493 s~a~g~~~~-~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r---~--~~i~l~~~~~ 549 (590)
+|++.+++. .+|-|+-+-.......+||..+|+.|.++.+.+-+-+ + +.+++.+...
T Consensus 120 SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyVYN~D~d~~ReVti~pn~a 182 (462)
T KOG3834|consen 120 SPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYVYNHDTDSCREVTITPNSA 182 (462)
T ss_pred CHHHhcccccccceEecchhhhccchHHHHHHHHhccCCCcceeEeecCCCccceEEeecccc
Confidence 999999998 6899999966666789999999999999998886532 3 5556655443
No 79
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=96.61 E-value=0.0022 Score=61.67 Aligned_cols=66 Identities=20% Similarity=0.288 Sum_probs=52.3
Q ss_pred ceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEE
Q 007765 342 GVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSIT 417 (590)
Q Consensus 342 gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~ 417 (590)
|..+.-..+++..+.. ||.||+.++||+..+.+.+++. .++.....-+.+.+||+|+|+..++.+.
T Consensus 208 Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~----------~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 208 GYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMF----------RLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred EEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHH----------HHHHhhhcCcceEEEEEecCccceeeec
Confidence 4444444556677777 9999999999999999887643 6777777778899999999999988764
No 80
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=96.58 E-value=0.0045 Score=55.98 Aligned_cols=69 Identities=17% Similarity=0.102 Sum_probs=51.1
Q ss_pred CcceEEEEEEeeccccccccccC-CceEEeeCCeecCCHHHHHHHHHhcCCCceEEEeCC---C--eEEEEEech
Q 007765 480 GEQLVILSQVLMDDINAGYERFA-DLQVKKVNGVEIENLKHLCQLVENCSSENLRFDLDD---D--RVVVLNYDV 548 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~~~~-gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l~~~r---~--~~i~l~~~~ 548 (590)
.+.+.-|..|.|+|||+.+|+.+ .|.|+.+|+...++.++|.++++++.++.+.|.+-+ + +.+.+.+..
T Consensus 41 ~~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vyns~~~~vR~V~i~P~~ 115 (138)
T PF04495_consen 41 EEEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVYNSKTDSVREVTITPSR 115 (138)
T ss_dssp CCCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEEETTTTCEEEEEE---T
T ss_pred ccceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEEECCCCeEEEEEEEcCC
Confidence 45677899999999999999998 699999999999999999999999999999998733 3 445555543
No 81
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.56 E-value=0.0028 Score=52.46 Aligned_cols=34 Identities=26% Similarity=0.354 Sum_probs=31.5
Q ss_pred cCceEEEEeCCCChhhhc-cCCCCEEEEECCEEec
Q 007765 340 VTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIA 373 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~ 373 (590)
..|++|..|..+|||+.| |+.+|.|+.+||....
T Consensus 58 D~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfT 92 (124)
T KOG3553|consen 58 DKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFT 92 (124)
T ss_pred CccEEEEEeccCChhhhhcceecceEEEecCceeE
Confidence 489999999999999999 9999999999997654
No 82
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=96.48 E-value=0.0036 Score=67.17 Aligned_cols=57 Identities=9% Similarity=0.056 Sum_probs=52.0
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCCH--HHHHHHHHhcCCCceEEEeCC
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIENL--KHLCQLVENCSSENLRFDLDD 538 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~--~~l~~~v~~~~~~~v~l~~~r 538 (590)
..+.+..+++++||..+++++||+|++|||+++... ++.++.++..+|..++|++.|
T Consensus 112 ~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r 170 (406)
T COG0793 112 GGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILR 170 (406)
T ss_pred CCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEE
Confidence 567888899999999999999999999999999887 578889999999999999988
No 83
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=96.39 E-value=0.0057 Score=66.51 Aligned_cols=123 Identities=15% Similarity=0.159 Sum_probs=77.1
Q ss_pred EeCCCChhhhc--cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCCEEEEEEEEEecCCCC
Q 007765 347 KINPLSDAHEI--LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDGKEHEFSITLRLLQPL 424 (590)
Q Consensus 347 ~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~~~l 424 (590)
...+++||++. |-.||.|++|||...-..--- ..+..+.....-..|+++|.+=--..++.|. +
T Consensus 679 nmm~~GpAarsgkLnIGDQiiaING~SLVGLPLs--------tcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~--R---- 744 (829)
T KOG3605|consen 679 NMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLS--------TCQSIIKGLKNQTAVKLNIVSCPPVTTVLIR--R---- 744 (829)
T ss_pred hcccCChhhhcCCccccceeEeecCceeccccHH--------HHHHHHhcccccceEEEEEecCCCceEEEee--c----
Confidence 34457899987 999999999999877643210 1234555555555688888664333333221 1
Q ss_pred CCCccCCCCCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCc
Q 007765 425 VPVHQFDKLPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADL 504 (590)
Q Consensus 425 ~~~~~~~~~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd 504 (590)
|+...-+||.++ .| ||-.++-++.|..-|.+.|-
T Consensus 745 ---------Pd~kyQLGFSVQ------------------------------------NG-iICSLlRGGIAERGGVRVGH 778 (829)
T KOG3605|consen 745 ---------PDLRYQLGFSVQ------------------------------------NG-IICSLLRGGIAERGGVRVGH 778 (829)
T ss_pred ---------ccchhhccceee------------------------------------Cc-EeehhhcccchhccCceeee
Confidence 111223566533 22 44457788999999999999
Q ss_pred eEEeeCCeecCCHHH--HHHHHHhcCC
Q 007765 505 QVKKVNGVEIENLKH--LCQLVENCSS 529 (590)
Q Consensus 505 ~I~~VNG~~v~~~~~--l~~~v~~~~~ 529 (590)
+|++|||+.|--..| .+++|.+.-|
T Consensus 779 RIIEINgQSVVA~pHekIV~lLs~aVG 805 (829)
T KOG3605|consen 779 RIIEINGQSVVATPHEKIVQLLSNAVG 805 (829)
T ss_pred eEEEECCceEEeccHHHHHHHHHHhhh
Confidence 999999999854443 4455554433
No 84
>PRK11186 carboxy-terminal protease; Provisional
Probab=95.88 E-value=0.014 Score=66.06 Aligned_cols=57 Identities=12% Similarity=0.219 Sum_probs=48.3
Q ss_pred ceEEEEEEeeccccccc-cccCCceEEeeC--CeecCC-----HHHHHHHHHhcCCCceEEEeCC
Q 007765 482 QLVILSQVLMDDINAGY-ERFADLQVKKVN--GVEIEN-----LKHLCQLVENCSSENLRFDLDD 538 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~-~~~~gd~I~~VN--G~~v~~-----~~~l~~~v~~~~~~~v~l~~~r 538 (590)
..++|..|++++||..+ ++++||+|++|| |+++.+ +++.+++++..+|..|+|++.|
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r 319 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILP 319 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEe
Confidence 35788899999999997 999999999999 555433 4588999999999999999976
No 85
>PF12812 PDZ_1: PDZ-like domain
Probab=95.45 E-value=0.024 Score=45.92 Aligned_cols=59 Identities=19% Similarity=0.180 Sum_probs=48.7
Q ss_pred cccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcc
Q 007765 316 CSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTV 378 (590)
Q Consensus 316 ~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v 378 (590)
-|.|..++++ +-+.++.++++- .| ++.....++++.+. +..|-+|++|||+++.+.+++
T Consensus 9 ~~~Ga~f~~L-s~q~aR~~~~~~--~g-v~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f 68 (78)
T PF12812_consen 9 EVCGAVFHDL-SYQQARQYGIPV--GG-VYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDF 68 (78)
T ss_pred EEcCeecccC-CHHHHHHhCCCC--CE-EEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHH
Confidence 3688999988 778888899874 34 44466889999888 999999999999999988753
No 86
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=95.36 E-value=0.031 Score=56.38 Aligned_cols=58 Identities=16% Similarity=0.126 Sum_probs=50.1
Q ss_pred CcceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHh-cCCCceEEEeCC
Q 007765 480 GEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVEN-CSSENLRFDLDD 538 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~-~~~~~v~l~~~r 538 (590)
...||++..|..++++.|- +..||.|++|||+++.+.++|.+++++ ..|+.++|++.|
T Consensus 128 ~y~gvyv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r 186 (342)
T COG3480 128 TYAGVYVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYER 186 (342)
T ss_pred EEeeEEEEEccCCcchhce-eccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEe
Confidence 4468888889888887664 679999999999999999999999988 567789999986
No 87
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=95.19 E-value=0.016 Score=55.86 Aligned_cols=135 Identities=22% Similarity=0.262 Sum_probs=47.4
Q ss_pred CCeEEEcCcccCCCcEEEEEEcCCCcEEEE---EEEEecCCCCeEEEEecCCcccC--cceeEEcCCCCCCC-CeEEEEe
Q 007765 149 GKKILTNAHVVADSTFVLVRKHGSPTKYRA---QVEAVGHECDLAILIVESDEFWE--GMHFLELGDIPFLQ-QAVAVVG 222 (590)
Q Consensus 149 ~g~IlT~aHvv~~~~~i~V~~~~~~~~~~a---~vv~~d~~~DlAlLkv~~~~~~~--~~~~~~l~~~~~~G-~~V~~iG 222 (590)
...++|+.||..+...+.... +|.+++- +.+..+...|++||++... ++. .++.+.+....++. ..+.+.+
T Consensus 41 ~~~L~ta~Hv~~~~~~~~~~k--~g~kipl~~f~~~~~~~~~D~~il~~P~n-~~s~Lg~k~~~~~~~~~~~~g~~~~y~ 117 (203)
T PF02122_consen 41 EDALLTARHVWSRPSKVTSLK--TGEKIPLAEFTDLLESRIADFVILRGPPN-WESKLGVKAAQLSQNSQLAKGPVSFYG 117 (203)
T ss_dssp -EEEEE-HHHHTSSS---EEE--TTEEEE--S-EEEEE-TTT-EEEEE--HH-HHHHHT-----B----SEEEEESSTTS
T ss_pred ccceecccccCCCccceeEcC--CCCcccchhChhhhCCCccCEEEEecCcC-HHHHhCcccccccchhhhCCCCeeeee
Confidence 459999999999866555444 4454443 3455678999999999832 211 34455553332210 0000001
Q ss_pred cCCCCCCceEEEeEEecccccccccCcceeeEEEEcccccCCCCCceEEeCCEEEEEEeeecC--CCCceeEEeehHH
Q 007765 223 YPQGGDNISVTKGVVSRVEPTQYVHGATQLMAIQIDAAINPGNSGGPAIMGNKVAGVAFQNLS--GAENIGYIIPVPV 298 (590)
Q Consensus 223 ~p~g~~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~~~G~vVGI~~~~~~--~~~~~~~aip~~~ 298 (590)
+..+. .......|... .. .+...-....+|.||.|++...+++|++.+... ..++.++..|+.-
T Consensus 118 ~~~~~--~~~~sa~i~g~--------~~--~~~~vls~T~~G~SGtp~y~g~~vvGvH~G~~~~~~~~n~n~~spip~ 183 (203)
T PF02122_consen 118 FSSGE--WPCSSAKIPGT--------EG--KFASVLSNTSPGWSGTPYYSGKNVVGVHTGSPSGSNRENNNRMSPIPP 183 (203)
T ss_dssp EEEEE--EEEEE-S------------ST--TEEEE-----TT-TT-EEE-SS-EEEEEEEE-----------------
T ss_pred ecCCC--ceeccCccccc--------cC--cCCceEcCCCCCCCCCCeEECCCceEeecCcccccccccccccccccc
Confidence 00000 01111111110 11 134444567899999999955599999998522 3455666655443
No 88
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=95.08 E-value=0.044 Score=55.15 Aligned_cols=58 Identities=7% Similarity=-0.048 Sum_probs=44.6
Q ss_pred Eeeccc---cccccccCCceEEeeCCeecCCHHHHHHHHHhcCCC-ceEEEeCCC-eEEEEEe
Q 007765 489 VLMDDI---NAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSE-NLRFDLDDD-RVVVLNY 546 (590)
Q Consensus 489 V~~~s~---a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~-~v~l~~~r~-~~i~l~~ 546 (590)
+.|+.. =..+|+++||++++|||.++.+.++..+++++..+. .++|+++|+ +...+.+
T Consensus 211 l~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i 273 (276)
T PRK09681 211 VKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISI 273 (276)
T ss_pred ECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEE
Confidence 445533 234578899999999999999999999999986655 699999984 5555443
No 89
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=94.75 E-value=0.073 Score=50.53 Aligned_cols=68 Identities=15% Similarity=0.181 Sum_probs=54.4
Q ss_pred eEEEEEEeeccccccccccCCceEEeeCCeecCC---HHHHHHHHHhcCCCceEEEeCC-CeEEEEEechhh
Q 007765 483 LVILSQVLMDDINAGYERFADLQVKKVNGVEIEN---LKHLCQLVENCSSENLRFDLDD-DRVVVLNYDVAK 550 (590)
Q Consensus 483 ~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~---~~~l~~~v~~~~~~~v~l~~~r-~~~i~l~~~~~~ 550 (590)
-++|+.|.++|||+.+|++.||.|+++....--| ++.....++.+.++.+.+++.| ++.+.|...+..
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~ 211 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLTPKK 211 (231)
T ss_pred eEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeCccc
Confidence 3589999999999999999999999887655444 5566677888999999999988 566777666543
No 90
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=94.62 E-value=0.032 Score=53.52 Aligned_cols=42 Identities=26% Similarity=0.366 Sum_probs=35.8
Q ss_pred cccccCCCCCceEEeCCEEEEEEeeecCCCCceeEEeehHHH
Q 007765 258 DAAINPGNSGGPAIMGNKVAGVAFQNLSGAENIGYIIPVPVI 299 (590)
Q Consensus 258 ~~~i~~G~SGGPl~~~G~vVGI~~~~~~~~~~~~~aip~~~i 299 (590)
...+-.|+||+|++.+|++||-++..+.+....+|.++++..
T Consensus 174 TGGIvqGMSGSPI~qdGKLiGAVthvf~~dp~~Gygi~ie~M 215 (218)
T PF05580_consen 174 TGGIVQGMSGSPIIQDGKLIGAVTHVFVNDPTKGYGIFIEWM 215 (218)
T ss_pred hCCEEecccCCCEEECCEEEEEEEEEEecCCCceeeecHHHH
Confidence 345678999999999999999999888777888999987654
No 91
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=94.45 E-value=0.29 Score=54.27 Aligned_cols=117 Identities=20% Similarity=0.231 Sum_probs=69.7
Q ss_pred CCCCeEEEEecCCc-----ccC------cceeEEcCCC--------CCCCCeEEEEecCCCCCCceEEEeEEeccccccc
Q 007765 185 HECDLAILIVESDE-----FWE------GMHFLELGDI--------PFLQQAVAVVGYPQGGDNISVTKGVVSRVEPTQY 245 (590)
Q Consensus 185 ~~~DlAlLkv~~~~-----~~~------~~~~~~l~~~--------~~~G~~V~~iG~p~g~~~~~v~~G~Vs~~~~~~~ 245 (590)
.-.|+|||+++... +.+ .-|.+.+.+. ...|.+|+=+|..-+ .|.|.|.++.-...
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg-----yT~G~lNg~klvyw 615 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG-----YTTGILNGIKLVYW 615 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC-----ccceEecceEEEEe
Confidence 34699999998542 111 2223333321 123889999987544 35566665532222
Q ss_pred ccCcce-eeEEEEc----ccccCCCCCceEE-eCC------EEEEEEeeecCCCCceeEEeehHHHHHHHHHH
Q 007765 246 VHGATQ-LMAIQID----AAINPGNSGGPAI-MGN------KVAGVAFQNLSGAENIGYIIPVPVIKHFITGV 306 (590)
Q Consensus 246 ~~~~~~-~~~i~~~----~~i~~G~SGGPl~-~~G------~vVGI~~~~~~~~~~~~~aip~~~i~~~l~~l 306 (590)
..+... ..++... .-...||||+=|+ .-+ .|+||..+.......+|.+.|+..|.+-|+++
T Consensus 616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGEQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCccceeeccCcHHHHHHHHHHh
Confidence 222222 2233333 1224799999998 533 49999988655556789999998877766654
No 92
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=94.41 E-value=0.0083 Score=49.74 Aligned_cols=40 Identities=13% Similarity=0.086 Sum_probs=35.4
Q ss_pred CcceEEEEEEeeccccccccccCCceEEeeCCeecCCHHH
Q 007765 480 GEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKH 519 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~ 519 (590)
.+.+++++.|..++||+-+|++.+|+|+.|||-...-..|
T Consensus 57 tD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTMvTH 96 (124)
T KOG3553|consen 57 TDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTMVTH 96 (124)
T ss_pred CCccEEEEEeccCChhhhhcceecceEEEecCceeEEEEh
Confidence 5688999999999999999999999999999987755544
No 93
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=94.39 E-value=0.048 Score=61.74 Aligned_cols=57 Identities=23% Similarity=0.230 Sum_probs=42.6
Q ss_pred CceEEEEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEe
Q 007765 341 TGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLR 407 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R 407 (590)
.-|+|..|.+|+|+...|++||+|++|||++|.+.- |-...+++.. ..+.|.|+|.+
T Consensus 75 rPviVr~VT~GGps~GKL~PGDQIl~vN~Epv~dap--------rervIdlvRa--ce~sv~ltV~q 131 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDAP--------RERVIDLVRA--CESSVNLTVCQ 131 (1298)
T ss_pred CceEEEEecCCCCccccccCCCeEEEecCccccccc--------HHHHHHHHHH--HhhhcceEEec
Confidence 457889999999999889999999999999998642 1112244433 34567888876
No 94
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=94.14 E-value=0.078 Score=60.46 Aligned_cols=21 Identities=29% Similarity=0.265 Sum_probs=16.1
Q ss_pred eEEEEEEe-CCeEEEcCcccCC
Q 007765 141 TGSGFVIP-GKKILTNAHVVAD 161 (590)
Q Consensus 141 ~GsGfiI~-~g~IlT~aHvv~~ 161 (590)
-|||.+|+ +|+|+||.||+.+
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~~ 69 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGYG 69 (698)
T ss_pred ceeEEEEcCCceEEecchhhhh
Confidence 47888888 7888888888764
No 95
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=93.97 E-value=0.38 Score=47.09 Aligned_cols=98 Identities=17% Similarity=0.227 Sum_probs=69.6
Q ss_pred CCCCCccCCC--CCCceEEEEEEeCCeEEEcCcccCCC----cEEEEEEcCCCcEEEE------EEEEec-----CCCCe
Q 007765 127 NYGLPWQNKS--QRETTGSGFVIPGKKILTNAHVVADS----TFVLVRKHGSPTKYRA------QVEAVG-----HECDL 189 (590)
Q Consensus 127 ~~~~p~~~~~--~~~~~GsGfiI~~g~IlT~aHvv~~~----~~i~V~~~~~~~~~~a------~vv~~d-----~~~Dl 189 (590)
+|..||...- .+...++|++|+..|||++-.|+.+- ..+.+.+. .++.+.- ++..+| +..++
T Consensus 13 ~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~YvsallG-~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v 91 (267)
T PF09342_consen 13 DYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALLG-GGKTYLSVDGPHEQISRVDCFKDVPESNV 91 (267)
T ss_pred cccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEec-CcceecccCCChheEEEeeeeeeccccce
Confidence 5667776543 45678999999999999999999873 35666663 4443321 232333 68899
Q ss_pred EEEEecCC-cccCcceeEEcCCCC---CCCCeEEEEecCC
Q 007765 190 AILIVESD-EFWEGMHFLELGDIP---FLQQAVAVVGYPQ 225 (590)
Q Consensus 190 AlLkv~~~-~~~~~~~~~~l~~~~---~~G~~V~~iG~p~ 225 (590)
+||.++.+ .|...+.|+-+.+.. ...+.++++|...
T Consensus 92 ~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 92 LLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 99999987 577788888776522 2256899999876
No 96
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=93.91 E-value=0.093 Score=57.63 Aligned_cols=57 Identities=11% Similarity=0.096 Sum_probs=50.5
Q ss_pred cCCcceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCceEE
Q 007765 478 KAGEQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSENLRF 534 (590)
Q Consensus 478 ~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~v~l 534 (590)
+.+.+.|-|-.|+++++|.++.+++||++++|||+||++.++..+.++...+....+
T Consensus 394 ~~~~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~~~~l 450 (1051)
T KOG3532|consen 394 KNTNRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGDLTVL 450 (1051)
T ss_pred cCCceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccceEEE
Confidence 456788889999999999999999999999999999999999999999877764444
No 97
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=93.09 E-value=0.085 Score=56.85 Aligned_cols=91 Identities=15% Similarity=0.107 Sum_probs=59.5
Q ss_pred CCcceeeccEEEeeCCHHHHHHhCCCccCCChhhhHHHHHhcCCccCCcceEEEEEEeeccccccccccCCceEEeeCCe
Q 007765 433 LPSYYIFAGLVFIPLTQPYLHEYGEDWYNTSPRRLCERALRELPKKAGEQLVILSQVLMDDINAGYERFADLQVKKVNGV 512 (590)
Q Consensus 433 ~p~~~~~~Gl~~~~l~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~ 512 (590)
.++++...||++.+..++ .--+| +.-+......+|+.|.+++||.++|+.+||.|++|||.
T Consensus 432 l~~~l~~~gL~~~~~~~~-~~~LG------------------l~v~~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~ 492 (558)
T COG3975 432 LNPLLERFGLTFTPKPRE-AYYLG------------------LKVKSEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI 492 (558)
T ss_pred hhhhhhhcceEEEecCCC-Ccccc------------------eEecccCCeeEEEecCCCChhHhccCCCccEEEEEcCc
Confidence 345566688888887654 00111 11222334568999999999999999999999999999
Q ss_pred ecCCHHHHHHHHHhcCCCceEEEeCC-CeEEEEEechh
Q 007765 513 EIENLKHLCQLVENCSSENLRFDLDD-DRVVVLNYDVA 549 (590)
Q Consensus 513 ~v~~~~~l~~~v~~~~~~~v~l~~~r-~~~i~l~~~~~ 549 (590)
. .+.-+...+..|.+.+.+ |..+.+.++..
T Consensus 493 s-------~~l~~~~~~d~i~v~~~~~~~L~e~~v~~~ 523 (558)
T COG3975 493 S-------DQLDRYKVNDKIQVHVFREGRLREFLVKLG 523 (558)
T ss_pred c-------ccccccccccceEEEEccCCceEEeecccC
Confidence 1 112223455677777755 66666655543
No 98
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=92.85 E-value=0.13 Score=56.50 Aligned_cols=38 Identities=26% Similarity=0.398 Sum_probs=33.8
Q ss_pred CceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcc
Q 007765 341 TGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTV 378 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v 378 (590)
.-|.|..|.++++|.++ |++||++++|||.||.+..++
T Consensus 398 ~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~ 436 (1051)
T KOG3532|consen 398 RAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQA 436 (1051)
T ss_pred eEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHH
Confidence 44667899999999999 999999999999999987754
No 99
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=92.80 E-value=0.27 Score=44.06 Aligned_cols=51 Identities=18% Similarity=0.242 Sum_probs=39.9
Q ss_pred CCcceEEEEEEeeccccccc-cccCCceEEeeCCeecCCHHH--HHHHHHhcCC
Q 007765 479 AGEQLVILSQVLMDDINAGY-ERFADLQVKKVNGVEIENLKH--LCQLVENCSS 529 (590)
Q Consensus 479 ~~~~~vvl~~V~~~s~a~g~-~~~~gd~I~~VNG~~v~~~~~--l~~~v~~~~~ 529 (590)
+....++|+.+.|++.++.- |++.||.+++|||+.|.--.| -++++++..+
T Consensus 112 eqnspiyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~g 165 (207)
T KOG3550|consen 112 EQNSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVG 165 (207)
T ss_pred ccCCceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcC
Confidence 34456899999999998876 689999999999999976554 4556666544
No 100
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=92.26 E-value=0.2 Score=57.07 Aligned_cols=48 Identities=25% Similarity=0.327 Sum_probs=42.4
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeCCeecCC--HHHHHHHHHhcCCC
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVNGVEIEN--LKHLCQLVENCSSE 530 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~--~~~l~~~v~~~~~~ 530 (590)
+.|+|..|.+++|..| ++++||+|+.|||++|++ |++.++++++++..
T Consensus 75 rPviVr~VT~GGps~G-KL~PGDQIl~vN~Epv~daprervIdlvRace~s 124 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSIG-KLQPGDQILAVNGEPVKDAPRERVIDLVRACESS 124 (1298)
T ss_pred CceEEEEecCCCCccc-cccCCCeEEEecCcccccccHHHHHHHHHHHhhh
Confidence 5689999999999887 488999999999999965 78999999998765
No 101
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=91.76 E-value=0.34 Score=43.47 Aligned_cols=37 Identities=24% Similarity=0.439 Sum_probs=32.8
Q ss_pred cCceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCC
Q 007765 340 VTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDG 376 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~ 376 (590)
.+-++|..|.|++.|++- |+-||.+++|||..|....
T Consensus 114 nspiyisriipggvadrhgglkrgdqllsvngvsvege~ 152 (207)
T KOG3550|consen 114 NSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEH 152 (207)
T ss_pred CCceEEEeecCCccccccCcccccceeEeecceeecchh
Confidence 356899999999999876 9999999999999998654
No 102
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=90.16 E-value=0.27 Score=54.12 Aligned_cols=42 Identities=31% Similarity=0.388 Sum_probs=35.3
Q ss_pred CCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCC
Q 007765 335 GMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDG 376 (590)
Q Consensus 335 gl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~ 376 (590)
|=.+..-|++|.+|.|++.|+.. |+-||.|+.|||+...+..
T Consensus 556 GGsEkGfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis 598 (1283)
T KOG3542|consen 556 GGSEKGFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENIS 598 (1283)
T ss_pred cCccccceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhh
Confidence 33334568999999999999999 9999999999999877653
No 103
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=89.64 E-value=0.49 Score=49.17 Aligned_cols=50 Identities=8% Similarity=0.026 Sum_probs=44.6
Q ss_pred CcceEEEEEEeeccccccc-cccCCceEEeeCCeecCCHHHHHHHHHhcCC
Q 007765 480 GEQLVILSQVLMDDINAGY-ERFADLQVKKVNGVEIENLKHLCQLVENCSS 529 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~-~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~ 529 (590)
..++|.+.+|...||..|+ |+..||+|.++||-+|.+.+|+.+.++.+.+
T Consensus 218 ~g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~tsl~ 268 (484)
T KOG2921|consen 218 HGEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLATSLD 268 (484)
T ss_pred cCceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHhhcc
Confidence 3478899999999998887 7899999999999999999999999988544
No 104
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=89.60 E-value=1.2 Score=39.11 Aligned_cols=34 Identities=21% Similarity=0.290 Sum_probs=26.5
Q ss_pred ceeeEEEEcccccCCCCCceEEeCCEEEEEEeee
Q 007765 250 TQLMAIQIDAAINPGNSGGPAIMGNKVAGVAFQN 283 (590)
Q Consensus 250 ~~~~~i~~~~~i~~G~SGGPl~~~G~vVGI~~~~ 283 (590)
....++....+..||+.||+|+.+--||||++++
T Consensus 76 ~Q~~~l~g~Gp~~PGdCGg~L~C~HGViGi~Tag 109 (127)
T PF00947_consen 76 YQYNLLIGEGPAEPGDCGGILRCKHGVIGIVTAG 109 (127)
T ss_dssp EEECEEEEE-SSSTT-TCSEEEETTCEEEEEEEE
T ss_pred eecCceeecccCCCCCCCceeEeCCCeEEEEEeC
Confidence 3345666778889999999999888899999985
No 105
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=88.86 E-value=0.69 Score=40.89 Aligned_cols=29 Identities=21% Similarity=0.434 Sum_probs=23.8
Q ss_pred cccccCCCCCceEE-eCCEEEEEEeeecCC
Q 007765 258 DAAINPGNSGGPAI-MGNKVAGVAFQNLSG 286 (590)
Q Consensus 258 ~~~i~~G~SGGPl~-~~G~vVGI~~~~~~~ 286 (590)
...-.+||||-|++ ..|+||||+..+...
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~ne 129 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGGANE 129 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEEEEE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecCCCC
Confidence 44557999999999 999999999987553
No 106
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=88.62 E-value=1 Score=47.84 Aligned_cols=58 Identities=31% Similarity=0.453 Sum_probs=44.5
Q ss_pred EEEeCCCChhhhc-cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCE---EEEEEEe-CCEEEE
Q 007765 345 VNKINPLSDAHEI-LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEK---SLVRVLR-DGKEHE 413 (590)
Q Consensus 345 V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~---v~l~V~R-~g~~~~ 413 (590)
+.++..+|+++.+ +++||.|+++|++++.+|.++. ..+ ....+.. +.+.+.| +++...
T Consensus 133 ~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~----------~~~-~~~~~~~~~~~~i~~~~~~~~~~~ 195 (375)
T COG0750 133 VGEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVR----------RLL-VAAAGDVFNLLTILVIRLDGEAHA 195 (375)
T ss_pred eeecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHH----------HHH-HhccCCcccceEEEEEeccceeee
Confidence 3478999999999 9999999999999999998653 223 2233444 7899999 776643
No 107
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=88.35 E-value=0.72 Score=49.40 Aligned_cols=37 Identities=22% Similarity=0.229 Sum_probs=31.3
Q ss_pred cCceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCC
Q 007765 340 VTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDG 376 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~ 376 (590)
..|++|..|.+++..+.- +.+||.||.||....+++.
T Consensus 276 DggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmS 314 (626)
T KOG3571|consen 276 DGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMS 314 (626)
T ss_pred CCceEEeeeccCceeeccCccCccceEEEeeecchhhcC
Confidence 479999999998766544 9999999999998887764
No 108
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=87.61 E-value=0.95 Score=40.50 Aligned_cols=30 Identities=23% Similarity=0.438 Sum_probs=21.2
Q ss_pred EEcccccCCCCCceEE-eCCEEEEEEeeecC
Q 007765 256 QIDAAINPGNSGGPAI-MGNKVAGVAFQNLS 285 (590)
Q Consensus 256 ~~~~~i~~G~SGGPl~-~~G~vVGI~~~~~~ 285 (590)
..+....+|.||+|+| .+|++|||-.....
T Consensus 89 ~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~ 119 (132)
T PF00949_consen 89 AIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE 119 (132)
T ss_dssp EE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred eeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence 3444567999999999 99999999876654
No 109
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=87.47 E-value=1.5 Score=44.92 Aligned_cols=55 Identities=13% Similarity=0.273 Sum_probs=42.7
Q ss_pred ceEEEEEEeecccccccc-ccCCceEEeeCCeecCCH--HHHHHHHHhcCCCceEEEeC
Q 007765 482 QLVILSQVLMDDINAGYE-RFADLQVKKVNGVEIENL--KHLCQLVENCSSENLRFDLD 537 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~-~~~gd~I~~VNG~~v~~~--~~l~~~v~~~~~~~v~l~~~ 537 (590)
-.|+|+.+..+.+|+--| +..||-|++|||..|+.- +|.+.+++ |.|+.+++++.
T Consensus 80 ~PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLR-NAGdeVtlTV~ 137 (505)
T KOG3549|consen 80 LPVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILR-NAGDEVTLTVK 137 (505)
T ss_pred ccEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHH-hcCCEEEEEeH
Confidence 357899999988877665 679999999999999754 56777777 56776666653
No 110
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=87.43 E-value=0.68 Score=53.06 Aligned_cols=62 Identities=19% Similarity=0.268 Sum_probs=46.5
Q ss_pred CCcCceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEeCC
Q 007765 338 SEVTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLRDG 409 (590)
Q Consensus 338 ~~~~gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g 409 (590)
++.-|++|..|.+|++|+.- |+.||.+++|||+..-...+-+ ..++ ....|..|.+.|...|
T Consensus 957 q~klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQEr--------AA~l--mtrtg~vV~leVaKqg 1020 (1629)
T KOG1892|consen 957 QRKLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQER--------AARL--MTRTGNVVHLEVAKQG 1020 (1629)
T ss_pred ccccceEEEEeccCCccccccccccCceeeeecCcccccccHHH--------HHHH--HhccCCeEEEehhhhh
Confidence 34569999999999999765 9999999999999887665421 1122 3357888899886544
No 111
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=86.52 E-value=1.5 Score=43.27 Aligned_cols=50 Identities=16% Similarity=0.285 Sum_probs=42.7
Q ss_pred CcceEEEEEEeecccccccccc-CCceEEeeCCeec--CCHHHHHHHHHhcCC
Q 007765 480 GEQLVILSQVLMDDINAGYERF-ADLQVKKVNGVEI--ENLKHLCQLVENCSS 529 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~~~-~gd~I~~VNG~~v--~~~~~l~~~v~~~~~ 529 (590)
+..|++|+...|++-|+.-|+. .+|.|++|||.+| +++++..+++-+|.-
T Consensus 192 kvpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANsh 244 (358)
T KOG3606|consen 192 KVPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANSH 244 (358)
T ss_pred ccCceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhccc
Confidence 4567899999999999988865 5999999999998 789999998877643
No 112
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=85.20 E-value=3 Score=44.92 Aligned_cols=58 Identities=12% Similarity=0.192 Sum_probs=43.1
Q ss_pred CCcceEEEEEEeeccccccc-cccCCceEEeeCCeecCCHH--HHHHHHHhc--CCCceEEEe
Q 007765 479 AGEQLVILSQVLMDDINAGY-ERFADLQVKKVNGVEIENLK--HLCQLVENC--SSENLRFDL 536 (590)
Q Consensus 479 ~~~~~vvl~~V~~~s~a~g~-~~~~gd~I~~VNG~~v~~~~--~l~~~v~~~--~~~~v~l~~ 536 (590)
.++.+++|..++++++-+.- .+.+||.|+.||.....++. +-+++|++. +..++++++
T Consensus 274 rgDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~gPi~ltv 336 (626)
T KOG3571|consen 274 RGDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPGPIKLTV 336 (626)
T ss_pred CCCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCCCeEEEE
Confidence 46788999999999873333 46799999999999887764 667777773 334566654
No 113
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=84.76 E-value=0.49 Score=54.13 Aligned_cols=43 Identities=23% Similarity=0.315 Sum_probs=26.0
Q ss_pred CCCeEEEEecCC------ccc---Ccce---eEEcCCC-CCCCCeEEEEecCCCCC
Q 007765 186 ECDLAILIVESD------EFW---EGMH---FLELGDI-PFLQQAVAVVGYPQGGD 228 (590)
Q Consensus 186 ~~DlAlLkv~~~------~~~---~~~~---~~~l~~~-~~~G~~V~~iG~p~g~~ 228 (590)
..|++++|+=.. .++ .+++ .+++... .+.|+-|+++|||....
T Consensus 199 tgDfs~fRvY~~~dg~PA~Ys~dnvP~~p~~~l~is~~G~keGD~vmv~GyPG~T~ 254 (698)
T PF10459_consen 199 TGDFSFFRVYADKDGKPADYSKDNVPYKPKHFLKISLKGVKEGDFVMVAGYPGRTN 254 (698)
T ss_pred CCceEEEEEEeCCCCCccccCcCCCCCCCccccccCCCCCCCCCeEEEccCCCccc
Confidence 459999999322 111 1222 2333333 25699999999997644
No 114
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=84.63 E-value=1.9 Score=42.12 Aligned_cols=46 Identities=7% Similarity=0.068 Sum_probs=38.5
Q ss_pred cccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCC-ceEEEeCC
Q 007765 493 DINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSE-NLRFDLDD 538 (590)
Q Consensus 493 s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~-~v~l~~~r 538 (590)
+.=...|++.||+.+++|+..+++.++..++++...+. .+.+++.|
T Consensus 218 slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R 264 (275)
T COG3031 218 SLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIR 264 (275)
T ss_pred chhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEe
Confidence 33445578899999999999999999999999996665 58888876
No 115
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=84.59 E-value=6.9 Score=33.47 Aligned_cols=55 Identities=15% Similarity=0.111 Sum_probs=37.5
Q ss_pred EEEEEeCCeEEEcCcccCCCcEEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccCcceeEEcCCC
Q 007765 143 SGFVIPGKKILTNAHVVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDI 211 (590)
Q Consensus 143 sGfiI~~g~IlT~aHvv~~~~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~ 211 (590)
=++-|.+|.++|+.||.+.++.+. +..+ +++. ..-|+|+++.+.. .++.+++++.
T Consensus 2 ~avHIGnG~~vt~tHva~~~~~v~------g~~f--~~~~--~~ge~~~v~~~~~----~~p~~~ig~g 56 (105)
T PF03510_consen 2 WAVHIGNGRYVTVTHVAKSSDSVD------GQPF--KIVK--TDGELCWVQSPLV----HLPAAQIGTG 56 (105)
T ss_pred ceEEeCCCEEEEEEEEeccCceEc------CcCc--EEEE--eccCEEEEECCCC----CCCeeEeccC
Confidence 356777999999999998776542 2222 2222 3569999998876 3677777654
No 116
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=84.08 E-value=0.99 Score=49.85 Aligned_cols=57 Identities=18% Similarity=0.200 Sum_probs=42.6
Q ss_pred cceEEEEEEeeccccccccccCCceEEeeCCeecCCHHHHH--HHHHhcCCCceEEEeCCC
Q 007765 481 EQLVILSQVLMDDINAGYERFADLQVKKVNGVEIENLKHLC--QLVENCSSENLRFDLDDD 539 (590)
Q Consensus 481 ~~~vvl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~--~~v~~~~~~~v~l~~~r~ 539 (590)
.-++++..|.|++.|+..|++.||.|++|||+..+++..-. ++++.+ -.+++.+..+
T Consensus 561 GfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrnn--thLtltvKtN 619 (1283)
T KOG3542|consen 561 GFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRNN--THLTLTVKTN 619 (1283)
T ss_pred cceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcCC--ceEEEEEecc
Confidence 34689999999999999999999999999999998876532 334332 2355555444
No 117
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=83.60 E-value=4.8 Score=46.76 Aligned_cols=64 Identities=13% Similarity=0.018 Sum_probs=35.4
Q ss_pred EEEEEEeCCeEEEcCcccCCCcEEEEEEcC-CCcEEEEEEEEecCCCCeEEEEecCCcccCcceeEEcC
Q 007765 142 GSGFVIPGKKILTNAHVVADSTFVLVRKHG-SPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELG 209 (590)
Q Consensus 142 GsGfiI~~g~IlT~aHvv~~~~~i~V~~~~-~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~ 209 (590)
|...+|++.||+|++|+..+... |.+.. +...|...-....+..|+.+-|++.-. -++.|++..
T Consensus 67 G~aTLigpqYiVSV~HN~~gy~~--v~FG~~g~~~Y~iV~RNn~~~~Df~~pRLnK~V--TEvaP~~~t 131 (769)
T PF02395_consen 67 GVATLIGPQYIVSVKHNGKGYNS--VSFGNEGQNTYKIVDRNNYPSGDFHMPRLNKFV--TEVAPAEMT 131 (769)
T ss_dssp SS-EEEETTEEEBETTG-TSCCE--ECESCSSTCEEEEEEEEBETTSTEBEEEESS-----SS----BB
T ss_pred ceEEEecCCeEEEEEccCCCcCc--eeecccCCceEEEEEccCCCCcccceeecCceE--EEEeccccc
Confidence 77899999999999999955444 33332 233444332233344699999998632 135555553
No 118
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=82.94 E-value=0.85 Score=47.27 Aligned_cols=54 Identities=13% Similarity=0.247 Sum_probs=39.4
Q ss_pred CcceEEEEEEeecccccccc-ccCCceEEeeCCeecCCHHH--HHHHHHhcCCCceEE
Q 007765 480 GEQLVILSQVLMDDINAGYE-RFADLQVKKVNGVEIENLKH--LCQLVENCSSENLRF 534 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~-~~~gd~I~~VNG~~v~~~~~--l~~~v~~~~~~~v~l 534 (590)
+.-.++|+.++++-+|+..+ ++-||.|++|||+...+..| -++.++. .|+.+.+
T Consensus 108 NkMPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKr-aGkeV~l 164 (506)
T KOG3551|consen 108 NKMPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKR-AGKEVLL 164 (506)
T ss_pred cCCceehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHHh-hCceeee
Confidence 44567899999998887774 78899999999999987765 3444443 3444433
No 119
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=82.92 E-value=1.5 Score=48.41 Aligned_cols=105 Identities=13% Similarity=0.183 Sum_probs=70.7
Q ss_pred cCCCCCceEEeCC------EEEEEEeeecCCCCceeEEeehHHHHHHHHHHHHcCccc--cccccccceeeeccHhhhhh
Q 007765 262 NPGNSGGPAIMGN------KVAGVAFQNLSGAENIGYIIPVPVIKHFITGVVEHGKYV--GFCSLGLSCQTTENVQLRNN 333 (590)
Q Consensus 262 ~~G~SGGPl~~~G------~vVGI~~~~~~~~~~~~~aip~~~i~~~l~~l~~~g~~~--~~~~lGi~~~~~~~~~~~~~ 333 (590)
-.-++|||.-..| +++.|+-.. -..+|.+....++..+++.-.+. -.+.--+....+..|+++-.
T Consensus 678 Anmm~~GpAarsgkLnIGDQiiaING~S-------LVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~kyQ 750 (829)
T KOG3605|consen 678 ANMMHGGPAARSGKLNIGDQIMSINGTS-------LVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDLRYQ 750 (829)
T ss_pred HhcccCChhhhcCCccccceeEeecCce-------eccccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccchhh
Confidence 3457788876444 344444221 23479999999999887533322 01111223333447888889
Q ss_pred cCCCCCcCceEEEEeCCCChhhhc-cCCCCEEEEECCEEecCC
Q 007765 334 FGMRSEVTGVLVNKINPLSDAHEI-LKKDDIILAFDGVPIAND 375 (590)
Q Consensus 334 lgl~~~~~gv~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~ 375 (590)
||+.- +.|+ |....-|+.|++. ++.|-+|+.|||+.|--.
T Consensus 751 LGFSV-QNGi-ICSLlRGGIAERGGVRVGHRIIEINgQSVVA~ 791 (829)
T KOG3605|consen 751 LGFSV-QNGI-ICSLLRGGIAERGGVRVGHRIIEINGQSVVAT 791 (829)
T ss_pred cccee-eCcE-eehhhcccchhccCceeeeeEEEECCceEEec
Confidence 99986 6775 6678899999999 999999999999887543
No 120
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=82.87 E-value=1.8 Score=43.42 Aligned_cols=38 Identities=16% Similarity=0.271 Sum_probs=33.2
Q ss_pred CceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCcc
Q 007765 341 TGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTV 378 (590)
Q Consensus 341 ~gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v 378 (590)
--++|..|-.++||++- ++.||.|++|||..|.....+
T Consensus 30 PClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKv 69 (429)
T KOG3651|consen 30 PCLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKV 69 (429)
T ss_pred CeEEEEEeccCCchhccCccccCCeeEEecceeecCccHH
Confidence 35788999999999876 999999999999999987654
No 121
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=82.47 E-value=4.9 Score=43.84 Aligned_cols=56 Identities=21% Similarity=0.312 Sum_probs=42.4
Q ss_pred ceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEEe
Q 007765 342 GVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVLR 407 (590)
Q Consensus 342 gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R 407 (590)
-++|..|..|+.+++. |+.||.|+.|||..+.+.. +. .+..++.... ..+++.|.-
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~-~~-------e~q~~l~~~~--G~itfkiiP 204 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKS-PE-------ELQELLRNSR--GSITFKIIP 204 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCC-HH-------HHHHHHHhCC--CcEEEEEcc
Confidence 5788999999999876 9999999999999998763 22 3445554443 467777764
No 122
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=82.06 E-value=8.5 Score=40.72 Aligned_cols=136 Identities=16% Similarity=0.163 Sum_probs=65.9
Q ss_pred ceEEEEEEeCCeEEEcCcccCCC-cEE-EEEEcCCCcEEEEEEEEecCCCCeEEEEecCCcccCcceeEEcCCCCCCCCe
Q 007765 140 TTGSGFVIPGKKILTNAHVVADS-TFV-LVRKHGSPTKYRAQVEAVGHECDLAILIVESDEFWEGMHFLELGDIPFLQQA 217 (590)
Q Consensus 140 ~~GsGfiI~~g~IlT~aHvv~~~-~~i-~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~~~l~~~~~~G~~ 217 (590)
++|=||.+++...+|+-||+... ..+ -| +..-+.++..-+++-++++.+- ..++.-+-|-.-.+-|.-
T Consensus 379 GsGWGfWVS~~lfITttHViP~g~~E~FGv---------~i~~i~vh~sGeF~~~rFpk~i-RPDvtgmiLEeGapEGtV 448 (535)
T PF05416_consen 379 GSGWGFWVSPTLFITTTHVIPPGAKEAFGV---------PISQIQVHKSGEFCRFRFPKPI-RPDVTGMILEEGAPEGTV 448 (535)
T ss_dssp TTEEEEESSSSEEEEEGGGS-STTSEETTE---------ECGGEEEEEETTEEEEEESS-S-STTS---EE-SS--TT-E
T ss_pred CCceeeeecceEEEEeeeecCCcchhhhCC---------ChhHeEEeeccceEEEecCCCC-CCCccceeeccCCCCceE
Confidence 67889999999999999999743 211 01 1111233445566677776652 124555555444444554
Q ss_pred EEEE-ecCCCC-CCceEEEeEEecccccccccCcceeeEEEE-------cccccCCCCCceEE-eCC---EEEEEEeeec
Q 007765 218 VAVV-GYPQGG-DNISVTKGVVSRVEPTQYVHGATQLMAIQI-------DAAINPGNSGGPAI-MGN---KVAGVAFQNL 284 (590)
Q Consensus 218 V~~i-G~p~g~-~~~~v~~G~Vs~~~~~~~~~~~~~~~~i~~-------~~~i~~G~SGGPl~-~~G---~vVGI~~~~~ 284 (590)
+.++ -.+.|. -.+.+..|......-.-..-+. ...++.+ |-...|||-|.|-+ ..| -|+||+.+..
T Consensus 449 ~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v~G-Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAt 527 (535)
T PF05416_consen 449 CSILIKRPSGELLPLAVRMGTHASMKIQGRTVHG-QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAAT 527 (535)
T ss_dssp EEEEEE-TTSBEEEEEEEEEEEEEEEETTEEEEE-EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-
T ss_pred EEEEEEcCCccchhhhhhhccceeEEEcceeecc-eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhc
Confidence 4333 333331 1245566655433211100001 1123332 22446899999999 665 4999998865
Q ss_pred CC
Q 007765 285 SG 286 (590)
Q Consensus 285 ~~ 286 (590)
.+
T Consensus 528 r~ 529 (535)
T PF05416_consen 528 RS 529 (535)
T ss_dssp SS
T ss_pred cC
Confidence 43
No 123
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=80.70 E-value=2.2 Score=37.89 Aligned_cols=23 Identities=26% Similarity=0.466 Sum_probs=18.0
Q ss_pred cCCCCCceEE-eCCEEEEEEeeec
Q 007765 262 NPGNSGGPAI-MGNKVAGVAFQNL 284 (590)
Q Consensus 262 ~~G~SGGPl~-~~G~vVGI~~~~~ 284 (590)
-.|.||||++ .+|.+|||..+..
T Consensus 106 lkGSSGgPiLC~~GH~vG~f~aa~ 129 (148)
T PF02907_consen 106 LKGSSGGPILCPSGHAVGMFRAAV 129 (148)
T ss_dssp HTT-TT-EEEETTSEEEEEEEEEE
T ss_pred EecCCCCcccCCCCCEEEEEEEEE
Confidence 4799999999 9999999986644
No 124
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=80.61 E-value=3.4 Score=40.93 Aligned_cols=80 Identities=23% Similarity=0.391 Sum_probs=50.6
Q ss_pred eEEeehHHHHHHHHH--HHHcCccccccccccceeeeccHhhhhhcCCCCCcCceEEEEeCCCChhhhc--cCCCCEEEE
Q 007765 291 GYIIPVPVIKHFITG--VVEHGKYVGFCSLGLSCQTTENVQLRNNFGMRSEVTGVLVNKINPLSDAHEI--LKKDDIILA 366 (590)
Q Consensus 291 ~~aip~~~i~~~l~~--l~~~g~~~~~~~lGi~~~~~~~~~~~~~lgl~~~~~gv~V~~V~~~s~A~~a--L~~GD~Il~ 366 (590)
+-.|-++.+-+.=.+ |-++|.-. -||+.+.+-..- --.--||+. ..|+.|....||+.|+.. |-..|.|++
T Consensus 147 SsIIDVDivPEtHRRVRL~khG~ek---PLGFYIRDG~SV-RVtp~Glek-vpGIFISRlVpGGLAeSTGLLaVnDEVlE 221 (358)
T KOG3606|consen 147 SSIIDVDIVPETHRRVRLHKHGSEK---PLGFYIRDGTSV-RVTPHGLEK-VPGIFISRLVPGGLAESTGLLAVNDEVLE 221 (358)
T ss_pred ceeeeecccchhhhheehhhcCCCC---CceEEEecCceE-Eeccccccc-cCceEEEeecCCccccccceeeecceeEE
Confidence 344444444333333 22344422 266665443111 111246655 579999999999999987 889999999
Q ss_pred ECCEEecCC
Q 007765 367 FDGVPIAND 375 (590)
Q Consensus 367 VnG~~v~~~ 375 (590)
|||.+|...
T Consensus 222 VNGIEVaGK 230 (358)
T KOG3606|consen 222 VNGIEVAGK 230 (358)
T ss_pred EcCEEeccc
Confidence 999999865
No 125
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=80.18 E-value=2.7 Score=44.66 Aligned_cols=52 Identities=12% Similarity=0.077 Sum_probs=46.8
Q ss_pred EEEeeccccccccccCCceEEeeCCeecCCHHHHHHHHHhcCCCc---eEEEeCC
Q 007765 487 SQVLMDDINAGYERFADLQVKKVNGVEIENLKHLCQLVENCSSEN---LRFDLDD 538 (590)
Q Consensus 487 ~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~~l~~~v~~~~~~~---v~l~~~r 538 (590)
..+..+++++.++++.||.++++|++++.+|++..+.+..+.+.. +.+.+.|
T Consensus 134 ~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~ 188 (375)
T COG0750 134 GEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIR 188 (375)
T ss_pred eecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEe
Confidence 357788899999999999999999999999999999999888887 7888877
No 126
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=76.62 E-value=4.2 Score=40.91 Aligned_cols=49 Identities=14% Similarity=0.175 Sum_probs=40.7
Q ss_pred ceEEEEEEeecccccccc-ccCCceEEeeCCeecCCH--HHHHHHHHhcCCC
Q 007765 482 QLVILSQVLMDDINAGYE-RFADLQVKKVNGVEIENL--KHLCQLVENCSSE 530 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~-~~~gd~I~~VNG~~v~~~--~~l~~~v~~~~~~ 530 (590)
..++|.+|..++||+.-| ++.||.|+.|||..|+-- -+..++|+...++
T Consensus 30 PClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~e 81 (429)
T KOG3651|consen 30 PCLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLNE 81 (429)
T ss_pred CeEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhccc
Confidence 467899999999988876 678999999999999654 4777888887665
No 127
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=76.28 E-value=1.8 Score=45.18 Aligned_cols=38 Identities=26% Similarity=0.415 Sum_probs=33.8
Q ss_pred cCceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCc
Q 007765 340 VTGVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGT 377 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~ 377 (590)
..|+.|.+|...||+..- |++||+|+++||-+|.+.++
T Consensus 219 g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~d 258 (484)
T KOG2921|consen 219 GEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSD 258 (484)
T ss_pred CceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHH
Confidence 478999999999999765 99999999999999988764
No 128
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=75.91 E-value=3.2 Score=42.56 Aligned_cols=55 Identities=18% Similarity=0.219 Sum_probs=41.3
Q ss_pred ceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEE
Q 007765 342 GVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVL 406 (590)
Q Consensus 342 gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~ 406 (590)
-++|..|..+-.|+.. |-.||-|+.|||..|..-.+- +...++ .+.|+.+++||.
T Consensus 81 PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~He--------evV~iL--RNAGdeVtlTV~ 137 (505)
T KOG3549|consen 81 PVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHE--------EVVNIL--RNAGDEVTLTVK 137 (505)
T ss_pred cEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChH--------HHHHHH--HhcCCEEEEEeH
Confidence 4678888888888766 889999999999999875432 111333 357999999985
No 129
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=73.71 E-value=1.9 Score=44.83 Aligned_cols=36 Identities=25% Similarity=0.287 Sum_probs=30.9
Q ss_pred ceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCc
Q 007765 342 GVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGT 377 (590)
Q Consensus 342 gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~ 377 (590)
-++|.+|.++-.|++. |..||.|++|||....+..+
T Consensus 111 PIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtH 148 (506)
T KOG3551|consen 111 PILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATH 148 (506)
T ss_pred ceehhHhccccccccccceeeccEEEEecchhhhhcch
Confidence 3678899999888887 99999999999999887654
No 130
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=73.38 E-value=2.8 Score=49.41 Aligned_cols=35 Identities=26% Similarity=0.272 Sum_probs=31.4
Q ss_pred eEEEEeCCCChhhhc-cCCCCEEEEECCEEecCCCc
Q 007765 343 VLVNKINPLSDAHEI-LKKDDIILAFDGVPIANDGT 377 (590)
Q Consensus 343 v~V~~V~~~s~A~~a-L~~GD~Il~VnG~~v~~~~~ 377 (590)
.+|..|..+|||..+ +++||.|+.|||+++....+
T Consensus 660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H 695 (1205)
T KOG0606|consen 660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVH 695 (1205)
T ss_pred eeeeeecCCCCccccCCCccceeEeccCcccchhhH
Confidence 567899999999999 99999999999999987643
No 131
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=71.28 E-value=6.2 Score=43.03 Aligned_cols=55 Identities=16% Similarity=0.334 Sum_probs=45.5
Q ss_pred eEEEEEEeecccccccc-ccCCceEEeeCCeecCC--HHHHHHHHHhcCCCceEEEeCC
Q 007765 483 LVILSQVLMDDINAGYE-RFADLQVKKVNGVEIEN--LKHLCQLVENCSSENLRFDLDD 538 (590)
Q Consensus 483 ~vvl~~V~~~s~a~g~~-~~~gd~I~~VNG~~v~~--~~~l~~~v~~~~~~~v~l~~~r 538 (590)
-+++..++.++.+..-+ +..||.|.+|||..|.+ ..++.+++++.. +.++|.+..
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G~itfkiiP 204 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSR-GSITFKIIP 204 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCC-CcEEEEEcc
Confidence 36889999999877776 46899999999999965 689999999887 678887643
No 132
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=68.29 E-value=4.4 Score=40.13 Aligned_cols=56 Identities=18% Similarity=0.227 Sum_probs=45.1
Q ss_pred eEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCEEEEEEE
Q 007765 343 VLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEKSLVRVL 406 (590)
Q Consensus 343 v~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~ 406 (590)
+.|..+.++|...+. ++.||-|-+|||+.+-.|.++. ...++.....|++.++.+.
T Consensus 151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYe--------VArmLKel~rge~ftlrLi 208 (334)
T KOG3938|consen 151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYE--------VARMLKELPRGETFTLRLI 208 (334)
T ss_pred eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHH--------HHHHHHhcccCCeeEEEee
Confidence 578889999999877 9999999999999999988653 3456666677887777664
No 133
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=64.26 E-value=15 Score=36.52 Aligned_cols=52 Identities=12% Similarity=0.266 Sum_probs=34.7
Q ss_pred EEEEEeeccccccc-cccCCceEEeeCCeecCCHHHH--HHHHHhc-CCCceEEEe
Q 007765 485 ILSQVLMDDINAGY-ERFADLQVKKVNGVEIENLKHL--CQLVENC-SSENLRFDL 536 (590)
Q Consensus 485 vl~~V~~~s~a~g~-~~~~gd~I~~VNG~~v~~~~~l--~~~v~~~-~~~~v~l~~ 536 (590)
+|..+-++|.-... ....||.|.+|||+.|--+.|. .+++++. +++..++.+
T Consensus 152 FIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrL 207 (334)
T KOG3938|consen 152 FIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRL 207 (334)
T ss_pred eeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEe
Confidence 45556666663333 3468999999999999998865 4667773 344444443
No 134
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=61.52 E-value=32 Score=32.68 Aligned_cols=29 Identities=17% Similarity=0.109 Sum_probs=24.2
Q ss_pred ceEEEEEEeeccccccccccCCceEEeeC
Q 007765 482 QLVILSQVLMDDINAGYERFADLQVKKVN 510 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~~~~gd~I~~VN 510 (590)
..++|..|..+|++++.++.-++.|.+|-
T Consensus 122 ~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 122 GKVIVDEVEFGSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred CEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence 34689999999999999988888887764
No 135
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=57.74 E-value=18 Score=38.51 Aligned_cols=58 Identities=14% Similarity=0.122 Sum_probs=47.7
Q ss_pred CcceEEEEEEeeccccccccccC-CceEEeeCCeecC-CHHHHHHHHHhcCCCceEEEeCC
Q 007765 480 GEQLVILSQVLMDDINAGYERFA-DLQVKKVNGVEIE-NLKHLCQLVENCSSENLRFDLDD 538 (590)
Q Consensus 480 ~~~~vvl~~V~~~s~a~g~~~~~-gd~I~~VNG~~v~-~~~~l~~~v~~~~~~~v~l~~~r 538 (590)
+.++.-+-+|..++++..+|+.+ -|.|++|||.-++ +-+.|.+.++.+-++ |++++-+
T Consensus 13 gteg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n 72 (462)
T KOG3834|consen 13 GTEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYN 72 (462)
T ss_pred CceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEe
Confidence 45666788999999999998876 6799999999996 556788888888777 8888755
No 136
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=45.73 E-value=29 Score=38.76 Aligned_cols=35 Identities=17% Similarity=0.087 Sum_probs=31.2
Q ss_pred ceEEEEeCCCChhhhc--cCCCCEEEEECCEEecCCC
Q 007765 342 GVLVNKINPLSDAHEI--LKKDDIILAFDGVPIANDG 376 (590)
Q Consensus 342 gv~V~~V~~~s~A~~a--L~~GD~Il~VnG~~v~~~~ 376 (590)
-++|.++.++|||..- |..||.|+.||+..+-.|+
T Consensus 226 ~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq 262 (638)
T KOG1738|consen 226 PHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ 262 (638)
T ss_pred ceeccccccCChHHHhhcccCccceeeecccccccch
Confidence 3567899999999876 9999999999999998995
No 137
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=43.99 E-value=30 Score=41.30 Aligned_cols=50 Identities=14% Similarity=0.112 Sum_probs=38.1
Q ss_pred EEEEEeeccccccccccCCceEEeeCCeecCCHH--HHHHHHHhcCCCceEEE
Q 007765 485 ILSQVLMDDINAGYERFADLQVKKVNGVEIENLK--HLCQLVENCSSENLRFD 535 (590)
Q Consensus 485 vl~~V~~~s~a~g~~~~~gd~I~~VNG~~v~~~~--~l~~~v~~~~~~~v~l~ 535 (590)
++..|..++||.-+++.++|.|+.|||++|..+. ++.+++.++ +..+.+.
T Consensus 661 ~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~~-gn~v~~~ 712 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLKS-GNKVTLR 712 (1205)
T ss_pred eeeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHhc-CCeeEEE
Confidence 5667899999999999999999999999998776 455555543 3334443
No 138
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=40.23 E-value=20 Score=38.19 Aligned_cols=22 Identities=32% Similarity=0.617 Sum_probs=19.5
Q ss_pred cccCCCCCceEE-eCCEEEEEEe
Q 007765 260 AINPGNSGGPAI-MGNKVAGVAF 281 (590)
Q Consensus 260 ~i~~G~SGGPl~-~~G~vVGI~~ 281 (590)
.+..|.||+.|+ .+|++|||..
T Consensus 351 ~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 351 SLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred CCCCCCCcCeEECCCCCEEEEeC
Confidence 456899999999 9999999975
No 139
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=39.40 E-value=57 Score=26.97 Aligned_cols=36 Identities=25% Similarity=0.401 Sum_probs=29.9
Q ss_pred ccCCCcEEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 007765 158 VVADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (590)
Q Consensus 158 vv~~~~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 194 (590)
++.....+.|.+. +++.+.+++.++|.+.++.|=..
T Consensus 10 ~~~~~~~V~V~lr-~~r~~~G~L~~fD~hmNlvL~d~ 45 (87)
T cd01720 10 AVKNNTQVLINCR-NNKKLLGRVKAFDRHCNMVLENV 45 (87)
T ss_pred HHcCCCEEEEEEc-CCCEEEEEEEEecCccEEEEcce
Confidence 4445678899997 89999999999999999987654
No 140
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=38.47 E-value=32 Score=33.37 Aligned_cols=54 Identities=15% Similarity=0.347 Sum_probs=38.0
Q ss_pred eEEEEcccccCCCCCceEE-eC----CEEEEEEeeecCCCCceeEEeeh--HHHHHHHHHHH
Q 007765 253 MAIQIDAAINPGNSGGPAI-MG----NKVAGVAFQNLSGAENIGYIIPV--PVIKHFITGVV 307 (590)
Q Consensus 253 ~~i~~~~~i~~G~SGGPl~-~~----G~vVGI~~~~~~~~~~~~~aip~--~~i~~~l~~l~ 307 (590)
..++...+...|+-|||++ .+ -+++||+.++..+ ...+||=++ +.+++.+++|.
T Consensus 169 ~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~-~~~gYAe~itQEDL~~A~~~l~ 229 (231)
T PF12381_consen 169 QGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSAN-HAMGYAESITQEDLMRAINKLE 229 (231)
T ss_pred eeeeEECCCcCCCccceeeEcchhhhhhhheeeeccccc-ccceehhhhhHHHHHHHHHhhc
Confidence 3466777888999999999 44 4699999987642 345666444 56666666654
No 141
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=36.97 E-value=1.6e+02 Score=22.10 Aligned_cols=38 Identities=26% Similarity=0.299 Sum_probs=29.9
Q ss_pred CCCcEEEEEEcCCCcEEEEEEEEecCCCCeEEEEecCC
Q 007765 160 ADSTFVLVRKHGSPTKYRAQVEAVGHECDLAILIVESD 197 (590)
Q Consensus 160 ~~~~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~~ 197 (590)
.....+.++..++...|++++..+|...++.-++.+.-
T Consensus 7 ~~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG 44 (55)
T PF09465_consen 7 AIGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG 44 (55)
T ss_dssp -SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred cCCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence 34567899999888888999999999999999988764
No 142
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=35.68 E-value=57 Score=30.98 Aligned_cols=29 Identities=21% Similarity=0.121 Sum_probs=25.4
Q ss_pred cCceEEEEeCCCChhhhc-cCCCCEEEEEC
Q 007765 340 VTGVLVNKINPLSDAHEI-LKKDDIILAFD 368 (590)
Q Consensus 340 ~~gv~V~~V~~~s~A~~a-L~~GD~Il~Vn 368 (590)
.+.++|..|..||+|+++ +.-|+.|++|-
T Consensus 121 ~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 121 GGKVIVDEVEFGSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred CCEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence 356889999999999999 99999998873
No 143
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=33.72 E-value=1.5e+02 Score=23.87 Aligned_cols=47 Identities=23% Similarity=0.244 Sum_probs=31.2
Q ss_pred CCCEEEEECCEEecCCCcccccccccchHHHHhhccCCCCE----EEEEEEeCCEEEEEE
Q 007765 360 KDDIILAFDGVPIANDGTVAFRNRERITFDHLVSMKKPNEK----SLVRVLRDGKEHEFS 415 (590)
Q Consensus 360 ~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~~~~~~g~~----v~l~V~R~g~~~~~~ 415 (590)
|-|-.+.+||++..+.+..+- | .-.....|.. +..++.|||+..+.+
T Consensus 10 PadAkl~v~G~~t~~~G~~R~-------F--~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~ 60 (75)
T TIGR03000 10 PADAKLKVDGKETNGTGTVRT-------F--TTPPLEAGKEYEYTVTAEYDRDGRILTRT 60 (75)
T ss_pred CCCCEEEECCeEcccCccEEE-------E--ECCCCCCCCEEEEEEEEEEecCCcEEEEE
Confidence 578889999999999887651 1 1123345554 555667899776544
No 144
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=32.99 E-value=64 Score=37.95 Aligned_cols=54 Identities=11% Similarity=0.123 Sum_probs=37.4
Q ss_pred ceEEEEEEeecccccccc-ccCCceEEeeCCeecCCHHH--HHHHHHhcCCCceEEEe
Q 007765 482 QLVILSQVLMDDINAGYE-RFADLQVKKVNGVEIENLKH--LCQLVENCSSENLRFDL 536 (590)
Q Consensus 482 ~~vvl~~V~~~s~a~g~~-~~~gd~I~~VNG~~v~~~~~--l~~~v~~~~~~~v~l~~ 536 (590)
-|++|..|.++++|+-.| |..||.+++|||...--+.+ -.+++ ...+..|.|++
T Consensus 960 lGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lm-trtg~vV~leV 1016 (1629)
T KOG1892|consen 960 LGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLM-TRTGNVVHLEV 1016 (1629)
T ss_pred cceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHHHH-hccCCeEEEeh
Confidence 478899999999877664 77999999999998755432 22233 33455555554
No 145
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=31.80 E-value=1.3e+02 Score=22.60 Aligned_cols=32 Identities=19% Similarity=0.061 Sum_probs=27.1
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~ 195 (590)
..+.|.+. +++.+.+.+.+.|...++.|-...
T Consensus 7 ~~V~V~l~-~g~~~~G~L~~~D~~~Ni~L~~~~ 38 (63)
T cd00600 7 KTVRVELK-DGRVLEGVLVAFDKYMNLVLDDVE 38 (63)
T ss_pred CEEEEEEC-CCcEEEEEEEEECCCCCEEECCEE
Confidence 46788886 899999999999999998876554
No 146
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=31.39 E-value=43 Score=36.87 Aligned_cols=56 Identities=13% Similarity=0.128 Sum_probs=35.5
Q ss_pred hhcCCCCCcCceEEEEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHh
Q 007765 332 NNFGMRSEVTGVLVNKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLV 392 (590)
Q Consensus 332 ~~lgl~~~~~gv~V~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~ 392 (590)
..++++. .=+.|.....|-|...+.+|||+|++.||+.|+=.++ .--+||-+.+.+
T Consensus 291 a~l~l~v--nv~~vl~~~ENm~~g~A~rPGDVits~~GkTVEV~NT---DAEGRLVLADaL 346 (485)
T COG0260 291 AELKLPV--NVVGVLPAVENMPSGNAYRPGDVITSMNGKTVEVLNT---DAEGRLVLADAL 346 (485)
T ss_pred HHcCCCc--eEEEEEeeeccCCCCCCCCCCCeEEecCCcEEEEccc---CccHHHHHHHHH
Confidence 3445654 2233445556777777799999999999988863322 112676665544
No 147
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=30.67 E-value=1.8e+02 Score=22.35 Aligned_cols=33 Identities=21% Similarity=0.184 Sum_probs=27.5
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEecC
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVES 196 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~~ 196 (590)
..+.++.. .|.+++++|+++|....+.+|+.+.
T Consensus 7 s~V~~kTc-~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 7 SQVSCRTC-FEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred cEEEEEec-CCceEEEEEEEecCCCcEEEEECcc
Confidence 45666665 6899999999999999999998655
No 148
>PF09122 DUF1930: Domain of unknown function (DUF1930); InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=28.20 E-value=2.1e+02 Score=22.03 Aligned_cols=44 Identities=23% Similarity=0.237 Sum_probs=28.2
Q ss_pred CceEEeeCCeecCCHH-HHHHHHHh-cCCCceEEEeCCCeEEEEEe
Q 007765 503 DLQVKKVNGVEIENLK-HLCQLVEN-CSSENLRFDLDDDRVVVLNY 546 (590)
Q Consensus 503 gd~I~~VNG~~v~~~~-~l~~~v~~-~~~~~v~l~~~r~~~i~l~~ 546 (590)
.-..+.|||+.+++.+ |+...+.. +-|+..++-|..++..+++.
T Consensus 19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~FkS~Rv~~iEv 64 (68)
T PF09122_consen 19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFFKSQRVAVIEV 64 (68)
T ss_dssp TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEETTS-EEEEE-
T ss_pred cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEEecCcEEEEEc
Confidence 4467899999999998 55555544 78888888888887777665
No 149
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=25.76 E-value=1.7e+02 Score=22.67 Aligned_cols=32 Identities=19% Similarity=0.252 Sum_probs=27.9
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~ 195 (590)
..+.|.+. +|+.+.+++.++|...++.|-...
T Consensus 11 ~~V~V~l~-~g~~~~G~L~~~D~~mNlvL~~~~ 42 (68)
T cd01731 11 KPVLVKLK-GGKEVRGRLKSYDQHMNLVLEDAE 42 (68)
T ss_pred CEEEEEEC-CCCEEEEEEEEECCcceEEEeeEE
Confidence 56888886 899999999999999999887664
No 150
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=25.76 E-value=1.5e+02 Score=22.86 Aligned_cols=32 Identities=22% Similarity=0.282 Sum_probs=27.2
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~ 195 (590)
..+.|.+. +|+.|.+++.++|+..++.|=...
T Consensus 11 ~~V~V~Lk-~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 11 RPVVVKLN-SGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CeEEEEEC-CCCEEEEEEEEEccceeeEEeeEE
Confidence 46888887 889999999999999999876553
No 151
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=25.60 E-value=1.7e+02 Score=23.08 Aligned_cols=32 Identities=22% Similarity=0.312 Sum_probs=27.5
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~ 195 (590)
..+.|.+. +|+.|.+++.++|...++.|-...
T Consensus 15 k~V~V~lk-~g~~~~G~L~~~D~~mNlvL~d~~ 46 (72)
T PRK00737 15 SPVLVRLK-GGREFRGELQGYDIHMNLVLDNAE 46 (72)
T ss_pred CEEEEEEC-CCCEEEEEEEEEcccceeEEeeEE
Confidence 46888886 899999999999999999887654
No 152
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.91 E-value=1.8e+02 Score=23.32 Aligned_cols=31 Identities=3% Similarity=0.093 Sum_probs=26.4
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 194 (590)
..+.|.+. ||+.+.+++.++|...+|.|=..
T Consensus 11 ~~v~V~l~-dgR~~~G~l~~~D~~~NivL~~~ 41 (75)
T cd06168 11 RTMRIHMT-DGRTLVGVFLCTDRDCNIILGSA 41 (75)
T ss_pred CeEEEEEc-CCeEEEEEEEEEcCCCcEEecCc
Confidence 46788886 99999999999999999876544
No 153
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.80 E-value=1.6e+02 Score=23.65 Aligned_cols=31 Identities=13% Similarity=0.072 Sum_probs=26.4
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 194 (590)
..+.|.+. +|+.+.+.+.++|...+|.|=..
T Consensus 11 ~~V~V~l~-dgR~~~G~L~~~D~~~NlVL~~~ 41 (79)
T cd01717 11 YRLRVTLQ-DGRQFVGQFLAFDKHMNLVLSDC 41 (79)
T ss_pred CEEEEEEC-CCcEEEEEEEEEcCccCEEcCCE
Confidence 56788886 89999999999999999876554
No 154
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=24.69 E-value=1.5e+02 Score=22.97 Aligned_cols=31 Identities=16% Similarity=0.209 Sum_probs=26.6
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 194 (590)
..+.|.+. +|+.+.+++.++|...+|.|=.+
T Consensus 12 ~~V~V~Lk-~g~~~~G~L~~~D~~mNi~L~~~ 42 (68)
T cd01722 12 KPVIVKLK-WGMEYKGTLVSVDSYMNLQLANT 42 (68)
T ss_pred CEEEEEEC-CCcEEEEEEEEECCCEEEEEeeE
Confidence 46888886 89999999999999999887554
No 155
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.41 E-value=1.4e+02 Score=24.19 Aligned_cols=31 Identities=16% Similarity=0.195 Sum_probs=26.2
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 194 (590)
..+.|.+. +|+.+.+++.++|...+|.|=..
T Consensus 12 k~V~V~l~-~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKLR-GDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEEC-CCCEEEEEEEEEccceEEeccce
Confidence 57888886 88999999999999999876543
No 156
>PRK05015 aminopeptidase B; Provisional
Probab=23.78 E-value=86 Score=33.83 Aligned_cols=45 Identities=16% Similarity=0.051 Sum_probs=30.4
Q ss_pred EEEeCCCChhhhccCCCCEEEEECCEEecCCCcccccccccchHHHHh
Q 007765 345 VNKINPLSDAHEILKKDDIILAFDGVPIANDGTVAFRNRERITFDHLV 392 (590)
Q Consensus 345 V~~V~~~s~A~~aL~~GD~Il~VnG~~v~~~~~v~~~~~~~~~~~~~~ 392 (590)
|-....|.+...+.++||+|++-||+.|+-.++ .--+|+-+.+.+
T Consensus 240 il~~aENmisg~A~kpgDVIt~~nGkTVEI~NT---DAEGRLVLAD~L 284 (424)
T PRK05015 240 FLCCAENLISGNAFKLGDIITYRNGKTVEVMNT---DAEGRLVLADGL 284 (424)
T ss_pred EEEecccCCCCCCCCCCCEEEecCCcEEeeecc---CccceeeehhHH
Confidence 445567777777899999999999988863322 112566554444
No 157
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.80 E-value=1.7e+02 Score=23.41 Aligned_cols=31 Identities=6% Similarity=0.104 Sum_probs=26.4
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 194 (590)
..+.|.+. +++.+.+++.++|...++.|=..
T Consensus 14 ~~V~V~l~-~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVMK-SDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEEC-CCeEEEEEEEEeccceEEEEccE
Confidence 57888886 89999999999999999876544
No 158
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.34 E-value=2.1e+02 Score=22.83 Aligned_cols=32 Identities=13% Similarity=0.033 Sum_probs=26.9
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~ 195 (590)
..+.|.+. +++.+.+.+.++|+..++.|=...
T Consensus 13 k~v~V~l~-~gr~~~G~L~~fD~~~NlvL~d~~ 44 (74)
T cd01728 13 KKVVVLLR-DGRKLIGILRSFDQFANLVLQDTV 44 (74)
T ss_pred CEEEEEEc-CCeEEEEEEEEECCcccEEecceE
Confidence 56788886 899999999999999998876543
No 159
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.18 E-value=3.6e+02 Score=21.28 Aligned_cols=32 Identities=9% Similarity=-0.025 Sum_probs=26.9
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEec
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIVE 195 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv~ 195 (590)
..+.|.+. +++.+.+++.++|...++.|=...
T Consensus 10 ~~V~V~l~-dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 10 KTVSVITV-DGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CEEEEEEC-CCcEEEEEEEEEccccCEEccceE
Confidence 46788886 999999999999999988876643
No 160
>COG2524 Predicted transcriptional regulator, contains C-terminal CBS domains [Transcription]
Probab=22.16 E-value=6.7e+02 Score=25.36 Aligned_cols=21 Identities=24% Similarity=0.518 Sum_probs=18.2
Q ss_pred cCCCCCceEEeCCEEEEEEee
Q 007765 262 NPGNSGGPAIMGNKVAGVAFQ 282 (590)
Q Consensus 262 ~~G~SGGPl~~~G~vVGI~~~ 282 (590)
..|-.|.|++.++++|||.+.
T Consensus 200 ~~~i~GaPVvd~dk~vGiit~ 220 (294)
T COG2524 200 EKGIRGAPVVDDDKIVGIITL 220 (294)
T ss_pred HcCccCCceecCCceEEEEEH
Confidence 468999999977799999975
No 161
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=21.83 E-value=2.1e+02 Score=23.20 Aligned_cols=31 Identities=6% Similarity=-0.003 Sum_probs=26.2
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 194 (590)
..+.|.+. +|+.+.+++.++|...+|.|=..
T Consensus 13 k~V~V~l~-~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKFQ-GGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEEC-CCcEEEEEEEEEcCcccEEecCE
Confidence 56888886 89999999999999999877544
No 162
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=20.98 E-value=2.3e+02 Score=22.32 Aligned_cols=31 Identities=6% Similarity=-0.016 Sum_probs=26.2
Q ss_pred cEEEEEEcCCCcEEEEEEEEecCCCCeEEEEe
Q 007765 163 TFVLVRKHGSPTKYRAQVEAVGHECDLAILIV 194 (590)
Q Consensus 163 ~~i~V~~~~~~~~~~a~vv~~d~~~DlAlLkv 194 (590)
..+.|.+. +|+.+.+++.++|...+|.|=..
T Consensus 11 k~V~V~L~-~g~~~~G~L~~~D~~mNlvL~~~ 41 (72)
T cd01719 11 KKLSLKLN-GNRKVSGILRGFDPFMNLVLDDA 41 (72)
T ss_pred CeEEEEEC-CCeEEEEEEEEEcccccEEeccE
Confidence 56788886 89999999999999998877544
Done!