Query 001444
Match_columns 1076
No_of_seqs 788 out of 5616
Neff 7.9
Searched_HMMs 46136
Date Fri Mar 29 00:58:55 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/001444.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/001444hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 KOG1421 Predicted signaling-as 100.0 3E-152 5E-157 1270.2 75.3 918 18-1068 34-953 (955)
2 PRK10139 serine endoprotease; 100.0 3.3E-52 7.2E-57 487.8 47.8 377 36-455 40-446 (455)
3 TIGR02037 degP_htrA_DO peripla 100.0 4.4E-52 9.5E-57 489.1 47.7 384 37-456 2-421 (428)
4 PRK10139 serine endoprotease; 100.0 4.7E-52 1E-56 486.5 45.3 388 587-1042 44-454 (455)
5 TIGR02037 degP_htrA_DO peripla 100.0 3E-51 6.5E-56 482.0 43.5 395 588-1041 6-427 (428)
6 PRK10942 serine endoprotease; 100.0 5E-50 1.1E-54 471.6 48.1 378 36-456 38-465 (473)
7 PRK10942 serine endoprotease; 100.0 1.5E-49 3.2E-54 467.6 44.3 359 616-1042 110-472 (473)
8 KOG1421 Predicted signaling-as 100.0 3.2E-49 6.9E-54 443.1 37.7 437 588-1074 57-504 (955)
9 TIGR02038 protease_degS peripl 100.0 1.1E-43 2.5E-48 404.8 37.7 300 32-367 41-349 (351)
10 PRK10898 serine endoprotease; 100.0 5.4E-43 1.2E-47 398.7 38.8 300 33-368 42-351 (353)
11 TIGR02038 protease_degS peripl 100.0 1.2E-40 2.7E-45 379.9 34.5 298 587-941 49-349 (351)
12 PRK10898 serine endoprotease; 100.0 9.7E-40 2.1E-44 372.0 33.5 300 587-942 49-351 (353)
13 COG0265 DegQ Trypsin-like seri 100.0 6.5E-33 1.4E-37 318.4 33.8 295 36-367 33-341 (347)
14 COG0265 DegQ Trypsin-like seri 100.0 5.7E-30 1.2E-34 294.0 29.4 293 588-940 38-340 (347)
15 KOG1320 Serine protease [Postt 99.9 2.6E-22 5.7E-27 228.0 23.8 317 34-366 126-468 (473)
16 KOG1320 Serine protease [Postt 99.8 4.7E-20 1E-24 209.8 21.9 310 588-940 133-468 (473)
17 KOG3209 WW domain-containing p 99.8 1.2E-18 2.6E-23 198.2 25.7 178 257-457 349-566 (984)
18 PRK10779 zinc metallopeptidase 99.7 1.2E-16 2.5E-21 189.2 19.3 157 870-1045 129-289 (449)
19 TIGR00054 RIP metalloprotease 99.6 3E-14 6.4E-19 167.1 17.9 144 867-1046 128-272 (420)
20 PRK10779 zinc metallopeptidase 99.5 7.9E-14 1.7E-18 165.2 17.2 143 302-456 130-279 (449)
21 PF12812 PDZ_1: PDZ-like domai 99.5 2.4E-14 5.3E-19 125.5 7.7 77 369-445 1-78 (78)
22 PF13365 Trypsin_2: Trypsin-li 99.5 2.2E-13 4.8E-18 131.3 12.3 110 70-216 1-120 (120)
23 TIGR00054 RIP metalloprotease 99.4 5.8E-12 1.3E-16 147.8 15.9 131 297-456 128-261 (420)
24 PF13180 PDZ_2: PDZ domain; PD 99.2 9.5E-11 2.1E-15 105.3 11.2 68 296-364 13-82 (82)
25 PF13180 PDZ_2: PDZ domain; PD 99.2 1.8E-10 3.9E-15 103.5 11.7 80 832-938 2-82 (82)
26 PF13365 Trypsin_2: Trypsin-li 99.1 6.5E-10 1.4E-14 107.1 11.7 55 619-677 1-65 (120)
27 PF12812 PDZ_1: PDZ-like domai 99.1 5.5E-10 1.2E-14 98.1 8.5 76 944-1023 2-78 (78)
28 cd00987 PDZ_serine_protease PD 98.9 6E-09 1.3E-13 95.2 11.2 88 952-1041 2-89 (90)
29 cd00987 PDZ_serine_protease PD 98.9 9.2E-09 2E-13 94.0 11.3 87 832-935 2-89 (90)
30 cd00991 PDZ_archaeal_metallopr 98.9 1.9E-08 4.1E-13 89.7 10.6 68 975-1043 10-77 (79)
31 cd00991 PDZ_archaeal_metallopr 98.8 2.4E-08 5.2E-13 89.0 10.5 68 866-937 9-77 (79)
32 cd00986 PDZ_LON_protease PDZ d 98.8 3E-08 6.4E-13 88.4 10.8 70 297-367 8-78 (79)
33 cd00986 PDZ_LON_protease PDZ d 98.7 7.4E-08 1.6E-12 85.8 10.7 69 975-1045 8-76 (79)
34 cd00989 PDZ_metalloprotease PD 98.7 1E-07 2.2E-12 84.8 10.2 66 976-1043 13-78 (79)
35 PF00089 Trypsin: Trypsin; In 98.7 2.4E-07 5.3E-12 98.9 14.4 176 47-238 13-219 (220)
36 cd00990 PDZ_glycyl_aminopeptid 98.7 1.3E-07 2.9E-12 84.3 10.3 65 297-365 12-78 (80)
37 TIGR01713 typeII_sec_gspC gene 98.6 4E-07 8.6E-12 99.5 14.4 99 233-364 159-259 (259)
38 TIGR01713 typeII_sec_gspC gene 98.6 2.9E-07 6.3E-12 100.5 13.1 100 800-938 159-259 (259)
39 cd00988 PDZ_CTP_protease PDZ d 98.6 3.4E-07 7.4E-12 82.7 10.5 66 297-363 13-82 (85)
40 KOG3580 Tight junction protein 98.6 3E-07 6.5E-12 104.4 12.0 158 297-457 219-491 (1027)
41 cd00989 PDZ_metalloprotease PD 98.6 2.3E-07 4.9E-12 82.5 9.0 63 300-362 14-77 (79)
42 cd00990 PDZ_glycyl_aminopeptid 98.6 3.7E-07 8E-12 81.4 10.0 66 976-1045 13-78 (80)
43 KOG3580 Tight junction protein 98.5 4.2E-07 9E-12 103.2 11.7 57 975-1031 429-487 (1027)
44 cd00988 PDZ_CTP_protease PDZ d 98.5 6.2E-07 1.3E-11 81.0 9.7 68 975-1044 13-83 (85)
45 KOG3209 WW domain-containing p 98.5 2.3E-07 5E-12 107.5 8.4 149 302-452 782-979 (984)
46 cd00190 Tryp_SPc Trypsin-like 98.4 3.7E-06 7.9E-11 90.5 15.5 164 46-221 12-208 (232)
47 cd00136 PDZ PDZ domain, also c 98.3 2.2E-06 4.8E-11 74.2 8.2 53 298-351 14-69 (70)
48 smart00020 Tryp_SPc Trypsin-li 98.3 2.1E-05 4.6E-10 84.6 16.8 164 46-221 13-208 (229)
49 cd00136 PDZ PDZ domain, also c 98.3 2.4E-06 5.2E-11 74.0 7.5 53 976-1029 14-68 (70)
50 COG3591 V8-like Glu-specific e 98.2 1.9E-05 4.2E-10 84.4 14.3 158 67-242 63-249 (251)
51 PF00863 Peptidase_C4: Peptida 98.1 9.6E-05 2.1E-09 78.5 15.5 166 43-233 14-185 (235)
52 PF14685 Tricorn_PDZ: Tricorn 98.0 3.5E-05 7.5E-10 69.3 9.6 64 297-360 11-86 (88)
53 PF00595 PDZ: PDZ domain (Also 97.9 2.5E-05 5.5E-10 69.8 7.3 53 975-1029 25-79 (81)
54 TIGR00225 prc C-terminal pepti 97.9 3E-05 6.5E-10 89.0 9.0 69 297-366 62-133 (334)
55 TIGR02860 spore_IV_B stage IV 97.9 7.9E-05 1.7E-09 85.3 12.2 68 975-1044 105-180 (402)
56 cd00992 PDZ_signaling PDZ doma 97.9 4.1E-05 8.9E-10 68.4 8.0 53 975-1029 26-80 (82)
57 smart00228 PDZ Domain present 97.9 3.6E-05 7.8E-10 69.1 7.7 59 975-1034 26-84 (85)
58 PLN00049 carboxyl-terminal pro 97.9 5.8E-05 1.3E-09 88.1 11.3 75 976-1052 103-179 (389)
59 TIGR03279 cyano_FeS_chp putati 97.9 3E-05 6.6E-10 89.1 8.0 72 979-1056 2-73 (433)
60 smart00228 PDZ Domain present 97.8 5.8E-05 1.3E-09 67.7 7.9 59 867-929 26-85 (85)
61 TIGR00225 prc C-terminal pepti 97.8 8.8E-05 1.9E-09 85.1 10.9 77 976-1054 63-143 (334)
62 KOG3834 Golgi reassembly stack 97.8 0.00019 4E-09 80.5 12.5 162 867-1048 15-183 (462)
63 TIGR02860 spore_IV_B stage IV 97.8 0.00011 2.4E-09 84.1 10.2 69 296-364 104-180 (402)
64 TIGR03279 cyano_FeS_chp putati 97.8 6.3E-05 1.4E-09 86.6 8.3 61 871-939 2-64 (433)
65 PF00595 PDZ: PDZ domain (Also 97.7 8.3E-05 1.8E-09 66.5 7.2 56 867-926 25-81 (81)
66 PLN00049 carboxyl-terminal pro 97.7 0.00011 2.4E-09 85.9 9.8 66 298-364 103-171 (389)
67 cd00992 PDZ_signaling PDZ doma 97.7 0.00011 2.3E-09 65.7 7.3 53 297-351 26-81 (82)
68 PF04495 GRASP55_65: GRASP55/6 97.6 0.00024 5.1E-09 70.0 8.8 72 975-1047 43-116 (138)
69 COG0793 Prc Periplasmic protea 97.6 0.0002 4.2E-09 83.7 9.6 78 976-1055 113-194 (406)
70 PF14685 Tricorn_PDZ: Tricorn 97.6 0.00056 1.2E-08 61.6 10.0 65 867-936 12-88 (88)
71 PF00089 Trypsin: Trypsin; In 97.6 0.0016 3.4E-08 69.4 15.4 169 616-806 24-220 (220)
72 KOG3605 Beta amyloid precursor 97.5 0.00016 3.4E-09 84.1 6.8 121 870-1019 676-802 (829)
73 COG0793 Prc Periplasmic protea 97.4 0.00047 1E-08 80.5 9.6 67 297-364 112-181 (406)
74 KOG3129 26S proteasome regulat 97.4 0.00055 1.2E-08 69.7 8.3 73 297-369 138-214 (231)
75 PF04495 GRASP55_65: GRASP55/6 97.3 0.00054 1.2E-08 67.5 7.1 86 260-365 25-114 (138)
76 PRK09681 putative type II secr 97.3 0.00074 1.6E-08 73.6 8.3 62 873-938 210-275 (276)
77 COG3480 SdrC Predicted secrete 97.3 0.00095 2.1E-08 72.4 8.8 71 867-941 130-201 (342)
78 KOG3834 Golgi reassembly stack 97.2 0.0043 9.4E-08 69.9 13.6 165 297-475 15-185 (462)
79 KOG3553 Tax interaction protei 97.1 0.00035 7.7E-09 62.2 3.3 58 975-1036 59-118 (124)
80 KOG3129 26S proteasome regulat 97.1 0.0015 3.3E-08 66.6 8.3 76 866-943 138-214 (231)
81 COG3480 SdrC Predicted secrete 97.1 0.0015 3.4E-08 70.9 8.5 69 975-1044 130-198 (342)
82 KOG3605 Beta amyloid precursor 97.1 0.00089 1.9E-08 78.1 6.8 115 304-441 679-802 (829)
83 PRK11186 carboxy-terminal prot 97.0 0.0019 4E-08 79.5 9.2 69 976-1046 256-335 (667)
84 PRK09681 putative type II secr 97.0 0.0023 4.9E-08 69.9 8.8 55 988-1043 220-274 (276)
85 cd00190 Tryp_SPc Trypsin-like 97.0 0.01 2.2E-07 63.6 13.7 93 615-715 23-133 (232)
86 PF00863 Peptidase_C4: Peptida 97.0 0.011 2.4E-07 63.1 13.1 179 589-810 13-197 (235)
87 PRK11186 carboxy-terminal prot 96.9 0.0023 5.1E-08 78.7 9.1 66 298-363 255-332 (667)
88 COG3975 Predicted protease wit 96.9 0.0013 2.8E-08 76.0 6.2 86 263-368 439-526 (558)
89 smart00020 Tryp_SPc Trypsin-li 96.9 0.016 3.5E-07 62.2 13.9 93 615-715 24-133 (229)
90 COG3031 PulC Type II secretory 96.7 0.004 8.7E-08 64.8 6.8 67 867-937 207-274 (275)
91 PF10459 Peptidase_S46: Peptid 96.4 0.014 3E-07 72.3 10.4 42 117-158 200-252 (698)
92 COG3031 PulC Type II secretory 96.1 0.01 2.2E-07 61.9 6.2 61 303-363 212-274 (275)
93 KOG3553 Tax interaction protei 96.1 0.013 2.9E-07 52.4 6.0 48 394-441 54-105 (124)
94 COG3975 Predicted protease wit 95.9 0.015 3.3E-07 67.5 6.8 90 830-942 436-526 (558)
95 PF05579 Peptidase_S32: Equine 95.6 0.12 2.5E-06 55.3 11.5 116 67-221 111-229 (297)
96 PF05580 Peptidase_S55: SpoIVB 95.3 0.2 4.2E-06 52.4 11.8 167 66-235 18-215 (218)
97 PF10459 Peptidase_S46: Peptid 94.8 0.034 7.3E-07 69.0 5.3 54 190-243 623-687 (698)
98 PF02122 Peptidase_S39: Peptid 94.7 0.0091 2E-07 62.6 0.1 144 66-232 28-181 (203)
99 KOG3550 Receptor targeting pro 94.6 0.15 3.2E-06 49.3 8.0 53 297-351 115-171 (207)
100 KOG3532 Predicted protein kina 94.4 0.088 1.9E-06 62.2 7.3 57 296-353 396-453 (1051)
101 KOG3532 Predicted protein kina 94.3 0.094 2E-06 62.0 7.1 46 867-912 398-444 (1051)
102 KOG3542 cAMP-regulated guanine 94.1 0.036 7.8E-07 65.1 3.3 54 975-1029 562-615 (1283)
103 COG3591 V8-like Glu-specific e 94.1 0.94 2E-05 49.1 13.8 146 617-779 64-225 (251)
104 PF00949 Peptidase_S7: Peptida 93.8 0.056 1.2E-06 52.6 3.5 31 191-221 88-118 (132)
105 KOG3552 FERM domain protein FR 93.8 0.062 1.3E-06 65.6 4.4 53 300-353 77-131 (1298)
106 KOG3550 Receptor targeting pro 93.2 0.22 4.7E-06 48.2 6.2 52 976-1029 116-170 (207)
107 PF00548 Peptidase_C3: 3C cyst 93.0 0.54 1.2E-05 48.5 9.5 139 66-220 23-170 (172)
108 KOG3542 cAMP-regulated guanine 92.8 0.092 2E-06 61.8 3.7 54 399-453 562-617 (1283)
109 KOG1892 Actin filament-binding 92.2 0.18 3.8E-06 61.9 5.2 61 295-356 958-1021(1629)
110 PF08192 Peptidase_S64: Peptid 91.8 0.81 1.8E-05 55.2 9.9 93 142-242 585-688 (695)
111 PF03761 DUF316: Domain of unk 91.8 5.2 0.00011 44.7 16.3 106 115-236 159-272 (282)
112 KOG3552 FERM domain protein FR 90.6 0.3 6.4E-06 60.0 4.9 55 868-928 76-132 (1298)
113 KOG3571 Dishevelled 3 and rela 90.6 0.64 1.4E-05 53.8 7.2 87 376-484 260-351 (626)
114 KOG1892 Actin filament-binding 90.0 0.19 4E-06 61.6 2.5 66 861-930 954-1021(1629)
115 KOG3606 Cell polarity protein 89.6 0.59 1.3E-05 49.8 5.6 53 976-1028 195-250 (358)
116 KOG3627 Trypsin [Amino acid tr 89.6 8.7 0.00019 42.0 15.3 147 69-222 39-229 (256)
117 PF11874 DUF3394: Domain of un 88.9 1.5 3.4E-05 45.1 7.9 83 901-1002 62-149 (183)
118 COG0750 Predicted membrane-ass 87.0 1.8 3.8E-05 50.6 8.2 56 304-359 135-195 (375)
119 KOG3651 Protein kinase C, alph 86.9 1.2 2.5E-05 48.3 5.7 54 297-352 30-87 (429)
120 KOG0609 Calcium/calmodulin-dep 86.4 1.2 2.6E-05 52.5 6.0 74 976-1051 147-226 (542)
121 KOG0606 Microtubule-associated 86.4 0.93 2E-05 57.5 5.5 51 978-1030 661-713 (1205)
122 KOG0609 Calcium/calmodulin-dep 85.7 3.7 8E-05 48.5 9.5 53 300-353 148-204 (542)
123 KOG3571 Dishevelled 3 and rela 85.6 2.5 5.3E-05 49.2 7.8 63 950-1020 260-325 (626)
124 PF00548 Peptidase_C3: 3C cyst 85.1 11 0.00023 39.0 11.7 151 590-776 9-169 (172)
125 COG0750 Predicted membrane-ass 84.9 3.1 6.6E-05 48.6 8.7 58 978-1037 132-193 (375)
126 KOG2921 Intramembrane metallop 84.6 1.3 2.9E-05 49.8 5.0 45 975-1019 220-265 (484)
127 KOG3551 Syntrophins (type beta 84.4 0.71 1.5E-05 51.7 2.8 56 295-351 108-166 (506)
128 PF00944 Peptidase_S3: Alphavi 84.3 0.95 2.1E-05 43.3 3.2 32 191-222 97-128 (158)
129 KOG3606 Cell polarity protein 83.4 1.7 3.6E-05 46.5 4.9 46 399-444 194-244 (358)
130 KOG3549 Syntrophins (type gamm 80.5 2.1 4.5E-05 47.3 4.5 53 300-352 82-137 (505)
131 KOG3651 Protein kinase C, alph 79.8 3.5 7.6E-05 44.8 5.8 55 866-926 29-87 (429)
132 PF08192 Peptidase_S64: Peptid 78.9 15 0.00033 44.8 11.3 125 666-809 540-688 (695)
133 KOG2921 Intramembrane metallop 77.1 3.1 6.7E-05 47.1 4.6 46 295-341 218-265 (484)
134 KOG3938 RGS-GAIP interacting p 74.0 7.5 0.00016 41.7 6.2 51 405-455 157-210 (334)
135 PF11874 DUF3394: Domain of un 73.3 17 0.00037 37.6 8.5 80 332-424 62-149 (183)
136 KOG3551 Syntrophins (type beta 72.5 3 6.5E-05 46.9 3.1 59 867-929 110-172 (506)
137 PF01732 DUF31: Putative pepti 71.8 2.5 5.5E-05 49.4 2.5 30 190-219 345-374 (374)
138 KOG3549 Syntrophins (type gamm 71.3 3.5 7.5E-05 45.6 3.2 68 977-1046 82-152 (505)
139 KOG0606 Microtubule-associated 67.8 6.3 0.00014 50.5 4.8 36 870-905 661-697 (1205)
140 PF03510 Peptidase_C24: 2C end 66.4 18 0.00039 33.8 6.3 52 72-135 3-54 (105)
141 KOG3938 RGS-GAIP interacting p 65.2 16 0.00034 39.4 6.4 55 978-1032 152-209 (334)
142 PF02907 Peptidase_S29: Hepati 59.2 6.9 0.00015 37.8 2.3 114 71-220 15-128 (148)
143 PF05416 Peptidase_C37: Southa 57.3 53 0.0012 37.9 9.1 136 68-221 379-527 (535)
144 PF12381 Peptidase_C3G: Tungro 53.4 21 0.00045 37.6 4.8 55 189-243 169-229 (231)
145 PF00949 Peptidase_S7: Peptida 49.5 15 0.00032 36.0 2.9 28 748-776 89-116 (132)
146 COG0298 HypC Hydrogenase matur 46.2 36 0.00078 30.0 4.4 47 657-711 4-51 (82)
147 COG5233 GRH1 Peripheral Golgi 44.8 49 0.0011 36.7 6.1 71 980-1051 191-266 (417)
148 KOG4371 Membrane-associated pr 39.8 73 0.0016 40.9 7.3 117 314-433 1186-1307(1332)
149 PF05579 Peptidase_S32: Equine 38.9 1.1E+02 0.0024 33.5 7.6 19 758-776 209-227 (297)
150 KOG4407 Predicted Rho GTPase-a 38.0 21 0.00045 46.4 2.4 87 303-444 101-192 (1973)
151 PF01455 HupF_HypC: HupF/HypC 36.2 78 0.0017 27.3 5.0 44 657-709 4-47 (68)
152 cd01735 LSm12_N LSm12 belongs 35.8 79 0.0017 26.7 4.7 32 646-678 8-39 (61)
153 PF00944 Peptidase_S3: Alphavi 29.0 83 0.0018 30.6 4.3 23 758-780 107-129 (158)
154 KOG1738 Membrane-associated gu 28.1 86 0.0019 38.3 5.2 37 297-333 224-262 (638)
155 cd00600 Sm_like The eukaryotic 26.8 1.1E+02 0.0024 25.3 4.4 31 645-676 7-37 (63)
156 PF09342 DUF1986: Domain of un 26.6 5.2E+02 0.011 28.2 10.1 90 66-157 26-131 (267)
157 cd01726 LSm6 The eukaryotic Sm 26.3 97 0.0021 26.4 4.0 31 645-676 11-41 (67)
158 cd01732 LSm5 The eukaryotic Sm 25.8 1.1E+02 0.0023 27.0 4.2 30 645-675 14-43 (76)
159 cd01735 LSm12_N LSm12 belongs 25.0 1.9E+02 0.0041 24.4 5.2 32 95-126 8-39 (61)
160 cd06168 LSm9 The eukaryotic Sm 24.8 1.2E+02 0.0026 26.6 4.4 30 645-675 11-40 (75)
161 cd01717 Sm_B The eukaryotic Sm 24.7 1E+02 0.0022 27.3 3.9 30 645-675 11-40 (79)
162 KOG1738 Membrane-associated gu 24.6 84 0.0018 38.4 4.3 46 867-913 225-272 (638)
163 cd01722 Sm_F The eukaryotic Sm 24.2 1.1E+02 0.0024 26.1 4.0 31 645-676 12-42 (68)
164 PF15436 PGBA_N: Plasminogen-b 23.2 5E+02 0.011 27.8 9.3 55 96-152 33-88 (218)
165 COG5640 Secreted trypsin-like 23.1 2.1E+02 0.0046 32.7 6.7 27 196-222 224-253 (413)
166 cd01730 LSm3 The eukaryotic Sm 22.8 1.1E+02 0.0023 27.4 3.7 29 645-674 12-40 (82)
167 PF14275 DUF4362: Domain of un 22.3 2.4E+02 0.0052 26.2 5.9 53 416-472 1-53 (98)
168 PF00947 Pico_P2A: Picornaviru 22.1 1.1E+02 0.0023 29.8 3.6 33 188-221 78-110 (127)
169 cd01729 LSm7 The eukaryotic Sm 22.0 1.4E+02 0.003 26.7 4.2 30 645-675 13-42 (81)
170 cd01719 Sm_G The eukaryotic Sm 21.9 1.5E+02 0.0032 25.8 4.3 57 645-710 11-67 (72)
171 PRK00737 small nuclear ribonuc 21.7 1.4E+02 0.003 25.9 4.1 31 645-676 15-45 (72)
172 cd01731 archaeal_Sm1 The archa 21.5 1.4E+02 0.0031 25.4 4.1 31 645-676 11-41 (68)
173 cd01728 LSm1 The eukaryotic Sm 21.4 1.5E+02 0.0034 25.9 4.3 60 645-710 13-72 (74)
No 1
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=100.00 E-value=2.5e-152 Score=1270.18 Aligned_cols=918 Identities=45% Similarity=0.714 Sum_probs=855.8
Q ss_pred ccccccCCCCCccCccCcchHHHHHHHhCCceEEEEEeeeeccCCCCCCCcEEEEEEEeCCCcEEEeCccccCCCCcEEE
Q 001444 18 EDMCMEVDPPLRENVATADDWRKALNKVVPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAE 97 (1076)
Q Consensus 18 ~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~svV~I~~~~~~~~d~~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~ 97 (1076)
++++.+..++....-+...+|+..+..+.+|||+|+++.++.||++.++.+.||||++++..|||||||||+.+++....
T Consensus 34 ~el~~e~~p~~~~s~~~~e~w~~~ia~VvksvVsI~~S~v~~fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~ 113 (955)
T KOG1421|consen 34 EELVIEPDPPLNESLATSEDWRNTIANVVKSVVSIRFSAVRAFDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVAS 113 (955)
T ss_pred cccccccCCCCCcccchhhhhhhhhhhhcccEEEEEehheeecccccccccceeEEEEecccceEEEeccccCCCCceeE
Confidence 33555555555555666779999999999999999999999999999999999999999999999999999999999999
Q ss_pred EEecCCcEEEEEEEEecCCCcEEEEEEcCCCCccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCC
Q 001444 98 AMFVNREEIPVYPIYRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPH 177 (1076)
Q Consensus 98 v~~~~~~~~~a~vv~~d~~~DlAlLk~~~~~~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~ 177 (1076)
+.|.|.++++..++|+||.|||+++|++|+.+.+..+.+++++++..++|.+++++||++++..++..|.+++++|++|.
T Consensus 114 avf~n~ee~ei~pvyrDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDagEklsIlagflSrldr~apd 193 (955)
T KOG1421|consen 114 AVFDNHEEIEIYPVYRDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPD 193 (955)
T ss_pred EEecccccCCcccccCCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCccceEEeehhhhhhccCCCcc
Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999
Q ss_pred CCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEEeecccCCCCCccccCHHHHHHHHHHHHhcCCccccccceec
Q 001444 178 YKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSSSASAFFLPLERVVRALRFLQERRDCNIHNWEAVS 257 (1076)
Q Consensus 178 ~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~~~~~~falP~~~i~~~l~~l~~~~~~~~~~~~~~~ 257 (1076)
|++..|+|||++|+|..+..++|+||+||+|.+|.+|+++++++..++.+|+||+++++|+|.++++++ +
T Consensus 194 yg~~~yndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg~~ssas~ffLpLdrV~RaL~clq~n~----------P 263 (955)
T KOG1421|consen 194 YGEDTYNDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSISSASDFFLPLDRVVRALRCLQNNT----------P 263 (955)
T ss_pred ccccccccccceeeeehhcCCCCCCCCceecccceEEeeecCCcccccccceeeccchhhhhhhhhcCC----------C
Confidence 999999999999999999999999999999999999999999999999999999999999999999998 8
Q ss_pred cCCCccceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCccccCCCCCCEEEEECCEEeCChhHHHH
Q 001444 258 IPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQFLKLET 337 (1076)
Q Consensus 258 ~~rg~lg~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~gL~~GD~Il~VnG~~v~~~~~l~~ 337 (1076)
++||+|+++|.++.+||||||||+.|||+.+|.++| ..+|||||+.|+++|||++.|++||++++||+..+.+|..+.+
T Consensus 264 ItRGtLqvefl~k~~de~rrlGL~sE~eqv~r~k~P-~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~l~~ 342 (955)
T KOG1421|consen 264 ITRGTLQVEFLHKLFDECRRLGLSSEWEQVVRTKFP-ERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEALEQ 342 (955)
T ss_pred cccceEEEEEehhhhHHHHhcCCcHHHHHHHHhcCc-ccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHHHHH
Confidence 999999999999999999999999999999999999 7999999999999999999999999999999999999999999
Q ss_pred HHhcCCCCeEEEEEEeCCeEEEEEEEeccCCCCCCCcccccCceEEecCCHHHHhccCCCCCeEEEEcCCChhhHcCCCC
Q 001444 338 LLDDGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLSYQQARNFRFPCGLVYVAEPGYMLFRAGVPR 417 (1076)
Q Consensus 338 ~l~~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~~G~~~~~l~~~~~~~~~~~~~gv~v~~~gs~a~~aGl~~ 417 (1076)
+|++.+|+.++|+|+|+|++.+++++++++|.++|+||++++|++||+++||++|.|.+|++|+||+++++.+...+-.-
T Consensus 343 iLDegvgk~l~LtI~Rggqelel~vtvqdlh~itp~R~levcGav~hdlsyq~ar~y~lP~~GvyVa~~~gsf~~~~~~y 422 (955)
T KOG1421|consen 343 ILDEGVGKNLELTIQRGGQELELTVTVQDLHGITPDRFLEVCGAVFHDLSYQLARLYALPVEGVYVASPGGSFRHRGPRY 422 (955)
T ss_pred HHhhccCceEEEEEEeCCEEEEEEEEeccccCCCCceEEEEcceEecCCCHHHHhhcccccCcEEEccCCCCccccCCcc
Confidence 99999999999999999999999999999999999999999999999999999999999999999999754444444333
Q ss_pred CCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeEEEEeccccccceEEEEEEecCCCCCCCeeeecCCCCCCccccccCCC
Q 001444 418 HAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDRHEWYAPPQIYTRNDSSGLWSANPAILS 497 (1076)
Q Consensus 418 GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l~~~~~~~~~~~~~~~l~i~r~~~~~~~~~~~r~d~~g~w~~~~~~~~ 497 (1076)
|++|.+||++++++|++|+++++++++|+||+++|+++.|+|+.+...++||| ||||++++++|||++|+||++++.+|
T Consensus 423 ~~ii~~vanK~tPdLdaFidvlk~L~dg~rV~vry~hl~dkh~p~v~~v~iDr-Hwy~p~~~~trndetglWdrk~L~~p 501 (955)
T KOG1421|consen 423 GQIIDSVANKPTPDLDAFIDVLKELPDGARVPVRYHHLTDKHSPRVTTVTIDR-HWYWPFREYTRNDETGLWDRKNLKDP 501 (955)
T ss_pred eEEEEeecCCcCCCHHHHHHHHHhccCCCeeeEEEEEecCCCCceEEEEEEec-cccccceeeeeCCCcccccccccCCC
Confidence 99999999999999999999999999999999999999999999999999999 99999999999999999999999999
Q ss_pred cCCCCCCCCCCCccccccccccccccccCCCCcccccccccccccccCcccccCCCCCCCCccccccccccccCCCCCCC
Q 001444 498 EVLMPSSGINGGVQGVASQTVSICESISRGESDNGRKKRRVEENISADGVVADCSPHESGDARLEDSSTMENAGSRDYFG 577 (1076)
Q Consensus 498 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g 577 (1076)
.|+.+.+|.. .+.+ .++
T Consensus 502 qPa~~~kP~s------------------------------------~~ip------------~i~--------------- 518 (955)
T KOG1421|consen 502 QPAISIKPAS------------------------------------VSIP------------SIG--------------- 518 (955)
T ss_pred CcccccCCcc------------------------------------ccCC------------CcC---------------
Confidence 9988777666 1110 111
Q ss_pred CCccCCccccccceeeeeEEEEEEEcCcccccCCcccceeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeecCCeE
Q 001444 578 APAATTNASFAESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIE 657 (1076)
Q Consensus 578 ~~~~~~~~~~~~~~~~~S~V~V~~~~~~~~~~dg~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d~~~~ 657 (1076)
..++..+.+..|+|.|.+.+| ..+||+.+....|||+|+| ...|+++++|.+|+.+++|++|+++| +..
T Consensus 519 ------~~~~~~~~i~~~~~~v~~~~~--~~l~g~s~~i~kgt~~i~d--~~~g~~vvsr~~vp~d~~d~~vt~~d-S~~ 587 (955)
T KOG1421|consen 519 ------VNNFPSADISNCLVDVEPMMP--VNLDGVSSDIYKGTALIMD--TSKGLGVVSRSVVPSDAKDQRVTEAD-SDG 587 (955)
T ss_pred ------cCCcchhHHhhhhhhheecee--eccccchhhhhcCceEEEE--ccCCceeEecccCCchhhceEEeecc-ccc
Confidence 113345678899999999999 4899999988999999999 57999999999999999999999999 999
Q ss_pred EeEEEEEeeCCCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCC
Q 001444 658 IPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSAD 737 (1076)
Q Consensus 658 ~~a~vv~~dp~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~ 737 (1076)
++|.+.|+||.+|+|.+||||++.-. ++|.+.. +++||++.+.|+..+.+.++++++|++++ ..++++..
T Consensus 588 i~a~~~fL~~t~n~a~~kydp~~~~~--------~kl~~~~-v~~gD~~~f~g~~~~~r~ltaktsv~dvs-~~~~ps~~ 657 (955)
T KOG1421|consen 588 IPANVSFLHPTENVASFKYDPALEVQ--------LKLTDTT-VLRGDECTFEGFTEDLRALTAKTSVTDVS-VVIIPSSV 657 (955)
T ss_pred ccceeeEecCccceeEeccChhHhhh--------hccceee-EecCCceeEecccccchhhcccceeeeeE-EEEecCCC
Confidence 99999999999999999999987644 7887765 99999999999999999999999999987 77889999
Q ss_pred CCcccccceeEEEEecccCCCc-CceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhhHHHHHHHHhcCCCCC
Q 001444 738 CPRYRAMNMEVIELDTDFGSTF-SGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISGASGP 816 (1076)
Q Consensus 738 ~~~~~~~~~~~I~~d~~ig~~s-GGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l~~l~~~~~~~ 816 (1076)
+||||++|+++|+++++++..| .|.|.|.+|+|+|+|.++.++. .++.+..|.+|+.+.+++++|++|+.+.+
T Consensus 658 ~pr~r~~n~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~----~~~kd~~y~~gl~~~~~l~vl~rlk~g~~-- 731 (955)
T KOG1421|consen 658 MPRFRATNLEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGED----VGGKDYTYKYGLSMSYILPVLERLKLGPS-- 731 (955)
T ss_pred CcceeecceEEEEEeccccccccceEEECCCCeEEEEEeeeeccc----cCCceeEEEeccchHHHHHHHHHHhcCCC--
Confidence 9999999999999999998767 9999999999999999988872 35678889999999999999999999863
Q ss_pred cccccccccCCCceeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhhccCCCEEEEECCE
Q 001444 817 SLLINGVKRPMPLVRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENMLEQGDMMLAINKQ 896 (1076)
Q Consensus 817 ~~~~~~v~r~~p~~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~aL~~GDiIlsVnG~ 896 (1076)
+..+++|++|..++..+||.+|+|.||+-+++.....++|.++|+.|.+..+ +.|..||+|+++|||
T Consensus 732 -----------~rp~i~~vef~~i~laqar~lglp~e~imk~e~es~~~~ql~~ishv~~~~~--kil~~gdiilsvngk 798 (955)
T KOG1421|consen 732 -----------ARPTIAGVEFSHITLAQARTLGLPSEFIMKSEEESTIPRQLYVISHVRPLLH--KILGVGDIILSVNGK 798 (955)
T ss_pred -----------CCceeeccceeeEEeehhhccCCCHHHHhhhhhcCCCcceEEEEEeeccCcc--cccccccEEEEecCe
Confidence 3346999999999999999999999999999999999999999999988664 239999999999999
Q ss_pred EcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccCCCCCcceeeecCccccCCcHhHhhc-CCCCCCCC
Q 001444 897 PVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRDGNGTTRVINWCGCIVQDPHPAVRAL-GFLPEEGH 975 (1076)
Q Consensus 897 ~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~G~~~~~p~~~~~~~-~~~p~~~~ 975 (1076)
.|+.++||.. + ..++++|+|+|.++++++++.... +++|.+.|+|.++|+||++++++ ..+|.
T Consensus 799 ~itr~~dl~d-~----------~eid~~ilrdg~~~~ikipt~p~~--et~r~vi~~gailq~ph~av~~q~edlp~--- 862 (955)
T KOG1421|consen 799 MITRLSDLHD-F----------EEIDAVILRDGIEMEIKIPTYPEY--ETSRAVIWMGAILQPPHSAVFEQVEDLPE--- 862 (955)
T ss_pred EEeeehhhhh-h----------hhhheeeeecCcEEEEEecccccc--ccceEEEEEeccccCchHHHHHHHhccCC---
Confidence 9999999986 3 358899999999999999997655 89999999999999999999985 78897
Q ss_pred cEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEEeCCccCcceEEE
Q 001444 976 GVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLKQDLHYWPTWELI 1055 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk~~~~y~pt~e~~ 1055 (1076)
||||+....||||.+ +|.+..||++|||..++++++|..+++++|++.||+++.++|+|.|..+++|+|+|||||.+|.
T Consensus 863 gvyvt~rg~gspalq-~l~aa~fitavng~~t~~lddf~~~~~~ipdnsyv~v~~mtfd~vp~~~s~k~n~hyfpt~~l~ 941 (955)
T KOG1421|consen 863 GVYVTSRGYGSPALQ-MLRAAHFITAVNGHDTNTLDDFYHMLLEIPDNSYVQVKQMTFDGVPSIVSVKPNPHYFPTCILE 941 (955)
T ss_pred ceEEeecccCChhHh-hcchheeEEEecccccCcHHHHHHHHhhCCCCceEEEEEeccCCCceEEEeccCCccCceeEEE
Confidence 999999999999999 9999999999999999999999999999999999999999999999999999999999999999
Q ss_pred EcCCCCCeEEEEe
Q 001444 1056 FDPDTALWRRKSV 1068 (1076)
Q Consensus 1056 ~~~~~~~w~~~~~ 1068 (1076)
+|..+. |+.++.
T Consensus 942 rd~~~~-wi~kev 953 (955)
T KOG1421|consen 942 RDSNGR-WITKEV 953 (955)
T ss_pred ecccCc-eeeeec
Confidence 998655 977653
No 2
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=3.3e-52 Score=487.77 Aligned_cols=377 Identities=19% Similarity=0.314 Sum_probs=310.2
Q ss_pred chHHHHHHHhCCceEEEEEeeeec------------cCC------CCCCCcEEEEEEEeCCCcEEEeCccccCCCCcEEE
Q 001444 36 DDWRKALNKVVPAVVVLRTTACRA------------FDT------EAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAE 97 (1076)
Q Consensus 36 ~~~~~~~~~v~~svV~I~~~~~~~------------~d~------~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~ 97 (1076)
.+|.++++++.||||.|.+..... |.. .....+.||||+|++++||||||+|||. +...+.
T Consensus 40 ~~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~-~a~~i~ 118 (455)
T PRK10139 40 PSLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVIN-QAQKIS 118 (455)
T ss_pred ccHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhC-CCCEEE
Confidence 479999999999999999864321 211 1123578999999865799999999999 567899
Q ss_pred EEecCCcEEEEEEEEecCCCcEEEEEEcC-CCCccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCC
Q 001444 98 AMFVNREEIPVYPIYRDPVHDFGFFRYDP-SAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAP 176 (1076)
Q Consensus 98 v~~~~~~~~~a~vv~~d~~~DlAlLk~~~-~~~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~ 176 (1076)
|++.|+++++|++++.|+.+||||||++. ..++++.+++ ++.+++||+|+++|||+|...+++.|+||++.|...
T Consensus 119 V~~~dg~~~~a~vvg~D~~~DlAvlkv~~~~~l~~~~lg~----s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~ 194 (455)
T PRK10139 119 IQLNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIAD----SDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGL 194 (455)
T ss_pred EEECCCCEEEEEEEEEcCCCCEEEEEecCCCCCceeEecC----ccccCCCCEEEEEecCCCCCCceEEEEEcccccccc
Confidence 99999999999999999999999999974 5666666665 788999999999999999999999999999988632
Q ss_pred CCCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEEeecccCC----CCCccccCHHHHHHHHHHHHhcCCccccc
Q 001444 177 HYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSS----SASAFFLPLERVVRALRFLQERRDCNIHN 252 (1076)
Q Consensus 177 ~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~~----~~~~falP~~~i~~~l~~l~~~~~~~~~~ 252 (1076)
.. .+|. .+||+|+++++|||||||||.+|+||||+++.... .+++|+||++.+++++++|.+++
T Consensus 195 ~~--~~~~----~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g------ 262 (455)
T PRK10139 195 NL--EGLE----NFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFG------ 262 (455)
T ss_pred CC--CCcc----eEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcC------
Confidence 22 1233 46999999999999999999999999999875432 57899999999999999999988
Q ss_pred cceeccCCCccceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCC
Q 001444 253 WEAVSIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQ 331 (1076)
Q Consensus 253 ~~~~~~~rg~lg~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~ 331 (1076)
.+.|||||+.++.++.+.++.|||+ ...|++| ..|.++|||++ ||++||+|++|||++|.+
T Consensus 263 ----~v~r~~LGv~~~~l~~~~~~~lgl~-------------~~~Gv~V-~~V~~~SpA~~AGL~~GDvIl~InG~~V~s 324 (455)
T PRK10139 263 ----EIKRGLLGIKGTEMSADIAKAFNLD-------------VQRGAFV-SEVLPNSGSAKAGVKAGDIITSLNGKPLNS 324 (455)
T ss_pred ----cccccceeEEEEECCHHHHHhcCCC-------------CCCceEE-EEECCCChHHHCCCCCCCEEEEECCEECCC
Confidence 7899999999999999999999987 4578887 79999999999 999999999999999999
Q ss_pred hhHHHHHHhc-CCCCeEEEEEEeCCeEEEEEEEeccCCCCCCCcc---cccCceEEecCCHHHHhccCCCCCeEEEEc--
Q 001444 332 FLKLETLLDD-GVDKNIELLIERGGISMTVNLVVQDLHSITPDYF---LEVSGAVIHPLSYQQARNFRFPCGLVYVAE-- 405 (1076)
Q Consensus 332 ~~~l~~~l~~-~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~---~~~~G~~~~~l~~~~~~~~~~~~~gv~v~~-- 405 (1076)
|.++...+.. .+|+++.++|.|+|+.+++++++...+....... ..+.|+.+.+. ..+. ...|++|..
T Consensus 325 ~~dl~~~l~~~~~g~~v~l~V~R~G~~~~l~v~~~~~~~~~~~~~~~~~~~~g~~l~~~---~~~~---~~~Gv~V~~V~ 398 (455)
T PRK10139 325 FAELRSRIATTEPGTKVKLGLLRNGKPLEVEVTLDTSTSSSASAEMITPALQGATLSDG---QLKD---GTKGIKIDEVV 398 (455)
T ss_pred HHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEECCCCCcccccccccccccccEeccc---cccc---CCCceEEEEeC
Confidence 9999988854 7899999999999999999998754432111111 11234444431 1111 125788874
Q ss_pred CCChhhHcCCCCCCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeEEEEec
Q 001444 406 PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSY 455 (1076)
Q Consensus 406 ~gs~a~~aGl~~GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l~~~~~ 455 (1076)
++++|+++||++||+|++|||+++.+|++|.+.+++.+ +.+.|++.|-
T Consensus 399 ~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~--~~v~l~v~R~ 446 (455)
T PRK10139 399 KGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP--AIIALQIVRG 446 (455)
T ss_pred CCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC--CeEEEEEEEC
Confidence 89999999999999999999999999999999998843 4666666554
No 3
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=4.4e-52 Score=489.10 Aligned_cols=384 Identities=25% Similarity=0.377 Sum_probs=332.6
Q ss_pred hHHHHHHHhCCceEEEEEeeee---------------ccCC----------CCCCCcEEEEEEEeCCCcEEEeCccccCC
Q 001444 37 DWRKALNKVVPAVVVLRTTACR---------------AFDT----------EAAGASYATGFVVDKRRGIILTNRHVVKP 91 (1076)
Q Consensus 37 ~~~~~~~~v~~svV~I~~~~~~---------------~~d~----------~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~ 91 (1076)
++.++++++.||||.|.+.... .|.. .....+.||||+|++ +||||||+||+.
T Consensus 2 ~~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~-~G~IlTn~Hvv~- 79 (428)
T TIGR02037 2 SFAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISA-DGYILTNNHVVD- 79 (428)
T ss_pred cHHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECC-CCEEEEcHHHcC-
Confidence 4789999999999999986421 1211 012457899999998 699999999999
Q ss_pred CCcEEEEEecCCcEEEEEEEEecCCCcEEEEEEcCC-CCccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEE
Q 001444 92 GPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDPS-AIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLAR 170 (1076)
Q Consensus 92 ~~~~~~v~~~~~~~~~a~vv~~d~~~DlAlLk~~~~-~~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~ 170 (1076)
++..+.|++.|++.++|++++.|+.+||||||++.. .++++.+.. ++.+++||+|+++|||++...+++.|+|++
T Consensus 80 ~~~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~~~~~~~~l~~----~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~ 155 (428)
T TIGR02037 80 GADEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAKKNLPVIKLGD----SDKLRVGDWVLAIGNPFGLGQTVTSGIVSA 155 (428)
T ss_pred CCCeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCCCCceEEEccC----CCCCCCCCEEEEEECCCcCCCcEEEEEEEe
Confidence 678899999999999999999999999999999865 566666654 678999999999999999999999999999
Q ss_pred ecCCCCCCCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEEeecccC----CCCCccccCHHHHHHHHHHHHhcC
Q 001444 171 LDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKS----SSASAFFLPLERVVRALRFLQERR 246 (1076)
Q Consensus 171 ~~~~~~~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~----~~~~~falP~~~i~~~l~~l~~~~ 246 (1076)
..+... ....|.+ +||+++++++|+|||||||.+|+||||+++... ..+.+|+||++.+++++++|++++
T Consensus 156 ~~~~~~--~~~~~~~----~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g 229 (428)
T TIGR02037 156 LGRSGL--GIGDYEN----FIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGG 229 (428)
T ss_pred cccCcc--CCCCccc----eEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcC
Confidence 987632 1223333 599999999999999999999999999977543 256899999999999999999998
Q ss_pred CccccccceeccCCCccceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCcccc-CCCCCCEEEEEC
Q 001444 247 DCNIHNWEAVSIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVN 325 (1076)
Q Consensus 247 ~~~~~~~~~~~~~rg~lg~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~-gL~~GD~Il~Vn 325 (1076)
.+.|+|||+.++.++.+.++.||++ ...|++| ..|.++|||++ ||++||+|++||
T Consensus 230 ----------~~~~~~lGi~~~~~~~~~~~~lgl~-------------~~~Gv~V-~~V~~~spA~~aGL~~GDvI~~Vn 285 (428)
T TIGR02037 230 ----------KVQRGWLGVTIQEVTSDLAKSLGLE-------------KQRGALV-AQVLPGSPAEKAGLKAGDVILSVN 285 (428)
T ss_pred ----------cCcCCcCceEeecCCHHHHHHcCCC-------------CCCceEE-EEccCCCChHHcCCCCCCEEEEEC
Confidence 7889999999999999999999997 4578777 79999999999 999999999999
Q ss_pred CEEeCChhHHHHHHh-cCCCCeEEEEEEeCCeEEEEEEEeccCCCCCCCcccccCceEEecCCHHHHhccCCCC--CeEE
Q 001444 326 GEVITQFLKLETLLD-DGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLSYQQARNFRFPC--GLVY 402 (1076)
Q Consensus 326 G~~v~~~~~l~~~l~-~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~~G~~~~~l~~~~~~~~~~~~--~gv~ 402 (1076)
|+++.++.++..++. ..+|+.++++|.|+|+..++++++..++.....+...+.|+.+++++....+.++++. .|++
T Consensus 286 g~~i~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~lGi~~~~l~~~~~~~~~l~~~~~Gv~ 365 (428)
T TIGR02037 286 GKPISSFADLRRAIGTLKPGKKVTLGILRKGKEKTITVTLGASPEEQASSSNPFLGLTVANLSPEIRKELRLKGDVKGVV 365 (428)
T ss_pred CEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEECcCCCccccccccccceEEecCCHHHHHHcCCCcCcCceE
Confidence 999999999998884 4788999999999999999999988766544455667899999999998888888875 6999
Q ss_pred EEc--CCChhhHcCCCCCCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeEEEEecc
Q 001444 403 VAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (1076)
Q Consensus 403 v~~--~gs~a~~aGl~~GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l~~~~~~ 456 (1076)
|.+ ++|+|+++||++||+|++|||+++.++++|.+++++.+.++.+.|++.|-.
T Consensus 366 V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~v~R~g 421 (428)
T TIGR02037 366 VTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALLILRGG 421 (428)
T ss_pred EEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECC
Confidence 984 899999999999999999999999999999999999877888888887654
No 4
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=4.7e-52 Score=486.54 Aligned_cols=388 Identities=20% Similarity=0.266 Sum_probs=311.8
Q ss_pred cccceeeeeEEEEEEEcCccc----------ccCC------cccceeeEEEEEEEeeCCceEEEEeCccccCCCccEEEE
Q 001444 587 FAESVIEPTLVMFEVHVPPSC----------MIDG------VHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLS 650 (1076)
Q Consensus 587 ~~~~~~~~S~V~V~~~~~~~~----------~~dg------~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~ 650 (1076)
..++++.||||.|.+...... ++.. .....+.||||||+ +++|||||| +||+.++..+.|+
T Consensus 44 ~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~--~~~g~IlTn-~HVv~~a~~i~V~ 120 (455)
T PRK10139 44 PMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIID--AAKGYVLTN-NHVINQAQKISIQ 120 (455)
T ss_pred HHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEE--CCCCEEEeC-hHHhCCCCEEEEE
Confidence 378899999999987532110 1110 01224689999998 247999999 6677888999999
Q ss_pred eecCCeEEeEEEEEeeCCCcEEEEEEC-CCCCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccc
Q 001444 651 FAAFPIEIPGEVVFLHPVHNFALIAYD-PSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCA 729 (1076)
Q Consensus 651 ~~d~~~~~~a~vv~~dp~~dlAvlk~d-~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~ 729 (1076)
+.| +++++|++++.|+.+||||||++ +..+++ ++|+++..+++||+|++||||+++.. ++|
T Consensus 121 ~~d-g~~~~a~vvg~D~~~DlAvlkv~~~~~l~~--------~~lg~s~~~~~G~~V~aiG~P~g~~~-----tvt---- 182 (455)
T PRK10139 121 LND-GREFDAKLIGSDDQSDIALLQIQNPSKLTQ--------IAIADSDKLRVGDFAVAVGNPFGLGQ-----TAT---- 182 (455)
T ss_pred ECC-CCEEEEEEEEEcCCCCEEEEEecCCCCCce--------eEecCccccCCCCEEEEEecCCCCCC-----ceE----
Confidence 998 99999999999999999999996 456665 89999999999999999999999854 555
Q ss_pred eeecCCCCCCccc-ccceeEEEEecccCC-CcCceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhhHHHHHH
Q 001444 730 ALNISSADCPRYR-AMNMEVIELDTDFGS-TFSGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLD 807 (1076)
Q Consensus 730 ~~~i~~~~~~~~~-~~~~~~I~~d~~ig~-~sGGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l~ 807 (1076)
.|++++..+..+. ..+..+||+|++|++ ||||||+|.+|+||||+++.... + ....+. .||||++.++++++
T Consensus 183 ~GivS~~~r~~~~~~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~---~-~~~~gi--gfaIP~~~~~~v~~ 256 (455)
T PRK10139 183 SGIISALGRSGLNLEGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAP---G-GGSVGI--GFAIPSNMARTLAQ 256 (455)
T ss_pred EEEEccccccccCCCCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcC---C-CCccce--EEEEEhHHHHHHHH
Confidence 6666666553221 123579999999976 89999999999999999985433 1 123344 48899999999999
Q ss_pred HHhcCCCCCcccccccccCCCceeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhh-ccC
Q 001444 808 KIISGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENM-LEQ 886 (1076)
Q Consensus 808 ~l~~~~~~~~~~~~~v~r~~p~~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~a-L~~ 886 (1076)
+|+++|+ +.|+ |||+.++.++...++.+|++.. .|++|..|.++|||+++ |++
T Consensus 257 ~l~~~g~--------v~r~-----~LGv~~~~l~~~~~~~lgl~~~-------------~Gv~V~~V~~~SpA~~AGL~~ 310 (455)
T PRK10139 257 QLIDFGE--------IKRG-----LLGIKGTEMSADIAKAFNLDVQ-------------RGAFVSEVLPNSGSAKAGVKA 310 (455)
T ss_pred HHhhcCc--------cccc-----ceeEEEEECCHHHHHhcCCCCC-------------CceEEEEECCCChHHHCCCCC
Confidence 9999987 8888 9999999999998888888654 67889999999999999 999
Q ss_pred CCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccCCCCCcce---eeecCccccCCcHh
Q 001444 887 GDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRDGNGTTRV---INWCGCIVQDPHPA 963 (1076)
Q Consensus 887 GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~~~~~~~~---~~~~G~~~~~p~~~ 963 (1076)
||+|++|||++|.+|+++...+....++ +++.++|.|+|+++++.+++...+....... ..+.|+.++..
T Consensus 311 GDvIl~InG~~V~s~~dl~~~l~~~~~g----~~v~l~V~R~G~~~~l~v~~~~~~~~~~~~~~~~~~~~g~~l~~~--- 383 (455)
T PRK10139 311 GDIITSLNGKPLNSFAELRSRIATTEPG----TKVKLGLLRNGKPLEVEVTLDTSTSSSASAEMITPALQGATLSDG--- 383 (455)
T ss_pred CCEEEEECCEECCCHHHHHHHHHhcCCC----CEEEEEEEECCEEEEEEEEECCCCCcccccccccccccccEeccc---
Confidence 9999999999999999999998666665 7899999999999999998754432211111 12345544431
Q ss_pred HhhcCCCCCCCCcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEE
Q 001444 964 VRALGFLPEEGHGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTL 1042 (1076)
Q Consensus 964 ~~~~~~~p~~~~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tl 1042 (1076)
+ ++....|++|..|.++|||+++||++||+|++|||+++.+|++|.+++++. .+.+.|+ +.|+|+.+++.+
T Consensus 384 --~---~~~~~~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~--~~~v~l~-v~R~g~~~~~~~ 454 (455)
T PRK10139 384 --Q---LKDGTKGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAK--PAIIALQ-IVRGNESIYLLL 454 (455)
T ss_pred --c---cccCCCceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhC--CCeEEEE-EEECCEEEEEEe
Confidence 1 111224899999999999999999999999999999999999999999874 3678888 799999888876
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=3e-51 Score=482.01 Aligned_cols=395 Identities=24% Similarity=0.320 Sum_probs=327.0
Q ss_pred ccceeeeeEEEEEEEcCccc-------------ccCC----c------ccceeeEEEEEEEeeCCceEEEEeCccccCCC
Q 001444 588 AESVIEPTLVMFEVHVPPSC-------------MIDG----V------HSQHFFGTGVIIYHSQSMGLVVVDKNTVAISA 644 (1076)
Q Consensus 588 ~~~~~~~S~V~V~~~~~~~~-------------~~dg----~------~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~ 644 (1076)
.++++.||||.|.+...... .+.. . ....+.||||+|+ ++|||||| +||+.++
T Consensus 6 ~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~---~~G~IlTn-~Hvv~~~ 81 (428)
T TIGR02037 6 LVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIIS---ADGYILTN-NHVVDGA 81 (428)
T ss_pred HHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEEC---CCCEEEEc-HHHcCCC
Confidence 67889999999987542100 1111 0 1234679999999 67999999 6678889
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEEECCC-CCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeee
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAYDPS-SLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSI 723 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~d~~-~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~ 723 (1076)
+++.|++++ +..++|++++.|+.+||||||++.. .+++ ++|+++..+++||+|+++|||+++....+
T Consensus 82 ~~i~V~~~~-~~~~~a~vv~~d~~~DlAllkv~~~~~~~~--------~~l~~~~~~~~G~~v~aiG~p~g~~~~~t--- 149 (428)
T TIGR02037 82 DEITVTLSD-GREFKAKLVGKDPRTDIAVLKIDAKKNLPV--------IKLGDSDKLRVGDWVLAIGNPFGLGQTVT--- 149 (428)
T ss_pred CeEEEEeCC-CCEEEEEEEEecCCCCEEEEEecCCCCceE--------EEccCCCCCCCCCEEEEEECCCcCCCcEE---
Confidence 999999998 9999999999999999999999764 5554 99998888999999999999999855444
Q ss_pred EecccceeecCCCCCCcc-cccceeEEEEecccCC-CcCceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhh
Q 001444 724 VTNPCAALNISSADCPRY-RAMNMEVIELDTDFGS-TFSGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYT 801 (1076)
Q Consensus 724 vt~i~~~~~i~~~~~~~~-~~~~~~~I~~d~~ig~-~sGGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~ 801 (1076)
.|++++..+... ...+..+||+|+++++ +|||||+|.+|+|+|||++.... +....++.||||++.
T Consensus 150 ------~G~vs~~~~~~~~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~------~g~~~g~~faiP~~~ 217 (428)
T TIGR02037 150 ------SGIVSALGRSGLGIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSP------SGGNVGIGFAIPSNM 217 (428)
T ss_pred ------EEEEEecccCccCCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcC------CCCccceEEEEEhHH
Confidence 444443333211 1123468999999965 79999999999999999884432 112334558999999
Q ss_pred HHHHHHHHhcCCCCCcccccccccCCCceeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHh
Q 001444 802 ISRVLDKIISGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAE 881 (1076)
Q Consensus 802 v~~~l~~l~~~~~~~~~~~~~v~r~~p~~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~ 881 (1076)
+++++++|+++++ +.|+ |||+.++.++...++.+|++.. .+++|.+|.++|||+
T Consensus 218 ~~~~~~~l~~~g~--------~~~~-----~lGi~~~~~~~~~~~~lgl~~~-------------~Gv~V~~V~~~spA~ 271 (428)
T TIGR02037 218 AKNVVDQLIEGGK--------VQRG-----WLGVTIQEVTSDLAKSLGLEKQ-------------RGALVAQVLPGSPAE 271 (428)
T ss_pred HHHHHHHHHhcCc--------CcCC-----cCceEeecCCHHHHHHcCCCCC-------------CceEEEEccCCCChH
Confidence 9999999999987 7788 9999999999999999999765 678899999999999
Q ss_pred hh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccCCCCCcceeeecCccccCC
Q 001444 882 NM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRDGNGTTRVINWCGCIVQDP 960 (1076)
Q Consensus 882 ~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~G~~~~~p 960 (1076)
++ |+.||+|++|||++|.++.++..++....++ ++++++|.|+|+.+++++++...+.....+...|+|+.++.+
T Consensus 272 ~aGL~~GDvI~~Vng~~i~~~~~~~~~l~~~~~g----~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~lGi~~~~l 347 (428)
T TIGR02037 272 KAGLKAGDVILSVNGKPISSFADLRRAIGTLKPG----KKVTLGILRKGKEKTITVTLGASPEEQASSSNPFLGLTVANL 347 (428)
T ss_pred HcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCC----CEEEEEEEECCEEEEEEEEECcCCCccccccccccceEEecC
Confidence 99 9999999999999999999999998766665 889999999999999999987665444445667899999999
Q ss_pred cHhHhhcCCCCCCCCcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEE
Q 001444 961 HPAVRALGFLPEEGHGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVL 1040 (1076)
Q Consensus 961 ~~~~~~~~~~p~~~~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~ 1040 (1076)
+...++...++....|++|++|.++|||+++||++||+|++|||+++.++++|.+++++.+.++.++|+ +.|+|+.+.+
T Consensus 348 ~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~-v~R~g~~~~~ 426 (428)
T TIGR02037 348 SPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALL-ILRGGATIFV 426 (428)
T ss_pred CHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEE-EEECCEEEEE
Confidence 988877555554345999999999999999999999999999999999999999999987778999999 7899988766
Q ss_pred E
Q 001444 1041 T 1041 (1076)
Q Consensus 1041 t 1041 (1076)
+
T Consensus 427 ~ 427 (428)
T TIGR02037 427 T 427 (428)
T ss_pred E
Confidence 5
No 6
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=5e-50 Score=471.64 Aligned_cols=378 Identities=20% Similarity=0.324 Sum_probs=310.2
Q ss_pred chHHHHHHHhCCceEEEEEeeee-------------ccCC----------------------------CCCCCcEEEEEE
Q 001444 36 DDWRKALNKVVPAVVVLRTTACR-------------AFDT----------------------------EAAGASYATGFV 74 (1076)
Q Consensus 36 ~~~~~~~~~v~~svV~I~~~~~~-------------~~d~----------------------------~~~~~~~GTGfv 74 (1076)
.++.++++++.||||.|.+.... .|.. .....+.||||+
T Consensus 38 ~~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~i 117 (473)
T PRK10942 38 PSLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVI 117 (473)
T ss_pred ccHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEE
Confidence 36999999999999999876521 0210 001246899999
Q ss_pred EeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEEEEEecCCCcEEEEEEc-CCCCccccccCCCCCCcCCCCCCEEEEE
Q 001444 75 VDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYD-PSAIQFLNYDEIPLAPEAACVGLEIRVV 153 (1076)
Q Consensus 75 V~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~vv~~d~~~DlAlLk~~-~~~~~~~~~~~l~l~~~~~~~G~~V~~i 153 (1076)
|++++||||||+|||. +.+.+.|++.|+++|+|++++.|+.+||||||++ +..++++.++. ++.+++||+|+++
T Consensus 118 i~~~~G~IlTn~HVv~-~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~~l~~~~lg~----s~~l~~G~~V~ai 192 (473)
T PRK10942 118 IDADKGYVVTNNHVVD-NATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPKNLTAIKMAD----SDALRVGDYTVAI 192 (473)
T ss_pred EECCCCEEEeChhhcC-CCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCCCCceeEecC----ccccCCCCEEEEE
Confidence 9975699999999999 6778999999999999999999999999999996 45566666655 7889999999999
Q ss_pred ecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEEeecccCC----CCCccc
Q 001444 154 GNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSS----SASAFF 229 (1076)
Q Consensus 154 G~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~~----~~~~fa 229 (1076)
|||++...+++.|+|+++.+.... ...|.+ +||+|+++++|||||||+|.+|+||||+++.... .+.+|+
T Consensus 193 G~P~g~~~tvt~GiVs~~~r~~~~--~~~~~~----~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfa 266 (473)
T PRK10942 193 GNPYGLGETVTSGIVSALGRSGLN--VENYEN----FIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFA 266 (473)
T ss_pred cCCCCCCcceeEEEEEEeecccCC--cccccc----eEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEE
Confidence 999999999999999999875321 123333 5999999999999999999999999999875432 468999
Q ss_pred cCHHHHHHHHHHHHhcCCccccccceeccCCCccceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCC
Q 001444 230 LPLERVVRALRFLQERRDCNIHNWEAVSIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGG 309 (1076)
Q Consensus 230 lP~~~i~~~l~~l~~~~~~~~~~~~~~~~~rg~lg~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~s 309 (1076)
||++.+++++++|.+++ .+.|||||+.++.++.+.++.|||+ ...|++| ..|.++|
T Consensus 267 IP~~~~~~v~~~l~~~g----------~v~rg~lGv~~~~l~~~~a~~~~l~-------------~~~GvlV-~~V~~~S 322 (473)
T PRK10942 267 IPSNMVKNLTSQMVEYG----------QVKRGELGIMGTELNSELAKAMKVD-------------AQRGAFV-SQVLPNS 322 (473)
T ss_pred EEHHHHHHHHHHHHhcc----------ccccceeeeEeeecCHHHHHhcCCC-------------CCCceEE-EEECCCC
Confidence 99999999999999988 7899999999999999999989987 4689888 6999999
Q ss_pred cccc-CCCCCCEEEEECCEEeCChhHHHHHHh-cCCCCeEEEEEEeCCeEEEEEEEeccCCCCCCCcccccCceEEecCC
Q 001444 310 PAHL-RLEPGDVLVRVNGEVITQFLKLETLLD-DGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLS 387 (1076)
Q Consensus 310 pA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~-~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~~G~~~~~l~ 387 (1076)
||++ ||++||+|++|||++|.++.++...+. ..+|+.+.++|.|+|+.+++.+++.........+...++|+....+.
T Consensus 323 pA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~~~~~~~~~~~~~lGl~g~~l~ 402 (473)
T PRK10942 323 SAAKAGIKAGDVITSLNGKPISSFAALRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQSSQNQVDSSNIFNGIEGAELS 402 (473)
T ss_pred hHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCcCcccccccccccccceeeecc
Confidence 9999 999999999999999999999998884 47889999999999999999988765422111121223444333333
Q ss_pred HHHHhccCCCCCeEEEEc--CCChhhHcCCCCCCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeEEEEecc
Q 001444 388 YQQARNFRFPCGLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYT 456 (1076)
Q Consensus 388 ~~~~~~~~~~~~gv~v~~--~gs~a~~aGl~~GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l~~~~~~ 456 (1076)
... ...|++|.+ ++|+|+++||++||+|++|||++|.++++|.+++++.+ +.+.|+++|-.
T Consensus 403 ~~~------~~~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~--~~v~l~V~R~g 465 (473)
T PRK10942 403 NKG------GDKGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP--SVLALNIQRGD 465 (473)
T ss_pred ccc------CCCCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC--CeEEEEEEECC
Confidence 210 114788874 89999999999999999999999999999999999833 56666666543
No 7
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=1.5e-49 Score=467.62 Aligned_cols=359 Identities=19% Similarity=0.254 Sum_probs=293.7
Q ss_pred eeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeecCCeEEeEEEEEeeCCCcEEEEEEC-CCCCCcccccceeeeec
Q 001444 616 HFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLHPVHNFALIAYD-PSSLGVAGASVVRAAEL 694 (1076)
Q Consensus 616 ~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~d-~~~~~~~~~~~v~~~~l 694 (1076)
.+.||||||+ .++|||||| +||+.+++++.|+++| +++++|+|++.|+.+||||||++ +..+++ ++|
T Consensus 110 ~~~GSG~ii~--~~~G~IlTn-~HVv~~a~~i~V~~~d-g~~~~a~vv~~D~~~DlAvlki~~~~~l~~--------~~l 177 (473)
T PRK10942 110 MALGSGVIID--ADKGYVVTN-NHVVDNATKIKVQLSD-GRKFDAKVVGKDPRSDIALIQLQNPKNLTA--------IKM 177 (473)
T ss_pred cceEEEEEEE--CCCCEEEeC-hhhcCCCCEEEEEECC-CCEEEEEEEEecCCCCEEEEEecCCCCCce--------eEe
Confidence 4679999999 246999999 5677888999999998 99999999999999999999995 556665 899
Q ss_pred cCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCCCCcccc-cceeEEEEecccCC-CcCceEECCCceEEE
Q 001444 695 LPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRA-MNMEVIELDTDFGS-TFSGVLTDEHGRVQA 772 (1076)
Q Consensus 695 ~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~~~~~~~-~~~~~I~~d~~ig~-~sGGpL~d~~G~VvG 772 (1076)
+++..+++||+|++||||+++....+ .|+|++..+..+.. .+..+||+|+++++ ||||||+|.+|+|||
T Consensus 178 g~s~~l~~G~~V~aiG~P~g~~~tvt---------~GiVs~~~r~~~~~~~~~~~iqtda~i~~GnSGGpL~n~~GeviG 248 (473)
T PRK10942 178 ADSDALRVGDYTVAIGNPYGLGETVT---------SGIVSALGRSGLNVENYENFIQTDAAINRGNSGGALVNLNGELIG 248 (473)
T ss_pred cCccccCCCCEEEEEcCCCCCCccee---------EEEEEEeecccCCcccccceEEeccccCCCCCcCccCCCCCeEEE
Confidence 99989999999999999999855433 44444433321111 12478999999966 799999999999999
Q ss_pred EEeeccccccccCCCCCcceeEeccchhhHHHHHHHHhcCCCCCcccccccccCCCceeeeeeEEEEcChHhHHHcCCCH
Q 001444 773 IWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARSFGLSD 852 (1076)
Q Consensus 773 i~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l~~l~~~~~~~~~~~~~v~r~~p~~~~Lgv~~~~~~~~~a~~~g~~~ 852 (1076)
||+++... +....++.||||++.+++++++|+++++ +.|+ |||+.++.++...++.+|++.
T Consensus 249 I~t~~~~~------~g~~~g~gfaIP~~~~~~v~~~l~~~g~--------v~rg-----~lGv~~~~l~~~~a~~~~l~~ 309 (473)
T PRK10942 249 INTAILAP------DGGNIGIGFAIPSNMVKNLTSQMVEYGQ--------VKRG-----ELGIMGTELNSELAKAMKVDA 309 (473)
T ss_pred EEEEEEcC------CCCcccEEEEEEHHHHHHHHHHHHhccc--------cccc-----eeeeEeeecCHHHHHhcCCCC
Confidence 99985543 1122345589999999999999999987 8888 999999999998888888875
Q ss_pred HHHHHHHhcCCCccceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEE
Q 001444 853 DWVQALVKKDPVRRQVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGRE 931 (1076)
Q Consensus 853 ~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~ 931 (1076)
. .|++|.+|.++|||+++ |+.||+|++|||++|.+++++...+....++ +++.++|.|+|+.
T Consensus 310 ~-------------~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g----~~v~l~v~R~G~~ 372 (473)
T PRK10942 310 Q-------------RGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAALRAQVGTMPVG----SKLTLGLLRDGKP 372 (473)
T ss_pred C-------------CceEEEEECCCChHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhcCCC----CEEEEEEEECCeE
Confidence 5 67889999999999999 9999999999999999999999998766665 7899999999999
Q ss_pred EEEEEeccccCCCCCcceeeecCccccCCcHhHhhcCCCCCCCCcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHH
Q 001444 932 IELQVGTDVRDGNGTTRVINWCGCIVQDPHPAVRALGFLPEEGHGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLE 1011 (1076)
Q Consensus 932 ~~~~v~l~~~~~~~~~~~~~~~G~~~~~p~~~~~~~~~~p~~~~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~ 1011 (1076)
+++.+++.............++|+........ ....+++|++|.++|||+++||++||+|++|||++|.+++
T Consensus 373 ~~v~v~l~~~~~~~~~~~~~~lGl~g~~l~~~--------~~~~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~ 444 (473)
T PRK10942 373 VNVNVELQQSSQNQVDSSNIFNGIEGAELSNK--------GGDKGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIA 444 (473)
T ss_pred EEEEEEeCcCcccccccccccccceeeecccc--------cCCCCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHH
Confidence 99999886643222222233455544332210 0113899999999999999999999999999999999999
Q ss_pred HHHHHHHhCCCCCeEEEEEEEeCCeEEEEEE
Q 001444 1012 AFVNVTKEIEHGEFVRVRTVHLNGKPRVLTL 1042 (1076)
Q Consensus 1012 ~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tl 1042 (1076)
+|.+++++. ++.+.|+ +.|+|..+++.+
T Consensus 445 dl~~~l~~~--~~~v~l~-V~R~g~~~~v~~ 472 (473)
T PRK10942 445 ELRKILDSK--PSVLALN-IQRGDSSIYLLM 472 (473)
T ss_pred HHHHHHHhC--CCeEEEE-EEECCEEEEEEe
Confidence 999999883 3678888 789999888765
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=100.00 E-value=3.2e-49 Score=443.07 Aligned_cols=437 Identities=31% Similarity=0.439 Sum_probs=384.7
Q ss_pred ccceeeeeEEEEEEEcCcccccCCcccceeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeecCCeEEeEEEEEeeC
Q 001444 588 AESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLHP 667 (1076)
Q Consensus 588 ~~~~~~~S~V~V~~~~~~~~~~dg~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d~~~~~~a~vv~~dp 667 (1076)
..+.+.+|+|.|+.... ..+|........|+|||+| +..||||||||+|..+...-.+.|.+ -.+++--.++.||
T Consensus 57 ~ia~VvksvVsI~~S~v--~~fdtesag~~~atgfvvd--~~~gyiLtnrhvv~pgP~va~avf~n-~ee~ei~pvyrDp 131 (955)
T KOG1421|consen 57 TIANVVKSVVSIRFSAV--RAFDTESAGESEATGFVVD--KKLGYILTNRHVVAPGPFVASAVFDN-HEEIEIYPVYRDP 131 (955)
T ss_pred hhhhhcccEEEEEehhe--eecccccccccceeEEEEe--cccceEEEeccccCCCCceeEEEecc-cccCCcccccCCc
Confidence 67779999999997665 3678778888999999999 58999999999999999888999998 8888889999999
Q ss_pred CCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCCC--Ccc----
Q 001444 668 VHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADC--PRY---- 741 (1076)
Q Consensus 668 ~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~~--~~~---- 741 (1076)
.||+.+++|||+.... ..++.+.++++. .++|-++..+||..+.-+. .. .|.++..++ |.|
T Consensus 132 VhdfGf~r~dps~ir~---s~vt~i~lap~~-akvgseirvvgNDagEkls----Il-----agflSrldr~apdyg~~~ 198 (955)
T KOG1421|consen 132 VHDFGFFRYDPSTIRF---SIVTEICLAPEL-AKVGSEIRVVGNDAGEKLS----IL-----AGFLSRLDRNAPDYGEDT 198 (955)
T ss_pred hhhcceeecChhhcce---eeeeccccCccc-cccCCceEEecCCccceEE----ee-----hhhhhhccCCCccccccc
Confidence 9999999999997765 677788888866 6999999999997776432 22 334443322 444
Q ss_pred -cccceeEEEEeccc-CCCcCceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhhHHHHHHHHhcCCCCCccc
Q 001444 742 -RAMNMEVIELDTDF-GSTFSGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISGASGPSLL 819 (1076)
Q Consensus 742 -~~~~~~~I~~d~~i-g~~sGGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l~~l~~~~~~~~~~ 819 (1076)
...|..++|.-++- |++||.|++|-.|..|+++..- +...... |++|++.+++.|.-|+++..
T Consensus 199 yndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg--------~~ssas~--ffLpLdrV~RaL~clq~n~P----- 263 (955)
T KOG1421|consen 199 YNDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGG--------SISSASD--FFLPLDRVVRALRCLQNNTP----- 263 (955)
T ss_pred cccccceeeeehhcCCCCCCCCceecccceEEeeecCC--------ccccccc--ceeeccchhhhhhhhhcCCC-----
Confidence 33567888887665 4467999999999999997652 2334556 56999999999999998765
Q ss_pred ccccccCCCceeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhhccCCCEEEEECCEEcC
Q 001444 820 INGVKRPMPLVRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENMLEQGDMMLAINKQPVT 899 (1076)
Q Consensus 820 ~~~v~r~~p~~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~aL~~GDiIlsVnG~~V~ 899 (1076)
++|+ +|.++|..-..+++|.+|++.||++.+..++|.+++.++|..|.++|||++.|++||++++||+.-+.
T Consensus 264 ---ItRG-----tLqvefl~k~~de~rrlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~ 335 (955)
T KOG1421|consen 264 ---ITRG-----TLQVEFLHKLFDECRRLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLN 335 (955)
T ss_pred ---cccc-----eEEEEEehhhhHHHHhcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehH
Confidence 8888 99999999999999999999999999999999999999999999999999999999999999988888
Q ss_pred ChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccCCCCCcceeeecCccccCCcHhHhhcCCCCCCCCcEEE
Q 001444 900 CFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRDGNGTTRVINWCGCIVQDPHPAVRALGFLPEEGHGVYV 979 (1076)
Q Consensus 900 ~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~G~~~~~p~~~~~~~~~~p~~~~gv~V 979 (1076)
++..+...+ ..+ .++.+.|+|.|+|++.++++++..+++..+.|++.|||+++|.|+.+++.+..+|- +|+||
T Consensus 336 df~~l~~iL---Deg--vgk~l~LtI~Rggqelel~vtvqdlh~itp~R~levcGav~hdlsyq~ar~y~lP~--~GvyV 408 (955)
T KOG1421|consen 336 DFEALEQIL---DEG--VGKNLELTIQRGGQELELTVTVQDLHGITPDRFLEVCGAVFHDLSYQLARLYALPV--EGVYV 408 (955)
T ss_pred HHHHHHHHH---hhc--cCceEEEEEEeCCEEEEEEEEeccccCCCCceEEEEcceEecCCCHHHHhhccccc--CcEEE
Confidence 777777666 233 45889999999999999999999999999999999999999999999999988886 59999
Q ss_pred EEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeC--CeEEEEEEEeCCc-cCcceEEEE
Q 001444 980 ARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLN--GKPRVLTLKQDLH-YWPTWELIF 1056 (1076)
Q Consensus 980 ~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~--g~~~~~tlk~~~~-y~pt~e~~~ 1056 (1076)
++.. ||++++++++ +-+|.+||++++++|++|.++++++++|+.|.++++..+ +++++++++.|.| |||+|++.|
T Consensus 409 a~~~-gsf~~~~~~y-~~ii~~vanK~tPdLdaFidvlk~L~dg~rV~vry~hl~dkh~p~v~~v~iDrHwy~p~~~~tr 486 (955)
T KOG1421|consen 409 ASPG-GSFRHRGPRY-GQIIDSVANKPTPDLDAFIDVLKELPDGARVPVRYHHLTDKHSPRVTTVTIDRHWYWPFREYTR 486 (955)
T ss_pred ccCC-CCccccCCcc-eEEEEeecCCcCCCHHHHHHHHHhccCCCeeeEEEEEecCCCCceEEEEEEeccccccceeeee
Confidence 9999 9999999999 999999999999999999999999999999999988887 7899999999999 999999999
Q ss_pred cCCCCCeEEEEecccCCC
Q 001444 1057 DPDTALWRRKSVKALNSS 1074 (1076)
Q Consensus 1057 ~~~~~~w~~~~~~~~~~~ 1074 (1076)
|++++.|++...+.|||.
T Consensus 487 ndetglWdrk~L~~pqPa 504 (955)
T KOG1421|consen 487 NDETGLWDRKNLKDPQPA 504 (955)
T ss_pred CCCcccccccccCCCCcc
Confidence 999999999999999986
No 9
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=1.1e-43 Score=404.83 Aligned_cols=300 Identities=19% Similarity=0.296 Sum_probs=257.5
Q ss_pred ccCcchHHHHHHHhCCceEEEEEeeeec-cCCCCCCCcEEEEEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEEE
Q 001444 32 VATADDWRKALNKVVPAVVVLRTTACRA-FDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYP 110 (1076)
Q Consensus 32 ~~~~~~~~~~~~~v~~svV~I~~~~~~~-~d~~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~v 110 (1076)
.+.+.++.++++++.||||.|+...... ........+.||||+|++ +||||||+|||. +...+.++|.|++.++|++
T Consensus 41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~-~G~IlTn~HVV~-~~~~i~V~~~dg~~~~a~v 118 (351)
T TIGR02038 41 NTVEISFNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSK-EGYILTNYHVIK-KADQIVVALQDGRKFEAEL 118 (351)
T ss_pred cccchhHHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeC-CeEEEecccEeC-CCCEEEEEECCCCEEEEEE
Confidence 3445589999999999999999864321 111223457899999997 799999999999 5678999999999999999
Q ss_pred EEecCCCcEEEEEEcCCCCccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceee
Q 001444 111 IYRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFY 190 (1076)
Q Consensus 111 v~~d~~~DlAlLk~~~~~~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~ 190 (1076)
+++|+.+||||||++...++.+++.. +..+++||+|+++|||++...+++.|+|+++.+... ...++ ..+
T Consensus 119 v~~d~~~DlAvlkv~~~~~~~~~l~~----s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~--~~~~~----~~~ 188 (351)
T TIGR02038 119 VGSDPLTDLAVLKIEGDNLPTIPVNL----DRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGL--SSVGR----QNF 188 (351)
T ss_pred EEecCCCCEEEEEecCCCCceEeccC----cCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCccc--CCCCc----ceE
Confidence 99999999999999987766665554 678999999999999999999999999999988642 11122 346
Q ss_pred EEEeeccCCCCCCCceecCCCcEEEEeecccC------CCCCccccCHHHHHHHHHHHHhcCCccccccceeccCCCccc
Q 001444 191 MQAASGTKGGSSGSPVIDWQGRAVALNAGSKS------SSASAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTLQ 264 (1076)
Q Consensus 191 i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~------~~~~~falP~~~i~~~l~~l~~~~~~~~~~~~~~~~~rg~lg 264 (1076)
||+|+.+++|||||||||.+|+||||+++... ..+.+|+||++.+++++++|++++ .+.|+|||
T Consensus 189 iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g----------~~~r~~lG 258 (351)
T TIGR02038 189 IQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDG----------RVIRGYIG 258 (351)
T ss_pred EEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcC----------cccceEee
Confidence 99999999999999999999999999976432 146899999999999999999888 78899999
Q ss_pred eEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhc-C
Q 001444 265 VTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-G 342 (1076)
Q Consensus 265 ~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~-~ 342 (1076)
+.++.+....++.||++ ...|++| ..|.++|||++ ||++||+|++|||++|.++.++...+.. .
T Consensus 259 v~~~~~~~~~~~~lgl~-------------~~~Gv~V-~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~ 324 (351)
T TIGR02038 259 VSGEDINSVVAQGLGLP-------------DLRGIVI-TGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAETR 324 (351)
T ss_pred eEEEECCHHHHHhcCCC-------------ccccceE-eecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhcC
Confidence 99999988888889987 4578887 69999999999 9999999999999999999999988854 7
Q ss_pred CCCeEEEEEEeCCeEEEEEEEeccC
Q 001444 343 VDKNIELLIERGGISMTVNLVVQDL 367 (1076)
Q Consensus 343 ~g~~v~l~v~R~g~~~~~~v~l~~~ 367 (1076)
+|+.+.++|+|+|+.+++.+++..+
T Consensus 325 ~g~~v~l~v~R~g~~~~~~v~l~~~ 349 (351)
T TIGR02038 325 PGSKVMVTVLRQGKQLELPVTIDEK 349 (351)
T ss_pred CCCEEEEEEEECCEEEEEEEEecCC
Confidence 8999999999999999998888654
No 10
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=5.4e-43 Score=398.73 Aligned_cols=300 Identities=18% Similarity=0.302 Sum_probs=255.4
Q ss_pred cCcchHHHHHHHhCCceEEEEEeeeeccC-CCCCCCcEEEEEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEEEE
Q 001444 33 ATADDWRKALNKVVPAVVVLRTTACRAFD-TEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPI 111 (1076)
Q Consensus 33 ~~~~~~~~~~~~v~~svV~I~~~~~~~~d-~~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~vv 111 (1076)
....++.++++++.||||.|.......+. ......+.||||+|++ +||||||+|||. +...+.|++.|++.++|+++
T Consensus 42 ~~~~~~~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~-~G~IlTn~HVv~-~a~~i~V~~~dg~~~~a~vv 119 (353)
T PRK10898 42 ETPASYNQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQ-RGYILTNKHVIN-DADQIIVALQDGRVFEALLV 119 (353)
T ss_pred cccchHHHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeC-CeEEEecccEeC-CCCEEEEEeCCCCEEEEEEE
Confidence 34458999999999999999987643322 2233457899999997 799999999999 66789999999999999999
Q ss_pred EecCCCcEEEEEEcCCCCccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeE
Q 001444 112 YRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYM 191 (1076)
Q Consensus 112 ~~d~~~DlAlLk~~~~~~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i 191 (1076)
++|+.+||||||++...++++++.+ +..+++||+|+++|||++...+++.|+|++..+.... ..++ ..+|
T Consensus 120 ~~d~~~DlAvl~v~~~~l~~~~l~~----~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~--~~~~----~~~i 189 (353)
T PRK10898 120 GSDSLTDLAVLKINATNLPVIPINP----KRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS--PTGR----QNFL 189 (353)
T ss_pred EEcCCCCEEEEEEcCCCCCeeeccC----cCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC--Cccc----cceE
Confidence 9999999999999987777666654 5678999999999999999999999999998875321 1122 2469
Q ss_pred EEeeccCCCCCCCceecCCCcEEEEeecccC-------CCCCccccCHHHHHHHHHHHHhcCCccccccceeccCCCccc
Q 001444 192 QAASGTKGGSSGSPVIDWQGRAVALNAGSKS-------SSASAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTLQ 264 (1076)
Q Consensus 192 ~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~-------~~~~~falP~~~i~~~l~~l~~~~~~~~~~~~~~~~~rg~lg 264 (1076)
|+|+.+++|||||||+|.+|+||||+++... ..+.+|+||++.+++++++|.+++ .+.|+|||
T Consensus 190 qtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G----------~~~~~~lG 259 (353)
T PRK10898 190 QTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDG----------RVIRGYIG 259 (353)
T ss_pred EeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcC----------cccccccc
Confidence 9999999999999999999999999976432 146899999999999999999888 78899999
Q ss_pred eEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhc-C
Q 001444 265 VTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-G 342 (1076)
Q Consensus 265 ~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~-~ 342 (1076)
+..+.++......+|++ ...|++| ..|.++|||++ ||++||+|++|||++|.++.++...+.. .
T Consensus 260 i~~~~~~~~~~~~~~~~-------------~~~Gv~V-~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~ 325 (353)
T PRK10898 260 IGGREIAPLHAQGGGID-------------QLQGIVV-NEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIR 325 (353)
T ss_pred eEEEECCHHHHHhcCCC-------------CCCeEEE-EEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcC
Confidence 99998776666666665 4578887 69999999999 9999999999999999999999888744 8
Q ss_pred CCCeEEEEEEeCCeEEEEEEEeccCC
Q 001444 343 VDKNIELLIERGGISMTVNLVVQDLH 368 (1076)
Q Consensus 343 ~g~~v~l~v~R~g~~~~~~v~l~~~~ 368 (1076)
+|+.+.+++.|+|+.+++.+++..++
T Consensus 326 ~g~~v~l~v~R~g~~~~~~v~l~~~p 351 (353)
T PRK10898 326 PGSVIPVVVMRDDKQLTLQVTIQEYP 351 (353)
T ss_pred CCCEEEEEEEECCEEEEEEEEeccCC
Confidence 89999999999999999999887654
No 11
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=1.2e-40 Score=379.87 Aligned_cols=298 Identities=20% Similarity=0.243 Sum_probs=242.9
Q ss_pred cccceeeeeEEEEEEEcCcccccCCcccceeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeecCCeEEeEEEEEee
Q 001444 587 FAESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLH 666 (1076)
Q Consensus 587 ~~~~~~~~S~V~V~~~~~~~~~~dg~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d~~~~~~a~vv~~d 666 (1076)
..++++.||+|.|.+....... .......+.||||||+ ++|||||| +||+.++..+.|+|.| ++.++|++++.|
T Consensus 49 ~~~~~~~psVV~I~~~~~~~~~-~~~~~~~~~GSG~vi~---~~G~IlTn-~HVV~~~~~i~V~~~d-g~~~~a~vv~~d 122 (351)
T TIGR02038 49 KAVRRAAPAVVNIYNRSISQNS-LNQLSIQGLGSGVIMS---KEGYILTN-YHVIKKADQIVVALQD-GRKFEAELVGSD 122 (351)
T ss_pred HHHHhcCCcEEEEEeEeccccc-cccccccceEEEEEEe---CCeEEEec-ccEeCCCCEEEEEECC-CCEEEEEEEEec
Confidence 3788999999999876542111 1112234679999999 78999999 5566778889999998 999999999999
Q ss_pred CCCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCCCCcccc-cc
Q 001444 667 PVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRA-MN 745 (1076)
Q Consensus 667 p~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~~~~~~~-~~ 745 (1076)
+.+||||||++...++. ++++++..+++||+|+++|||+++... ++ .|+|++..+..+.. ..
T Consensus 123 ~~~DlAvlkv~~~~~~~--------~~l~~s~~~~~G~~V~aiG~P~~~~~s-----~t----~GiIs~~~r~~~~~~~~ 185 (351)
T TIGR02038 123 PLTDLAVLKIEGDNLPT--------IPVNLDRPPHVGDVVLAIGNPYNLGQT-----IT----QGIISATGRNGLSSVGR 185 (351)
T ss_pred CCCCEEEEEecCCCCce--------EeccCcCccCCCCEEEEEeCCCCCCCc-----EE----EEEEEeccCcccCCCCc
Confidence 99999999998877765 889888889999999999999998654 33 45554443322211 23
Q ss_pred eeEEEEecccCC-CcCceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhhHHHHHHHHhcCCCCCcccccccc
Q 001444 746 MEVIELDTDFGS-TFSGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISGASGPSLLINGVK 824 (1076)
Q Consensus 746 ~~~I~~d~~ig~-~sGGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l~~l~~~~~~~~~~~~~v~ 824 (1076)
.++||+|+++++ ||||||+|.+|+|||||++.... + ......++.||||++.+++++++|+++++ +.
T Consensus 186 ~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~---~-~~~~~~g~~faIP~~~~~~vl~~l~~~g~--------~~ 253 (351)
T TIGR02038 186 QNFIQTDAAINAGNSGGALINTNGELVGINTASFQK---G-GDEGGEGINFAIPIKLAHKIMGKIIRDGR--------VI 253 (351)
T ss_pred ceEEEECCccCCCCCcceEECCCCeEEEEEeeeecc---c-CCCCccceEEEecHHHHHHHHHHHhhcCc--------cc
Confidence 578999999976 79999999999999999874322 0 11223445589999999999999999886 67
Q ss_pred cCCCceeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhH
Q 001444 825 RPMPLVRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHD 903 (1076)
Q Consensus 825 r~~p~~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~d 903 (1076)
|+ |||+.+++++...++.+|++.. ++++|.+|.++|||+++ |++||+|++|||++|.++.+
T Consensus 254 r~-----~lGv~~~~~~~~~~~~lgl~~~-------------~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~d 315 (351)
T TIGR02038 254 RG-----YIGVSGEDINSVVAQGLGLPDL-------------RGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEE 315 (351)
T ss_pred ce-----EeeeEEEECCHHHHHhcCCCcc-------------ccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHH
Confidence 77 9999999999988888998754 67889999999999999 99999999999999999999
Q ss_pred HHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEecccc
Q 001444 904 IENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVR 941 (1076)
Q Consensus 904 l~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~ 941 (1076)
+...+...+++ +++.++|.|+|+.+++.+++.+.
T Consensus 316 l~~~l~~~~~g----~~v~l~v~R~g~~~~~~v~l~~~ 349 (351)
T TIGR02038 316 LMDRIAETRPG----SKVMVTVLRQGKQLELPVTIDEK 349 (351)
T ss_pred HHHHHHhcCCC----CEEEEEEEECCEEEEEEEEecCC
Confidence 99998766665 78999999999999999987643
No 12
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=9.7e-40 Score=372.04 Aligned_cols=300 Identities=16% Similarity=0.205 Sum_probs=240.2
Q ss_pred cccceeeeeEEEEEEEcCcccccCCcccceeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeecCCeEEeEEEEEee
Q 001444 587 FAESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEVVFLH 666 (1076)
Q Consensus 587 ~~~~~~~~S~V~V~~~~~~~~~~dg~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d~~~~~~a~vv~~d 666 (1076)
..++++.+|+|.|.+.... +.........+.||||||+ ++|||||| +||+.++.++.|++.| +..++|++++.|
T Consensus 49 ~~~~~~~psvV~v~~~~~~-~~~~~~~~~~~~GSGfvi~---~~G~IlTn-~HVv~~a~~i~V~~~d-g~~~~a~vv~~d 122 (353)
T PRK10898 49 QAVRRAAPAVVNVYNRSLN-STSHNQLEIRTLGSGVIMD---QRGYILTN-KHVINDADQIIVALQD-GRVFEALLVGSD 122 (353)
T ss_pred HHHHHhCCcEEEEEeEecc-ccCcccccccceeeEEEEe---CCeEEEec-ccEeCCCCEEEEEeCC-CCEEEEEEEEEc
Confidence 3788999999999986642 1111222334789999999 78999999 5566788899999998 999999999999
Q ss_pred CCCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCCCCccc-ccc
Q 001444 667 PVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYR-AMN 745 (1076)
Q Consensus 667 p~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~~~~~~-~~~ 745 (1076)
|.+||||||++...++. ++|+++..+++||+|+++|||+++....+ .|++++..+..+. ...
T Consensus 123 ~~~DlAvl~v~~~~l~~--------~~l~~~~~~~~G~~V~aiG~P~g~~~~~t---------~Giis~~~r~~~~~~~~ 185 (353)
T PRK10898 123 SLTDLAVLKINATNLPV--------IPINPKRVPHIGDVVLAIGNPYNLGQTIT---------QGIISATGRIGLSPTGR 185 (353)
T ss_pred CCCCEEEEEEcCCCCCe--------eeccCcCcCCCCCEEEEEeCCCCcCCCcc---------eeEEEeccccccCCccc
Confidence 99999999998877776 88988888999999999999998765433 4444443332111 112
Q ss_pred eeEEEEecccCC-CcCceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhhHHHHHHHHhcCCCCCcccccccc
Q 001444 746 MEVIELDTDFGS-TFSGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISGASGPSLLINGVK 824 (1076)
Q Consensus 746 ~~~I~~d~~ig~-~sGGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l~~l~~~~~~~~~~~~~v~ 824 (1076)
.++||+|+++++ ||||||+|.+|+||||+++..... + ......++.||||++.+++++++|+++|+ +.
T Consensus 186 ~~~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~--~-~~~~~~g~~faIP~~~~~~~~~~l~~~G~--------~~ 254 (353)
T PRK10898 186 QNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKS--N-DGETPEGIGFAIPTQLATKIMDKLIRDGR--------VI 254 (353)
T ss_pred cceEEeccccCCCCCcceEECCCCeEEEEEEEEeccc--C-CCCcccceEEEEchHHHHHHHHHHhhcCc--------cc
Confidence 478999999976 799999999999999998744330 0 01112345589999999999999999887 77
Q ss_pred cCCCceeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhH
Q 001444 825 RPMPLVRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHD 903 (1076)
Q Consensus 825 r~~p~~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~d 903 (1076)
|+ |||+..+.++...+..++++.. ++++|.+|.++|||+++ |++||+|++|||++|.++.+
T Consensus 255 ~~-----~lGi~~~~~~~~~~~~~~~~~~-------------~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~ 316 (353)
T PRK10898 255 RG-----YIGIGGREIAPLHAQGGGIDQL-------------QGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALE 316 (353)
T ss_pred cc-----ccceEEEECCHHHHHhcCCCCC-------------CeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHH
Confidence 88 9999999887666655555433 78899999999999999 99999999999999999999
Q ss_pred HHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccC
Q 001444 904 IENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRD 942 (1076)
Q Consensus 904 l~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~ 942 (1076)
+...+...+++ +++.++|.|+|+.+++.+++.+.+
T Consensus 317 l~~~l~~~~~g----~~v~l~v~R~g~~~~~~v~l~~~p 351 (353)
T PRK10898 317 TMDQVAEIRPG----SVIPVVVMRDDKQLTLQVTIQEYP 351 (353)
T ss_pred HHHHHHhcCCC----CEEEEEEEECCEEEEEEEEeccCC
Confidence 99888766665 789999999999999999886543
No 13
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=6.5e-33 Score=318.35 Aligned_cols=295 Identities=24% Similarity=0.416 Sum_probs=255.0
Q ss_pred chHHHHHHHhCCceEEEEEeeeecc----CCCCC---CCcEEEEEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEE
Q 001444 36 DDWRKALNKVVPAVVVLRTTACRAF----DTEAA---GASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPV 108 (1076)
Q Consensus 36 ~~~~~~~~~v~~svV~I~~~~~~~~----d~~~~---~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a 108 (1076)
..+...++++.|+||.|........ ..... ..+.||||+++. +|||+||.||+.. +..+.+.+.|++++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~-~g~ivTn~hVi~~-a~~i~v~l~dg~~~~a 110 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISS-DGYIVTNNHVIAG-AEEITVTLADGREVPA 110 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcC-CeEEEecceecCC-cceEEEEeCCCCEEEE
Confidence 6999999999999999998754321 11110 148899999996 8999999999995 8899999999999999
Q ss_pred EEEEecCCCcEEEEEEcCCC-CccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccc
Q 001444 109 YPIYRDPVHDFGFFRYDPSA-IQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFN 187 (1076)
Q Consensus 109 ~vv~~d~~~DlAlLk~~~~~-~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~ 187 (1076)
++++.|+..|+|+||++... ++.+.+.. +..+++|++++++|+|++...+++.|+++.+.|. .+....+ .
T Consensus 111 ~~vg~d~~~dlavlki~~~~~~~~~~~~~----s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~--~v~~~~~---~ 181 (347)
T COG0265 111 KLVGKDPISDLAVLKIDGAGGLPVIALGD----SDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT--GVGSAGG---Y 181 (347)
T ss_pred EEEecCCccCEEEEEeccCCCCceeeccC----CCCcccCCEEEEecCCCCcccceeccEEeccccc--cccCccc---c
Confidence 99999999999999999765 66666666 7888999999999999999999999999999997 2222111 2
Q ss_pred eeeEEEeeccCCCCCCCceecCCCcEEEEeecccCCC----CCccccCHHHHHHHHHHHHhcCCccccccceeccCCCcc
Q 001444 188 TFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSSS----ASAFFLPLERVVRALRFLQERRDCNIHNWEAVSIPRGTL 263 (1076)
Q Consensus 188 ~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~~~----~~~falP~~~i~~~l~~l~~~~~~~~~~~~~~~~~rg~l 263 (1076)
..+||+|+.+++|+||||++|.+|++|||++...... +.+|++|++.+++++.++...+ ++.|+++
T Consensus 182 ~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G----------~v~~~~l 251 (347)
T COG0265 182 VNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKG----------KVVRGYL 251 (347)
T ss_pred cchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcC----------Ccccccc
Confidence 3459999999999999999999999999998876553 4799999999999999999866 7899999
Q ss_pred ceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHH-hc
Q 001444 264 QVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLL-DD 341 (1076)
Q Consensus 264 g~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l-~~ 341 (1076)
|+.+..+..+.+ +|++ ...|.+| ..|.+++||++ |++.||+|+++||+++.+..++...+ ..
T Consensus 252 gv~~~~~~~~~~--~g~~-------------~~~G~~V-~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~ 315 (347)
T COG0265 252 GVIGEPLTADIA--LGLP-------------VAAGAVV-LGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASN 315 (347)
T ss_pred ceEEEEcccccc--cCCC-------------CCCceEE-EecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhcc
Confidence 999998887777 7766 5688776 79999999999 99999999999999999999999887 45
Q ss_pred CCCCeEEEEEEeCCeEEEEEEEeccC
Q 001444 342 GVDKNIELLIERGGISMTVNLVVQDL 367 (1076)
Q Consensus 342 ~~g~~v~l~v~R~g~~~~~~v~l~~~ 367 (1076)
.+|+.+.+.+.|+|+.+++.+++.++
T Consensus 316 ~~g~~v~~~~~r~g~~~~~~v~l~~~ 341 (347)
T COG0265 316 RPGDEVALKLLRGGKERELAVTLGDR 341 (347)
T ss_pred CCCCEEEEEEEECCEEEEEEEEecCc
Confidence 78999999999999999999998873
No 14
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=99.97 E-value=5.7e-30 Score=293.99 Aligned_cols=293 Identities=27% Similarity=0.356 Sum_probs=241.3
Q ss_pred ccceeeeeEEEEEEEcCccc--ccCC---cccceeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeecCCeEEeEEE
Q 001444 588 AESVIEPTLVMFEVHVPPSC--MIDG---VHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGEV 662 (1076)
Q Consensus 588 ~~~~~~~S~V~V~~~~~~~~--~~dg---~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d~~~~~~a~v 662 (1076)
..+++.+++|.+........ .+.. .....+.||||+++ +.|||+|| +||+.++..+.+++.| ++.++|++
T Consensus 38 ~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~---~~g~ivTn-~hVi~~a~~i~v~l~d-g~~~~a~~ 112 (347)
T COG0265 38 AVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIIS---SDGYIVTN-NHVIAGAEEITVTLAD-GREVPAKL 112 (347)
T ss_pred HHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEc---CCeEEEec-ceecCCcceEEEEeCC-CCEEEEEE
Confidence 67788999999987654211 0000 01115889999999 89999999 6666669999999988 99999999
Q ss_pred EEeeCCCcEEEEEECCCC-CCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCCCCcc
Q 001444 663 VFLHPVHNFALIAYDPSS-LGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRY 741 (1076)
Q Consensus 663 v~~dp~~dlAvlk~d~~~-~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~~~~~ 741 (1076)
++.|+.+|+|++|++... ++. +.++++..++.||+++++|+|+++ ..+|+ .|+++...+..+
T Consensus 113 vg~d~~~dlavlki~~~~~~~~--------~~~~~s~~l~vg~~v~aiGnp~g~-----~~tvt----~Givs~~~r~~v 175 (347)
T COG0265 113 VGKDPISDLAVLKIDGAGGLPV--------IALGDSDKLRVGDVVVAIGNPFGL-----GQTVT----SGIVSALGRTGV 175 (347)
T ss_pred EecCCccCEEEEEeccCCCCce--------eeccCCCCcccCCEEEEecCCCCc-----cccee----ccEEeccccccc
Confidence 999999999999998654 665 899999999999999999999996 35666 667776666422
Q ss_pred c--ccceeEEEEecccCC-CcCceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhhHHHHHHHHhcCCCCCcc
Q 001444 742 R--AMNMEVIELDTDFGS-TFSGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKIISGASGPSL 818 (1076)
Q Consensus 742 ~--~~~~~~I~~d~~ig~-~sGGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l~~l~~~~~~~~~ 818 (1076)
. ..+.++||+|+++++ +||||++|.+|+++||+++..... + ...+++ ||||++.+++++.++...++
T Consensus 176 ~~~~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~--~--~~~gig--faiP~~~~~~v~~~l~~~G~---- 245 (347)
T COG0265 176 GSAGGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPS--G--GSSGIG--FAIPVNLVAPVLDELISKGK---- 245 (347)
T ss_pred cCcccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCC--C--CcceeE--EEecHHHHHHHHHHHHHcCC----
Confidence 1 125689999999977 899999999999999999854431 1 134555 88999999999999998665
Q ss_pred cccccccCCCceeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhh-ccCCCEEEEECCEE
Q 001444 819 LINGVKRPMPLVRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENM-LEQGDMMLAINKQP 897 (1076)
Q Consensus 819 ~~~~v~r~~p~~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~ 897 (1076)
+.|+ ++|+.+.+++...+ +|++.. .|.+|..|.+++||+++ ++.||+|+++||++
T Consensus 246 ----v~~~-----~lgv~~~~~~~~~~--~g~~~~-------------~G~~V~~v~~~spa~~agi~~Gdii~~vng~~ 301 (347)
T COG0265 246 ----VVRG-----YLGVIGEPLTADIA--LGLPVA-------------AGAVVLGVLPGSPAAKAGIKAGDIITAVNGKP 301 (347)
T ss_pred ----cccc-----ccceEEEEcccccc--cCCCCC-------------CceEEEecCCCChHHHcCCCCCCEEEEECCEE
Confidence 8888 89999998887666 665533 67889999999999999 99999999999999
Q ss_pred cCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccc
Q 001444 898 VTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDV 940 (1076)
Q Consensus 898 V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~ 940 (1076)
+.+..++...+....++ ..+.+++.|+|+++++.+++.+
T Consensus 302 v~~~~~l~~~v~~~~~g----~~v~~~~~r~g~~~~~~v~l~~ 340 (347)
T COG0265 302 VASLSDLVAAVASNRPG----DEVALKLLRGGKERELAVTLGD 340 (347)
T ss_pred ccCHHHHHHHHhccCCC----CEEEEEEEECCEEEEEEEEecC
Confidence 99999999999777765 8999999999999999999876
No 15
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.90 E-value=2.6e-22 Score=227.98 Aligned_cols=317 Identities=19% Similarity=0.244 Sum_probs=236.6
Q ss_pred CcchHHHHHHHhCCceEEEEEeee----eccCCCCCCCcEEEEEEEeCCCcEEEeCccccCCCCc----------EEEEE
Q 001444 34 TADDWRKALNKVVPAVVVLRTTAC----RAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPV----------VAEAM 99 (1076)
Q Consensus 34 ~~~~~~~~~~~v~~svV~I~~~~~----~~~d~~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~----------~~~v~ 99 (1076)
..+..+...++...|+|.|....- .+|...+-....|||||++. +|+++||+||+..... .+.+.
T Consensus 126 ~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~g-d~i~VTnghV~~~~~~~y~~~~~~l~~vqi~ 204 (473)
T KOG1320|consen 126 YKAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGG-DGIIVTNGHVVRVEPRIYAHSSTVLLRVQID 204 (473)
T ss_pred hhhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcC-CcEEEEeeEEEEEEeccccCCCcceeeEEEE
Confidence 356788899999999999997421 22555566778899999997 8999999999985433 36666
Q ss_pred ecCC--cEEEEEEEEecCCCcEEEEEEcCCC--CccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCC
Q 001444 100 FVNR--EEIPVYPIYRDPVHDFGFFRYDPSA--IQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDA 175 (1076)
Q Consensus 100 ~~~~--~~~~a~vv~~d~~~DlAlLk~~~~~--~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~ 175 (1076)
..++ ..+.+.+.+.|+..|+|+++++..+ ++.+++.. ...++.|+++..+|+|++.....+.|+++...|..
T Consensus 205 aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~----~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~ 280 (473)
T KOG1320|consen 205 AAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGV----SSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKS 280 (473)
T ss_pred EeecCCccCCCeEEccccccceEEEEEecCCcccceeecce----eeeecccceeeccccCceeeeeeeecccccccccc
Confidence 6666 8899999999999999999996442 34444443 78899999999999999999999999999999986
Q ss_pred CCCCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEEeecccCC----CCCccccCHHHHHHHHHHHHhcCCcccc
Q 001444 176 PHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSS----SASAFFLPLERVVRALRFLQERRDCNIH 251 (1076)
Q Consensus 176 ~~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~~----~~~~falP~~~i~~~l~~l~~~~~~~~~ 251 (1076)
...+.. ......+++|+++.++.|+||+|++|.||++||+++..... .+.+|++|.+.++.++.+.-+.+. -..
T Consensus 281 ~~lg~~-~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~-~lr 358 (473)
T KOG1320|consen 281 FKLGLE-TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQI-SLR 358 (473)
T ss_pred cccCcc-cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhce-eec
Confidence 554433 22334568999999999999999999999999998876653 678999999999988876632220 000
Q ss_pred ccceeccCCCccceEEEEcChhHHH-HcCCChhHHHhhhcCCCCCC-CcEEEEEEecCCCcccc-CCCCCCEEEEECCEE
Q 001444 252 NWEAVSIPRGTLQVTFVHKGFDETR-RLGLQSATEQMVRHASPPGE-TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEV 328 (1076)
Q Consensus 252 ~~~~~~~~rg~lg~~~~~~~~~~~~-~lGl~~~~~~~~~~~~~~~~-~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~ 328 (1076)
........+.++|+....+....+- .++.++ .+|+.. .++++ ..|.|++++.. ++++||+|++|||++
T Consensus 359 ~~~~~~p~~~~~g~~s~~i~~g~vf~~~~~~~--------~~~~~~~q~v~i-s~Vlp~~~~~~~~~~~g~~V~~vng~~ 429 (473)
T KOG1320|consen 359 PVKPLVPVHQYIGLPSYYIFAGLVFVPLTKSY--------IFPSGVVQLVLV-SQVLPGSINGGYGLKPGDQVVKVNGKP 429 (473)
T ss_pred cccCcccccccCCceeEEEecceEEeecCCCc--------cccccceeEEEE-EEeccCCCcccccccCCCEEEEECCEE
Confidence 1111122344555554432211111 123321 233222 35555 79999999999 999999999999999
Q ss_pred eCChhHHHHHHh-cCCCCeEEEEEEeCCeEEEEEEEecc
Q 001444 329 ITQFLKLETLLD-DGVDKNIELLIERGGISMTVNLVVQD 366 (1076)
Q Consensus 329 v~~~~~l~~~l~-~~~g~~v~l~v~R~g~~~~~~v~l~~ 366 (1076)
|.+..++..++. ...+++|.+..+|+.+..++.+....
T Consensus 430 V~n~~~l~~~i~~~~~~~~v~vl~~~~~e~~tl~Il~~~ 468 (473)
T KOG1320|consen 430 VKNLKHLYELIEECSTEDKVAVLDRRSAEDATLEILPEH 468 (473)
T ss_pred eechHHHHHHHHhcCcCceEEEEEecCccceeEEecccc
Confidence 999999999995 46678899998999888888776543
No 16
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.85 E-value=4.7e-20 Score=209.78 Aligned_cols=310 Identities=17% Similarity=0.156 Sum_probs=221.5
Q ss_pred ccceeeeeEEEEEEE--cCcccccCCcccceeeEEEEEEEeeCCceEEEEeCccccCCCc----------cEEEEeecCC
Q 001444 588 AESVIEPTLVMFEVH--VPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISAS----------DVMLSFAAFP 655 (1076)
Q Consensus 588 ~~~~~~~S~V~V~~~--~~~~~~~dg~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~----------~i~v~~~d~~ 655 (1076)
.+++-..++|.|+.. -....++.+.......||||||+ .+|+++||.|++..... .|.++.++ +
T Consensus 133 ~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~-~ 208 (473)
T KOG1320|consen 133 VFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAI-G 208 (473)
T ss_pred hhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---CCcEEEEeeEEEEEEeccccCCCcceeeEEEEEee-c
Confidence 455566778888752 12222466777777899999999 79999999777654433 48888888 6
Q ss_pred --eEEeEEEEEeeCCCcEEEEEECCC--CCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEeccccee
Q 001444 656 --IEIPGEVVFLHPVHNFALIAYDPS--SLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAAL 731 (1076)
Q Consensus 656 --~~~~a~vv~~dp~~dlAvlk~d~~--~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~ 731 (1076)
...++.+++.|+..|+|+++++.. .++. ++++-+..++.|+++.++|+|+++. .+++ .|
T Consensus 209 ~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~--------i~~~~~~~~~~G~~~~a~~~~f~~~-----nt~t----~g 271 (473)
T KOG1320|consen 209 PGNSGEPVIVGVDKVAGVAFLKIKTPENILYV--------IPLGVSSHFRTGVEVSAIGNGFGLL-----NTLT----QG 271 (473)
T ss_pred CCccCCCeEEccccccceEEEEEecCCcccce--------eecceeeeecccceeeccccCceee-----eeee----ec
Confidence 899999999999999999999533 2333 7787778899999999999999994 4555 66
Q ss_pred ecCCCCCCccc------ccceeEEEEecccCC-CcCceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhhHHH
Q 001444 732 NISSADCPRYR------AMNMEVIELDTDFGS-TFSGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISR 804 (1076)
Q Consensus 732 ~i~~~~~~~~~------~~~~~~I~~d~~ig~-~sGGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~ 804 (1076)
+++...|..|. -...+++|+|++++. ++||||+|.+|++||++++...... -..+.+ |++|.+.++.
T Consensus 272 ~vs~~~R~~~~lg~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~----~~~~iS--f~~p~d~vl~ 345 (473)
T KOG1320|consen 272 MVSGQLRKSFKLGLETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIG----FSHGIS--FKIPIDTVLV 345 (473)
T ss_pred ccccccccccccCcccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEee----ccccce--eccCchHhhh
Confidence 66666554443 245789999999976 7999999999999999887332110 123444 8899999999
Q ss_pred HHHHHhcCCCCCcccccccccCCCceeeeeeEEEEcChHhHHH-cCCCHHHHHHHHhcCC-CccceEEEEEecCCCHHhh
Q 001444 805 VLDKIISGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARS-FGLSDDWVQALVKKDP-VRRQVLRVKGCLAGSKAEN 882 (1076)
Q Consensus 805 ~l~~l~~~~~~~~~~~~~v~r~~p~~~~Lgv~~~~~~~~~a~~-~g~~~~wi~~~~~~~~-~~~~~~~V~~V~~~s~A~~ 882 (1076)
++.+..+.+..- ..+++-.|.-+|+|.....+.....-. .+.+.. ++ ...|+.+|..|.+++++..
T Consensus 346 ~v~r~~e~~~~l----r~~~~~~p~~~~~g~~s~~i~~g~vf~~~~~~~~--------~~~~~~q~v~is~Vlp~~~~~~ 413 (473)
T KOG1320|consen 346 IVLRLGEFQISL----RPVKPLVPVHQYIGLPSYYIFAGLVFVPLTKSYI--------FPSGVVQLVLVSQVLPGSINGG 413 (473)
T ss_pred hhhhhhhhceee----ccccCcccccccCCceeEEEecceEEeecCCCcc--------ccccceeEEEEEEeccCCCccc
Confidence 988874332200 001222233345555444333221110 111111 11 2347899999999999999
Q ss_pred h-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccc
Q 001444 883 M-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDV 940 (1076)
Q Consensus 883 a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~ 940 (1076)
. ++.||+|++|||++|.+..++..++.....+ +++.+...|..|..++.+....
T Consensus 414 ~~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~----~~v~vl~~~~~e~~tl~Il~~~ 468 (473)
T KOG1320|consen 414 YGLKPGDQVVKVNGKPVKNLKHLYELIEECSTE----DKVAVLDRRSAEDATLEILPEH 468 (473)
T ss_pred ccccCCCEEEEECCEEeechHHHHHHHHhcCcC----ceEEEEEecCccceeEEecccc
Confidence 9 9999999999999999999999999655543 6888888888888888886543
No 17
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=99.82 E-value=1.2e-18 Score=198.19 Aligned_cols=178 Identities=20% Similarity=0.254 Sum_probs=111.0
Q ss_pred ccCCCccceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCcccc--CCCCCCEEEEECCEEeCChhH
Q 001444 257 SIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLK 334 (1076)
Q Consensus 257 ~~~rg~lg~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~--gL~~GD~Il~VnG~~v~~~~~ 334 (1076)
...+++.|+-|....-|.. ...-.|.|+.|+.++||++ .|+.||+|+.|||+.+-..++
T Consensus 349 ~LvKg~~GFGfTliGGdd~-------------------~gDefLqVKsvl~DGPAa~dGkle~GDviV~INg~cvlGhTH 409 (984)
T KOG3209|consen 349 KLVKGYMGFGFTLIGGDDV-------------------RGDEFLQVKSVLKDGPAAQDGKLETGDVIVHINGECVLGHTH 409 (984)
T ss_pred EEeecccccceEEecCCcC-------------------CCCceeeeeecccCCchhhcCccccCcEEEEECCceeccccH
Confidence 4566677777775442221 2345677899999999999 599999999999999988877
Q ss_pred HH--HHHh-cCCCCeEEEEEEeCCeE-----E----EEEEEeccCCCCCCCcccccCceEEe------------cCCHH-
Q 001444 335 LE--TLLD-DGVDKNIELLIERGGIS-----M----TVNLVVQDLHSITPDYFLEVSGAVIH------------PLSYQ- 389 (1076)
Q Consensus 335 l~--~~l~-~~~g~~v~l~v~R~g~~-----~----~~~v~l~~~~~~~~~~~~~~~G~~~~------------~l~~~- 389 (1076)
.+ +++. ..+|..|.|++.|+=.. - ...--+..|+. +-....|..+. +-+..
T Consensus 410 AqaV~~fqaiPvg~~V~L~lcRgyelp~dp~dp~~sp~~~iv~~~P~----~~~~~~gp~v~~~~sss~~~a~~~~~~el 485 (984)
T KOG3209|consen 410 AQAVKRFQAIPVGQSVDLVLCRGYELPFDPEDPVGSPRVAIVPSWPD----SSTDKGGPMVTGRPSSSTHLAQHDGPPEL 485 (984)
T ss_pred HHHHHHhhccccCCeeeEEEecCccCCCCCcccCCCCccccccCCCC----CCCCCCCCeeecCCCCccccccCCCCccc
Confidence 65 4443 36799999999995211 0 00001111221 11011111110 00000
Q ss_pred -------HHhccCCC----CCeEEEEcCCChhhHcCCCCCCEEEEcCCeecCCHH--HHHHHHHhcCCCCeEeEEEEecc
Q 001444 390 -------QARNFRFP----CGLVYVAEPGYMLFRAGVPRHAIIKKFAGEEISRLE--DLISVLSKLSRGARVPIEYSSYT 456 (1076)
Q Consensus 390 -------~~~~~~~~----~~gv~v~~~gs~a~~aGl~~GD~I~~Vng~~v~~l~--~~~~~l~~~~~g~~v~l~~~~~~ 456 (1076)
-...|++. ..|.-+...-.+-++-||+.||+|+++|++.+..|. +++++++..|-|.++.|.++|..
T Consensus 486 ~ti~i~kgpegfgftiADsPtgqrvK~ilDp~~c~gl~eGd~IVei~~rnvr~L~h~qvvdmlke~piG~r~~Llv~RGg 565 (984)
T KOG3209|consen 486 TTIKIVKGPEGFGFTIADSPTGQRVKQILDPQDCPGLSEGDLIVEINERNVRALTHTQVVDMLKECPIGSRVHLLVKRGG 565 (984)
T ss_pred EEEeeecCCCCCCceeccCCCCCceeeecCcccCCCCCCCCeEEecccccccccchHHHHHHHHhccCCcceeEEEecCC
Confidence 00112211 112222222234455689999999999999999985 89999999999999999888876
Q ss_pred c
Q 001444 457 D 457 (1076)
Q Consensus 457 ~ 457 (1076)
-
T Consensus 566 p 566 (984)
T KOG3209|consen 566 P 566 (984)
T ss_pred C
Confidence 5
No 18
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=99.72 E-value=1.2e-16 Score=189.17 Aligned_cols=157 Identities=20% Similarity=0.212 Sum_probs=127.7
Q ss_pred EEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccCCCC---
Q 001444 870 RVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRDGNG--- 945 (1076)
Q Consensus 870 ~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~~~~--- 945 (1076)
+|..|.++|||++| ||+||+|++|||++|.+++++...+....++ ++++++|.|+|+.+++++++...+...
T Consensus 129 lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g----~~v~v~v~R~gk~~~~~v~l~~~~~~~~~~ 204 (449)
T PRK10779 129 VVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGD----ESTTITVAPFGSDQRRDKTLDLRHWAFEPD 204 (449)
T ss_pred cccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccC----CceEEEEEeCCccceEEEEecccccccCcc
Confidence 48999999999999 9999999999999999999999988666665 789999999999998888885442111
Q ss_pred CcceeeecCccccCCcHhHhhcCCCCCCCCcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCe
Q 001444 946 TTRVINWCGCIVQDPHPAVRALGFLPEEGHGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEF 1025 (1076)
Q Consensus 946 ~~~~~~~~G~~~~~p~~~~~~~~~~p~~~~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~ 1025 (1076)
........|+....+ ..+++|..|.++|||+++||++||.|++|||+++++++++.+.++.. .++.
T Consensus 205 ~~~~~~~lGl~~~~~-------------~~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~~-~~~~ 270 (449)
T PRK10779 205 KQDPVSSLGIRPRGP-------------QIEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRDN-PGKP 270 (449)
T ss_pred ccchhhcccccccCC-------------CcCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhC-CCCE
Confidence 111112233321111 12468999999999999999999999999999999999999999884 5788
Q ss_pred EEEEEEEeCCeEEEEEEEeC
Q 001444 1026 VRVRTVHLNGKPRVLTLKQD 1045 (1076)
Q Consensus 1026 v~l~~v~r~g~~~~~tlk~~ 1045 (1076)
+.++ +.|+|+...++++++
T Consensus 271 v~l~-v~R~g~~~~~~v~~~ 289 (449)
T PRK10779 271 LALE-IERQGSPLSLTLTPD 289 (449)
T ss_pred EEEE-EEECCEEEEEEEEee
Confidence 9998 789999999888875
No 19
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=99.58 E-value=3e-14 Score=167.06 Aligned_cols=144 Identities=22% Similarity=0.316 Sum_probs=119.5
Q ss_pred ceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccCCCC
Q 001444 867 QVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRDGNG 945 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~~~~ 945 (1076)
.+.+|.+|.++|||+++ |++||+|+++||+++.+++++...+.... +++.+++.|+++...+.+++.
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~------~~v~~~I~r~g~~~~l~v~l~------ 195 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA------GEPMVEILAERENWTFEVMKE------ 195 (420)
T ss_pred CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc------ccceEEEEEecCceEeccccc------
Confidence 34569999999999999 99999999999999999999998885433 357899999988766544432
Q ss_pred CcceeeecCccccCCcHhHhhcCCCCCCCCcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCe
Q 001444 946 TTRVINWCGCIVQDPHPAVRALGFLPEEGHGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEF 1025 (1076)
Q Consensus 946 ~~~~~~~~G~~~~~p~~~~~~~~~~p~~~~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~ 1025 (1076)
+....| ..+++|..|.++|||+++||++||.|++|||+++.+++++.+.+++. +++.
T Consensus 196 ---------~~~~~~-------------~~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~-~~~~ 252 (420)
T TIGR00054 196 ---------LIPRGP-------------KIEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKEN-PGKS 252 (420)
T ss_pred ---------ceecCC-------------CcCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhC-CCCc
Confidence 110011 12578999999999999999999999999999999999999999985 4778
Q ss_pred EEEEEEEeCCeEEEEEEEeCC
Q 001444 1026 VRVRTVHLNGKPRVLTLKQDL 1046 (1076)
Q Consensus 1026 v~l~~v~r~g~~~~~tlk~~~ 1046 (1076)
+.++ +.|+|+.+.++++++.
T Consensus 253 v~l~-v~R~g~~~~~~v~~~~ 272 (420)
T TIGR00054 253 MDIK-VERNGETLSISLTPEA 272 (420)
T ss_pred eEEE-EEECCEEEEEEEEEcC
Confidence 9998 7899999999998854
No 20
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=99.54 E-value=7.9e-14 Score=165.20 Aligned_cols=143 Identities=18% Similarity=0.244 Sum_probs=111.3
Q ss_pred EEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHH-hcCCCCeEEEEEEeCCeEEEEEEEeccCCCC-CC--Cccc
Q 001444 302 VDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLL-DDGVDKNIELLIERGGISMTVNLVVQDLHSI-TP--DYFL 376 (1076)
Q Consensus 302 v~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l-~~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~-~~--~~~~ 376 (1076)
|..|.++|||++ |||+||+|++|||++|.+|.+++..+ ...+|++++++|.|+|+..+.++++...+.. .+ ....
T Consensus 130 V~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~~~~~~~~~~~~~~ 209 (449)
T PRK10779 130 VGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTLDLRHWAFEPDKQDPV 209 (449)
T ss_pred ccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEecccccccCccccchh
Confidence 479999999999 99999999999999999999999777 4577789999999999988888877533211 00 1111
Q ss_pred ccCceEEecCCHHHHhccCCCCCeEEEE--cCCChhhHcCCCCCCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeEEEEe
Q 001444 377 EVSGAVIHPLSYQQARNFRFPCGLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSS 454 (1076)
Q Consensus 377 ~~~G~~~~~l~~~~~~~~~~~~~gv~v~--~~gs~a~~aGl~~GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l~~~~ 454 (1076)
...| +.++.++ .++.|. .++|||++|||++||+|++|||+++.+|+++.+.++. ..++.+.+++.|
T Consensus 210 ~~lG--l~~~~~~---------~~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v~R 277 (449)
T PRK10779 210 SSLG--IRPRGPQ---------IEPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEIER 277 (449)
T ss_pred hccc--ccccCCC---------cCcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEEEE
Confidence 1223 2232221 134555 4899999999999999999999999999999999988 456788888877
Q ss_pred cc
Q 001444 455 YT 456 (1076)
Q Consensus 455 ~~ 456 (1076)
..
T Consensus 278 ~g 279 (449)
T PRK10779 278 QG 279 (449)
T ss_pred CC
Confidence 54
No 21
>PF12812 PDZ_1: PDZ-like domain
Probab=99.52 E-value=2.4e-14 Score=125.48 Aligned_cols=77 Identities=45% Similarity=0.699 Sum_probs=71.1
Q ss_pred CCCCCcccccCceEEecCCHHHHhccCCCCCeEEEEcCCChhhHcC-CCCCCEEEEcCCeecCCHHHHHHHHHhcCCC
Q 001444 369 SITPDYFLEVSGAVIHPLSYQQARNFRFPCGLVYVAEPGYMLFRAG-VPRHAIIKKFAGEEISRLEDLISVLSKLSRG 445 (1076)
Q Consensus 369 ~~~~~~~~~~~G~~~~~l~~~~~~~~~~~~~gv~v~~~gs~a~~aG-l~~GD~I~~Vng~~v~~l~~~~~~l~~~~~g 445 (1076)
+++|+|+++++|+.||+|+||++|.|+++++|+|++.++++...++ +.+|++|++|||+||+||++|+++|+++||+
T Consensus 1 ~itp~r~v~~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ipd~ 78 (78)
T PF12812_consen 1 AITPSRFVEVCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVKKIPDN 78 (78)
T ss_pred CccCCEEEEEcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCCCC
Confidence 3689999999999999999999999999999999998766666666 9999999999999999999999999999983
No 22
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.49 E-value=2.2e-13 Score=131.34 Aligned_cols=110 Identities=33% Similarity=0.576 Sum_probs=75.0
Q ss_pred EEEEEEeCCCcEEEeCccccCC-------CCcEEEEEecCCcEEE--EEEEEecCC-CcEEEEEEcCCCCccccccCCCC
Q 001444 70 ATGFVVDKRRGIILTNRHVVKP-------GPVVAEAMFVNREEIP--VYPIYRDPV-HDFGFFRYDPSAIQFLNYDEIPL 139 (1076)
Q Consensus 70 GTGfvV~~~~G~IlTn~Hvv~~-------~~~~~~v~~~~~~~~~--a~vv~~d~~-~DlAlLk~~~~~~~~~~~~~l~l 139 (1076)
||||+|++ +|+||||+||+.+ ....+.+.+.++..+. +++++.|+. +|+|||+++.
T Consensus 1 GTGf~i~~-~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~------------- 66 (120)
T PF13365_consen 1 GTGFLIGP-DGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP------------- 66 (120)
T ss_dssp EEEEEEET-TTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC-------------
T ss_pred CEEEEEcC-CceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec-------------
Confidence 79999997 6899999999995 3456888888988888 999999999 9999999980
Q ss_pred CCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEE
Q 001444 140 APEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVAL 216 (1076)
Q Consensus 140 ~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi 216 (1076)
....+.. ....+......... .......+| +++.+.+|+|||||||.+|+||||
T Consensus 67 ----------~~~~~~~-----~~~~~~~~~~~~~~-------~~~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 67 ----------WTGVGGG-----VRVPGSTSGVSPTS-------TNDNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ----------EEEEEEE-----EEEEEEEEEEEEEE-------EEETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred ----------ccceeee-----eEeeeecccccccc-------CcccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 0000000 00000001110000 000011124 899999999999999999999997
No 23
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=99.37 E-value=5.8e-12 Score=147.83 Aligned_cols=131 Identities=19% Similarity=0.242 Sum_probs=105.4
Q ss_pred CcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhcCCCCeEEEEEEeCCeEEEEEEEeccCCCCCCCcc
Q 001444 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVVQDLHSITPDYF 375 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~ 375 (1076)
.|.+| ..|.++|||++ ||++||+|++|||+++.++.++...+.... .++.+++.|+++..++.+++.
T Consensus 128 ~g~~V-~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~-~~v~~~I~r~g~~~~l~v~l~---------- 195 (420)
T TIGR00054 128 VGPVI-ELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA-GEPMVEILAERENWTFEVMKE---------- 195 (420)
T ss_pred CCcee-eccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc-ccceEEEEEecCceEeccccc----------
Confidence 56676 79999999999 999999999999999999999998875544 678899999888766443322
Q ss_pred cccCceEEecCCHHHHhccCCCCCeEEEE--cCCChhhHcCCCCCCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeEEEE
Q 001444 376 LEVSGAVIHPLSYQQARNFRFPCGLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYS 453 (1076)
Q Consensus 376 ~~~~G~~~~~l~~~~~~~~~~~~~gv~v~--~~gs~a~~aGl~~GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l~~~ 453 (1076)
+.+..+ ..++.|. .++|||++|||++||+|++|||+++.+|+++.+.++.. .++.+.+++.
T Consensus 196 -------~~~~~~---------~~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~~-~~~~v~l~v~ 258 (420)
T TIGR00054 196 -------LIPRGP---------KIEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKEN-PGKSMDIKVE 258 (420)
T ss_pred -------ceecCC---------CcCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhC-CCCceEEEEE
Confidence 111111 1245565 48999999999999999999999999999999999984 4667888887
Q ss_pred ecc
Q 001444 454 SYT 456 (1076)
Q Consensus 454 ~~~ 456 (1076)
|-+
T Consensus 259 R~g 261 (420)
T TIGR00054 259 RNG 261 (420)
T ss_pred ECC
Confidence 754
No 24
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.21 E-value=9.5e-11 Score=105.30 Aligned_cols=68 Identities=31% Similarity=0.581 Sum_probs=61.0
Q ss_pred CCcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHH-hcCCCCeEEEEEEeCCeEEEEEEEe
Q 001444 296 ETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLL-DDGVDKNIELLIERGGISMTVNLVV 364 (1076)
Q Consensus 296 ~~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l-~~~~g~~v~l~v~R~g~~~~~~v~l 364 (1076)
..|++| ..|.++|||++ ||++||+|++|||+++.++.++...+ ...+|++++|+|.|+|+.++++++|
T Consensus 13 ~~g~~V-~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 13 TGGVVV-VSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp SSSEEE-EEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEETTEEEEEEEE-
T ss_pred CCeEEE-EEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEC
Confidence 368887 58999999999 99999999999999999999999888 5689999999999999999998875
No 25
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.19 E-value=1.8e-10 Score=103.53 Aligned_cols=80 Identities=28% Similarity=0.372 Sum_probs=69.5
Q ss_pred eeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHh
Q 001444 832 ILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQA 910 (1076)
Q Consensus 832 ~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~ 910 (1076)
+||+.+...+. .++++|.+|.++|||+++ |++||+|++|||++|+++.++...+..
T Consensus 2 ~lGv~~~~~~~-----------------------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~ 58 (82)
T PF13180_consen 2 GLGVTVQNLSD-----------------------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSK 58 (82)
T ss_dssp E-SEEEEECSC-----------------------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHC
T ss_pred EECeEEEEccC-----------------------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHh
Confidence 78998887552 267889999999999999 999999999999999999999999976
Q ss_pred ccCCCCCCCeEEEEEEeCCEEEEEEEec
Q 001444 911 LDKDGEDNGKLDITIFRQGREIELQVGT 938 (1076)
Q Consensus 911 ~~~g~~~~~~v~l~V~R~g~~~~~~v~l 938 (1076)
..++ ++++++|.|+|+.+++++++
T Consensus 59 ~~~g----~~v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 59 GKPG----DTVTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp SSTT----SEEEEEEEETTEEEEEEEE-
T ss_pred CCCC----CEEEEEEEECCEEEEEEEEC
Confidence 6776 99999999999999999875
No 26
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.10 E-value=6.5e-10 Score=107.07 Aligned_cols=55 Identities=31% Similarity=0.512 Sum_probs=47.3
Q ss_pred EEEEEEEeeCCceEEEEeCccccC-------CCccEEEEeecCCeEEe--EEEEEeeCC-CcEEEEEEC
Q 001444 619 GTGVIIYHSQSMGLVVVDKNTVAI-------SASDVMLSFAAFPIEIP--GEVVFLHPV-HNFALIAYD 677 (1076)
Q Consensus 619 GsG~vId~~~~~G~IlTn~~~V~~-------~~~~i~v~~~d~~~~~~--a~vv~~dp~-~dlAvlk~d 677 (1076)
||||+|+ ++|||||++|++.. ....+.+.+.+ +...+ |++++.|+. +|+|||+++
T Consensus 1 GTGf~i~---~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~D~All~v~ 65 (120)
T PF13365_consen 1 GTGFLIG---PDGYILTAAHVVEDWNDGKQPDNSSVEVVFPD-GRRVPPVAEVVYFDPDDYDLALLKVD 65 (120)
T ss_dssp EEEEEEE---TTTEEEEEHHHHTCCTT--G-TCSEEEEEETT-SCEEETEEEEEEEETT-TTEEEEEES
T ss_pred CEEEEEc---CCceEEEchhheecccccccCCCCEEEEEecC-CCEEeeeEEEEEECCccccEEEEEEe
Confidence 8999999 67799999776653 45668888887 77788 999999999 999999998
No 27
>PF12812 PDZ_1: PDZ-like domain
Probab=99.06 E-value=5.5e-10 Score=98.14 Aligned_cols=76 Identities=30% Similarity=0.484 Sum_probs=65.7
Q ss_pred CCCcceeeecCccccC-CcHhHhhcCCCCCCCCcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCC
Q 001444 944 NGTTRVINWCGCIVQD-PHPAVRALGFLPEEGHGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEH 1022 (1076)
Q Consensus 944 ~~~~~~~~~~G~~~~~-p~~~~~~~~~~p~~~~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~ 1022 (1076)
..++|++.|+|+.+++ +++.+|++. ++. ++++.....|+++.++|+.+|-+|++|||+||+|+++|.++++++|+
T Consensus 2 itp~r~v~~~Ga~f~~Ls~q~aR~~~-~~~---~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ipd 77 (78)
T PF12812_consen 2 ITPSRFVEVCGAVFHDLSYQQARQYG-IPV---GGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDDFIKVVKKIPD 77 (78)
T ss_pred ccCCEEEEEcCeecccCCHHHHHHhC-CCC---CEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCCC
Confidence 4578999999999999 667777765 444 45666778999999999999999999999999999999999999986
Q ss_pred C
Q 001444 1023 G 1023 (1076)
Q Consensus 1023 ~ 1023 (1076)
+
T Consensus 78 ~ 78 (78)
T PF12812_consen 78 N 78 (78)
T ss_pred C
Confidence 4
No 28
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.95 E-value=6e-09 Score=95.19 Aligned_cols=88 Identities=31% Similarity=0.404 Sum_probs=74.6
Q ss_pred ecCccccCCcHhHhhcCCCCCCCCcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEE
Q 001444 952 WCGCIVQDPHPAVRALGFLPEEGHGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTV 1031 (1076)
Q Consensus 952 ~~G~~~~~p~~~~~~~~~~p~~~~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v 1031 (1076)
|+|+.++.++...+....++ ...|++|..+.++|||+++||++||+|++|||+++.++.++.+++.....++.+.+. +
T Consensus 2 ~~G~~~~~~~~~~~~~~~~~-~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~i~l~-v 79 (90)
T cd00987 2 WLGVTVQDLTPDLAEELGLK-DTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDKVTLT-V 79 (90)
T ss_pred ccceEEeECCHHHHHHcCCC-CCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEE-E
Confidence 78999999887666542222 245899999999999999999999999999999999999999999887668899999 6
Q ss_pred EeCCeEEEEE
Q 001444 1032 HLNGKPRVLT 1041 (1076)
Q Consensus 1032 ~r~g~~~~~t 1041 (1076)
.|+|+.+.++
T Consensus 80 ~r~g~~~~~~ 89 (90)
T cd00987 80 LRGGKELTVT 89 (90)
T ss_pred EECCEEEEee
Confidence 7899876654
No 29
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.92 E-value=9.2e-09 Score=93.96 Aligned_cols=87 Identities=30% Similarity=0.461 Sum_probs=71.9
Q ss_pred eeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHh
Q 001444 832 ILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQA 910 (1076)
Q Consensus 832 ~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~ 910 (1076)
|+|+.++.++....+.++++. ..+++|.+|.++|||+++ |++||+|++|||+++.++.++...+..
T Consensus 2 ~~G~~~~~~~~~~~~~~~~~~-------------~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~ 68 (90)
T cd00987 2 WLGVTVQDLTPDLAEELGLKD-------------TKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAE 68 (90)
T ss_pred ccceEEeECCHHHHHHcCCCC-------------CCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHh
Confidence 799999998865555444432 368899999999999999 999999999999999999999988865
Q ss_pred ccCCCCCCCeEEEEEEeCCEEEEEE
Q 001444 911 LDKDGEDNGKLDITIFRQGREIELQ 935 (1076)
Q Consensus 911 ~~~g~~~~~~v~l~V~R~g~~~~~~ 935 (1076)
...+ ..+.+++.|+|+...+.
T Consensus 69 ~~~~----~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 69 LKPG----DKVTLTVLRGGKELTVT 89 (90)
T ss_pred cCCC----CEEEEEEEECCEEEEee
Confidence 4444 78999999999876654
No 30
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.86 E-value=1.9e-08 Score=89.67 Aligned_cols=68 Identities=26% Similarity=0.268 Sum_probs=62.0
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEE
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLK 1043 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk 1043 (1076)
.|++|..+.++|||+++||++||.|++|||+++.+|++|.+.+.....++.+.+. +.|+|+...+++.
T Consensus 10 ~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~-v~r~g~~~~~~~~ 77 (79)
T cd00991 10 AGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVT-VLPSTTKLTNVST 77 (79)
T ss_pred CcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEE-EEECCEEEEEEEE
Confidence 4899999999999999999999999999999999999999999886568889998 7799998877764
No 31
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.83 E-value=2.4e-08 Score=88.99 Aligned_cols=68 Identities=24% Similarity=0.381 Sum_probs=60.4
Q ss_pred cceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEe
Q 001444 866 RQVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVG 937 (1076)
Q Consensus 866 ~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~ 937 (1076)
.++++|.+|.++|||+++ |++||+|++|||+++.+|.++...+....++ +.+.+++.|+|+.+++.++
T Consensus 9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g----~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPG----EVITVTVLPSTTKLTNVST 77 (79)
T ss_pred CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCC----CEEEEEEEECCEEEEEEEE
Confidence 467889999999999999 9999999999999999999999988655555 7899999999998887765
No 32
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.82 E-value=3e-08 Score=88.41 Aligned_cols=70 Identities=24% Similarity=0.440 Sum_probs=62.9
Q ss_pred CcEEEEEEecCCCccccCCCCCCEEEEECCEEeCChhHHHHHHhc-CCCCeEEEEEEeCCeEEEEEEEeccC
Q 001444 297 TGLLVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLVVQDL 367 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~gL~~GD~Il~VnG~~v~~~~~l~~~l~~-~~g~~v~l~v~R~g~~~~~~v~l~~~ 367 (1076)
.|++| ..|.++|||+.+|++||+|++|||.++.+|.++..++.. ..|+.+.+++.|+|+..++++++..+
T Consensus 8 ~Gv~V-~~V~~~s~A~~gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~~~ 78 (79)
T cd00986 8 HGVYV-TSVVEGMPAAGKLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLILKTF 78 (79)
T ss_pred cCEEE-EEECCCCchhhCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEEecc
Confidence 57666 799999999889999999999999999999999988864 77889999999999999999888755
No 33
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.74 E-value=7.4e-08 Score=85.82 Aligned_cols=69 Identities=26% Similarity=0.398 Sum_probs=62.3
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEEeC
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLKQD 1045 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk~~ 1045 (1076)
.|++|..|.++|||++ ||++||.|++|||+++.++++|.+.++....++.+.|. +.|+|+...+++++.
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~-v~r~g~~~~~~v~l~ 76 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLK-VKREEKELPEDLILK 76 (79)
T ss_pred cCEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEE-EEECCEEEEEEEEEe
Confidence 4899999999999997 89999999999999999999999999876668889998 689999988887764
No 34
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.70 E-value=1e-07 Score=84.81 Aligned_cols=66 Identities=30% Similarity=0.421 Sum_probs=58.9
Q ss_pred cEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEE
Q 001444 976 GVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLK 1043 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk 1043 (1076)
.++|..+.++|||+++||++||.|++|||+++.+++++...++... ++.+.+. +.|+|+...++++
T Consensus 13 ~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~-~~~~~l~-v~r~~~~~~~~l~ 78 (79)
T cd00989 13 EPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENP-GKPLTLT-VERNGETITLTLT 78 (79)
T ss_pred CcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCC-CceEEEE-EEECCEEEEEEec
Confidence 4689999999999999999999999999999999999999998854 7788888 6889988777764
No 35
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=98.68 E-value=2.4e-07 Score=98.92 Aligned_cols=176 Identities=21% Similarity=0.270 Sum_probs=107.2
Q ss_pred CceEEEEEeeeeccCCCCCCCcEEEEEEEeCCCcEEEeCccccCCCCcEEEEEec-------CC--cEEEEEEEEecC--
Q 001444 47 PAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFV-------NR--EEIPVYPIYRDP-- 115 (1076)
Q Consensus 47 ~svV~I~~~~~~~~d~~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~-------~~--~~~~a~vv~~d~-- 115 (1076)
|.+|.|.... ....|+|++|+++ +|||++||+.. ...+.+.+. ++ ..+...-+..++
T Consensus 13 p~~v~i~~~~---------~~~~C~G~li~~~--~vLTaahC~~~-~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~ 80 (220)
T PF00089_consen 13 PWVVSIRYSN---------GRFFCTGTLISPR--WVLTAAHCVDG-ASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKY 80 (220)
T ss_dssp TTEEEEEETT---------TEEEEEEEEEETT--EEEEEGGGHTS-GGSEEEEESESBTTSTTTTSEEEEEEEEEEETTS
T ss_pred CeEEEEeeCC---------CCeeEeEEecccc--ccccccccccc-cccccccccccccccccccccccccccccccccc
Confidence 6677777742 1678999999975 99999999995 333444332 22 355555554533
Q ss_pred -----CCcEEEEEEcCCCCccccccCCCCCC--cCCCCCCEEEEEecCCCCC----CeEEEEEEEEecCC-C-CCCCCCC
Q 001444 116 -----VHDFGFFRYDPSAIQFLNYDEIPLAP--EAACVGLEIRVVGNDSGEK----VSILAGTLARLDRD-A-PHYKKDG 182 (1076)
Q Consensus 116 -----~~DlAlLk~~~~~~~~~~~~~l~l~~--~~~~~G~~V~~iG~p~g~~----~s~~~G~is~~~~~-~-~~~~~~~ 182 (1076)
.+|+|||+++..-.....+.++.+.. ..++.|+.+.++|++.... ..+....+....+. . ..+...
T Consensus 81 ~~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~- 159 (220)
T PF00089_consen 81 DPSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDN- 159 (220)
T ss_dssp BTTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTT-
T ss_pred ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc-
Confidence 46999999986511111223333333 4468999999999987533 23444344333221 0 001110
Q ss_pred ccccceeeEEEee----ccCCCCCCCceecCCCcEEEEeecccCCCC---CccccCHHHHHHH
Q 001444 183 YNDFNTFYMQAAS----GTKGGSSGSPVIDWQGRAVALNAGSKSSSA---SAFFLPLERVVRA 238 (1076)
Q Consensus 183 ~~~~~~~~i~~~a----~~~~G~SGgPv~n~~G~vVGi~~~~~~~~~---~~falP~~~i~~~ 238 (1076)
....++.+.. ....|+|||||++.++.++||.+.+..... ..++.++...+++
T Consensus 160 ---~~~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~W 219 (220)
T PF00089_consen 160 ---LTPNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDW 219 (220)
T ss_dssp ---STTTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHH
T ss_pred ---cccccccccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhcc
Confidence 1122355554 678899999999988899999988743322 3666776655443
No 36
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.68 E-value=1.3e-07 Score=84.33 Aligned_cols=65 Identities=28% Similarity=0.260 Sum_probs=53.7
Q ss_pred CcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhc-CCCCeEEEEEEeCCeEEEEEEEec
Q 001444 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLVVQ 365 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~-~~g~~v~l~v~R~g~~~~~~v~l~ 365 (1076)
.|++| ..|.++|||+. ||++||+|++|||+++.+|.++ +.. ..++.+.+++.|+|+..++.+++.
T Consensus 12 ~~~~V-~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~---l~~~~~~~~v~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 12 GLGKV-TFVRDDSPADKAGLVAGDELVAVNGWRVDALQDR---LKEYQAGDPVELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred CcEEE-EEECCCChHHHhCCCCCCEEEEECCEEhHHHHHH---HHhcCCCCEEEEEEEECCEEEEEEEEec
Confidence 44554 79999999999 9999999999999999986544 432 467899999999999888877654
No 37
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.63 E-value=4e-07 Score=99.50 Aligned_cols=99 Identities=17% Similarity=0.117 Sum_probs=81.6
Q ss_pred HHHHHHHHHHHhcCCccccccceeccCCCccceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCccc
Q 001444 233 ERVVRALRFLQERRDCNIHNWEAVSIPRGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAH 312 (1076)
Q Consensus 233 ~~i~~~l~~l~~~~~~~~~~~~~~~~~rg~lg~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~ 312 (1076)
..+++++++|.+.+ .+-++++|+.-.... . ...|+++ ..+.+++||+
T Consensus 159 ~~~~~v~~~l~~~g----------~~~~~~lgi~p~~~~---------------------g-~~~G~~v-~~v~~~s~a~ 205 (259)
T TIGR01713 159 VVSRRIIEELTKDP----------QKMFDYIRLSPVMKN---------------------D-KLEGYRL-NPGKDPSLFY 205 (259)
T ss_pred hhHHHHHHHHHHCH----------HhhhheEeEEEEEeC---------------------C-ceeEEEE-EecCCCCHHH
Confidence 45677888888776 677888888865321 0 2367777 6999999999
Q ss_pred c-CCCCCCEEEEECCEEeCChhHHHHHHhc-CCCCeEEEEEEeCCeEEEEEEEe
Q 001444 313 L-RLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLVV 364 (1076)
Q Consensus 313 ~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~-~~g~~v~l~v~R~g~~~~~~v~l 364 (1076)
+ ||++||+|++|||+++.++.++..++.. ..++.++|+|+|+|+.+++.+.+
T Consensus 206 ~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 206 KSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred HcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 9 9999999999999999999999988865 67889999999999998887653
No 38
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.63 E-value=2.9e-07 Score=100.54 Aligned_cols=100 Identities=16% Similarity=0.157 Sum_probs=82.5
Q ss_pred hhHHHHHHHHhcCCCCCcccccccccCCCceeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCH
Q 001444 800 YTISRVLDKIISGASGPSLLINGVKRPMPLVRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSK 879 (1076)
Q Consensus 800 ~~v~~~l~~l~~~~~~~~~~~~~v~r~~p~~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~ 879 (1076)
..+++++++|.+.++ +-|. |+|+...... ....|++|..+.++++
T Consensus 159 ~~~~~v~~~l~~~g~--------~~~~-----~lgi~p~~~~----------------------g~~~G~~v~~v~~~s~ 203 (259)
T TIGR01713 159 VVSRRIIEELTKDPQ--------KMFD-----YIRLSPVMKN----------------------DKLEGYRLNPGKDPSL 203 (259)
T ss_pred hhHHHHHHHHHHCHH--------hhhh-----eEeEEEEEeC----------------------CceeEEEEEecCCCCH
Confidence 456788899988764 4455 8888764321 1236899999999999
Q ss_pred Hhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEec
Q 001444 880 AENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGT 938 (1076)
Q Consensus 880 A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l 938 (1076)
|+++ |++||+|++|||++++++.++..++.....+ +++.++|.|+|+.+++.+.+
T Consensus 204 a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~----~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 204 FYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREE----TNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred HHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCC----CeEEEEEEECCEEEEEEEEC
Confidence 9999 9999999999999999999999999777665 78999999999999888753
No 39
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.59 E-value=3.4e-07 Score=82.67 Aligned_cols=66 Identities=32% Similarity=0.617 Sum_probs=57.0
Q ss_pred CcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCCh--hHHHHHHhcCCCCeEEEEEEeC-CeEEEEEEE
Q 001444 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLIERG-GISMTVNLV 363 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~--~~l~~~l~~~~g~~v~l~v~R~-g~~~~~~v~ 363 (1076)
.+++| ..|.+++||++ ||++||+|++|||+.+.++ .++..++....|+.+.+++.|+ |+..++++.
T Consensus 13 ~~~~V-~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~~~~~~~~ 82 (85)
T cd00988 13 GGLVI-TSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGEPREVTLT 82 (85)
T ss_pred CeEEE-EEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCCEEEEEEE
Confidence 45555 79999999999 9999999999999999999 8888877666788999999998 887777664
No 40
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=98.59 E-value=3e-07 Score=104.36 Aligned_cols=158 Identities=22% Similarity=0.331 Sum_probs=109.5
Q ss_pred CcEEEEEEecCCCcccc--CCCCCCEEEEECCEEeCChh--HHHHHHhcCCCCeEEEEEEeCCeEEEEEEEeccCC--CC
Q 001444 297 TGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFL--KLETLLDDGVDKNIELLIERGGISMTVNLVVQDLH--SI 370 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~--gL~~GD~Il~VnG~~v~~~~--~l~~~l~~~~g~~v~l~v~R~g~~~~~~v~l~~~~--~~ 370 (1076)
..++| +.+...+.|++ +|++||+|+.|||....|+. +.+.++..+- .++.|.|+|+.....+.|+..... .+
T Consensus 219 SqIFv-Keit~~gLAardgnlqEGDiiLkINGtvteNmSLtDar~LIEkS~-GKL~lvVlRD~~qtLiNiP~l~d~dSe~ 296 (1027)
T KOG3580|consen 219 SQIFV-KEITRTGLAARDGNLQEGDIILKINGTVTENMSLTDARKLIEKSR-GKLQLVVLRDSQQTLINIPSLNDSDSEI 296 (1027)
T ss_pred chhhh-hhhcccchhhccCCcccccEEEEECcEeeccccchhHHHHHHhcc-CceEEEEEecCCceeeecCCCccccccc
Confidence 44555 79999999998 79999999999999888763 4455554444 468999999877766666422110 00
Q ss_pred --------------------------------------------CCCcccccCce-------------------------
Q 001444 371 --------------------------------------------TPDYFLEVSGA------------------------- 381 (1076)
Q Consensus 371 --------------------------------------------~~~~~~~~~G~------------------------- 381 (1076)
++.| +...|+
T Consensus 297 ~disEi~tms~rs~spp~rrs~~~s~d~~s~s~h~p~~Ps~r~~~~~R-~s~~gat~tPvks~~d~~~~~V~e~t~e~~~ 375 (1027)
T KOG3580|consen 297 EDISEIETMSDRSFSPPERRSQYSSYDYHSSSEHLPERPSSREDTPSR-LSRMGATPTPVKSTGDIAGTVVPETTKEPRY 375 (1027)
T ss_pred cchhhhhccccccCCCchhhhhccCccccCchhcCCCCCCccccchhh-cccCCCCCCCccCccccCCccccccccCccc
Confidence 0000 000111
Q ss_pred --------------EEecCCHHHHhccCCC----------------------CCeEEEE--cCCChhhHcCCCCCCEEEE
Q 001444 382 --------------VIHPLSYQQARNFRFP----------------------CGLVYVA--EPGYMLFRAGVPRHAIIKK 423 (1076)
Q Consensus 382 --------------~~~~l~~~~~~~~~~~----------------------~~gv~v~--~~gs~a~~aGl~~GD~I~~ 423 (1076)
+|.-++.+....|+.+ .-|+||+ ..|+||+..||+.||+|+.
T Consensus 376 ~q~p~lP~pk~~~~~~~~pS~~~m~~ygysP~tk~VrF~KGdSvGLRLAGGNDVGIFVaGvqegspA~~eGlqEGDQIL~ 455 (1027)
T KOG3580|consen 376 QQEPPLPQPKAAPRTFLRPSPEDMAIYGYSPNTKMVRFKKGDSVGLRLAGGNDVGIFVAGVQEGSPAEQEGLQEGDQILK 455 (1027)
T ss_pred ccCCCCCCcccCcceeeecCHHHHHHhcCCCCceeEEeecCCeeeeEeccCCceeEEEeecccCCchhhccccccceeEE
Confidence 1222333333334422 1289998 5899999999999999999
Q ss_pred cCCeecCCH--HHHHHHHHhcCCCCeEeEEEEeccc
Q 001444 424 FAGEEISRL--EDLISVLSKLSRGARVPIEYSSYTD 457 (1076)
Q Consensus 424 Vng~~v~~l--~~~~~~l~~~~~g~~v~l~~~~~~~ 457 (1076)
||.+++.|+ ++.+..|-.+|.|..|+|...+-.|
T Consensus 456 VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ~k~D 491 (1027)
T KOG3580|consen 456 VNTVDFRNLVREEAVLFLLELPKGEEVTILAQSKAD 491 (1027)
T ss_pred eccccchhhhHHHHHHHHhcCCCCcEEeehhhhhhH
Confidence 999999998 5899999999999999997654433
No 41
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.58 E-value=2.3e-07 Score=82.54 Aligned_cols=63 Identities=33% Similarity=0.588 Sum_probs=55.3
Q ss_pred EEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhcCCCCeEEEEEEeCCeEEEEEE
Q 001444 300 LVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNL 362 (1076)
Q Consensus 300 lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~~~g~~v~l~v~R~g~~~~~~v 362 (1076)
++|..|.++|||++ ||++||+|++|||+++.++.++...+....++.+.+++.|+++..++.+
T Consensus 14 ~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~~~~~~~~l 77 (79)
T cd00989 14 PVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERNGETITLTL 77 (79)
T ss_pred cEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEECCEEEEEEe
Confidence 34579999999998 9999999999999999999999988866567889999999998766654
No 42
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.57 E-value=3.7e-07 Score=81.43 Aligned_cols=66 Identities=26% Similarity=0.224 Sum_probs=56.2
Q ss_pred cEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEEeC
Q 001444 976 GVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLKQD 1045 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk~~ 1045 (1076)
+++|..+.++|||+++||++||.|++|||+++.++.++ ++....++.+.+. +.|+|+.+.+++++.
T Consensus 13 ~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~---l~~~~~~~~v~l~-v~r~g~~~~~~v~~~ 78 (80)
T cd00990 13 LGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDR---LKEYQAGDPVELT-VFRDDRLIEVPLTLA 78 (80)
T ss_pred cEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHH---HHhcCCCCEEEEE-EEECCEEEEEEEEec
Confidence 68999999999999999999999999999999986655 4443457789998 688999988888764
No 43
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=98.54 E-value=4.2e-07 Score=103.22 Aligned_cols=57 Identities=30% Similarity=0.350 Sum_probs=52.7
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEEEE
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVRTV 1031 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~~v 1031 (1076)
-|++|+.|.+||||++.||+.||.|+.||.++..++ ++.+..+..+|+|+.++|...
T Consensus 429 VGIFVaGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ 487 (1027)
T KOG3580|consen 429 VGIFVAGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQ 487 (1027)
T ss_pred eeEEEeecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhh
Confidence 389999999999999999999999999999999998 788888899999999998643
No 44
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.50 E-value=6.2e-07 Score=80.98 Aligned_cols=68 Identities=31% Similarity=0.478 Sum_probs=60.7
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEEEEEeC-CeEEEEEEEe
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVRTVHLN-GKPRVLTLKQ 1044 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~~v~r~-g~~~~~tlk~ 1044 (1076)
.+++|..+.++|||+++||++||+|++|||+++.++ +++...++.. .++.+.|. +.|+ |..+.+++++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~-~~~~i~l~-v~r~~~~~~~~~~~~ 83 (85)
T cd00988 13 GGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGK-AGTKVRLT-LKRGDGEPREVTLTR 83 (85)
T ss_pred CeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCC-CCCEEEEE-EEcCCCCEEEEEEEE
Confidence 378999999999999999999999999999999999 9999999774 48889998 6777 8888888875
No 45
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.50 E-value=2.3e-07 Score=107.54 Aligned_cols=149 Identities=21% Similarity=0.280 Sum_probs=94.6
Q ss_pred EEEecCCCcccc--CCCCCCEEEEECCEEeCChhHHHHH-HhcCCCCeEEEEEEeCCeEE----------EEEEEecc--
Q 001444 302 VDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLETL-LDDGVDKNIELLIERGGISM----------TVNLVVQD-- 366 (1076)
Q Consensus 302 v~~V~~~spA~~--gL~~GD~Il~VnG~~v~~~~~l~~~-l~~~~g~~v~l~v~R~g~~~----------~~~v~l~~-- 366 (1076)
|..|.+||||+. .|+.||+|++|||..|.+..+-+-+ |-+..|-+|+|+|.-..+.. .-.++...
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKdaGlsVtLtIip~ee~~~~~~~~sa~~~s~~t~~~~~ 861 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKDAGLSVTLTIIPPEEAGPPTSMTSAEKQSPFTQNGPY 861 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHhcCceEEEEEcChhccCCCCCCcchhhcCcccccCCH
Confidence 478899999999 6999999999999999998877644 34567889999987533221 01111100
Q ss_pred -----CCCCCC----C--cccccCceEEecCCHH-----------HHhccCCCC-------CeEEEE--cCCChhhHcC-
Q 001444 367 -----LHSITP----D--YFLEVSGAVIHPLSYQ-----------QARNFRFPC-------GLVYVA--EPGYMLFRAG- 414 (1076)
Q Consensus 367 -----~~~~~~----~--~~~~~~G~~~~~l~~~-----------~~~~~~~~~-------~gv~v~--~~gs~a~~aG- 414 (1076)
++...+ . ....+.|..+.+...| -++.||++. -+.||. ..++||.+.|
T Consensus 862 ~q~~glp~~~~s~~~~~pqpdt~~~~~~~~r~~qn~~~~~VelErG~kGFGFSiRGGreynM~LfVLRlAeDGPA~rdGr 941 (984)
T KOG3209|consen 862 EQQYGLPGPRPSVYEEHPQPDTFQGLSINDRMSQNGDLYTVELERGAKGFGFSIRGGREYNMDLFVLRLAEDGPAIRDGR 941 (984)
T ss_pred hHccCCCCCCccccccCCCCccccceeccccccccCCeeEEEeeccccccceEeecccccccceEEEEeccCCCccccCc
Confidence 000000 0 0111223333222211 134566553 257776 3789999999
Q ss_pred CCCCCEEEEcCCeecCCHH--HHHHHHHhcCCCCeEeEEE
Q 001444 415 VPRHAIIKKFAGEEISRLE--DLISVLSKLSRGARVPIEY 452 (1076)
Q Consensus 415 l~~GD~I~~Vng~~v~~l~--~~~~~l~~~~~g~~v~l~~ 452 (1076)
++.||+|++|||+++.+.. ..++.||+ +|-+|.+..
T Consensus 942 m~VGDqi~eINGesTkgmtH~rAIelIk~--gg~~vll~L 979 (984)
T KOG3209|consen 942 MRVGDQITEINGESTKGMTHDRAIELIKQ--GGRRVLLLL 979 (984)
T ss_pred eeecceEEEecCcccCCCcHHHHHHHHHh--CCeEEEEEe
Confidence 9999999999999999984 66777777 334444433
No 46
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=98.44 E-value=3.7e-06 Score=90.52 Aligned_cols=164 Identities=19% Similarity=0.224 Sum_probs=93.8
Q ss_pred CCceEEEEEeeeeccCCCCCCCcEEEEEEEeCCCcEEEeCccccCCC-CcEEEEEecC---------CcEEEEEEEEecC
Q 001444 46 VPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPG-PVVAEAMFVN---------REEIPVYPIYRDP 115 (1076)
Q Consensus 46 ~~svV~I~~~~~~~~d~~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~-~~~~~v~~~~---------~~~~~a~vv~~d~ 115 (1076)
.|.+|.|.... ....|+|++|+++ +|||+||++... .....+.+.. ...+.++-+..+|
T Consensus 12 ~Pw~v~i~~~~---------~~~~C~GtlIs~~--~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp 80 (232)
T cd00190 12 FPWQVSLQYTG---------GRHFCGGSLISPR--WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHP 80 (232)
T ss_pred CCCEEEEEccC---------CcEEEEEEEeeCC--EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECC
Confidence 45677776531 3578999999965 999999999853 1334444331 2334555555554
Q ss_pred -------CCcEEEEEEcCCCCccccccCCCCCCc--CCCCCCEEEEEecCCCCCC-----eEEEEEEEEecCC--CCCCC
Q 001444 116 -------VHDFGFFRYDPSAIQFLNYDEIPLAPE--AACVGLEIRVVGNDSGEKV-----SILAGTLARLDRD--APHYK 179 (1076)
Q Consensus 116 -------~~DlAlLk~~~~~~~~~~~~~l~l~~~--~~~~G~~V~~iG~p~g~~~-----s~~~G~is~~~~~--~~~~~ 179 (1076)
.+|+|||+++..--....+.++.|... .+..|+.+.+.|+...... ......+..+.+. ...+.
T Consensus 81 ~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~ 160 (232)
T cd00190 81 NYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYS 160 (232)
T ss_pred CCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhcc
Confidence 479999999843211112445555544 6788999999998754321 1222222222211 00010
Q ss_pred C-CCccccceeeEEE---eeccCCCCCCCceecCC---CcEEEEeeccc
Q 001444 180 K-DGYNDFNTFYMQA---ASGTKGGSSGSPVIDWQ---GRAVALNAGSK 221 (1076)
Q Consensus 180 ~-~~~~~~~~~~i~~---~a~~~~G~SGgPv~n~~---G~vVGi~~~~~ 221 (1076)
. ....+ +...... ......|.|||||+... +.++||.+.+.
T Consensus 161 ~~~~~~~-~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 161 YGGTITD-NMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred CcccCCC-ceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 0 00001 0000110 23455799999999765 78999998764
No 47
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.31 E-value=2.2e-06 Score=74.21 Aligned_cols=53 Identities=36% Similarity=0.619 Sum_probs=47.2
Q ss_pred cEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCCh--hHHHHHHhcCCCCeEEEEE
Q 001444 298 GLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLI 351 (1076)
Q Consensus 298 G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~--~~l~~~l~~~~g~~v~l~v 351 (1076)
|++| ..|.++|||+. ||++||+|++|||+++.++ .++..++....|+.++|++
T Consensus 14 ~~~V-~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v 69 (70)
T cd00136 14 GVVV-LSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV 69 (70)
T ss_pred CEEE-EEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence 5555 79999999999 9999999999999999999 8888888776688888876
No 48
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=98.28 E-value=2.1e-05 Score=84.65 Aligned_cols=164 Identities=16% Similarity=0.173 Sum_probs=94.0
Q ss_pred CCceEEEEEeeeeccCCCCCCCcEEEEEEEeCCCcEEEeCccccCCCC-cEEEEEecCC--------cEEEEEEEEec--
Q 001444 46 VPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGP-VVAEAMFVNR--------EEIPVYPIYRD-- 114 (1076)
Q Consensus 46 ~~svV~I~~~~~~~~d~~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~~-~~~~v~~~~~--------~~~~a~vv~~d-- 114 (1076)
.|-+|.|.... ....|+|.+|+++ +|||++|++.... ....+.+... ..+.+.-+..+
T Consensus 13 ~Pw~~~i~~~~---------~~~~C~GtlIs~~--~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~ 81 (229)
T smart00020 13 FPWQVSLQYRG---------GRHFCGGSLISPR--WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPN 81 (229)
T ss_pred CCcEEEEEEcC---------CCcEEEEEEecCC--EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCC
Confidence 35566665421 3678999999964 9999999999542 3555655432 34555555544
Q ss_pred -----CCCcEEEEEEcCCC-CccccccCCCCCC--cCCCCCCEEEEEecCCCCC------CeEEEEEEEEecCC--CCCC
Q 001444 115 -----PVHDFGFFRYDPSA-IQFLNYDEIPLAP--EAACVGLEIRVVGNDSGEK------VSILAGTLARLDRD--APHY 178 (1076)
Q Consensus 115 -----~~~DlAlLk~~~~~-~~~~~~~~l~l~~--~~~~~G~~V~~iG~p~g~~------~s~~~G~is~~~~~--~~~~ 178 (1076)
..+|+|||+++..- +.. .+.++.|.. ..+..|+.+.+.|++.... .......+..+.+. ...+
T Consensus 82 ~~~~~~~~DiAll~L~~~i~~~~-~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~ 160 (229)
T smart00020 82 YNPSTYDNDIALLKLKSPVTLSD-NVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAY 160 (229)
T ss_pred CCCCCCcCCEEEEEECcccCCCC-ceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhh
Confidence 35799999997531 111 234444443 3677899999999876542 12222222222220 0001
Q ss_pred CCC-CccccceeeEE--EeeccCCCCCCCceecCCC--cEEEEeeccc
Q 001444 179 KKD-GYNDFNTFYMQ--AASGTKGGSSGSPVIDWQG--RAVALNAGSK 221 (1076)
Q Consensus 179 ~~~-~~~~~~~~~i~--~~a~~~~G~SGgPv~n~~G--~vVGi~~~~~ 221 (1076)
... ........... ......+|.||||++...+ .++||.+.+.
T Consensus 161 ~~~~~~~~~~~C~~~~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 161 SGGGAITDNMLCAGGLEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred ccccccCCCcEeecCCCCCCcccCCCCCCeeEEECCCEEEEEEEEECC
Confidence 000 01110001111 1244567999999997664 8999998764
No 49
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.28 E-value=2.4e-06 Score=74.02 Aligned_cols=53 Identities=36% Similarity=0.441 Sum_probs=50.0
Q ss_pred cEEEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEE
Q 001444 976 GVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVR 1029 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~ 1029 (1076)
+++|..+.++|||+++||++||.|++|||+++.++ +++.+.++..+ ++.++|+
T Consensus 14 ~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~-g~~v~l~ 68 (70)
T cd00136 14 GVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEV-GEKVTLT 68 (70)
T ss_pred CEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCC-CCeEEEE
Confidence 78999999999999999999999999999999999 99999999865 8888887
No 50
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.22 E-value=1.9e-05 Score=84.45 Aligned_cols=158 Identities=14% Similarity=0.099 Sum_probs=92.3
Q ss_pred CcEEEEEEEeCCCcEEEeCccccCCCCcE-EE-EEec-----CCc---EEEEEEEEecC----CCcEEEEEEcCCCCc--
Q 001444 67 ASYATGFVVDKRRGIILTNRHVVKPGPVV-AE-AMFV-----NRE---EIPVYPIYRDP----VHDFGFFRYDPSAIQ-- 130 (1076)
Q Consensus 67 ~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~-~~-v~~~-----~~~---~~~a~vv~~d~----~~DlAlLk~~~~~~~-- 130 (1076)
...+++|+|.++ .+||++||+-..... .. ..++ ++. .+........+ ..|.+...+.+..+.
T Consensus 63 ~~~~~~~lI~pn--tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g 140 (251)
T COG3591 63 RLCTAATLIGPN--TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESG 140 (251)
T ss_pred cceeeEEEEcCc--eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccC
Confidence 334566999976 999999999854321 11 1121 111 12222222222 346666666543222
Q ss_pred --cccc---cCCCCCCcCCCCCCEEEEEecCCCCCCe----EEEEEEEEecCCCCCCCCCCccccceeeEEEeeccCCCC
Q 001444 131 --FLNY---DEIPLAPEAACVGLEIRVVGNDSGEKVS----ILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGS 201 (1076)
Q Consensus 131 --~~~~---~~l~l~~~~~~~G~~V~~iG~p~g~~~s----~~~G~is~~~~~~~~~~~~~~~~~~~~~i~~~a~~~~G~ 201 (1076)
+..+ ...++ ....+.++.+.++|||.+.... ...+.+..... .+++.++.+.+|+
T Consensus 141 ~~~~~~~~~~~~~~-~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~---------------~~l~y~~dT~pG~ 204 (251)
T COG3591 141 INIGDVVNYLKRNT-ASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKG---------------NKLFYDADTLPGS 204 (251)
T ss_pred CCcccccccccccc-ccccccCceeEEEeccCCCCcceeEeeecceeEEEec---------------ceEEEEecccCCC
Confidence 1111 11111 4567899999999999876532 23344433322 2489999999999
Q ss_pred CCCceecCCCcEEEEeecccCCC---CCccc-cCHHHHHHHHHHH
Q 001444 202 SGSPVIDWQGRAVALNAGSKSSS---ASAFF-LPLERVVRALRFL 242 (1076)
Q Consensus 202 SGgPv~n~~G~vVGi~~~~~~~~---~~~fa-lP~~~i~~~l~~l 242 (1076)
||+||++.+.++||+++.+.... ..+++ .-...++++++++
T Consensus 205 SGSpv~~~~~~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~ 249 (251)
T COG3591 205 SGSPVLISKDEVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQN 249 (251)
T ss_pred CCCceEecCceEEEEEecCCCcccccccCcceEecHHHHHHHHHh
Confidence 99999999999999998776432 22332 2234555555443
No 51
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.06 E-value=9.6e-05 Score=78.51 Aligned_cols=166 Identities=20% Similarity=0.241 Sum_probs=91.8
Q ss_pred HHhCCceEEEEEeeeeccCCCCCCCcEEEEEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEE-----EEEecCCC
Q 001444 43 NKVVPAVVVLRTTACRAFDTEAAGASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVY-----PIYRDPVH 117 (1076)
Q Consensus 43 ~~v~~svV~I~~~~~~~~d~~~~~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~-----vv~~d~~~ 117 (1076)
.-+...|+.|.... ......=-||.... +|+||+|........+.+...-|. |... -+..=+..
T Consensus 14 n~Ia~~ic~l~n~s-------~~~~~~l~gigyG~---~iItn~HLf~~nng~L~i~s~hG~-f~v~nt~~lkv~~i~~~ 82 (235)
T PF00863_consen 14 NPIASNICRLTNES-------DGGTRSLYGIGYGS---YIITNAHLFKRNNGELTIKSQHGE-FTVPNTTQLKVHPIEGR 82 (235)
T ss_dssp HHHHTTEEEEEEEE-------TTEEEEEEEEEETT---EEEEEGGGGSSTTCEEEEEETTEE-EEECEGGGSEEEE-TCS
T ss_pred chhhheEEEEEEEe-------CCCeEEEEEEeECC---EEEEChhhhccCCCeEEEEeCceE-EEcCCccccceEEeCCc
Confidence 34556778887542 11223335666664 999999999876667777765553 3222 24445688
Q ss_pred cEEEEEEcCCCCccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeEEEeecc
Q 001444 118 DFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGT 197 (1076)
Q Consensus 118 DlAlLk~~~~~~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i~~~a~~ 197 (1076)
||.++|+. +++|+.+- .+.-..++.|++|.+||.-+....... .+|......|. . +..+...-.++
T Consensus 83 DiviirmP-kDfpPf~~---kl~FR~P~~~e~v~mVg~~fq~k~~~s--~vSesS~i~p~----~----~~~fWkHwIsT 148 (235)
T PF00863_consen 83 DIVIIRMP-KDFPPFPQ---KLKFRAPKEGERVCMVGSNFQEKSISS--TVSESSWIYPE----E----NSHFWKHWIST 148 (235)
T ss_dssp SEEEEE---TTS----S------B----TT-EEEEEEEECSSCCCEE--EEEEEEEEEEE----T----TTTEEEE-C--
T ss_pred cEEEEeCC-cccCCcch---hhhccCCCCCCEEEEEEEEEEcCCeeE--EECCceEEeec----C----CCCeeEEEecC
Confidence 99999995 44443221 122356789999999998666543211 22222111110 1 12247777888
Q ss_pred CCCCCCCceec-CCCcEEEEeecccCCCCCccccCHH
Q 001444 198 KGGSSGSPVID-WQGRAVALNAGSKSSSASAFFLPLE 233 (1076)
Q Consensus 198 ~~G~SGgPv~n-~~G~vVGi~~~~~~~~~~~falP~~ 233 (1076)
..|+-|+|+++ .||++|||++........+|+.|+.
T Consensus 149 k~G~CG~PlVs~~Dg~IVGiHsl~~~~~~~N~F~~f~ 185 (235)
T PF00863_consen 149 KDGDCGLPLVSTKDGKIVGIHSLTSNTSSRNYFTPFP 185 (235)
T ss_dssp -TT-TT-EEEETTT--EEEEEEEEETTTSSEEEEE--
T ss_pred CCCccCCcEEEcCCCcEEEEEcCccCCCCeEEEEcCC
Confidence 99999999998 6899999999888888889997764
No 52
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.02 E-value=3.5e-05 Score=69.33 Aligned_cols=64 Identities=25% Similarity=0.372 Sum_probs=48.2
Q ss_pred CcEEEEEEecCC--------Ccccc-C--CCCCCEEEEECCEEeCChhHHHHHHhcCCCCeEEEEEEeCCe-EEEE
Q 001444 297 TGLLVVDSVVPG--------GPAHL-R--LEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGI-SMTV 360 (1076)
Q Consensus 297 ~G~lvv~~V~~~--------spA~~-g--L~~GD~Il~VnG~~v~~~~~l~~~l~~~~g~~v~l~v~R~g~-~~~~ 360 (1076)
.|.+.|.+|.++ ||..+ | +++||+|++|||+++..-.++..+|....|+.|.|+|.+.+. .+++
T Consensus 11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~~~~R~v 86 (88)
T PF14685_consen 11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKPGGARTV 86 (88)
T ss_dssp TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-STT-EEEE
T ss_pred CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCCCCceEE
Confidence 577888899986 88888 6 559999999999999999999999999999999999999663 4444
No 53
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.94 E-value=2.5e-05 Score=69.82 Aligned_cols=53 Identities=32% Similarity=0.494 Sum_probs=46.5
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEE
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVR 1029 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~ 1029 (1076)
.++||+++.++|||+++||++||.|++|||+++.++ ++..++++..+ ..++|+
T Consensus 25 ~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~--~~v~L~ 79 (81)
T PF00595_consen 25 KGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSAS--NPVTLT 79 (81)
T ss_dssp EEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHST--SEEEEE
T ss_pred CCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCC--CcEEEE
Confidence 389999999999999999999999999999999977 77788888743 377776
No 54
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.90 E-value=3e-05 Score=88.95 Aligned_cols=69 Identities=22% Similarity=0.423 Sum_probs=56.6
Q ss_pred CcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCCh--hHHHHHHhcCCCCeEEEEEEeCCeEEEEEEEecc
Q 001444 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLIERGGISMTVNLVVQD 366 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~--~~l~~~l~~~~g~~v~l~v~R~g~~~~~~v~l~~ 366 (1076)
.+++| ..|.++|||++ ||++||+|++|||+++.+| .++...+....|..+.+++.|+|+...+++++..
T Consensus 62 ~~~~V-~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~g~~~~~~v~l~~ 133 (334)
T TIGR00225 62 GEIVI-VSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRAGKSKPLTFTLKR 133 (334)
T ss_pred CEEEE-EEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeCCCCceEEEEEEE
Confidence 34444 79999999999 9999999999999999987 4555666666789999999999877766666654
No 55
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.90 E-value=7.9e-05 Score=85.27 Aligned_cols=68 Identities=24% Similarity=0.385 Sum_probs=59.4
Q ss_pred CcEEEEEec--------CCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEEe
Q 001444 975 HGVYVARWC--------HGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLKQ 1044 (1076)
Q Consensus 975 ~gv~V~~v~--------~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk~ 1044 (1076)
.||+|.... .+|||+++||++||.|++|||+++.++++|.+++++.. ++.+.|+ +.|+|+.+.+++++
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~-g~~V~Lt-V~R~Ge~~tv~V~P 180 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAG-GEKLTLT-IERGGKIIETVIKP 180 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCC-CCeEEEE-EEECCEEEEEEEEE
Confidence 477775542 36999999999999999999999999999999999875 8889999 78999999888874
No 56
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.90 E-value=4.1e-05 Score=68.37 Aligned_cols=53 Identities=30% Similarity=0.396 Sum_probs=46.9
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecC--CHHHHHHHHHhCCCCCeEEEE
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTP--DLEAFVNVTKEIEHGEFVRVR 1029 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~--~l~~f~~~v~~~~~~~~v~l~ 1029 (1076)
.+++|..+.++|||+++||++||.|++|||+++. +++++.+.++... + .++|.
T Consensus 26 ~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~-~-~v~l~ 80 (82)
T cd00992 26 GGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSG-D-EVTLT 80 (82)
T ss_pred CCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCC-C-eEEEE
Confidence 3789999999999999999999999999999999 9999999998743 2 55554
No 57
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.90 E-value=3.6e-05 Score=69.08 Aligned_cols=59 Identities=29% Similarity=0.317 Sum_probs=47.1
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeC
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLN 1034 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~ 1034 (1076)
.+++|..+.++|||+++||++||.|++|||+++.++.+..........+..++|. +.|+
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~-i~r~ 84 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLT-VLRG 84 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEE-EEeC
Confidence 3899999999999999999999999999999999876554444433335578887 4554
No 58
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.89 E-value=5.8e-05 Score=88.15 Aligned_cols=75 Identities=21% Similarity=0.271 Sum_probs=63.2
Q ss_pred cEEEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEEeCCccCcce
Q 001444 976 GVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLKQDLHYWPTW 1052 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk~~~~y~pt~ 1052 (1076)
+++|..|.++|||+++||++||+|++|||+++.++ .++...++. +.++.+.|+ +.|+|..+.++|+...-..+++
T Consensus 103 g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g-~~g~~v~lt-v~r~g~~~~~~l~r~~v~~~~v 179 (389)
T PLN00049 103 GLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQG-PEGSSVELT-LRRGPETRLVTLTREKVSLNPV 179 (389)
T ss_pred cEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhc-CCCCEEEEE-EEECCEEEEEEEEeeeEeccce
Confidence 78999999999999999999999999999999864 778888865 458899998 6799998888887655444444
No 59
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=97.86 E-value=3e-05 Score=89.10 Aligned_cols=72 Identities=21% Similarity=0.278 Sum_probs=58.3
Q ss_pred EEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEEeCCccCcceEEEE
Q 001444 979 VARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLKQDLHYWPTWELIF 1056 (1076)
Q Consensus 979 V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk~~~~y~pt~e~~~ 1056 (1076)
|..|.++|||+++||++||.|++|||+++++|.++...+. ++.+.+++..|+|+.+.+++.++ |...+-+.+
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~----~e~l~L~V~~rdGe~~~l~Ie~~--~dedlG~~f 73 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA----DEELELEVLDANGESHQIEIEKD--LDEDLGLEF 73 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc----CCcEEEEEEcCCCeEEEEEEecC--CCCCCcEEe
Confidence 5778999999999999999999999999999999988874 35688884458999888888875 334444433
No 60
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.83 E-value=5.8e-05 Score=67.71 Aligned_cols=59 Identities=27% Similarity=0.428 Sum_probs=49.3
Q ss_pred ceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCC
Q 001444 867 QVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQG 929 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g 929 (1076)
.+++|..|.++|||+.+ |++||+|++|||+.+.++.+..........+ +.+.+++.|++
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~----~~~~l~i~r~~ 85 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAG----GKVTLTVLRGG 85 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCC----CeEEEEEEeCC
Confidence 46889999999999999 9999999999999999988776665443433 57899998864
No 61
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.82 E-value=8.8e-05 Score=85.12 Aligned_cols=77 Identities=26% Similarity=0.365 Sum_probs=61.6
Q ss_pred cEEEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEEEEEeCCeEEEE--EEEeCCccCcc
Q 001444 976 GVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVRTVHLNGKPRVL--TLKQDLHYWPT 1051 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~--tlk~~~~y~pt 1051 (1076)
+++|..|.++|||+++||++||+|++|||+++.+| +++...++. +.++.+.|+ +.|+|+...+ ++.....+.|+
T Consensus 63 ~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g~~v~l~-v~R~g~~~~~~v~l~~~~~~~~~ 140 (334)
T TIGR00225 63 EIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRG-KKGTKVSLE-ILRAGKSKPLTFTLKRDRIELQT 140 (334)
T ss_pred EEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccC-CCCCEEEEE-EEeCCCCceEEEEEEEEEeeccc
Confidence 68999999999999999999999999999999986 577777766 458899999 6788765544 45444455566
Q ss_pred eEE
Q 001444 1052 WEL 1054 (1076)
Q Consensus 1052 ~e~ 1054 (1076)
...
T Consensus 141 v~~ 143 (334)
T TIGR00225 141 VKA 143 (334)
T ss_pred eEE
Confidence 554
No 62
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=97.80 E-value=0.00019 Score=80.47 Aligned_cols=162 Identities=17% Similarity=0.119 Sum_probs=109.6
Q ss_pred ceEEEEEecCCCHHhhh-ccC-CCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeC--CEEEEEEEeccccC
Q 001444 867 QVLRVKGCLAGSKAENM-LEQ-GDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQ--GREIELQVGTDVRD 942 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~a-L~~-GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~--g~~~~~~v~l~~~~ 942 (1076)
.++-|.+|..+|+|.++ |.+ =|.|++|||..+..-.|..+++..... ++|+++|.-. -....+.|+.....
T Consensus 15 eg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~s-----ekVkltv~n~kt~~~R~v~I~ps~~w 89 (462)
T KOG3834|consen 15 EGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANS-----EKVKLTVYNSKTQEVRIVEIVPSNNW 89 (462)
T ss_pred eeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcc-----cceEEEEEecccceeEEEEecccccc
Confidence 56779999999999999 655 799999999999987776666532222 5599999843 22333444332211
Q ss_pred CCCCcceeeecCccccCCcHhHhhcCCCCCCCCcE-EEEEecCCChhhhcCCC-CCCeEEEECCeecCCHHHHHHHHHhC
Q 001444 943 GNGTTRVINWCGCIVQDPHPAVRALGFLPEEGHGV-YVARWCHGSPVHRYGLY-ALQWIVEINGKRTPDLEAFVNVTKEI 1020 (1076)
Q Consensus 943 ~~~~~~~~~~~G~~~~~p~~~~~~~~~~p~~~~gv-~V~~v~~gSpA~~~GL~-~gD~I~~VNg~~v~~l~~f~~~v~~~ 1020 (1076)
... ++|+.++-=... .+. .-+ =|-+|.+.|||+++||+ -+|.|+-+-+.-..+.+||...|...
T Consensus 90 ---ggq---llGvsvrFcsf~------~A~--~~vwHvl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIesh 155 (462)
T KOG3834|consen 90 ---GGQ---LLGVSVRFCSFD------GAV--ESVWHVLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESH 155 (462)
T ss_pred ---ccc---ccceEEEeccCc------cch--hheeeeeecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhc
Confidence 101 233333211100 000 012 27789999999999998 79999999555566778888888875
Q ss_pred CCCCeEEEEEEEeC-CeEEEEEEEeCCcc
Q 001444 1021 EHGEFVRVRTVHLN-GKPRVLTLKQDLHY 1048 (1076)
Q Consensus 1021 ~~~~~v~l~~v~r~-g~~~~~tlk~~~~y 1048 (1076)
.++.++|-+.+-| ...+.+++++|.++
T Consensus 156 -e~kpLklyVYN~D~d~~ReVti~pn~aw 183 (462)
T KOG3834|consen 156 -EGKPLKLYVYNHDTDSCREVTITPNSAW 183 (462)
T ss_pred -cCCCcceeEeecCCCccceEEeeccccc
Confidence 4899998766655 45889999998874
No 63
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.75 E-value=0.00011 Score=84.13 Aligned_cols=69 Identities=28% Similarity=0.437 Sum_probs=59.2
Q ss_pred CCcEEEEEEec-------CCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhcCCCCeEEEEEEeCCeEEEEEEEe
Q 001444 296 ETGLLVVDSVV-------PGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVV 364 (1076)
Q Consensus 296 ~~G~lvv~~V~-------~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~~~g~~v~l~v~R~g~~~~~~v~l 364 (1076)
..|+||+..-. .++||++ ||++||+|++|||+++.+|.++.+++....++.+.++|.|+|+..++.+++
T Consensus 104 t~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~Ge~~tv~V~P 180 (402)
T TIGR02860 104 TKGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERGGKIIETVIKP 180 (402)
T ss_pred cCEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEECCEEEEEEEEE
Confidence 48999985422 2589999 999999999999999999999999987766889999999999988877753
No 64
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=97.75 E-value=6.3e-05 Score=86.56 Aligned_cols=61 Identities=25% Similarity=0.517 Sum_probs=52.9
Q ss_pred EEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEE-eCCEEEEEEEecc
Q 001444 871 VKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIF-RQGREIELQVGTD 939 (1076)
Q Consensus 871 V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~-R~g~~~~~~v~l~ 939 (1076)
|..|.++|+|+++ |++||+|++|||++|.+|.|+...+. ++.+.++|. |+|+..++.+...
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~--------~e~l~L~V~~rdGe~~~l~Ie~~ 64 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA--------DEELELEVLDANGESHQIEIEKD 64 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc--------CCcEEEEEEcCCCeEEEEEEecC
Confidence 6789999999999 99999999999999999999887772 256889997 8998888887654
No 65
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.74 E-value=8.3e-05 Score=66.46 Aligned_cols=56 Identities=27% Similarity=0.285 Sum_probs=45.0
Q ss_pred ceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEE
Q 001444 867 QVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIF 926 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~ 926 (1076)
.+++|.+|.++|+|+++ |+.||+|++|||+.+.++...+......... ..++|+|.
T Consensus 25 ~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~----~~v~L~V~ 81 (81)
T PF00595_consen 25 KGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSAS----NPVTLTVQ 81 (81)
T ss_dssp EEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHST----SEEEEEEE
T ss_pred CCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCC----CcEEEEEC
Confidence 57889999999999999 9999999999999999986655444333333 57888763
No 66
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.72 E-value=0.00011 Score=85.89 Aligned_cols=66 Identities=30% Similarity=0.527 Sum_probs=55.3
Q ss_pred cEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCCh--hHHHHHHhcCCCCeEEEEEEeCCeEEEEEEEe
Q 001444 298 GLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLIERGGISMTVNLVV 364 (1076)
Q Consensus 298 G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~--~~l~~~l~~~~g~~v~l~v~R~g~~~~~~v~l 364 (1076)
|++| ..|.++|||++ ||++||+|++|||++|.++ .++..++....|..+.|+|.|+|+..+++++-
T Consensus 103 g~~V-~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r~g~~~~~~l~r 171 (389)
T PLN00049 103 GLVV-VAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRRGPETRLVTLTR 171 (389)
T ss_pred cEEE-EEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEe
Confidence 6655 69999999999 9999999999999999865 56666666677889999999999877666553
No 67
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.70 E-value=0.00011 Score=65.66 Aligned_cols=53 Identities=34% Similarity=0.577 Sum_probs=43.6
Q ss_pred CcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeC--ChhHHHHHHhcCCCCeEEEEE
Q 001444 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVIT--QFLKLETLLDDGVDKNIELLI 351 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~--~~~~l~~~l~~~~g~~v~l~v 351 (1076)
.|++| ..|.++|||+. ||++||+|++|||+++. ++.++...+....+ .+++++
T Consensus 26 ~~~~V-~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~-~v~l~v 81 (82)
T cd00992 26 GGIFV-SRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV 81 (82)
T ss_pred CCeEE-EEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC-eEEEEE
Confidence 46666 79999999999 99999999999999999 78888887765433 566654
No 68
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.61 E-value=0.00024 Score=69.96 Aligned_cols=72 Identities=19% Similarity=0.295 Sum_probs=53.5
Q ss_pred CcEEEEEecCCChhhhcCCCC-CCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEe-CCeEEEEEEEeCCc
Q 001444 975 HGVYVARWCHGSPVHRYGLYA-LQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHL-NGKPRVLTLKQDLH 1047 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~-gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r-~g~~~~~tlk~~~~ 1047 (1076)
.+.-|.+|.++|||+++||.+ .|+|+.+++....+.++|.+.++... ++.+.|.+.+. ....+.+++.|+.+
T Consensus 43 ~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~-~~~l~L~Vyns~~~~vR~V~i~P~~~ 116 (138)
T PF04495_consen 43 EGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANE-NKPLQLYVYNSKTDSVREVTITPSRN 116 (138)
T ss_dssp CEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTT-TS-EEEEEEETTTTCEEEEEE---TT
T ss_pred ceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcC-CCcEEEEEEECCCCeEEEEEEEcCCC
Confidence 477899999999999999998 69999999999999999999999954 88999985443 34577888888654
No 69
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.60 E-value=0.0002 Score=83.70 Aligned_cols=78 Identities=24% Similarity=0.318 Sum_probs=66.6
Q ss_pred cEEEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEEEEEeC--CeEEEEEEEeCCccCcc
Q 001444 976 GVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVRTVHLN--GKPRVLTLKQDLHYWPT 1051 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~~v~r~--g~~~~~tlk~~~~y~pt 1051 (1076)
++.|.+..+|+||+++|+++||.|+.|||+++.+. ++.++.++. +.|+.|+|+ +.|. ++++.++++-+.-+-++
T Consensus 113 ~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG-~~Gt~V~L~-i~r~~~~k~~~v~l~Re~i~l~~ 190 (406)
T COG0793 113 GVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRG-KPGTKVTLT-ILRAGGGKPFTVTLTREEIELED 190 (406)
T ss_pred CcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCC-CCCCeEEEE-EEEcCCCceeEEEEEEEEEeccc
Confidence 78899999999999999999999999999999988 678888887 569999999 6775 67899998887666665
Q ss_pred eEEE
Q 001444 1052 WELI 1055 (1076)
Q Consensus 1052 ~e~~ 1055 (1076)
..++
T Consensus 191 v~~~ 194 (406)
T COG0793 191 VAAK 194 (406)
T ss_pred eeee
Confidence 5553
No 70
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.58 E-value=0.00056 Score=61.63 Aligned_cols=65 Identities=25% Similarity=0.252 Sum_probs=44.1
Q ss_pred ceEEEEEecCC--------CHHhhh---ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCC-EEEEE
Q 001444 867 QVLRVKGCLAG--------SKAENM---LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQG-REIEL 934 (1076)
Q Consensus 867 ~~~~V~~V~~~--------s~A~~a---L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g-~~~~~ 934 (1076)
+++.|.++.++ ||..+. +++||+|++|||++++.-.++..++. -+.+ ..+.|+|.+.+ +.+++
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~-~~ag----k~V~Ltv~~~~~~~R~v 86 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADANPYRLLE-GKAG----KQVLLTVNRKPGGARTV 86 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHH-TTTT----SEEEEEEE-STT-EEEE
T ss_pred CEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhc-ccCC----CEEEEEEecCCCCceEE
Confidence 67889999875 676655 67999999999999999999988883 3443 88999999865 45555
Q ss_pred EE
Q 001444 935 QV 936 (1076)
Q Consensus 935 ~v 936 (1076)
.|
T Consensus 87 ~V 88 (88)
T PF14685_consen 87 VV 88 (88)
T ss_dssp EE
T ss_pred EC
Confidence 43
No 71
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=97.57 E-value=0.0016 Score=69.40 Aligned_cols=169 Identities=20% Similarity=0.226 Sum_probs=96.5
Q ss_pred eeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeec------CC--eEEeEEEEEeeC-------CCcEEEEEECCC-
Q 001444 616 HFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAA------FP--IEIPGEVVFLHP-------VHNFALIAYDPS- 679 (1076)
Q Consensus 616 ~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d------~~--~~~~a~vv~~dp-------~~dlAvlk~d~~- 679 (1076)
...++|++|. .-+|||++|.+.. ..++.+.+.. .+ ..+..+-+..|| .+|+|||+++..
T Consensus 24 ~~~C~G~li~----~~~vLTaahC~~~-~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~ 98 (220)
T PF00089_consen 24 RFFCTGTLIS----PRWVLTAAHCVDG-ASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPI 98 (220)
T ss_dssp EEEEEEEEEE----TTEEEEEGGGHTS-GGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSS
T ss_pred CeeEeEEecc----ccccccccccccc-cccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 6789999999 4599999887766 4455554331 02 244444455554 369999999765
Q ss_pred CCCcccccceeeeeccC-CccCCCCCEEEEEeeCCCCcee----eeeeeEecccceeecCCCCCCc-ccc-cceeEEEEe
Q 001444 680 SLGVAGASVVRAAELLP-EPALRRGDSVYLVGLSRSLQAT----SRKSIVTNPCAALNISSADCPR-YRA-MNMEVIELD 752 (1076)
Q Consensus 680 ~~~~~~~~~v~~~~l~~-~~~l~~G~~V~~iG~p~~~~~~----~~~~~vt~i~~~~~i~~~~~~~-~~~-~~~~~I~~d 752 (1076)
.+. ..+.++.+.. ...++.|+.+.++|++...... .....+. .++...... |.. .....+...
T Consensus 99 ~~~----~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~------~~~~~~c~~~~~~~~~~~~~c~~ 168 (220)
T PF00089_consen 99 TFG----DNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVP------VVSRKTCRSSYNDNLTPNMICAG 168 (220)
T ss_dssp EHB----SSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEE------EEEHHHHHHHTTTTSTTTEEEEE
T ss_pred ccc----cccccccccccccccccccccccccccccccccccccccccccc------ccccccccccccccccccccccc
Confidence 222 4566677766 3346899999999998753221 1111111 111110000 110 122334444
Q ss_pred c----cc-CCCcCceEECCCceEEEEEeeccccccccCCCCCcceeEeccchhhHHHHH
Q 001444 753 T----DF-GSTFSGVLTDEHGRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVL 806 (1076)
Q Consensus 753 ~----~i-g~~sGGpL~d~~G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l 806 (1076)
. .. ...|||||+..++.++||......- ....... +..++....++|
T Consensus 169 ~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c-----~~~~~~~--v~~~v~~~~~WI 220 (220)
T PF00089_consen 169 SSGSGDACQGDSGGPLICNNNYLVGIVSFGENC-----GSPNYPG--VYTRVSSYLDWI 220 (220)
T ss_dssp TTSSSBGGTTTTTSEEEETTEEEEEEEEEESSS-----SBTTSEE--EEEEGGGGHHHH
T ss_pred ccccccccccccccccccceeeecceeeecCCC-----CCCCcCE--EEEEHHHhhccC
Confidence 3 22 2358999999999999998763111 0111223 347777666654
No 72
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=97.50 E-value=0.00016 Score=84.12 Aligned_cols=121 Identities=19% Similarity=0.293 Sum_probs=81.4
Q ss_pred EEEEecCCCHHhhh--ccCCCEEEEECCEEcCCh--hHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccCCCC
Q 001444 870 RVKGCLAGSKAENM--LEQGDMMLAINKQPVTCF--HDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRDGNG 945 (1076)
Q Consensus 870 ~V~~V~~~s~A~~a--L~~GDiIlsVnG~~V~~~--~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~~~~ 945 (1076)
+|.....++||++. |.-||.|++|||...-.. +.-+.+++..+.- ..|+|+|++=--..++.|.-.
T Consensus 676 ViAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQ----T~VkltiV~cpPV~~V~I~RP------ 745 (829)
T KOG3605|consen 676 VIANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQ----TAVKLNIVSCPPVTTVLIRRP------ 745 (829)
T ss_pred HHHhcccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhccccc----ceEEEEEecCCCceEEEeecc------
Confidence 35567789999977 999999999999887653 3445555444442 668888887444444443311
Q ss_pred CcceeeecCccccCCcHhHhhcCCCCCCCCcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHh
Q 001444 946 TTRVINWCGCIVQDPHPAVRALGFLPEEGHGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKE 1019 (1076)
Q Consensus 946 ~~~~~~~~G~~~~~p~~~~~~~~~~p~~~~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~ 1019 (1076)
+...-+||.+|. || |.++..|+-|++-|++.|-+|++|||+.|--. +-.++++..
T Consensus 746 --d~kyQLGFSVQN----------------Gi-ICSLlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~ 802 (829)
T KOG3605|consen 746 --DLRYQLGFSVQN----------------GI-ICSLLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSN 802 (829)
T ss_pred --cchhhccceeeC----------------cE-eehhhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHH
Confidence 111223444432 55 56678999999999999999999999997543 445555543
No 73
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.42 E-value=0.00047 Score=80.52 Aligned_cols=67 Identities=24% Similarity=0.439 Sum_probs=52.3
Q ss_pred CcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChh--HHHHHHhcCCCCeEEEEEEeCCeEEEEEEEe
Q 001444 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFL--KLETLLDDGVDKNIELLIERGGISMTVNLVV 364 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~--~l~~~l~~~~g~~v~l~v~R~g~~~~~~v~l 364 (1076)
.++.| .++.+++||++ ||++||+|++|||+++.... +....+....|..|+|++.|.+....+.+++
T Consensus 112 ~~~~V-~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~~~~k~~~v~l 181 (406)
T COG0793 112 GGVKV-VSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRAGGGKPFTVTL 181 (406)
T ss_pred CCcEE-EecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEcCCCceeEEEE
Confidence 44444 69999999999 99999999999999998874 3555667789999999999974333333333
No 74
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.40 E-value=0.00055 Score=69.71 Aligned_cols=73 Identities=27% Similarity=0.445 Sum_probs=62.9
Q ss_pred CcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHH---HHhcCCCCeEEEEEEeCCeEEEEEEEeccCCC
Q 001444 297 TGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLET---LLDDGVDKNIELLIERGGISMTVNLVVQDLHS 369 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~---~l~~~~g~~v~l~v~R~g~~~~~~v~l~~~~~ 369 (1076)
..+.+|..|.|+|||+. ||+.||.|+++....-.+|..++. ....+.++.+.++|.|.|+.+.+.+++..|..
T Consensus 138 ~~Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~W~G 214 (231)
T KOG3129|consen 138 RPFAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLTPKKWQG 214 (231)
T ss_pred cceEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeCcccccC
Confidence 55777899999999999 999999999999888777776653 34668889999999999999999998888764
No 75
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.32 E-value=0.00054 Score=67.47 Aligned_cols=86 Identities=23% Similarity=0.371 Sum_probs=58.5
Q ss_pred CCccceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCcccc-CCCC-CCEEEEECCEEeCChhHHHH
Q 001444 260 RGTLQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEP-GDVLVRVNGEVITQFLKLET 337 (1076)
Q Consensus 260 rg~lg~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~-gL~~-GD~Il~VnG~~v~~~~~l~~ 337 (1076)
.|.||+.++.-.+..+ ...+.-| -.|.|+|||++ ||++ .|.|+.+|+..+.+.+++.+
T Consensus 25 ~g~LG~sv~~~~~~~~-------------------~~~~~~V-l~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~ 84 (138)
T PF04495_consen 25 QGLLGISVRFESFEGA-------------------EEEGWHV-LRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFE 84 (138)
T ss_dssp SSSS-EEEEEEE-TTG-------------------CCCEEEE-EEE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHH
T ss_pred CCCCcEEEEEeccccc-------------------ccceEEE-eEecCCCHHHHCCccccccEEEEccceecCCHHHHHH
Confidence 4778888886544322 2355555 59999999999 9999 69999999999999999999
Q ss_pred HHhcCCCCeEEEEEEeCC--eEEEEEEEec
Q 001444 338 LLDDGVDKNIELLIERGG--ISMTVNLVVQ 365 (1076)
Q Consensus 338 ~l~~~~g~~v~l~v~R~g--~~~~~~v~l~ 365 (1076)
.+..+.++.+.|.|+... ..+++++.+.
T Consensus 85 ~v~~~~~~~l~L~Vyns~~~~vR~V~i~P~ 114 (138)
T PF04495_consen 85 LVEANENKPLQLYVYNSKTDSVREVTITPS 114 (138)
T ss_dssp HHHHTTTS-EEEEEEETTTTCEEEEEE---
T ss_pred HHHHcCCCcEEEEEEECCCCeEEEEEEEcC
Confidence 998889999999999754 4455555544
No 76
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.28 E-value=0.00074 Score=73.58 Aligned_cols=62 Identities=18% Similarity=0.359 Sum_probs=52.3
Q ss_pred EecCCCH---Hhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEec
Q 001444 873 GCLAGSK---AENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGT 938 (1076)
Q Consensus 873 ~V~~~s~---A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l 938 (1076)
++.|+.. ...+ ||+||++++|||..+++.++...++..+... ..++|+|+|+|+..++.+.+
T Consensus 210 rl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~----tei~ltVeRdGq~~~i~i~l 275 (276)
T PRK09681 210 AVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSM----DSIQLTVLRKGARHDISIAL 275 (276)
T ss_pred EECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccC----CeEEEEEEECCEEEEEEEEc
Confidence 4556643 4466 9999999999999999999988888777765 88999999999999998865
No 77
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.26 E-value=0.00095 Score=72.44 Aligned_cols=71 Identities=15% Similarity=0.192 Sum_probs=64.0
Q ss_pred ceEEEEEecCCCHHhhhccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEe-CCEEEEEEEecccc
Q 001444 867 QVLRVKGCLAGSKAENMLEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFR-QGREIELQVGTDVR 941 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~aL~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R-~g~~~~~~v~l~~~ 941 (1076)
.|++|..+..++|+...|+.||.|++|||+++.+..++...+...++| +++++++.| +++...+++++...
T Consensus 130 ~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~G----d~VtI~~~r~~~~~~~~~~tl~~~ 201 (342)
T COG3480 130 AGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPG----DEVTIDYERHNETPEIVTITLIKN 201 (342)
T ss_pred eeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHHHHHHHhccCCC----CeEEEEEEeccCCCceEEEEEEee
Confidence 467788899999999889999999999999999999999999888887 999999997 88888888888766
No 78
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=97.22 E-value=0.0043 Score=69.86 Aligned_cols=165 Identities=22% Similarity=0.253 Sum_probs=105.2
Q ss_pred CcEEEEEEecCCCcccc-CCCCC-CEEEEECCEEeCChhH-HHHHHhcCCCCeEEEEEEeCCeEE--EEEEEeccCCCCC
Q 001444 297 TGLLVVDSVVPGGPAHL-RLEPG-DVLVRVNGEVITQFLK-LETLLDDGVDKNIELLIERGGISM--TVNLVVQDLHSIT 371 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-gL~~G-D~Il~VnG~~v~~~~~-l~~~l~~~~g~~v~l~v~R~g~~~--~~~v~l~~~~~~~ 371 (1076)
.|.-| -.|..+|||.+ ||++- |-|++|||..+..-.+ |.+.|..+..+ |+++|+...... .+.|+..+...
T Consensus 15 eg~hv-lkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~kt~~~R~v~I~ps~~wg-- 90 (462)
T KOG3834|consen 15 EGYHV-LKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNSKTQEVRIVEIVPSNNWG-- 90 (462)
T ss_pred eeEEE-EEeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEecccceeEEEEeccccccc--
Confidence 44444 69999999999 99985 8999999999986544 45556555544 999998754433 33333322211
Q ss_pred CCcccccCceEEecCCHHHHhccCCCCCeEEEEcCCChhhHcCCC-CCCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeE
Q 001444 372 PDYFLEVSGAVIHPLSYQQARNFRFPCGLVYVAEPGYMLFRAGVP-RHAIIKKFAGEEISRLEDLISVLSKLSRGARVPI 450 (1076)
Q Consensus 372 ~~~~~~~~G~~~~~l~~~~~~~~~~~~~gv~v~~~gs~a~~aGl~-~GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l 450 (1076)
. .++|.+++..+...+-.. .-=++-..+++||+.|||. -+|.|+-+-+.-...-+|+...|... .++.+.+
T Consensus 91 ---g-qllGvsvrFcsf~~A~~~---vwHvl~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIesh-e~kpLkl 162 (462)
T KOG3834|consen 91 ---G-QLLGVSVRFCSFDGAVES---VWHVLSVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESH-EGKPLKL 162 (462)
T ss_pred ---c-cccceEEEeccCccchhh---eeeeeecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhc-cCCCcce
Confidence 1 145666655443111100 0112233589999999998 79999999444455666777777774 5788888
Q ss_pred EEEeccccccceEEEEEEecCCCCC
Q 001444 451 EYSSYTDRHRRKSVLVTIDRHEWYA 475 (1076)
Q Consensus 451 ~~~~~~~~~~~~~~~l~i~r~~~~~ 475 (1076)
-+.+.+. .+.+.+.|+=.+ .|-.
T Consensus 163 yVYN~D~-d~~ReVti~pn~-awGg 185 (462)
T KOG3834|consen 163 YVYNHDT-DSCREVTITPNS-AWGG 185 (462)
T ss_pred eEeecCC-CccceEEeeccc-cccc
Confidence 7777654 334455555556 6654
No 79
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.14 E-value=0.00035 Score=62.18 Aligned_cols=58 Identities=28% Similarity=0.375 Sum_probs=44.6
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecC--CHHHHHHHHHhCCCCCeEEEEEEEeCCe
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTP--DLEAFVNVTKEIEHGEFVRVRTVHLNGK 1036 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~--~l~~f~~~v~~~~~~~~v~l~~v~r~g~ 1036 (1076)
.|+||++|.+||||+.+||+.+|.|++|||-... +-+..++.+++ ++-+++. |.|.+.
T Consensus 59 ~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTMvTHd~Avk~i~k---~~vl~mL-VaR~~l 118 (124)
T KOG3553|consen 59 KGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTMVTHDQAVKRITK---EEVLRML-VARQSL 118 (124)
T ss_pred ccEEEEEeccCChhhhhcceecceEEEecCceeEEEEhHHHHHHhhH---hHHHHHH-HHhhcc
Confidence 5999999999999999999999999999998754 44666666664 4444444 455443
No 80
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.14 E-value=0.0015 Score=66.56 Aligned_cols=76 Identities=21% Similarity=0.394 Sum_probs=59.2
Q ss_pred cceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEeccccCC
Q 001444 866 RQVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVGTDVRDG 943 (1076)
Q Consensus 866 ~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~~ 943 (1076)
+.+.+|.+|.++|||+.+ |+.||.|+++....-.++..|...-...... .+..+.++|.|.|+...+.++...+.+
T Consensus 138 ~~Fa~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~--e~~~v~v~v~R~g~~v~L~ltP~~W~G 214 (231)
T KOG3129|consen 138 RPFAVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSN--EDQIVSVTVIREGQKVVLSLTPKKWQG 214 (231)
T ss_pred cceEEEeecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhc--cCcceeEEEecCCCEEEEEeCcccccC
Confidence 568889999999999999 9999999999887777766554433222221 237799999999999999998876653
No 81
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.11 E-value=0.0015 Score=70.87 Aligned_cols=69 Identities=22% Similarity=0.335 Sum_probs=61.5
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEEe
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLKQ 1044 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk~ 1044 (1076)
.|||+..+..+|||..- |.+||.|++|||+++.+.++|.++++..+.|+.|+|.+-+.++.+...+++.
T Consensus 130 ~gvyv~~v~~~~~~~gk-l~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl 198 (342)
T COG3480 130 AGVYVLSVIDNSPFKGK-LEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITL 198 (342)
T ss_pred eeEEEEEccCCcchhce-eccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEE
Confidence 59999999999998854 8999999999999999999999999999999999999544588888766654
No 82
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=97.08 E-value=0.00089 Score=78.11 Aligned_cols=115 Identities=23% Similarity=0.354 Sum_probs=74.8
Q ss_pred EecCCCcccc--CCCCCCEEEEECCEEeCCh--hHHHHHHhcCCC-CeEEEEEEeCCeEEEEEEEeccCCCCCCCccccc
Q 001444 304 SVVPGGPAHL--RLEPGDVLVRVNGEVITQF--LKLETLLDDGVD-KNIELLIERGGISMTVNLVVQDLHSITPDYFLEV 378 (1076)
Q Consensus 304 ~V~~~spA~~--gL~~GD~Il~VnG~~v~~~--~~l~~~l~~~~g-~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~ 378 (1076)
....++||++ .|..||.|++|||..+-.+ ..-+.++...-. ..|+++|.+=---.++.|.-.+
T Consensus 679 nmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd------------ 746 (829)
T KOG3605|consen 679 NMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPD------------ 746 (829)
T ss_pred hcccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEecCCCceEEEeeccc------------
Confidence 3567899999 6999999999999987664 234455544222 3467766553332333221111
Q ss_pred CceEEecCCHHHHhccCCCC-CeEEEE-cCCChhhHcCCCCCCEEEEcCCeecCCH--HHHHHHHHh
Q 001444 379 SGAVIHPLSYQQARNFRFPC-GLVYVA-EPGYMLFRAGVPRHAIIKKFAGEEISRL--EDLISVLSK 441 (1076)
Q Consensus 379 ~G~~~~~l~~~~~~~~~~~~-~gv~v~-~~gs~a~~aGl~~GD~I~~Vng~~v~~l--~~~~~~l~~ 441 (1076)
+.| .+||.+ .||+.+ --|+-|++.|++.|-+|++|||+.|--. +..++.|..
T Consensus 747 -------~ky----QLGFSVQNGiICSLlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~ 802 (829)
T KOG3605|consen 747 -------LRY----QLGFSVQNGIICSLLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSN 802 (829)
T ss_pred -------chh----hccceeeCcEeehhhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHH
Confidence 112 233333 477766 4899999999999999999999988543 355555555
No 83
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.02 E-value=0.0019 Score=79.54 Aligned_cols=69 Identities=28% Similarity=0.364 Sum_probs=56.3
Q ss_pred cEEEEEecCCChhhhc-CCCCCCeEEEEC--CeecCC-----HHHHHHHHHhCCCCCeEEEEEEEe---CCeEEEEEEEe
Q 001444 976 GVYVARWCHGSPVHRY-GLYALQWIVEIN--GKRTPD-----LEAFVNVTKEIEHGEFVRVRTVHL---NGKPRVLTLKQ 1044 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~-GL~~gD~I~~VN--g~~v~~-----l~~f~~~v~~~~~~~~v~l~~v~r---~g~~~~~tlk~ 1044 (1076)
+++|..+.+||||+++ ||++||+|++|| |+++.+ +++.++.++. +.|+.|+|+ +.| ++.++.++|.-
T Consensus 256 ~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG-~~Gt~V~Lt-V~r~~~~~~~~~vtl~R 333 (667)
T PRK11186 256 YTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKG-PKGSKVRLE-ILPAGKGTKTRIVTLTR 333 (667)
T ss_pred eEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcC-CCCCEEEEE-EEeCCCCCceEEEEEEe
Confidence 5789999999999998 999999999999 555443 4688888887 669999999 565 45778888775
Q ss_pred CC
Q 001444 1045 DL 1046 (1076)
Q Consensus 1045 ~~ 1046 (1076)
+.
T Consensus 334 ~~ 335 (667)
T PRK11186 334 DK 335 (667)
T ss_pred ee
Confidence 54
No 84
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.01 E-value=0.0023 Score=69.88 Aligned_cols=55 Identities=11% Similarity=0.219 Sum_probs=51.0
Q ss_pred hhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEE
Q 001444 988 VHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLK 1043 (1076)
Q Consensus 988 A~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk 1043 (1076)
-.++||++||++++|||.++.+.++..++++...+.+.++|+ |.|||++..+.+.
T Consensus 220 F~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~lt-VeRdGq~~~i~i~ 274 (276)
T PRK09681 220 FDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLT-VLRKGARHDISIA 274 (276)
T ss_pred HHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEE-EEECCEEEEEEEE
Confidence 457899999999999999999999999999999999999999 8999999888764
No 85
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=96.97 E-value=0.01 Score=63.56 Aligned_cols=93 Identities=23% Similarity=0.281 Sum_probs=62.9
Q ss_pred ceeeEEEEEEEeeCCceEEEEeCccccCC-CccEEEEeecC--------CeEEeEEEEEeeC-------CCcEEEEEECC
Q 001444 615 QHFFGTGVIIYHSQSMGLVVVDKNTVAIS-ASDVMLSFAAF--------PIEIPGEVVFLHP-------VHNFALIAYDP 678 (1076)
Q Consensus 615 ~~~~GsG~vId~~~~~G~IlTn~~~V~~~-~~~i~v~~~d~--------~~~~~a~vv~~dp-------~~dlAvlk~d~ 678 (1076)
....++|.+|+ ..+|||++|.+... ...+.|.+... ...+..+-+..|| .+|+|||+++.
T Consensus 23 ~~~~C~GtlIs----~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~ 98 (232)
T cd00190 23 GRHFCGGSLIS----PRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKR 98 (232)
T ss_pred CcEEEEEEEee----CCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECC
Confidence 34679999999 67999998887654 23455554421 2234555566775 47999999964
Q ss_pred C-CCCcccccceeeeeccCCc-cCCCCCEEEEEeeCCCC
Q 001444 679 S-SLGVAGASVVRAAELLPEP-ALRRGDSVYLVGLSRSL 715 (1076)
Q Consensus 679 ~-~~~~~~~~~v~~~~l~~~~-~l~~G~~V~~iG~p~~~ 715 (1076)
. .+. ..++++.|.... .+..|+.+.++|+....
T Consensus 99 ~~~~~----~~v~picl~~~~~~~~~~~~~~~~G~g~~~ 133 (232)
T cd00190 99 PVTLS----DNVRPICLPSSGYNLPAGTTCTVSGWGRTS 133 (232)
T ss_pred cccCC----CcccceECCCccccCCCCCEEEEEeCCcCC
Confidence 3 222 346667776552 47889999999986543
No 86
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=96.95 E-value=0.011 Score=63.09 Aligned_cols=179 Identities=11% Similarity=0.102 Sum_probs=88.6
Q ss_pred cceeeeeEEEEEEEcCcccccCCcccceeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeecCCeEEeEE---EEEe
Q 001444 589 ESVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEIPGE---VVFL 665 (1076)
Q Consensus 589 ~~~~~~S~V~V~~~~~~~~~~dg~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d~~~~~~a~---vv~~ 665 (1076)
..-+...++.++.... .....=.|+... .|||||+|....+...+.|+..- |.- ... -+.+
T Consensus 13 yn~Ia~~ic~l~n~s~---------~~~~~l~gigyG-----~~iItn~HLf~~nng~L~i~s~h-G~f-~v~nt~~lkv 76 (235)
T PF00863_consen 13 YNPIASNICRLTNESD---------GGTRSLYGIGYG-----SYIITNAHLFKRNNGELTIKSQH-GEF-TVPNTTQLKV 76 (235)
T ss_dssp -HHHHTTEEEEEEEET---------TEEEEEEEEEET-----TEEEEEGGGGSSTTCEEEEEETT-EEE-EECEGGGSEE
T ss_pred cchhhheEEEEEEEeC---------CCeEEEEEEeEC-----CEEEEChhhhccCCCeEEEEeCc-eEE-EcCCccccce
Confidence 3446677888875433 222334455554 49999988887877888887764 432 111 1122
Q ss_pred e--CCCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCCCCcccc
Q 001444 666 H--PVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRA 743 (1076)
Q Consensus 666 d--p~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~~~~~~~ 743 (1076)
+ +..|+.|+|.. +++|+ --+-++|. ..+.||+|..||..+..... ..+||+. +-+.+......++
T Consensus 77 ~~i~~~DiviirmP-kDfpP----f~~kl~FR---~P~~~e~v~mVg~~fq~k~~--~s~vSes--S~i~p~~~~~fWk- 143 (235)
T PF00863_consen 77 HPIEGRDIVIIRMP-KDFPP----FPQKLKFR---APKEGERVCMVGSNFQEKSI--SSTVSES--SWIYPEENSHFWK- 143 (235)
T ss_dssp EE-TCSSEEEEE---TTS--------S---B-------TT-EEEEEEEECSSCCC--EEEEEEE--EEEEEETTTTEEE-
T ss_pred EEeCCccEEEEeCC-cccCC----cchhhhcc---CCCCCCEEEEEEEEEEcCCe--eEEECCc--eEEeecCCCCeeE-
Confidence 2 45799999974 44443 00113342 25899999999987655332 2333321 1111111111111
Q ss_pred cceeEEEEecccCCCcCceEECCC-ceEEEEEeeccccccccCCCCCcceeEeccchhhHHHHHHHHh
Q 001444 744 MNMEVIELDTDFGSTFSGVLTDEH-GRVQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYTISRVLDKII 810 (1076)
Q Consensus 744 ~~~~~I~~d~~ig~~sGGpL~d~~-G~VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~v~~~l~~l~ 810 (1076)
. .+++.-|. ||.||++.. |.+|||..... .....+|+-++|-+....+++...
T Consensus 144 ---H--wIsTk~G~-CG~PlVs~~Dg~IVGiHsl~~--------~~~~~N~F~~f~~~f~~~~l~~~~ 197 (235)
T PF00863_consen 144 ---H--WISTKDGD-CGLPLVSTKDGKIVGIHSLTS--------NTSSRNYFTPFPDDFEEFYLENIE 197 (235)
T ss_dssp ---E---C---TT--TT-EEEETTT--EEEEEEEEE--------TTTSSEEEEE--TTHHHHHCC-CC
T ss_pred ---E--EecCCCCc-cCCcEEEcCCCcEEEEEcCcc--------CCCCeEEEEcCCHHHHHHHhcccc
Confidence 2 33444465 999999975 99999987632 234556776777776666655443
No 87
>PRK11186 carboxy-terminal protease; Provisional
Probab=96.94 E-value=0.0023 Score=78.67 Aligned_cols=66 Identities=29% Similarity=0.431 Sum_probs=49.0
Q ss_pred cEEEEEEecCCCcccc--CCCCCCEEEEEC--CEEeCC---h--hHHHHHHhcCCCCeEEEEEEeC---CeEEEEEEE
Q 001444 298 GLLVVDSVVPGGPAHL--RLEPGDVLVRVN--GEVITQ---F--LKLETLLDDGVDKNIELLIERG---GISMTVNLV 363 (1076)
Q Consensus 298 G~lvv~~V~~~spA~~--gL~~GD~Il~Vn--G~~v~~---~--~~l~~~l~~~~g~~v~l~v~R~---g~~~~~~v~ 363 (1076)
+.++|..|.|||||++ ||++||+|++|| |+++.+ | .++..++....|.+|.|+|.|+ ++..+++++
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r~~~~~~~~~vtl~ 332 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILPAGKGTKTRIVTLT 332 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEeCCCCCceEEEEEE
Confidence 3345579999999998 899999999999 554432 2 3566777778899999999984 344555443
No 88
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=96.92 E-value=0.0013 Score=76.03 Aligned_cols=86 Identities=30% Similarity=0.394 Sum_probs=67.8
Q ss_pred cceEEEEcChhHHHHcCCChhHHHhhhcCCCCCCCcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHH-h
Q 001444 263 LQVTFVHKGFDETRRLGLQSATEQMVRHASPPGETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLL-D 340 (1076)
Q Consensus 263 lg~~~~~~~~~~~~~lGl~~~~~~~~~~~~~~~~~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l-~ 340 (1076)
.|+.+..++.+ .-.||+.. ..+.|..++..|.++|||++ ||.+||.|++|||. ...+ +
T Consensus 439 ~gL~~~~~~~~-~~~LGl~v-----------~~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~ 498 (558)
T COG3975 439 FGLTFTPKPRE-AYYLGLKV-----------KSEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDR 498 (558)
T ss_pred cceEEEecCCC-CcccceEe-----------cccCCeeEEEecCCCChhHhccCCCccEEEEEcCc--------cccccc
Confidence 67777777665 44577652 14567777789999999999 99999999999999 2233 4
Q ss_pred cCCCCeEEEEEEeCCeEEEEEEEeccCC
Q 001444 341 DGVDKNIELLIERGGISMTVNLVVQDLH 368 (1076)
Q Consensus 341 ~~~g~~v~l~v~R~g~~~~~~v~l~~~~ 368 (1076)
..+++.|++++.|.|..+++.+++....
T Consensus 499 ~~~~d~i~v~~~~~~~L~e~~v~~~~~~ 526 (558)
T COG3975 499 YKVNDKIQVHVFREGRLREFLVKLGGDP 526 (558)
T ss_pred cccccceEEEEccCCceEEeecccCCCc
Confidence 5788999999999999999988776443
No 89
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=96.85 E-value=0.016 Score=62.18 Aligned_cols=93 Identities=22% Similarity=0.275 Sum_probs=64.4
Q ss_pred ceeeEEEEEEEeeCCceEEEEeCccccCCC-ccEEEEeecCC-------eEEeEEEEEeeC-------CCcEEEEEECCC
Q 001444 615 QHFFGTGVIIYHSQSMGLVVVDKNTVAISA-SDVMLSFAAFP-------IEIPGEVVFLHP-------VHNFALIAYDPS 679 (1076)
Q Consensus 615 ~~~~GsG~vId~~~~~G~IlTn~~~V~~~~-~~i~v~~~d~~-------~~~~a~vv~~dp-------~~dlAvlk~d~~ 679 (1076)
.....+|.+|+ +.+|||.+|.+.... ..+.|.+.... ..+...-+..|| .+|+|||+++..
T Consensus 24 ~~~~C~GtlIs----~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~ 99 (229)
T smart00020 24 GRHFCGGSLIS----PRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSP 99 (229)
T ss_pred CCcEEEEEEec----CCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcc
Confidence 35679999999 779999988877653 56777776522 334455555554 479999999654
Q ss_pred -CCCcccccceeeeeccCC-ccCCCCCEEEEEeeCCCC
Q 001444 680 -SLGVAGASVVRAAELLPE-PALRRGDSVYLVGLSRSL 715 (1076)
Q Consensus 680 -~~~~~~~~~v~~~~l~~~-~~l~~G~~V~~iG~p~~~ 715 (1076)
.+. ..++++.+... ..+..|+.+.+.|+....
T Consensus 100 i~~~----~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~ 133 (229)
T smart00020 100 VTLS----DNVRPICLPSSNYNVPAGTTCTVSGWGRTS 133 (229)
T ss_pred cCCC----CceeeccCCCcccccCCCCEEEEEeCCCCC
Confidence 232 35666777653 246789999999987654
No 90
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=96.65 E-value=0.004 Score=64.81 Aligned_cols=67 Identities=27% Similarity=0.337 Sum_probs=55.7
Q ss_pred ceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCEEEEEEEe
Q 001444 867 QVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGREIELQVG 937 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~~~~~~v~ 937 (1076)
-|+.+.-..+++..++. ||.||+-+++|+..+++.+++..+++.+..- ..++++|.|+|+..++.|.
T Consensus 207 ~Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m----~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 207 EGYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNM----PSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred EEEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcC----cceEEEEEecCccceeeec
Confidence 34445445556778888 9999999999999999999999888777664 7899999999999988774
No 91
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.38 E-value=0.014 Score=72.33 Aligned_cols=42 Identities=19% Similarity=0.270 Sum_probs=33.9
Q ss_pred CcEEEEEEcCC-----------CCccccccCCCCCCcCCCCCCEEEEEecCCC
Q 001444 117 HDFGFFRYDPS-----------AIQFLNYDEIPLAPEAACVGLEIRVVGNDSG 158 (1076)
Q Consensus 117 ~DlAlLk~~~~-----------~~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g 158 (1076)
-||+|+|+-.. +.|+-+-..++++...++.||.|+++|||..
T Consensus 200 gDfs~fRvY~~~dg~PA~Ys~dnvP~~p~~~l~is~~G~keGD~vmv~GyPG~ 252 (698)
T PF10459_consen 200 GDFSFFRVYADKDGKPADYSKDNVPYKPKHFLKISLKGVKEGDFVMVAGYPGR 252 (698)
T ss_pred CceEEEEEEeCCCCCccccCcCCCCCCCccccccCCCCCCCCCeEEEccCCCc
Confidence 39999999433 5666555668888999999999999999954
No 92
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=96.13 E-value=0.01 Score=61.87 Aligned_cols=61 Identities=21% Similarity=0.202 Sum_probs=50.9
Q ss_pred EEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhc-CCCCeEEEEEEeCCeEEEEEEE
Q 001444 303 DSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDD-GVDKNIELLIERGGISMTVNLV 363 (1076)
Q Consensus 303 ~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~-~~g~~v~l~v~R~g~~~~~~v~ 363 (1076)
+-..+++..+. |||+||+.++||+..+++-.++..+|.. ..-..++++|.|+|+.+.+.|.
T Consensus 212 ~pgkd~slF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 212 EPGKDGSLFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred cCCCCcchhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 55555677777 9999999999999999999998888743 5557899999999999887764
No 93
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.12 E-value=0.013 Score=52.44 Aligned_cols=48 Identities=21% Similarity=0.304 Sum_probs=37.7
Q ss_pred cCCCCCeEEEE--cCCChhhHcCCCCCCEEEEcCCeecC--CHHHHHHHHHh
Q 001444 394 FRFPCGLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEIS--RLEDLISVLSK 441 (1076)
Q Consensus 394 ~~~~~~gv~v~--~~gs~a~~aGl~~GD~I~~Vng~~v~--~l~~~~~~l~~ 441 (1076)
|+-+..|+||. +.||||+.|||+.+|.|++|||-... +-+..++.+++
T Consensus 54 f~ytD~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTMvTHd~Avk~i~k 105 (124)
T KOG3553|consen 54 FSYTDKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTMVTHDQAVKRITK 105 (124)
T ss_pred CCcCCccEEEEEeccCChhhhhcceecceEEEecCceeEEEEhHHHHHHhhH
Confidence 33456899999 48999999999999999999997654 34566666665
No 94
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=95.89 E-value=0.015 Score=67.53 Aligned_cols=90 Identities=21% Similarity=0.289 Sum_probs=64.4
Q ss_pred eeeeeeEEEEcChHhHHHcCCCHHHHHHHHhcCCCccceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHH
Q 001444 830 VRILEVELYPTLLSKARSFGLSDDWVQALVKKDPVRRQVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENAC 908 (1076)
Q Consensus 830 ~~~Lgv~~~~~~~~~a~~~g~~~~wi~~~~~~~~~~~~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l 908 (1076)
+...|+.+...... +..+|+.-. +..+..+|..|.++|||++| |.+||.|++|||. ...+
T Consensus 436 l~~~gL~~~~~~~~-~~~LGl~v~----------~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l 496 (558)
T COG3975 436 LERFGLTFTPKPRE-AYYLGLKVK----------SEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI--------SDQL 496 (558)
T ss_pred hhhcceEEEecCCC-CcccceEec----------ccCCeeEEEecCCCChhHhccCCCccEEEEEcCc--------cccc
Confidence 33356666665433 334555443 23356679999999999999 9999999999999 1222
Q ss_pred HhccCCCCCCCeEEEEEEeCCEEEEEEEeccccC
Q 001444 909 QALDKDGEDNGKLDITIFRQGREIELQVGTDVRD 942 (1076)
Q Consensus 909 ~~~~~g~~~~~~v~l~V~R~g~~~~~~v~l~~~~ 942 (1076)
...+.+ +.+++++.|.|+-+++.+++....
T Consensus 497 ~~~~~~----d~i~v~~~~~~~L~e~~v~~~~~~ 526 (558)
T COG3975 497 DRYKVN----DKIQVHVFREGRLREFLVKLGGDP 526 (558)
T ss_pred cccccc----cceEEEEccCCceEEeecccCCCc
Confidence 234444 789999999999999988876543
No 95
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=95.62 E-value=0.12 Score=55.28 Aligned_cols=116 Identities=21% Similarity=0.217 Sum_probs=57.5
Q ss_pred CcEEEEEEEeCC-CcEEEeCccccCCCCcEEEEEecCCcEEEEEEEEecCCCcEEEEEEcC--CCCccccccCCCCCCcC
Q 001444 67 ASYATGFVVDKR-RGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDP--SAIQFLNYDEIPLAPEA 143 (1076)
Q Consensus 67 ~~~GTGfvV~~~-~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~vv~~d~~~DlAlLk~~~--~~~~~~~~~~l~l~~~~ 143 (1076)
++.|||=++.-+ +-.|+|+.||+..+ ..++... +.... .-+...-|||.-.++. -..|.+++.+ .
T Consensus 111 ss~Gsggvft~~~~~vvvTAtHVlg~~--~a~v~~~-g~~~~---~tF~~~GDfA~~~~~~~~G~~P~~k~a~----~-- 178 (297)
T PF05579_consen 111 SSVGSGGVFTIGGNTVVVTATHVLGGN--TARVSGV-GTRRM---LTFKKNGDFAEADITNWPGAAPKYKFAQ----N-- 178 (297)
T ss_dssp SSEEEEEEEECTTEEEEEEEHHHCBTT--EEEEEET-TEEEE---EEEEEETTEEEEEETTS-S---B--B-T----T--
T ss_pred ecccccceEEECCeEEEEEEEEEcCCC--eEEEEec-ceEEE---EEEeccCcEEEEECCCCCCCCCceeecC----C--
Confidence 455666555431 34999999999943 3444332 22222 3445566999999943 1222222221 0
Q ss_pred CCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEEeeccc
Q 001444 144 ACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSK 221 (1076)
Q Consensus 144 ~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~ 221 (1076)
..|---.. + ..-+..|.|..- -.=+=+.+|.||+||+..+|.+||++++..
T Consensus 179 -~~GrAyW~---t---~tGvE~G~ig~~--------------------~~~~fT~~GDSGSPVVt~dg~liGVHTGSn 229 (297)
T PF05579_consen 179 -YTGRAYWL---T---STGVEPGFIGGG--------------------GAVCFTGPGDSGSPVVTEDGDLIGVHTGSN 229 (297)
T ss_dssp --SEEEEEE---E---TTEEEEEEEETT--------------------EEEESS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred -cccceEEE---c---ccCcccceecCc--------------------eEEEEcCCCCCCCccCcCCCCEEEEEecCC
Confidence 11111000 0 011223322111 111235679999999999999999999854
No 96
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=95.31 E-value=0.2 Score=52.45 Aligned_cols=167 Identities=15% Similarity=0.090 Sum_probs=86.1
Q ss_pred CCcEEEEEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEEEEEecCC----------------CcEEEEEEc----
Q 001444 66 GASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPV----------------HDFGFFRYD---- 125 (1076)
Q Consensus 66 ~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~vv~~d~~----------------~DlAlLk~~---- 125 (1076)
..+.||=-+++++++..--=.|.+.+......+.+.+|+.+++++....+. .-++-+.-+
T Consensus 18 ~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~Nt~~G 97 (218)
T PF05580_consen 18 TAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEKNTQFG 97 (218)
T ss_pred CcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEeccccc
Confidence 456778888887666666667777754444566677888888777655322 112222221
Q ss_pred ------CCC-CccccccCCCCC-CcCCCCCCEEEEEecCCCCCC-eEEEEEEEEecCCC-CCCCCCCccccceeeEEEee
Q 001444 126 ------PSA-IQFLNYDEIPLA-PEAACVGLEIRVVGNDSGEKV-SILAGTLARLDRDA-PHYKKDGYNDFNTFYMQAAS 195 (1076)
Q Consensus 126 ------~~~-~~~~~~~~l~l~-~~~~~~G~~V~~iG~p~g~~~-s~~~G~is~~~~~~-~~~~~~~~~~~~~~~i~~~a 195 (1076)
... .....-.++|++ .+.+++|..-+.--. .|... .... .|..+.++. +.....-..-....++....
T Consensus 98 I~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~-~G~~ie~f~i-eI~~v~~~~~~~~k~~vi~vtd~~Ll~~TG 175 (218)
T PF05580_consen 98 IYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVI-DGTKIEEFDI-EIEKVLPQSSPSGKGMVIKVTDPRLLEKTG 175 (218)
T ss_pred eeEEeccccccccccCceeEEEEHHHceEccEEEEEEE-cCCeEEEeEE-EEEEEccCCCCCCCcEEEEECCcchhhhhC
Confidence 110 111122333333 456777764322111 12111 1111 222222221 11110000000112344455
Q ss_pred ccCCCCCCCceecCCCcEEEEeeccc-CCCCCccccCHHHH
Q 001444 196 GTKGGSSGSPVIDWQGRAVALNAGSK-SSSASAFFLPLERV 235 (1076)
Q Consensus 196 ~~~~G~SGgPv~n~~G~vVGi~~~~~-~~~~~~falP~~~i 235 (1076)
.+-.||||||++ .+|++||=++... ++...+|.++++..
T Consensus 176 GIvqGMSGSPI~-qdGKLiGAVthvf~~dp~~Gygi~ie~M 215 (218)
T PF05580_consen 176 GIVQGMSGSPII-QDGKLIGAVTHVFVNDPTKGYGIFIEWM 215 (218)
T ss_pred CEEecccCCCEE-ECCEEEEEEEEEEecCCCceeeecHHHH
Confidence 677899999998 5999999887665 34677888887654
No 97
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=94.76 E-value=0.034 Score=68.99 Aligned_cols=54 Identities=24% Similarity=0.353 Sum_probs=41.9
Q ss_pred eEEEeeccCCCCCCCceecCCCcEEEEeecccCC-----------CCCccccCHHHHHHHHHHHH
Q 001444 190 YMQAASGTKGGSSGSPVIDWQGRAVALNAGSKSS-----------SASAFFLPLERVVRALRFLQ 243 (1076)
Q Consensus 190 ~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~~-----------~~~~falP~~~i~~~l~~l~ 243 (1076)
.+.++..+.+|||||||+|.+|++|||+.-+.-. ...+..+.+..|+.+|+++-
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 3688889999999999999999999999754321 23455577788888887764
No 98
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.68 E-value=0.0091 Score=62.59 Aligned_cols=144 Identities=22% Similarity=0.272 Sum_probs=52.1
Q ss_pred CCcEEEEEEEeCCCc--EEEeCccccCCCCcEEEEEecCCcEEEE---EEEEecCCCcEEEEEEcCCCCccccccCCCCC
Q 001444 66 GASYATGFVVDKRRG--IILTNRHVVKPGPVVAEAMFVNREEIPV---YPIYRDPVHDFGFFRYDPSAIQFLNYDEIPLA 140 (1076)
Q Consensus 66 ~~~~GTGfvV~~~~G--~IlTn~Hvv~~~~~~~~v~~~~~~~~~a---~vv~~d~~~DlAlLk~~~~~~~~~~~~~l~l~ 140 (1076)
..++++. |...+| .++|++||... +... ..+.+|+.++. +.++.+...|++||+..+.-...+....+.+.
T Consensus 28 hvGya~c--v~l~~g~~~L~ta~Hv~~~-~~~~-~~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~ 103 (203)
T PF02122_consen 28 HVGYATC--VRLFDGEDALLTARHVWSR-PSKV-TSLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLS 103 (203)
T ss_dssp ------E--EEE----EEEEE-HHHHTS-SS----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B-
T ss_pred ccccceE--EECcCCccceecccccCCC-ccce-eEcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCccccccc
Confidence 3455555 432255 99999999995 4443 34456666554 35677889999999998431111122221111
Q ss_pred -CcCCCCCCEEEEEecCCCCCCeEEEE-EEEEecCCCCCCCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEEee
Q 001444 141 -PEAACVGLEIRVVGNDSGEKVSILAG-TLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNA 218 (1076)
Q Consensus 141 -~~~~~~G~~V~~iG~p~g~~~s~~~G-~is~~~~~~~~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~ 218 (1076)
...+. -| +.. ......+ ..+.-..-.+. .+ .++..-+.+.+|.||.|.++.+ ++||++.
T Consensus 104 ~~~~~~-------~g-~~~-~y~~~~~~~~~~sa~i~g~------~~---~~~~vls~T~~G~SGtp~y~g~-~vvGvH~ 164 (203)
T PF02122_consen 104 QNSQLA-------KG-PVS-FYGFSSGEWPCSSAKIPGT------EG---KFASVLSNTSPGWSGTPYYSGK-NVVGVHT 164 (203)
T ss_dssp ---SEE-------EE-ESS-TTSEEEEEEEEEE-S----------ST---TEEEE-----TT-TT-EEE-SS--EEEEEE
T ss_pred chhhhC-------CC-Cee-eeeecCCCceeccCccccc------cC---cCCceEcCCCCCCCCCCeEECC-CceEeec
Confidence 11100 00 000 0111111 11222111111 11 1467778999999999999999 9999999
Q ss_pred cc---cCCCCCccccCH
Q 001444 219 GS---KSSSASAFFLPL 232 (1076)
Q Consensus 219 ~~---~~~~~~~falP~ 232 (1076)
+. ....+.++.-|+
T Consensus 165 G~~~~~~~~n~n~~spi 181 (203)
T PF02122_consen 165 GSPSGSNRENNNRMSPI 181 (203)
T ss_dssp EE---------------
T ss_pred Ccccccccccccccccc
Confidence 84 333444444333
No 99
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=94.60 E-value=0.15 Score=49.34 Aligned_cols=53 Identities=28% Similarity=0.522 Sum_probs=40.6
Q ss_pred CcEEEEEEecCCCcccc--CCCCCCEEEEECCEEeCChhHHH--HHHhcCCCCeEEEEE
Q 001444 297 TGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLE--TLLDDGVDKNIELLI 351 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~--gL~~GD~Il~VnG~~v~~~~~l~--~~l~~~~g~~v~l~v 351 (1076)
..+++ +.+.||+.|+. ||+.||.+++|||..+..-.+-. ++|....| +|+|.|
T Consensus 115 spiyi-sriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~g-svklvv 171 (207)
T KOG3550|consen 115 SPIYI-SRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVG-SVKLVV 171 (207)
T ss_pred CceEE-EeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcC-cEEEEE
Confidence 44565 89999999999 89999999999999998765543 44555554 566654
No 100
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=94.45 E-value=0.088 Score=62.24 Aligned_cols=57 Identities=26% Similarity=0.352 Sum_probs=46.1
Q ss_pred CCcEEEEEEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhcCCCCeEEEEEEe
Q 001444 296 ETGLLVVDSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIER 353 (1076)
Q Consensus 296 ~~G~lvv~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~~~g~~v~l~v~R 353 (1076)
...++-|..|.+++||.+ .|++||++++|||.+|++..+....+....|. +.+.+.|
T Consensus 396 ~~~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~-~~~l~~~ 453 (1051)
T KOG3532|consen 396 TNRAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGD-LTVLVER 453 (1051)
T ss_pred CceEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccc-eEEEEee
Confidence 345566689999999999 99999999999999999999998888765553 4444444
No 101
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=94.27 E-value=0.094 Score=61.98 Aligned_cols=46 Identities=22% Similarity=0.324 Sum_probs=41.9
Q ss_pred ceEEEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHHHHHHhcc
Q 001444 867 QVLRVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIENACQALD 912 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~~~l~~~~ 912 (1076)
+.+.|..|.+++||.++ |++||++++|||.+|++.++....++...
T Consensus 398 ~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~ 444 (1051)
T KOG3532|consen 398 RAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTT 444 (1051)
T ss_pred eEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcc
Confidence 67789999999999999 99999999999999999999998885443
No 102
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=94.12 E-value=0.036 Score=65.06 Aligned_cols=54 Identities=24% Similarity=0.388 Sum_probs=42.3
Q ss_pred CcEEEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCeEEEE
Q 001444 975 HGVYVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEFVRVR 1029 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~v~l~ 1029 (1076)
.|+||..|.+||-|++.||+.||.|++|||+..+++.. .++..-...+..+.|.
T Consensus 562 fgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~-~KA~eiLrnnthLtlt 615 (1283)
T KOG3542|consen 562 FGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISA-KKAEEILRNNTHLTLT 615 (1283)
T ss_pred ceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhH-HHHHHHhcCCceEEEE
Confidence 58999999999999999999999999999999988753 3333333335555554
No 103
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=94.10 E-value=0.94 Score=49.09 Aligned_cols=146 Identities=12% Similarity=0.040 Sum_probs=77.1
Q ss_pred eeEEEEEEEeeCCceEEEEeCccccCCC-c-cEEEEeecCCe--------EEeEEEEEeeC----CCcEEEEEECCCCCC
Q 001444 617 FFGTGVIIYHSQSMGLVVVDKNTVAISA-S-DVMLSFAAFPI--------EIPGEVVFLHP----VHNFALIAYDPSSLG 682 (1076)
Q Consensus 617 ~~GsG~vId~~~~~G~IlTn~~~V~~~~-~-~i~v~~~d~~~--------~~~a~vv~~dp----~~dlAvlk~d~~~~~ 682 (1076)
...++|+|. +..+||+.|++.... . +....+.. +. .+.....+..+ ..|.+...+.+..+.
T Consensus 64 ~~~~~~lI~----pntvLTa~Hc~~s~~~G~~~~~~~p~-g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~ 138 (251)
T COG3591 64 LCTAATLIG----PNTVLTAGHCIYSPDYGEDDIAAAPP-GVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALE 138 (251)
T ss_pred ceeeEEEEc----CceEEEeeeEEecCCCChhhhhhcCC-cccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhc
Confidence 345669999 779999977765433 1 11122221 11 22233332232 346666666444332
Q ss_pred --cccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCCCCcccccceeEEEEecccCCCcC
Q 001444 683 --VAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRAMNMEVIELDTDFGSTFS 760 (1076)
Q Consensus 683 --~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~~~~~~~~~~~~I~~d~~ig~~sG 760 (1076)
...-.-+....+.....++.+|.+-.+|||.........-.-+ +-+ .... ...-..++|+-.|. ||
T Consensus 139 ~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t-----~~v-----~~~~-~~~l~y~~dT~pG~-SG 206 (251)
T COG3591 139 SGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWEST-----GKV-----NSIK-GNKLFYDADTLPGS-SG 206 (251)
T ss_pred cCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeec-----cee-----EEEe-cceEEEEecccCCC-CC
Confidence 1001122222344445579999999999998775211111111 111 0110 11223355554455 99
Q ss_pred ceEECCCceEEEEEeeccc
Q 001444 761 GVLTDEHGRVQAIWGSFST 779 (1076)
Q Consensus 761 GpL~d~~G~VvGi~~~~~~ 779 (1076)
-|+++.+.+|+|++..-..
T Consensus 207 Spv~~~~~~vigv~~~g~~ 225 (251)
T COG3591 207 SPVLISKDEVIGVHYNGPG 225 (251)
T ss_pred CceEecCceEEEEEecCCC
Confidence 9999999999999887444
No 104
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=93.81 E-value=0.056 Score=52.59 Aligned_cols=31 Identities=29% Similarity=0.359 Sum_probs=22.0
Q ss_pred EEEeeccCCCCCCCceecCCCcEEEEeeccc
Q 001444 191 MQAASGTKGGSSGSPVIDWQGRAVALNAGSK 221 (1076)
Q Consensus 191 i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~ 221 (1076)
...+....+|+||+|+||.+|++|||...+.
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~ 118 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGV 118 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEE
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEccce
Confidence 3444556889999999999999999987654
No 105
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=93.75 E-value=0.062 Score=65.60 Aligned_cols=53 Identities=28% Similarity=0.552 Sum_probs=40.6
Q ss_pred EEEEEecCCCccccCCCCCCEEEEECCEEeCCh--hHHHHHHhcCCCCeEEEEEEe
Q 001444 300 LVVDSVVPGGPAHLRLEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLIER 353 (1076)
Q Consensus 300 lvv~~V~~~spA~~gL~~GD~Il~VnG~~v~~~--~~l~~~l~~~~g~~v~l~v~R 353 (1076)
+||+.|.+|||+..+|++||.|+.|||+++.+. .++-.++. .....|.|+|.+
T Consensus 77 viVr~VT~GGps~GKL~PGDQIl~vN~Epv~daprervIdlvR-ace~sv~ltV~q 131 (1298)
T KOG3552|consen 77 VIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDAPRERVIDLVR-ACESSVNLTVCQ 131 (1298)
T ss_pred eEEEEecCCCCccccccCCCeEEEecCcccccccHHHHHHHHH-HHhhhcceEEec
Confidence 445799999999999999999999999999874 33434443 334567777765
No 106
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=93.19 E-value=0.22 Score=48.23 Aligned_cols=52 Identities=23% Similarity=0.286 Sum_probs=42.0
Q ss_pred cEEEEEecCCChhhhc-CCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEE
Q 001444 976 GVYVARWCHGSPVHRY-GLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVR 1029 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~-GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~ 1029 (1076)
-+||+++.||+-|++- ||+.||.+++|||..+..- +..+++++.. -..|+|.
T Consensus 116 piyisriipggvadrhgglkrgdqllsvngvsvege~hekavellkaa--~gsvklv 170 (207)
T KOG3550|consen 116 PIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAA--VGSVKLV 170 (207)
T ss_pred ceEEEeecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHh--cCcEEEE
Confidence 4699999999999986 8999999999999998865 4566777763 3446665
No 107
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=93.03 E-value=0.54 Score=48.47 Aligned_cols=139 Identities=14% Similarity=0.155 Sum_probs=77.4
Q ss_pred CCcEEEEEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEE--EEEecCC---CcEEEEEEcCCCCccccccC-CCC
Q 001444 66 GASYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVY--PIYRDPV---HDFGFFRYDPSAIQFLNYDE-IPL 139 (1076)
Q Consensus 66 ~~~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~--vv~~d~~---~DlAlLk~~~~~~~~~~~~~-l~l 139 (1076)
+...++++.|.. .++|.++|.-. ...+.+ ++..++.. +...+.. .|+++++++... ++-++.. ++
T Consensus 23 g~~t~l~~gi~~--~~~lvp~H~~~----~~~i~i-~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~-kfrDIrk~~~- 93 (172)
T PF00548_consen 23 GEFTMLALGIYD--RYFLVPTHEEP----EDTIYI-DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNP-KFRDIRKFFP- 93 (172)
T ss_dssp EEEEEEEEEEEB--TEEEEEGGGGG----CSEEEE-TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS--B--GGGGSB-
T ss_pred ceEEEecceEee--eEEEEECcCCC----cEEEEE-CCEEEEeeeeEEEecCCCcceeEEEEEccCCc-ccCchhhhhc-
Confidence 456788888874 49999999222 223333 45555433 2334444 599999996421 1212221 11
Q ss_pred CCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeEEEeeccCCCCCCCceec---CCCcEEEE
Q 001444 140 APEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVID---WQGRAVAL 216 (1076)
Q Consensus 140 ~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n---~~G~vVGi 216 (1076)
.......+...++-++......+..+.+...+.-. . .. ......+...+++.+|+-||||+. ..++++||
T Consensus 94 -~~~~~~~~~~l~v~~~~~~~~~~~v~~v~~~~~i~-~-~g----~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~Gi 166 (172)
T PF00548_consen 94 -ESIPEYPECVLLVNSTKFPRMIVEVGFVTNFGFIN-L-SG----TTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGI 166 (172)
T ss_dssp -SSGGTEEEEEEEEESSSSTCEEEEEEEEEEEEEEE-E-TT----EEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEE
T ss_pred -cccccCCCcEEEEECCCCccEEEEEEEEeecCccc-c-CC----CEeeEEEEEccCCCCCccCCeEEEeeccCccEEEE
Confidence 11223455555554443333333444444443310 0 00 112346888999999999999984 35789999
Q ss_pred eecc
Q 001444 217 NAGS 220 (1076)
Q Consensus 217 ~~~~ 220 (1076)
+.++
T Consensus 167 HvaG 170 (172)
T PF00548_consen 167 HVAG 170 (172)
T ss_dssp EEEE
T ss_pred Eecc
Confidence 9885
No 108
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=92.75 E-value=0.092 Score=61.82 Aligned_cols=54 Identities=20% Similarity=0.294 Sum_probs=43.0
Q ss_pred CeEEEEc--CCChhhHcCCCCCCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeEEEE
Q 001444 399 GLVYVAE--PGYMLFRAGVPRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYS 453 (1076)
Q Consensus 399 ~gv~v~~--~gs~a~~aGl~~GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l~~~ 453 (1076)
-|+||.+ ||+.|.++|||.||.|++||||..+++ .+.+.+.-+.++...+|+++
T Consensus 562 fgifV~~V~pgskAa~~GlKRgDqilEVNgQnfeni-s~~KA~eiLrnnthLtltvK 617 (1283)
T KOG3542|consen 562 FGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENI-SAKKAEEILRNNTHLTLTVK 617 (1283)
T ss_pred ceeEEeeecCCchHHHhhhhhhhhhhhccccchhhh-hHHHHHHHhcCCceEEEEEe
Confidence 3799984 999999999999999999999999999 55555555555555555543
No 109
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=92.25 E-value=0.18 Score=61.85 Aligned_cols=61 Identities=30% Similarity=0.442 Sum_probs=50.1
Q ss_pred CCCcEEEEEEecCCCcccc-C-CCCCCEEEEECCEEeCChhHHH-HHHhcCCCCeEEEEEEeCCe
Q 001444 295 GETGLLVVDSVVPGGPAHL-R-LEPGDVLVRVNGEVITQFLKLE-TLLDDGVDKNIELLIERGGI 356 (1076)
Q Consensus 295 ~~~G~lvv~~V~~~spA~~-g-L~~GD~Il~VnG~~v~~~~~l~-~~l~~~~g~~v~l~v~R~g~ 356 (1076)
..-|+|| +.|.+|++|+. | |+.||.+++|||..+-...+-+ ..|+...|..|.+.|...|.
T Consensus 958 ~klGIYv-KsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lmtrtg~vV~leVaKqgA 1021 (1629)
T KOG1892|consen 958 RKLGIYV-KSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLMTRTGNVVHLEVAKQGA 1021 (1629)
T ss_pred cccceEE-EEeccCCccccccccccCceeeeecCcccccccHHHHHHHHhccCCeEEEehhhhhh
Confidence 4678888 89999999999 5 9999999999999887776655 34566788899999876543
No 110
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=91.77 E-value=0.81 Score=55.19 Aligned_cols=93 Identities=22% Similarity=0.325 Sum_probs=59.9
Q ss_pred cCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeEEEe----eccCCCCCCCceecCCCc-----
Q 001444 142 EAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAA----SGTKGGSSGSPVIDWQGR----- 212 (1076)
Q Consensus 142 ~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i~~~----a~~~~G~SGgPv~n~~G~----- 212 (1076)
..+..|.+|+=+|.-.|. +.|.+.+..- .|...+-.. -.+++... +=..+|.||+=|++.-+.
T Consensus 585 ~~~~~G~~VfK~GrTTgy----T~G~lNg~kl---vyw~dG~i~-s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gL 656 (695)
T PF08192_consen 585 SNLVPGMEVFKVGRTTGY----TTGILNGIKL---VYWADGKIQ-SSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGL 656 (695)
T ss_pred hccCCCCeEEEecccCCc----cceEecceEE---EEecCCCeE-EEEEEEecCCCccccCCCCcccEEEecccccccCc
Confidence 346789999999988775 4555544421 111111111 13344444 345679999999997555
Q ss_pred -EEEEeecccCC-CCCccccCHHHHHHHHHHH
Q 001444 213 -AVALNAGSKSS-SASAFFLPLERVVRALRFL 242 (1076)
Q Consensus 213 -vVGi~~~~~~~-~~~~falP~~~i~~~l~~l 242 (1076)
|+||..+.... ..++++.|+..|+.-|+..
T Consensus 657 gvvGMlhsydge~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 657 GVVGMLHSYDGEQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred eeeEEeeecCCccceeeccCcHHHHHHHHHHh
Confidence 99998774433 5788899999888766654
No 111
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=91.76 E-value=5.2 Score=44.68 Aligned_cols=106 Identities=11% Similarity=0.167 Sum_probs=57.6
Q ss_pred CCCcEEEEEEcCC---CCccccccCCCCCCcCCCCCCEEEEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeE
Q 001444 115 PVHDFGFFRYDPS---AIQFLNYDEIPLAPEAACVGLEIRVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYM 191 (1076)
Q Consensus 115 ~~~DlAlLk~~~~---~~~~~~~~~l~l~~~~~~~G~~V~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i 191 (1076)
...++.||.++.. ...+.=+.. .+.....|+.+.+.|+..........-.+..... ....+
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~---~~~~~~~~~~~~~yg~~~~~~~~~~~~~i~~~~~-------------~~~~~ 222 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLAD---SSTNWEKGDEVDVYGFNSTGKLKHRKLKITNCTK-------------CAYSI 222 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCC---CccccccCceEEEeecCCCCeEEEEEEEEEEeec-------------cceeE
Confidence 3468899999865 222222222 3445778999999998322221111111111111 11124
Q ss_pred EEeeccCCCCCCCcee-cCCC--cEEEEeecccCCC--CCccccCHHHHH
Q 001444 192 QAASGTKGGSSGSPVI-DWQG--RAVALNAGSKSSS--ASAFFLPLERVV 236 (1076)
Q Consensus 192 ~~~a~~~~G~SGgPv~-n~~G--~vVGi~~~~~~~~--~~~falP~~~i~ 236 (1076)
........|.+|||++ +.+| .|||+.+.+.... ...+++.+...+
T Consensus 223 ~~~~~~~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~~~~~~~f~~v~~~~ 272 (282)
T PF03761_consen 223 CTKQYSCKGDRGGPLVKNINGRWTLIGVGASGNYECNKNNSYFFNVSWYQ 272 (282)
T ss_pred ecccccCCCCccCeEEEEECCCEEEEEEEccCCCcccccccEEEEHHHhh
Confidence 4555667889999997 3455 4999987654321 245566655543
No 112
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=90.59 E-value=0.3 Score=60.01 Aligned_cols=55 Identities=22% Similarity=0.341 Sum_probs=42.7
Q ss_pred eEEEEEecCCCHHhhhccCCCEEEEECCEEcCC--hhHHHHHHHhccCCCCCCCeEEEEEEeC
Q 001444 868 VLRVKGCLAGSKAENMLEQGDMMLAINKQPVTC--FHDIENACQALDKDGEDNGKLDITIFRQ 928 (1076)
Q Consensus 868 ~~~V~~V~~~s~A~~aL~~GDiIlsVnG~~V~~--~~dl~~~l~~~~~g~~~~~~v~l~V~R~ 928 (1076)
-++|..|.+|+|+...|++||.|++|||++|.. ++.+..+++.- .+.+.++|.+-
T Consensus 76 PviVr~VT~GGps~GKL~PGDQIl~vN~Epv~daprervIdlvRac------e~sv~ltV~qP 132 (1298)
T KOG3552|consen 76 PVIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDAPRERVIDLVRAC------ESSVNLTVCQP 132 (1298)
T ss_pred ceEEEEecCCCCccccccCCCeEEEecCcccccccHHHHHHHHHHH------hhhcceEEecc
Confidence 367999999999998899999999999999996 44444444221 25688888874
No 113
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=90.56 E-value=0.64 Score=53.76 Aligned_cols=87 Identities=22% Similarity=0.310 Sum_probs=61.5
Q ss_pred cccCceEEecCCHHHHhccCCCCCeEEEEc--CCChhhHcC-CCCCCEEEEcCCeecCCH--HHHHHHHHhcCCCCeEeE
Q 001444 376 LEVSGAVIHPLSYQQARNFRFPCGLVYVAE--PGYMLFRAG-VPRHAIIKKFAGEEISRL--EDLISVLSKLSRGARVPI 450 (1076)
Q Consensus 376 ~~~~G~~~~~l~~~~~~~~~~~~~gv~v~~--~gs~a~~aG-l~~GD~I~~Vng~~v~~l--~~~~~~l~~~~~g~~v~l 450 (1076)
+.|+|+.+.--+. .+ ...|+||.+ ++++-+..| +.+||.|++||....+|+ ++.+++|+..-.
T Consensus 260 vnfLGiSivgqsn--~r----gDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~------ 327 (626)
T KOG3571|consen 260 VNFLGISIVGQSN--AR----GDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVS------ 327 (626)
T ss_pred cccceeEeecccC--cC----CCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHhc------
Confidence 3567777654332 11 126999995 788887887 999999999999999998 588999988532
Q ss_pred EEEeccccccceEEEEEEecCCCCCCCeeeecCC
Q 001444 451 EYSSYTDRHRRKSVLVTIDRHEWYAPPQIYTRND 484 (1076)
Q Consensus 451 ~~~~~~~~~~~~~~~l~i~r~~~~~~~~~~~r~d 484 (1076)
+..++.|++-. .|....+-+.+.+
T Consensus 328 ---------~~gPi~ltvAk-~~DP~~q~~fTip 351 (626)
T KOG3571|consen 328 ---------RPGPIKLTVAK-CWDPNPQSYFTIP 351 (626)
T ss_pred ---------cCCCeEEEEee-ccCCCCcccccCC
Confidence 33566777766 7766555554443
No 114
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=89.96 E-value=0.19 Score=61.64 Aligned_cols=66 Identities=20% Similarity=0.303 Sum_probs=50.3
Q ss_pred cCCCccceEEEEEecCCCHHhhh--ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeEEEEEEeCCE
Q 001444 861 KDPVRRQVLRVKGCLAGSKAENM--LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKLDITIFRQGR 930 (1076)
Q Consensus 861 ~~~~~~~~~~V~~V~~~s~A~~a--L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v~l~V~R~g~ 930 (1076)
...++.-|++|.+|.+|++|+.- |+.||.+|+|||+..-..++-+.+-...+.| ..|.+.|...|.
T Consensus 954 GaGq~klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lmtrtg----~vV~leVaKqgA 1021 (1629)
T KOG1892|consen 954 GAGQRKLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLMTRTG----NVVHLEVAKQGA 1021 (1629)
T ss_pred cCCccccceEEEEeccCCccccccccccCceeeeecCcccccccHHHHHHHHhccC----CeEEEehhhhhh
Confidence 34466679999999999999854 9999999999999888766544433334444 788898876554
No 115
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=89.64 E-value=0.59 Score=49.82 Aligned_cols=53 Identities=21% Similarity=0.399 Sum_probs=45.5
Q ss_pred cEEEEEecCCChhhhcCC-CCCCeEEEECCeec--CCHHHHHHHHHhCCCCCeEEE
Q 001444 976 GVYVARWCHGSPVHRYGL-YALQWIVEINGKRT--PDLEAFVNVTKEIEHGEFVRV 1028 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL-~~gD~I~~VNg~~v--~~l~~f~~~v~~~~~~~~v~l 1028 (1076)
|++|++..+|+-|+--|| ..+|.|++|||..| +++|+..+++-+...|--+++
T Consensus 195 GIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANshNLIiTV 250 (358)
T KOG3606|consen 195 GIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANSHNLIITV 250 (358)
T ss_pred ceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcccceEEEe
Confidence 899999999999999987 78999999999987 589999999987655544444
No 116
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=89.58 E-value=8.7 Score=41.96 Aligned_cols=147 Identities=17% Similarity=0.143 Sum_probs=73.6
Q ss_pred EEEEEEEeCCCcEEEeCccccCCCC-cEEEEEec---------CC---cEEEE-EEEEecC-------C-CcEEEEEEcC
Q 001444 69 YATGFVVDKRRGIILTNRHVVKPGP-VVAEAMFV---------NR---EEIPV-YPIYRDP-------V-HDFGFFRYDP 126 (1076)
Q Consensus 69 ~GTGfvV~~~~G~IlTn~Hvv~~~~-~~~~v~~~---------~~---~~~~a-~vv~~d~-------~-~DlAlLk~~~ 126 (1076)
.+-|.+|++ .+|||++|++.... ....|.+. ++ ..... +++ .++ . +|||||+++.
T Consensus 39 ~Cggsli~~--~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~ 115 (256)
T KOG3627|consen 39 LCGGSLISP--RWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSE 115 (256)
T ss_pred eeeeEEeeC--CEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECC
Confidence 566767765 39999999999531 03334332 11 11111 222 332 2 7999999985
Q ss_pred C-CCccccccCCCCCCc----CCCCCCEEEEEecCCCCC-----C-eEEEEEEEEecCC--CCCCCCC-CccccceeeEE
Q 001444 127 S-AIQFLNYDEIPLAPE----AACVGLEIRVVGNDSGEK-----V-SILAGTLARLDRD--APHYKKD-GYNDFNTFYMQ 192 (1076)
Q Consensus 127 ~-~~~~~~~~~l~l~~~----~~~~G~~V~~iG~p~g~~-----~-s~~~G~is~~~~~--~~~~~~~-~~~~~~~~~i~ 192 (1076)
. .+. -.+.++.|... ....+..+.+.|+..... . .+....+..+... ...|... ...+ ..+-
T Consensus 116 ~v~~~-~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~---~~~C 191 (256)
T KOG3627|consen 116 PVTFS-SHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITD---TMLC 191 (256)
T ss_pred CcccC-CcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCC---CEEe
Confidence 3 221 13344444322 134458888888653211 1 1222222222210 1111110 0001 1121
Q ss_pred E-----eeccCCCCCCCceecCC---CcEEEEeecccC
Q 001444 193 A-----ASGTKGGSSGSPVIDWQ---GRAVALNAGSKS 222 (1076)
Q Consensus 193 ~-----~a~~~~G~SGgPv~n~~---G~vVGi~~~~~~ 222 (1076)
+ ...+-.|.|||||+-.+ ..++||.+.+..
T Consensus 192 a~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 192 AGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred eCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 1 12245699999998654 699999988654
No 117
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=88.90 E-value=1.5 Score=45.07 Aligned_cols=83 Identities=18% Similarity=0.234 Sum_probs=55.9
Q ss_pred hhHHHHHHHhccCCCCCCCeEEEEEEe---CCEE--EEEEEeccccCCCCCcceeeecCccccCCcHhHhhcCCCCCCCC
Q 001444 901 FHDIENACQALDKDGEDNGKLDITIFR---QGRE--IELQVGTDVRDGNGTTRVINWCGCIVQDPHPAVRALGFLPEEGH 975 (1076)
Q Consensus 901 ~~dl~~~l~~~~~g~~~~~~v~l~V~R---~g~~--~~~~v~l~~~~~~~~~~~~~~~G~~~~~p~~~~~~~~~~p~~~~ 975 (1076)
..++...+....+| +.+.++|.+ .|+. .++.+++.+.. ....| ..-+|+.+..- ..
T Consensus 62 ~~~~~~~~~~~~~g----~~lrl~V~G~~~~G~~~~k~v~lpl~~~~-~g~eR-L~~~GL~l~~e-------------~~ 122 (183)
T PF11874_consen 62 PSELVQVAEQLPPG----SSLRLRVEGPDFEGDPVTKTVLLPLGDGA-DGEER-LEAAGLTLMEE-------------GG 122 (183)
T ss_pred HHHHHHHHhcCCCC----CEEEEEEEccCCCCCceEEEEEEEcCCCC-CHHHH-HHhCCCEEEee-------------CC
Confidence 45666666556665 889999987 3554 44555554332 22222 33457766441 23
Q ss_pred cEEEEEecCCChhhhcCCCCCCeEEEE
Q 001444 976 GVYVARWCHGSPVHRYGLYALQWIVEI 1002 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL~~gD~I~~V 1002 (1076)
.+.|..+..||||+++|+..+..|++|
T Consensus 123 ~~~Vd~v~fgS~A~~~g~d~d~~I~~v 149 (183)
T PF11874_consen 123 KVIVDEVEFGSPAEKAGIDFDWEITEV 149 (183)
T ss_pred EEEEEecCCCCHHHHcCCCCCcEEEEE
Confidence 689999999999999999999999887
No 118
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=87.04 E-value=1.8 Score=50.61 Aligned_cols=56 Identities=30% Similarity=0.509 Sum_probs=48.2
Q ss_pred EecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhcCCCCe---EEEEEEe-CCeEEE
Q 001444 304 SVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKN---IELLIER-GGISMT 359 (1076)
Q Consensus 304 ~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~~~g~~---v~l~v~R-~g~~~~ 359 (1076)
.+..+++|.. +|++||.++++|++++.+|.++...+....+.. +.+.+.| ++..+.
T Consensus 135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~ 195 (375)
T COG0750 135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEAHA 195 (375)
T ss_pred ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEeccceeee
Confidence 6889999999 999999999999999999999998886666655 8899999 666543
No 119
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=86.95 E-value=1.2 Score=48.33 Aligned_cols=54 Identities=24% Similarity=0.371 Sum_probs=40.2
Q ss_pred CcEEEEEEecCCCcccc-C-CCCCCEEEEECCEEeCChhHHH--HHHhcCCCCeEEEEEE
Q 001444 297 TGLLVVDSVVPGGPAHL-R-LEPGDVLVRVNGEVITQFLKLE--TLLDDGVDKNIELLIE 352 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~-g-L~~GD~Il~VnG~~v~~~~~l~--~~l~~~~g~~v~l~v~ 352 (1076)
.-+++| .|..++||++ | ++.||.|++|||..|..-..+. .++.... +.|.+++.
T Consensus 30 PClYiV-QvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~-~eV~IhyN 87 (429)
T KOG3651|consen 30 PCLYIV-QVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSL-NEVKIHYN 87 (429)
T ss_pred CeEEEE-EeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhc-cceEEEeh
Confidence 346665 8999999999 5 9999999999999998876554 4444433 34566543
No 120
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=86.45 E-value=1.2 Score=52.45 Aligned_cols=74 Identities=23% Similarity=0.315 Sum_probs=58.5
Q ss_pred cEEEEEecCCChhhhcCC-CCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEEEEEeC---CeEEEEEEEeCCccC
Q 001444 976 GVYVARWCHGSPVHRYGL-YALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVRTVHLN---GKPRVLTLKQDLHYW 1049 (1076)
Q Consensus 976 gv~V~~v~~gSpA~~~GL-~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~~v~r~---g~~~~~tlk~~~~y~ 1049 (1076)
.++|.++..|+.+++.|+ +.||.|.+|||..+.+. +++.++++... ..++++++--- .....+-++.-.+|+
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~--G~itfkiiP~~~~~~~~~~~~vra~FdYd 224 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSR--GSITFKIIPSYRPPPQQQVVFVRALFDYD 224 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCC--CcEEEEEcccccCCCceeeeeehhhcCcC
Confidence 469999999999999886 89999999999998875 89999999854 55777744322 233346788888998
Q ss_pred cc
Q 001444 1050 PT 1051 (1076)
Q Consensus 1050 pt 1051 (1076)
|-
T Consensus 225 P~ 226 (542)
T KOG0609|consen 225 PK 226 (542)
T ss_pred cc
Confidence 86
No 121
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=86.42 E-value=0.93 Score=57.53 Aligned_cols=51 Identities=29% Similarity=0.378 Sum_probs=44.2
Q ss_pred EEEEecCCChhhhcCCCCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEEE
Q 001444 978 YVARWCHGSPVHRYGLYALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVRT 1030 (1076)
Q Consensus 978 ~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~~ 1030 (1076)
.|..|..||||..+||+++|.|++|||+++..+ .++.+.+.+ .|..|.+++
T Consensus 661 ~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~t 713 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLRT 713 (1205)
T ss_pred eeeeecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHh--cCCeeEEEe
Confidence 689999999999999999999999999999988 567777764 477787773
No 122
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=85.69 E-value=3.7 Score=48.53 Aligned_cols=53 Identities=30% Similarity=0.428 Sum_probs=41.7
Q ss_pred EEEEEecCCCcccc-C-CCCCCEEEEECCEEeCCh--hHHHHHHhcCCCCeEEEEEEe
Q 001444 300 LVVDSVVPGGPAHL-R-LEPGDVLVRVNGEVITQF--LKLETLLDDGVDKNIELLIER 353 (1076)
Q Consensus 300 lvv~~V~~~spA~~-g-L~~GD~Il~VnG~~v~~~--~~l~~~l~~~~g~~v~l~v~R 353 (1076)
++|..+..|+.+++ | |..||.|++|||..+.+- .+++.+|....| .|++.+.-
T Consensus 148 ~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~G-~itfkiiP 204 (542)
T KOG0609|consen 148 VVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSRG-SITFKIIP 204 (542)
T ss_pred cEEeeeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCCC-cEEEEEcc
Confidence 44579999999999 5 999999999999999874 577788866544 56666543
No 123
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=85.62 E-value=2.5 Score=49.18 Aligned_cols=63 Identities=21% Similarity=0.327 Sum_probs=47.9
Q ss_pred eeecCccccCCcHhHhhcCCCCCCCCcEEEEEecCCChhhhcC-CCCCCeEEEECCeecCCH--HHHHHHHHhC
Q 001444 950 INWCGCIVQDPHPAVRALGFLPEEGHGVYVARWCHGSPVHRYG-LYALQWIVEINGKRTPDL--EAFVNVTKEI 1020 (1076)
Q Consensus 950 ~~~~G~~~~~p~~~~~~~~~~p~~~~gv~V~~v~~gSpA~~~G-L~~gD~I~~VNg~~v~~l--~~f~~~v~~~ 1020 (1076)
+.|+|+.+.--. -..+..|+||.++.+|++-+.-| +.+||.|++||.....++ ++.+.+++++
T Consensus 260 vnfLGiSivgqs--------n~rgDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREa 325 (626)
T KOG3571|consen 260 VNFLGISIVGQS--------NARGDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREA 325 (626)
T ss_pred cccceeEeeccc--------CcCCCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHH
Confidence 457787764322 11123589999999999877665 699999999999999988 7888888774
No 124
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=85.13 E-value=11 Score=38.95 Aligned_cols=151 Identities=14% Similarity=0.139 Sum_probs=78.3
Q ss_pred ceeeeeEEEEEEEcCcccccCCcccceeeEEEEEEEeeCCceEEEEeCccccCCCccEEEEeecCCeEE--eEEEEEeeC
Q 001444 590 SVIEPTLVMFEVHVPPSCMIDGVHSQHFFGTGVIIYHSQSMGLVVVDKNTVAISASDVMLSFAAFPIEI--PGEVVFLHP 667 (1076)
Q Consensus 590 ~~~~~S~V~V~~~~~~~~~~dg~~~~~~~GsG~vId~~~~~G~IlTn~~~V~~~~~~i~v~~~d~~~~~--~a~vv~~dp 667 (1076)
..+++-++.|++ ......++++-|. +.+.|.++| ......+.+. ++.+ ...+...+.
T Consensus 9 ~~~~~N~~~v~~-----------~~g~~t~l~~gi~----~~~~lvp~H----~~~~~~i~i~--g~~~~~~d~~~lv~~ 67 (172)
T PF00548_consen 9 SLIKKNVVPVTT-----------GKGEFTMLALGIY----DRYFLVPTH----EEPEDTIYID--GVEYKVDDSVVLVDR 67 (172)
T ss_dssp HHHHHHEEEEEE-----------TTEEEEEEEEEEE----BTEEEEEGG----GGGCSEEEET--TEEEEEEEEEEEEET
T ss_pred HHHhccEEEEEe-----------CCceEEEecceEe----eeEEEEECc----CCCcEEEEEC--CEEEEeeeeEEEecC
Confidence 335566666664 2667889988898 568888877 2222334443 4433 334445565
Q ss_pred C---CcEEEEEECCC-CCCcccccceeeeeccCCccCCCCCEEEEEeeCCCCceeeeeeeEecccceeecCCCCCCcccc
Q 001444 668 V---HNFALIAYDPS-SLGVAGASVVRAAELLPEPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALNISSADCPRYRA 743 (1076)
Q Consensus 668 ~---~dlAvlk~d~~-~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~i~~~~~~~~~~ 743 (1076)
. .|+++++++.. .+. ...+-+. +.. -...+.+.++=++...+.....+.|+. .+.+.....+.+
T Consensus 68 ~~~~~Dl~~v~l~~~~kfr----DIrk~~~--~~~-~~~~~~~l~v~~~~~~~~~~~v~~v~~---~~~i~~~g~~~~-- 135 (172)
T PF00548_consen 68 DGVDTDLTLVKLPRNPKFR----DIRKFFP--ESI-PEYPECVLLVNSTKFPRMIVEVGFVTN---FGFINLSGTTTP-- 135 (172)
T ss_dssp TSSEEEEEEEEEESSS-B------GGGGSB--SSG-GTEEEEEEEEESSSSTCEEEEEEEEEE---EEEEEETTEEEE--
T ss_pred CCcceeEEEEEccCCcccC----chhhhhc--ccc-ccCCCcEEEEECCCCccEEEEEEEEee---cCccccCCCEee--
Confidence 4 59999999432 221 2222222 211 134555555554444333333344442 333311111111
Q ss_pred cceeEEEEecc-cCCCcCceEECC---CceEEEEEee
Q 001444 744 MNMEVIELDTD-FGSTFSGVLTDE---HGRVQAIWGS 776 (1076)
Q Consensus 744 ~~~~~I~~d~~-ig~~sGGpL~d~---~G~VvGi~~~ 776 (1076)
..+.-++. ..+.|||+|+.. .++++||..+
T Consensus 136 ---~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHva 169 (172)
T PF00548_consen 136 ---RSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVA 169 (172)
T ss_dssp ---EEEEEESEEETTGTTEEEEESCGGTTEEEEEEEE
T ss_pred ---EEEEEccCCCCCccCCeEEEeeccCccEEEEEec
Confidence 23333332 234599999864 4799999876
No 125
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=84.94 E-value=3.1 Score=48.61 Aligned_cols=58 Identities=17% Similarity=0.252 Sum_probs=48.8
Q ss_pred EEEEecCCChhhhcCCCCCCeEEEECCeecCCHHHHHHHHHhCCCCCe---EEEEEEEe-CCeE
Q 001444 978 YVARWCHGSPVHRYGLYALQWIVEINGKRTPDLEAFVNVTKEIEHGEF---VRVRTVHL-NGKP 1037 (1076)
Q Consensus 978 ~V~~v~~gSpA~~~GL~~gD~I~~VNg~~v~~l~~f~~~v~~~~~~~~---v~l~~v~r-~g~~ 1037 (1076)
++..+..+|+|.++|+++||.|+++|++++.++++..+.+.... +.. +.+. +.| ++..
T Consensus 132 ~~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~-~~~~~~~~i~-~~~~~~~~ 193 (375)
T COG0750 132 VVGEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAA-GDVFNLLTIL-VIRLDGEA 193 (375)
T ss_pred eeeecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhcc-CCcccceEEE-EEecccee
Confidence 34578999999999999999999999999999999999988855 444 6777 555 7776
No 126
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=84.64 E-value=1.3 Score=49.83 Aligned_cols=45 Identities=20% Similarity=0.235 Sum_probs=41.9
Q ss_pred CcEEEEEecCCChhhhc-CCCCCCeEEEECCeecCCHHHHHHHHHh
Q 001444 975 HGVYVARWCHGSPVHRY-GLYALQWIVEINGKRTPDLEAFVNVTKE 1019 (1076)
Q Consensus 975 ~gv~V~~v~~gSpA~~~-GL~~gD~I~~VNg~~v~~l~~f~~~v~~ 1019 (1076)
.||.|++|...||+..+ ||.+||+|+++||-|+.+.+|+.+-++.
T Consensus 220 ~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 220 EGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred ceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHh
Confidence 58999999999999866 9999999999999999999999998876
No 127
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=84.36 E-value=0.71 Score=51.66 Aligned_cols=56 Identities=23% Similarity=0.293 Sum_probs=43.3
Q ss_pred CCCcEEEEEEecCCCcccc--CCCCCCEEEEECCEEeCChhHHHHHH-hcCCCCeEEEEE
Q 001444 295 GETGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLETLL-DDGVDKNIELLI 351 (1076)
Q Consensus 295 ~~~G~lvv~~V~~~spA~~--gL~~GD~Il~VnG~~v~~~~~l~~~l-~~~~g~~v~l~v 351 (1076)
+...+|+ +.+.+|=.|++ -|-.||.|++|||+.+.+..+-+..- .++.|+.|.++|
T Consensus 108 NkMPIlI-SKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKraGkeV~lev 166 (506)
T KOG3551|consen 108 NKMPILI-SKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKRAGKEVLLEV 166 (506)
T ss_pred cCCceeh-hHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHHhhCceeeeee
Confidence 3444555 89999999999 59999999999999999887766543 456788765554
No 128
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=84.30 E-value=0.95 Score=43.35 Aligned_cols=32 Identities=34% Similarity=0.509 Sum_probs=26.1
Q ss_pred EEEeeccCCCCCCCceecCCCcEEEEeecccC
Q 001444 191 MQAASGTKGGSSGSPVIDWQGRAVALNAGSKS 222 (1076)
Q Consensus 191 i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~~ 222 (1076)
..-...-.+|.||-|++|..|+||||+.++.+
T Consensus 97 tip~g~g~~GDSGRpi~DNsGrVVaIVLGG~n 128 (158)
T PF00944_consen 97 TIPTGVGKPGDSGRPIFDNSGRVVAIVLGGAN 128 (158)
T ss_dssp EEETTS-STTSTTEEEESTTSBEEEEEEEEEE
T ss_pred EeccCCCCCCCCCCccCcCCCCEEEEEecCCC
Confidence 34455678999999999999999999988754
No 129
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=83.42 E-value=1.7 Score=46.53 Aligned_cols=46 Identities=15% Similarity=0.332 Sum_probs=40.2
Q ss_pred CeEEEEc--CCChhhHcC-CCCCCEEEEcCCeecC--CHHHHHHHHHhcCC
Q 001444 399 GLVYVAE--PGYMLFRAG-VPRHAIIKKFAGEEIS--RLEDLISVLSKLSR 444 (1076)
Q Consensus 399 ~gv~v~~--~gs~a~~aG-l~~GD~I~~Vng~~v~--~l~~~~~~l~~~~~ 444 (1076)
.|+|++. ||+.|+..| |...|.|++|||.+|. +|+++.++|-+...
T Consensus 194 pGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANsh 244 (358)
T KOG3606|consen 194 PGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANSH 244 (358)
T ss_pred CceEEEeecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhccc
Confidence 5899994 999999999 6689999999999994 68999999998653
No 130
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=80.54 E-value=2.1 Score=47.29 Aligned_cols=53 Identities=21% Similarity=0.354 Sum_probs=42.1
Q ss_pred EEEEEecCCCcccc-C-CCCCCEEEEECCEEeCChhHHHHH-HhcCCCCeEEEEEE
Q 001444 300 LVVDSVVPGGPAHL-R-LEPGDVLVRVNGEVITQFLKLETL-LDDGVDKNIELLIE 352 (1076)
Q Consensus 300 lvv~~V~~~spA~~-g-L~~GD~Il~VnG~~v~~~~~l~~~-l~~~~g~~v~l~v~ 352 (1076)
+||+.+..+-.|+. | |=.||-|+.|||..|+.-.+-+.. +..+.|+.|+|+|.
T Consensus 82 vviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRNAGdeVtlTV~ 137 (505)
T KOG3549|consen 82 VVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRNAGDEVTLTVK 137 (505)
T ss_pred EEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHhcCCEEEEEeH
Confidence 55689999999999 5 889999999999999886654422 34567888888875
No 131
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=79.78 E-value=3.5 Score=44.76 Aligned_cols=55 Identities=16% Similarity=0.161 Sum_probs=40.9
Q ss_pred cceEEEEEecCCCHHhhh--ccCCCEEEEECCEEcCChh--HHHHHHHhccCCCCCCCeEEEEEE
Q 001444 866 RQVLRVKGCLAGSKAENM--LEQGDMMLAINKQPVTCFH--DIENACQALDKDGEDNGKLDITIF 926 (1076)
Q Consensus 866 ~~~~~V~~V~~~s~A~~a--L~~GDiIlsVnG~~V~~~~--dl~~~l~~~~~g~~~~~~v~l~V~ 926 (1076)
...++|..|..++||++- ++.||.|++|||..|..-. ++..+++... +.+++++-
T Consensus 29 CPClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~------~eV~IhyN 87 (429)
T KOG3651|consen 29 CPCLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSL------NEVKIHYN 87 (429)
T ss_pred CCeEEEEEeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhc------cceEEEeh
Confidence 456889999999999976 9999999999999998644 4445553211 44666664
No 132
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=78.94 E-value=15 Score=44.79 Aligned_cols=125 Identities=18% Similarity=0.210 Sum_probs=69.0
Q ss_pred eCCCcEEEEEECCCC-----CCcccc--cceeeeeccC------CccCCCCCEEEEEeeCCCCceeeeeeeEecccceee
Q 001444 666 HPVHNFALIAYDPSS-----LGVAGA--SVVRAAELLP------EPALRRGDSVYLVGLSRSLQATSRKSIVTNPCAALN 732 (1076)
Q Consensus 666 dp~~dlAvlk~d~~~-----~~~~~~--~~v~~~~l~~------~~~l~~G~~V~~iG~p~~~~~~~~~~~vt~i~~~~~ 732 (1076)
....|+||||+++.. +..... ..=..+.|.+ ...+..|..|+=+|..-++. .|.+.++. ...
T Consensus 540 ~~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTgyT----~G~lNg~k-lvy 614 (695)
T PF08192_consen 540 KRLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTGYT----TGILNGIK-LVY 614 (695)
T ss_pred ccccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCCcc----ceEecceE-EEE
Confidence 455799999998664 221000 0001122322 13467899999999887774 56666443 111
Q ss_pred cCCCCCCcccccceeE-EEEe--ccc--CCCcCceEECCCce------EEEEEeeccccccccCCCCCcceeEeccchhh
Q 001444 733 ISSADCPRYRAMNMEV-IELD--TDF--GSTFSGVLTDEHGR------VQAIWGSFSTQVKFGCSSSEDHQFVRGIPIYT 801 (1076)
Q Consensus 733 i~~~~~~~~~~~~~~~-I~~d--~~i--g~~sGGpL~d~~G~------VvGi~~~~~~~~~~g~~~~~~~~~~~aipi~~ 801 (1076)
...... ...++ |..+ ..+ ++.||..+++.-+. |+||..+|.++ -..|.+-.|+..
T Consensus 615 w~dG~i-----~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge---------~kqfglftPi~~ 680 (695)
T PF08192_consen 615 WADGKI-----QSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGE---------QKQFGLFTPINE 680 (695)
T ss_pred ecCCCe-----EEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCc---------cceeeccCcHHH
Confidence 111111 11223 2221 223 33567777776554 99999998876 334445578887
Q ss_pred HHHHHHHH
Q 001444 802 ISRVLDKI 809 (1076)
Q Consensus 802 v~~~l~~l 809 (1076)
|+.-|++.
T Consensus 681 il~rl~~v 688 (695)
T PF08192_consen 681 ILDRLEEV 688 (695)
T ss_pred HHHHHHHh
Confidence 77655553
No 133
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=77.05 E-value=3.1 Score=47.06 Aligned_cols=46 Identities=22% Similarity=0.258 Sum_probs=40.1
Q ss_pred CCCcEEEEEEecCCCcccc--CCCCCCEEEEECCEEeCChhHHHHHHhc
Q 001444 295 GETGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFLKLETLLDD 341 (1076)
Q Consensus 295 ~~~G~lvv~~V~~~spA~~--gL~~GD~Il~VnG~~v~~~~~l~~~l~~ 341 (1076)
...|+.| ..|...||+.. ||.+||+|.++||-+|.+..+..+-++.
T Consensus 218 ~g~gV~V-tev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 218 HGEGVTV-TEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred cCceEEE-EeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHh
Confidence 3467777 69999999998 9999999999999999999988877754
No 134
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=74.01 E-value=7.5 Score=41.74 Aligned_cols=51 Identities=20% Similarity=0.285 Sum_probs=42.7
Q ss_pred cCCChhhHcC-CCCCCEEEEcCCeecCCHH--HHHHHHHhcCCCCeEeEEEEec
Q 001444 405 EPGYMLFRAG-VPRHAIIKKFAGEEISRLE--DLISVLSKLSRGARVPIEYSSY 455 (1076)
Q Consensus 405 ~~gs~a~~aG-l~~GD~I~~Vng~~v~~l~--~~~~~l~~~~~g~~v~l~~~~~ 455 (1076)
.++|-..+-- +..||.|.+|||+.+-.+. ++.+.|+.++.|+..+|+....
T Consensus 157 kegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLieP 210 (334)
T KOG3938|consen 157 KEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIEP 210 (334)
T ss_pred cCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeecc
Confidence 3666665553 8899999999999999985 8899999999999999987755
No 135
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=73.27 E-value=17 Score=37.60 Aligned_cols=80 Identities=24% Similarity=0.169 Sum_probs=51.8
Q ss_pred hhHHHHHHh-cCCCCeEEEEEEe---CCeEEE--EEEEeccCCCCCCCcccccCceEEecCCHHHHhccCCCCCeEEEEc
Q 001444 332 FLKLETLLD-DGVDKNIELLIER---GGISMT--VNLVVQDLHSITPDYFLEVSGAVIHPLSYQQARNFRFPCGLVYVAE 405 (1076)
Q Consensus 332 ~~~l~~~l~-~~~g~~v~l~v~R---~g~~~~--~~v~l~~~~~~~~~~~~~~~G~~~~~l~~~~~~~~~~~~~gv~v~~ 405 (1076)
..++.+.+. ...|+.++++|.+ .|+..+ +.+++.+.. +...-+.-.|+.+.+.. +.+.|.+
T Consensus 62 ~~~~~~~~~~~~~g~~lrl~V~G~~~~G~~~~k~v~lpl~~~~--~g~eRL~~~GL~l~~e~-----------~~~~Vd~ 128 (183)
T PF11874_consen 62 PSELVQVAEQLPPGSSLRLRVEGPDFEGDPVTKTVLLPLGDGA--DGEERLEAAGLTLMEEG-----------GKVIVDE 128 (183)
T ss_pred HHHHHHHHhcCCCCCEEEEEEEccCCCCCceEEEEEEEcCCCC--CHHHHHHhCCCEEEeeC-----------CEEEEEe
Confidence 356666664 4789999999998 455444 444444332 22222344566655421 3466664
Q ss_pred --CCChhhHcCCCCCCEEEEc
Q 001444 406 --PGYMLFRAGVPRHAIIKKF 424 (1076)
Q Consensus 406 --~gs~a~~aGl~~GD~I~~V 424 (1076)
.||+|+++|+.-++.|++|
T Consensus 129 v~fgS~A~~~g~d~d~~I~~v 149 (183)
T PF11874_consen 129 VEFGSPAEKAGIDFDWEITEV 149 (183)
T ss_pred cCCCCHHHHcCCCCCcEEEEE
Confidence 7999999999999999876
No 136
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=72.53 E-value=3 Score=46.89 Aligned_cols=59 Identities=15% Similarity=0.196 Sum_probs=40.6
Q ss_pred ceEEEEEecCCCHHhhh--ccCCCEEEEECCEEcCChhHHHHHHHhccCCCCCCCeE--EEEEEeCC
Q 001444 867 QVLRVKGCLAGSKAENM--LEQGDMMLAINKQPVTCFHDIENACQALDKDGEDNGKL--DITIFRQG 929 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~a--L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~g~~~~~~v--~l~V~R~g 929 (1076)
.-++|+++.+|-.|++. |..||.|++|||....+..+-+..-.+.+.| .+| .++++|+-
T Consensus 110 MPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKraG----keV~levKy~REv 172 (506)
T KOG3551|consen 110 MPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKRAG----KEVLLEVKYMREV 172 (506)
T ss_pred CceehhHhccccccccccceeeccEEEEecchhhhhcchHHHHHHHHhhC----ceeeeeeeeehhc
Confidence 34668899999888876 9999999999999998765533333233444 444 44445643
No 137
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=71.77 E-value=2.5 Score=49.38 Aligned_cols=30 Identities=33% Similarity=0.489 Sum_probs=24.7
Q ss_pred eEEEeeccCCCCCCCceecCCCcEEEEeec
Q 001444 190 YMQAASGTKGGSSGSPVIDWQGRAVALNAG 219 (1076)
Q Consensus 190 ~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~ 219 (1076)
++.-.....+|+||+.|+|.+|++|||..|
T Consensus 345 y~~~~~~l~gGaSGS~V~n~~~~lvGIy~g 374 (374)
T PF01732_consen 345 YLIDNYSLGGGASGSMVINQNNELVGIYFG 374 (374)
T ss_pred hcccccCCCCCCCcCeEECCCCCEEEEeCC
Confidence 344455778999999999999999999754
No 138
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=71.32 E-value=3.5 Score=45.62 Aligned_cols=68 Identities=22% Similarity=0.335 Sum_probs=53.9
Q ss_pred EEEEEecCCChhhhcCC-CCCCeEEEECCeecCCH--HHHHHHHHhCCCCCeEEEEEEEeCCeEEEEEEEeCC
Q 001444 977 VYVARWCHGSPVHRYGL-YALQWIVEINGKRTPDL--EAFVNVTKEIEHGEFVRVRTVHLNGKPRVLTLKQDL 1046 (1076)
Q Consensus 977 v~V~~v~~gSpA~~~GL-~~gD~I~~VNg~~v~~l--~~f~~~v~~~~~~~~v~l~~v~r~g~~~~~tlk~~~ 1046 (1076)
|+|+.+..+-.|+.-|+ ..||-|++|||.-|..- ++.+.+++. .|+.|+|++....-.|-++.+..+.
T Consensus 82 vviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV~~lr~ApaFLklpL~~ 152 (505)
T KOG3549|consen 82 VVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTVKHLRAAPAFLKLPLTK 152 (505)
T ss_pred EEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEeHhhhcCcHHhcCccCC
Confidence 58999999999999875 89999999999998865 899999985 6899998844444455555555544
No 139
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=67.79 E-value=6.3 Score=50.50 Aligned_cols=36 Identities=31% Similarity=0.415 Sum_probs=31.9
Q ss_pred EEEEecCCCHHhhh-ccCCCEEEEECCEEcCChhHHH
Q 001444 870 RVKGCLAGSKAENM-LEQGDMMLAINKQPVTCFHDIE 905 (1076)
Q Consensus 870 ~V~~V~~~s~A~~a-L~~GDiIlsVnG~~V~~~~dl~ 905 (1076)
+|..|.++|||..+ |++||.|+.+||++|....+-+
T Consensus 661 ~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H~e 697 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTE 697 (1205)
T ss_pred eeeeecCCCCccccCCCccceeEeccCcccchhhHHH
Confidence 48899999999999 9999999999999999865433
No 140
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=66.43 E-value=18 Score=33.81 Aligned_cols=52 Identities=10% Similarity=0.117 Sum_probs=32.2
Q ss_pred EEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEEEEEecCCCcEEEEEEcCCCCcccccc
Q 001444 72 GFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDPSAIQFLNYD 135 (1076)
Q Consensus 72 GfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~vv~~d~~~DlAlLk~~~~~~~~~~~~ 135 (1076)
++=|. +|..+|+.||.+.... + ++.++ +++ ...-|+++++.....++.++++
T Consensus 3 avHIG--nG~~vt~tHva~~~~~-v-----~g~~f--~~~--~~~ge~~~v~~~~~~~p~~~ig 54 (105)
T PF03510_consen 3 AVHIG--NGRYVTVTHVAKSSDS-V-----DGQPF--KIV--KTDGELCWVQSPLVHLPAAQIG 54 (105)
T ss_pred eEEeC--CCEEEEEEEEeccCce-E-----cCcCc--EEE--EeccCEEEEECCCCCCCeeEec
Confidence 44455 7999999999984311 1 22222 222 3445999999987665655554
No 141
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=65.19 E-value=16 Score=39.42 Aligned_cols=55 Identities=24% Similarity=0.377 Sum_probs=47.1
Q ss_pred EEEEecCCChhhhc-CCCCCCeEEEECCeecCCHH--HHHHHHHhCCCCCeEEEEEEE
Q 001444 978 YVARWCHGSPVHRY-GLYALQWIVEINGKRTPDLE--AFVNVTKEIEHGEFVRVRTVH 1032 (1076)
Q Consensus 978 ~V~~v~~gSpA~~~-GL~~gD~I~~VNg~~v~~l~--~f~~~v~~~~~~~~v~l~~v~ 1032 (1076)
+|..+.+||--++. -+..||.|.+|||+.+-.+. +..++++.++.++.++|+++.
T Consensus 152 FIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie 209 (334)
T KOG3938|consen 152 FIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE 209 (334)
T ss_pred eeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence 68888888877765 46899999999999999884 678899999999999999765
No 142
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=59.16 E-value=6.9 Score=37.83 Aligned_cols=114 Identities=19% Similarity=0.231 Sum_probs=58.1
Q ss_pred EEEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEEEEEecCCCcEEEEEEcCCCCccccccCCCCCCcCCCCCCEE
Q 001444 71 TGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEI 150 (1076)
Q Consensus 71 TGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~vv~~d~~~DlAlLk~~~~~~~~~~~~~l~l~~~~~~~G~~V 150 (1076)
-|+.| +|.+-|.+|-.... .+.-+.| +....+.+...|+..-...+..- .+.+....+ ..+
T Consensus 15 mgt~v---nGV~wT~~HGagsr----tlAgp~G---pv~q~~~s~~~Dlv~~p~P~Ga~---SL~pCtCg~------~dl 75 (148)
T PF02907_consen 15 MGTCV---NGVMWTVYHGAGSR----TLAGPKG---PVNQMYTSVDDDLVGWPAPPGAR---SLTPCTCGS------SDL 75 (148)
T ss_dssp EEEEE---TTEEEEEHHHHTTS----EEEBTTS---EB-ESEEETTTTEEEEE-STTB-----BBB-SSSS------SEE
T ss_pred ehhEE---ccEEEEEEecCCcc----cccCCCC---cceEeEEcCCCCCcccccccccc---cCCccccCC------ccE
Confidence 46667 47999999976631 1211222 34556888899998887765321 223322222 346
Q ss_pred EEEecCCCCCCeEEEEEEEEecCCCCCCCCCCccccceeeEEEeeccCCCCCCCceecCCCcEEEEeecc
Q 001444 151 RVVGNDSGEKVSILAGTLARLDRDAPHYKKDGYNDFNTFYMQAASGTKGGSSGSPVIDWQGRAVALNAGS 220 (1076)
Q Consensus 151 ~~iG~p~g~~~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~ 220 (1076)
++|-+... +..+ .+.+. .+... ..-.-.+...|+||||++=.+|.+|||..+.
T Consensus 76 ylVtr~~~----v~p~--rr~gd--------~~~~L---~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa 128 (148)
T PF02907_consen 76 YLVTRDAD----VIPV--RRRGD--------SRASL---LSPRPISDLKGSSGGPILCPSGHAVGMFRAA 128 (148)
T ss_dssp EEE-TTS-----EEEE--EEEST--------TEEEE---EEEEEHHHHTT-TT-EEEETTSEEEEEEEEE
T ss_pred EEEeccCc----Eeee--EEcCC--------CceEe---cCCceeEEEecCCCCcccCCCCCEEEEEEEE
Confidence 66644432 1221 11111 00000 0111223457999999999999999997553
No 143
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=57.30 E-value=53 Score=37.94 Aligned_cols=136 Identities=23% Similarity=0.318 Sum_probs=68.7
Q ss_pred cEEEEEEEeCCCcEEEeCccccCCCCcEEEEEecCCcEEEEEEEEecCCCcEEEEEEcCCCCccccccCCCCCCcCCCCC
Q 001444 68 SYATGFVVDKRRGIILTNRHVVKPGPVVAEAMFVNREEIPVYPIYRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVG 147 (1076)
Q Consensus 68 ~~GTGfvV~~~~G~IlTn~Hvv~~~~~~~~v~~~~~~~~~a~vv~~d~~~DlAlLk~~~~~~~~~~~~~l~l~~~~~~~G 147 (1076)
+.|-||.|+++ +.+|+-||+..+...+.= .+..-+..+..-+|.-++|..+--| ++.-+-| .+....|
T Consensus 379 GsGWGfWVS~~--lfITttHViP~g~~E~FG-------v~i~~i~vh~sGeF~~~rFpk~iRP--DvtgmiL-EeGapEG 446 (535)
T PF05416_consen 379 GSGWGFWVSPT--LFITTTHVIPPGAKEAFG-------VPISQIQVHKSGEFCRFRFPKPIRP--DVTGMIL-EEGAPEG 446 (535)
T ss_dssp TTEEEEESSSS--EEEEEGGGS-STTSEETT-------EECGGEEEEEETTEEEEEESS-SST--TS---EE--SS--TT
T ss_pred CCceeeeecce--EEEEeeeecCCcchhhhC-------CChhHeEEeeccceEEEecCCCCCC--Cccceee-ccCCCCc
Confidence 56889999976 999999999976432210 1112244555668888888632211 2222212 2334456
Q ss_pred CEEEE-EecCCCCC--CeEEEEEEEEecCCCCCCCCCCccccceeeEE-------EeeccCCCCCCCceecCCCc---EE
Q 001444 148 LEIRV-VGNDSGEK--VSILAGTLARLDRDAPHYKKDGYNDFNTFYMQ-------AASGTKGGSSGSPVIDWQGR---AV 214 (1076)
Q Consensus 148 ~~V~~-iG~p~g~~--~s~~~G~is~~~~~~~~~~~~~~~~~~~~~i~-------~~a~~~~G~SGgPv~n~~G~---vV 214 (1076)
.-+.+ |=.|.|+- +.+..|......-.- ....-.+.++. .|-.+.||.-|.|-+=..|+ |+
T Consensus 447 tV~siLiKR~sGEllpLAvRMgt~AsmkIqg------r~v~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~ 520 (535)
T PF05416_consen 447 TVCSILIKRPSGELLPLAVRMGTHASMKIQG------RTVHGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVI 520 (535)
T ss_dssp -EEEEEEE-TTSBEEEEEEEEEEEEEEEETT------EEEEEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEE
T ss_pred eEEEEEEEcCCccchhhhhhhccceeEEEcc------eeecceeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEE
Confidence 65544 35677754 345555544331110 00000111111 13356789999999977775 89
Q ss_pred EEeeccc
Q 001444 215 ALNAGSK 221 (1076)
Q Consensus 215 Gi~~~~~ 221 (1076)
|++++..
T Consensus 521 GVH~AAt 527 (535)
T PF05416_consen 521 GVHAAAT 527 (535)
T ss_dssp EEEEEE-
T ss_pred EEEehhc
Confidence 9987744
No 144
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=53.40 E-value=21 Score=37.60 Aligned_cols=55 Identities=20% Similarity=0.329 Sum_probs=43.9
Q ss_pred eeEEEeeccCCCCCCCceecC----CCcEEEEeecccCCCCCccccCH--HHHHHHHHHHH
Q 001444 189 FYMQAASGTKGGSSGSPVIDW----QGRAVALNAGSKSSSASAFFLPL--ERVVRALRFLQ 243 (1076)
Q Consensus 189 ~~i~~~a~~~~G~SGgPv~n~----~G~vVGi~~~~~~~~~~~falP~--~~i~~~l~~l~ 243 (1076)
..++..++...|+-|||++-. --+++||+.++..+.+.+||-++ +.++++++.|.
T Consensus 169 ~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~~~~gYAe~itQEDL~~A~~~l~ 229 (231)
T PF12381_consen 169 QGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSANHAMGYAESITQEDLMRAINKLE 229 (231)
T ss_pred eeeeEECCCcCCCccceeeEcchhhhhhhheeeecccccccceehhhhhHHHHHHHHHhhc
Confidence 346788899999999998732 35799999999888889999555 67777777765
No 145
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=49.47 E-value=15 Score=36.01 Aligned_cols=28 Identities=25% Similarity=0.519 Sum_probs=17.9
Q ss_pred EEEEecccCCCcCceEECCCceEEEEEee
Q 001444 748 VIELDTDFGSTFSGVLTDEHGRVQAIWGS 776 (1076)
Q Consensus 748 ~I~~d~~ig~~sGGpL~d~~G~VvGi~~~ 776 (1076)
++.+|-.-|. ||.|++|.+|+|+||-..
T Consensus 89 ~~~~d~~~Gs-SGSpi~n~~g~ivGlYg~ 116 (132)
T PF00949_consen 89 AIDLDFPKGS-SGSPIFNQNGEIVGLYGN 116 (132)
T ss_dssp EE---S-TTG-TT-EEEETTSCEEEEEEE
T ss_pred eeecccCCCC-CCCceEcCCCcEEEEEcc
Confidence 3445543354 899999999999999554
No 146
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=46.22 E-value=36 Score=30.01 Aligned_cols=47 Identities=28% Similarity=0.499 Sum_probs=31.9
Q ss_pred EEeEEEEEeeCCCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEE-Eee
Q 001444 657 EIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYL-VGL 711 (1076)
Q Consensus 657 ~~~a~vv~~dp~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~-iG~ 711 (1076)
-+|++|+.++...++|++.+-.-.-. | .+.|-+. .++.||+|++ +||
T Consensus 4 aiPgqI~~I~~~~~~A~Vd~gGvkre------V-~l~Lv~~-~v~~GdyVLVHvGf 51 (82)
T COG0298 4 AIPGQIVEIDDNNHLAIVDVGGVKRE------V-NLDLVGE-EVKVGDYVLVHVGF 51 (82)
T ss_pred ccccEEEEEeCCCceEEEEeccEeEE------E-EeeeecC-ccccCCEEEEEeeE
Confidence 37899999999888999988432111 0 0233232 4799999997 665
No 147
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=44.81 E-value=49 Score=36.71 Aligned_cols=71 Identities=13% Similarity=0.041 Sum_probs=45.0
Q ss_pred EEe-cCCChhhhcCCCC-CCeEE-EECCeecCCHH-HHHHHHHhCCCCCeEEEEEEE-eCCeEEEEEEEeCCccCcc
Q 001444 980 ARW-CHGSPVHRYGLYA-LQWIV-EINGKRTPDLE-AFVNVTKEIEHGEFVRVRTVH-LNGKPRVLTLKQDLHYWPT 1051 (1076)
Q Consensus 980 ~~v-~~gSpA~~~GL~~-gD~I~-~VNg~~v~~l~-~f~~~v~~~~~~~~v~l~~v~-r~g~~~~~tlk~~~~y~pt 1051 (1076)
-.| .+++|++.++|.+ -|+|. .=+|++..--+ ++.++++ ++-+-...|.+.+ -+...+.+|+..+.|.-|-
T Consensus 191 lnV~I~d~p~a~a~l~PdEdyi~gs~dg~~~~~ge~~l~Dv~e-s~~n~pl~Ly~yn~i~d~~R~~T~~~~~h~g~~ 266 (417)
T COG5233 191 LNVSIQDKPPAYALLSPDEDYIDGSSDGQPLEIGELDLEDVNE-SPVNLPLSLYYYNPIDDQERAKTERDGVHKGIV 266 (417)
T ss_pred eeeecCCCchhhcccCCcccccccCCCcccccchhhHHHHHhh-cccCCceEEEEEecccccccceeeccCccccCc
Confidence 344 7899999999976 45554 46788875443 4444444 4557667765333 3566777888776665443
No 148
>KOG4371 consensus Membrane-associated protein tyrosine phosphatase PTP-BAS and related proteins, contain FERM domain [Signal transduction mechanisms]
Probab=39.76 E-value=73 Score=40.95 Aligned_cols=117 Identities=15% Similarity=0.135 Sum_probs=61.4
Q ss_pred CCCCCCEEEEECCEEeCChhHHHHH-HhcCCCCeEEEEEEeCCeEEEEEEEeccCCCCCCCcccccCceEEecCCHHHH-
Q 001444 314 RLEPGDVLVRVNGEVITQFLKLETL-LDDGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGAVIHPLSYQQA- 391 (1076)
Q Consensus 314 gL~~GD~Il~VnG~~v~~~~~l~~~-l~~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~~G~~~~~l~~~~~- 391 (1076)
.|+.||+++.+||..+..-.+.... +....|+.|.|-|+|...... ...+.. ....+..+.-.-+...+...+..
T Consensus 1186 d~~~g~~l~~~n~i~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~~-d~~~~s--~~~~~~~l~~~~~~~~p~~~~~~~ 1262 (1332)
T KOG4371|consen 1186 DIRVGDVLLYVNGIAVEGKVHQEVVAMLRGGGDRVVLGVQRPPPAYS-DQHHAS--STSASAPLISVMLLKKPMATLGLS 1262 (1332)
T ss_pred CcchhhhhhhccceeeechhhHHHHHHHhccCceEEEEeecCCcccc-cchhhh--hhcccchhhhheeeeccccccccc
Confidence 7999999999999877654444332 244667889999999643211 000000 00000111101111222111100
Q ss_pred hccCCCCCeEEEEc--CCChhhHc-CCCCCCEEEEcCCeecCCHH
Q 001444 392 RNFRFPCGLVYVAE--PGYMLFRA-GVPRHAIIKKFAGEEISRLE 433 (1076)
Q Consensus 392 ~~~~~~~~gv~v~~--~gs~a~~a-Gl~~GD~I~~Vng~~v~~l~ 433 (1076)
-.-..+.+|+|+.. .++.|..- .+++||.+...+|+++...-
T Consensus 1263 ~~~~~~s~~~~~~~~~~~~~a~~~~~~r~g~~~~~~~~~~~~~~~ 1307 (1332)
T KOG4371|consen 1263 LAKRTMSDGIFIRNIAQDSAASSEGTLRVGDRLVSLDGEPVDGFT 1307 (1332)
T ss_pred ccccCcCCceeeecccccccccccccccccceeeccCCccCCCCC
Confidence 00011236788764 23333333 39999999999999997753
No 149
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=38.91 E-value=1.1e+02 Score=33.53 Aligned_cols=19 Identities=11% Similarity=0.183 Sum_probs=15.4
Q ss_pred CcCceEECCCceEEEEEee
Q 001444 758 TFSGVLTDEHGRVQAIWGS 776 (1076)
Q Consensus 758 ~sGGpL~d~~G~VvGi~~~ 776 (1076)
.||.|++..+|.+||+.+.
T Consensus 209 DSGSPVVt~dg~liGVHTG 227 (297)
T PF05579_consen 209 DSGSPVVTEDGDLIGVHTG 227 (297)
T ss_dssp CTT-EEEETTC-EEEEEEE
T ss_pred CCCCccCcCCCCEEEEEec
Confidence 5899999999999999776
No 150
>KOG4407 consensus Predicted Rho GTPase-activating protein [General function prediction only]
Probab=38.02 E-value=21 Score=46.43 Aligned_cols=87 Identities=7% Similarity=0.055 Sum_probs=62.1
Q ss_pred EEecCCCcccc-CCCCCCEEEEECCEEeCChhHHHHHHhcCCCCeEEEEEEeCCeEEEEEEEeccCCCCCCCcccccCce
Q 001444 303 DSVVPGGPAHL-RLEPGDVLVRVNGEVITQFLKLETLLDDGVDKNIELLIERGGISMTVNLVVQDLHSITPDYFLEVSGA 381 (1076)
Q Consensus 303 ~~V~~~spA~~-gL~~GD~Il~VnG~~v~~~~~l~~~l~~~~g~~v~l~v~R~g~~~~~~v~l~~~~~~~~~~~~~~~G~ 381 (1076)
..+..++|+.. |+..||.|+.|||..+.+-..+--.+..+.
T Consensus 101 ~Q~~s~~~~~nsG~~s~~~v~~itG~e~~~~TS~~~~~vk~~-------------------------------------- 142 (1973)
T KOG4407|consen 101 PQEASSAAGSNSGSSSSVGVAGITGLEPTSPTSLPPYQVKAM-------------------------------------- 142 (1973)
T ss_pred chhcccCcccccCcccccceeeecccccCCCccccHHHHhhh--------------------------------------
Confidence 45566788888 999999999999998876543322211110
Q ss_pred EEecCCHHHHhccCCCCCeEEEE--cCCChhhHcCCCCCCEEEEcCCeecCCHH--HHHHHHHhcCC
Q 001444 382 VIHPLSYQQARNFRFPCGLVYVA--EPGYMLFRAGVPRHAIIKKFAGEEISRLE--DLISVLSKLSR 444 (1076)
Q Consensus 382 ~~~~l~~~~~~~~~~~~~gv~v~--~~gs~a~~aGl~~GD~I~~Vng~~v~~l~--~~~~~l~~~~~ 444 (1076)
.-+|+. ++++++.-+.|+-||.++.||.+++..+. +.+..++..+.
T Consensus 143 -----------------eT~~~~eV~~n~~~~~a~LQ~~~~V~~v~~q~~A~i~~s~~~S~~~qt~~ 192 (1973)
T KOG4407|consen 143 -----------------ETIFIKEVQANGPAHYANLQTGDRVLMVNNQPIAGIAYSTIVSMIKQTPA 192 (1973)
T ss_pred -----------------hhhhhhhhccCChhHHHhhhccceeEEeecCcccchhhhhhhhhhccCCC
Confidence 011222 37788888999999999999999999985 67777776554
No 151
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=36.23 E-value=78 Score=27.26 Aligned_cols=44 Identities=25% Similarity=0.369 Sum_probs=29.6
Q ss_pred EEeEEEEEeeCCCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEEE
Q 001444 657 EIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLV 709 (1076)
Q Consensus 657 ~~~a~vv~~dp~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~i 709 (1076)
-+|++|+..+...+.|++.+....-. +.+.=-+++++||+|++-
T Consensus 4 ~iP~~Vv~v~~~~~~A~v~~~G~~~~---------V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 4 AIPGRVVEVDEDGGMAVVDFGGVRRE---------VSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp CEEEEEEEEETTTTEEEEEETTEEEE---------EEGTTCTSB-TT-EEEEE
T ss_pred cccEEEEEEeCCCCEEEEEcCCcEEE---------EEEEEeCCCCCCCEEEEe
Confidence 37999999998899999988643221 333222338999999985
No 152
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=35.84 E-value=79 Score=26.65 Aligned_cols=32 Identities=16% Similarity=0.162 Sum_probs=26.4
Q ss_pred cEEEEeecCCeEEeEEEEEeeCCCcEEEEEECC
Q 001444 646 DVMLSFAAFPIEIPGEVVFLHPVHNFALIAYDP 678 (1076)
Q Consensus 646 ~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~d~ 678 (1076)
.+.++... +.+++|+|+++|+...+.|||-..
T Consensus 8 ~V~~kTc~-g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 8 QVSCRTCF-EQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred EEEEEecC-CceEEEEEEEecCCCcEEEEECcc
Confidence 35566665 889999999999999999998544
No 153
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=29.05 E-value=83 Score=30.64 Aligned_cols=23 Identities=26% Similarity=0.174 Sum_probs=19.6
Q ss_pred CcCceEECCCceEEEEEeecccc
Q 001444 758 TFSGVLTDEHGRVQAIWGSFSTQ 780 (1076)
Q Consensus 758 ~sGGpL~d~~G~VvGi~~~~~~~ 780 (1076)
.||=|++|..|+||||.+.-.++
T Consensus 107 DSGRpi~DNsGrVVaIVLGG~ne 129 (158)
T PF00944_consen 107 DSGRPIFDNSGRVVAIVLGGANE 129 (158)
T ss_dssp STTEEEESTTSBEEEEEEEEEEE
T ss_pred CCCCccCcCCCCEEEEEecCCCC
Confidence 68999999999999998874444
No 154
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=28.09 E-value=86 Score=38.28 Aligned_cols=37 Identities=19% Similarity=0.427 Sum_probs=33.5
Q ss_pred CcEEEEEEecCCCcccc--CCCCCCEEEEECCEEeCChh
Q 001444 297 TGLLVVDSVVPGGPAHL--RLEPGDVLVRVNGEVITQFL 333 (1076)
Q Consensus 297 ~G~lvv~~V~~~spA~~--gL~~GD~Il~VnG~~v~~~~ 333 (1076)
.|.-++..+.+++||+. .|..||.++.||+..+-.|+
T Consensus 224 dg~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq 262 (638)
T KOG1738|consen 224 DGPHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ 262 (638)
T ss_pred CCceeccccccCChHHHhhcccCccceeeecccccccch
Confidence 67777799999999999 79999999999999988885
No 155
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=26.81 E-value=1.1e+02 Score=25.33 Aligned_cols=31 Identities=16% Similarity=0.144 Sum_probs=26.8
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAY 676 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~ 676 (1076)
..+.|.+.+ ++.+.|.+.+.|+..|+.+-..
T Consensus 7 ~~V~V~l~~-g~~~~G~L~~~D~~~Ni~L~~~ 37 (63)
T cd00600 7 KTVRVELKD-GRVLEGVLVAFDKYMNLVLDDV 37 (63)
T ss_pred CEEEEEECC-CcEEEEEEEEECCCCCEEECCE
Confidence 358899998 9999999999999999877654
No 156
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=26.56 E-value=5.2e+02 Score=28.20 Aligned_cols=90 Identities=14% Similarity=0.190 Sum_probs=53.6
Q ss_pred CCcEEEEEEEeCCCcEEEeCccccCCCC---cEEEEEecCCcEEE-----------EEEEEecCCCcEEEEEEcCCC-Cc
Q 001444 66 GASYATGFVVDKRRGIILTNRHVVKPGP---VVAEAMFVNREEIP-----------VYPIYRDPVHDFGFFRYDPSA-IQ 130 (1076)
Q Consensus 66 ~~~~GTGfvV~~~~G~IlTn~Hvv~~~~---~~~~v~~~~~~~~~-----------a~vv~~d~~~DlAlLk~~~~~-~~ 130 (1076)
+...+||++|+++ |||++..+..+-. .-+.+.+..++.+. ..-+..=+..++.||.++.+. +.
T Consensus 26 G~~~CsgvLlD~~--WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~~~~fT 103 (267)
T PF09342_consen 26 GRYWCSGVLLDPH--WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQPANFT 103 (267)
T ss_pred CeEEEEEEEeccc--eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecCcccce
Confidence 5678999999976 9999999998521 23445555444321 121222367899999998542 21
Q ss_pred -cccccCCCCCCcCCCCCCEEEEEecCC
Q 001444 131 -FLNYDEIPLAPEAACVGLEIRVVGNDS 157 (1076)
Q Consensus 131 -~~~~~~l~l~~~~~~~G~~V~~iG~p~ 157 (1076)
++.-.-+|=........+..++||...
T Consensus 104 r~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 104 RYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeecccccccccCCCCCCCceEEEEccc
Confidence 111111111123444556899999876
No 157
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=26.28 E-value=97 Score=26.41 Aligned_cols=31 Identities=19% Similarity=0.251 Sum_probs=26.8
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAY 676 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~ 676 (1076)
..+.|.+.+ ++.+.|++.++|+..|+.+=..
T Consensus 11 ~~V~V~Lk~-g~~~~G~L~~~D~~mNlvL~~~ 41 (67)
T cd01726 11 RPVVVKLNS-GVDYRGILACLDGYMNIALEQT 41 (67)
T ss_pred CeEEEEECC-CCEEEEEEEEEccceeeEEeeE
Confidence 358899998 9999999999999999977544
No 158
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=25.82 E-value=1.1e+02 Score=27.04 Aligned_cols=30 Identities=17% Similarity=0.203 Sum_probs=26.3
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIA 675 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk 675 (1076)
..+.|.+.+ ++++.|+++++|...|+.+=.
T Consensus 14 ~~V~V~l~~-gr~~~G~L~g~D~~mNlvL~d 43 (76)
T cd01732 14 SRIWIVMKS-DKEFVGTLLGFDDYVNMVLED 43 (76)
T ss_pred CEEEEEECC-CeEEEEEEEEeccceEEEEcc
Confidence 468899998 999999999999999998643
No 159
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=25.00 E-value=1.9e+02 Score=24.42 Aligned_cols=32 Identities=3% Similarity=-0.106 Sum_probs=28.1
Q ss_pred EEEEEecCCcEEEEEEEEecCCCcEEEEEEcC
Q 001444 95 VAEAMFVNREEIPVYPIYRDPVHDFGFFRYDP 126 (1076)
Q Consensus 95 ~~~v~~~~~~~~~a~vv~~d~~~DlAlLk~~~ 126 (1076)
.+.++.-.|++++++++.+|....+.+|+-..
T Consensus 8 ~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 8 QVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred EEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 56777788999999999999999999998654
No 160
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.84 E-value=1.2e+02 Score=26.62 Aligned_cols=30 Identities=10% Similarity=0.063 Sum_probs=26.1
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIA 675 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk 675 (1076)
..+.|++.| |+.+.|++.++|...|+.+=.
T Consensus 11 ~~v~V~l~d-gR~~~G~l~~~D~~~NivL~~ 40 (75)
T cd06168 11 RTMRIHMTD-GRTLVGVFLCTDRDCNIILGS 40 (75)
T ss_pred CeEEEEEcC-CeEEEEEEEEEcCCCcEEecC
Confidence 358899998 999999999999999986643
No 161
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.69 E-value=1e+02 Score=27.25 Aligned_cols=30 Identities=10% Similarity=0.155 Sum_probs=25.9
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIA 675 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk 675 (1076)
..+.|++.+ ++.+.|.+.++|...|+.+=.
T Consensus 11 ~~V~V~l~d-gR~~~G~L~~~D~~~NlVL~~ 40 (79)
T cd01717 11 YRLRVTLQD-GRQFVGQFLAFDKHMNLVLSD 40 (79)
T ss_pred CEEEEEECC-CcEEEEEEEEEcCccCEEcCC
Confidence 357899998 999999999999999997643
No 162
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=24.55 E-value=84 Score=38.36 Aligned_cols=46 Identities=22% Similarity=0.271 Sum_probs=36.7
Q ss_pred ceEEEEEecCCCHHhhh--ccCCCEEEEECCEEcCChhHHHHHHHhccC
Q 001444 867 QVLRVKGCLAGSKAENM--LEQGDMMLAINKQPVTCFHDIENACQALDK 913 (1076)
Q Consensus 867 ~~~~V~~V~~~s~A~~a--L~~GDiIlsVnG~~V~~~~dl~~~l~~~~~ 913 (1076)
+..+|.++.++|||..- |.+||.|+.||++.|-.| ++..++..+..
T Consensus 225 g~h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgw-qlk~vV~sL~~ 272 (638)
T KOG1738|consen 225 GPHVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGW-QLKVVVSSLRE 272 (638)
T ss_pred CceeccccccCChHHHhhcccCccceeeecccccccc-hhHhHHhhccc
Confidence 44568899999999965 999999999999999988 45666644443
No 163
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=24.17 E-value=1.1e+02 Score=26.14 Aligned_cols=31 Identities=19% Similarity=0.261 Sum_probs=26.9
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAY 676 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~ 676 (1076)
..+.|.+.+ ++.+.|+++++|...|+.+=..
T Consensus 12 ~~V~V~Lk~-g~~~~G~L~~~D~~mNi~L~~~ 42 (68)
T cd01722 12 KPVIVKLKW-GMEYKGTLVSVDSYMNLQLANT 42 (68)
T ss_pred CEEEEEECC-CcEEEEEEEEECCCEEEEEeeE
Confidence 358899998 9999999999999999987544
No 164
>PF15436 PGBA_N: Plasminogen-binding protein pgbA N-terminal
Probab=23.20 E-value=5e+02 Score=27.84 Aligned_cols=55 Identities=13% Similarity=0.114 Sum_probs=35.6
Q ss_pred EEEEe-cCCcEEEEEEEEecCCCcEEEEEEcCCCCccccccCCCCCCcCCCCCCEEEE
Q 001444 96 AEAMF-VNREEIPVYPIYRDPVHDFGFFRYDPSAIQFLNYDEIPLAPEAACVGLEIRV 152 (1076)
Q Consensus 96 ~~v~~-~~~~~~~a~vv~~d~~~DlAlLk~~~~~~~~~~~~~l~l~~~~~~~G~~V~~ 152 (1076)
+.-.| .+...+-|+.+-.....+.|.+|+.+-. .+.-..+|...-.++.||+|+.
T Consensus 33 V~h~~~~~~~~IiA~a~V~~~~~g~A~~kf~~fd--~L~Q~aLP~p~~~pk~GD~vil 88 (218)
T PF15436_consen 33 VVHKFDKDHSSIIARAVVISKKNGVAKAKFSVFD--SLKQDALPTPKMVPKKGDEVIL 88 (218)
T ss_pred EEEEecCCcceeeeEEEEEEecCCeeEEEEeehh--hhhhhcCCCCccccCCCCEEEE
Confidence 33345 4556666666666667889999986522 1223445555677899999876
No 165
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=23.06 E-value=2.1e+02 Score=32.75 Aligned_cols=27 Identities=22% Similarity=0.376 Sum_probs=19.5
Q ss_pred ccCCCCCCCceec--CCCc-EEEEeecccC
Q 001444 196 GTKGGSSGSPVID--WQGR-AVALNAGSKS 222 (1076)
Q Consensus 196 ~~~~G~SGgPv~n--~~G~-vVGi~~~~~~ 222 (1076)
....|.||||+|- .+|+ =+||++-+..
T Consensus 224 daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~ 253 (413)
T COG5640 224 DACQGDSGGPIFHKGEEGRVQRGVVSWGDG 253 (413)
T ss_pred ccccCCCCCceEEeCCCccEEEeEEEecCC
Confidence 4567999999984 3476 4788876654
No 166
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.82 E-value=1.1e+02 Score=27.37 Aligned_cols=29 Identities=17% Similarity=0.165 Sum_probs=25.8
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALI 674 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvl 674 (1076)
..|.|.+.+ ++.+.|++.++|.+.|+.+=
T Consensus 12 k~V~V~l~~-gr~~~G~L~~fD~~mNlvL~ 40 (82)
T cd01730 12 ERVYVKLRG-DRELRGRLHAYDQHLNMILG 40 (82)
T ss_pred CEEEEEECC-CCEEEEEEEEEccceEEecc
Confidence 468899998 99999999999999999763
No 167
>PF14275 DUF4362: Domain of unknown function (DUF4362)
Probab=22.28 E-value=2.4e+02 Score=26.23 Aligned_cols=53 Identities=19% Similarity=0.315 Sum_probs=37.3
Q ss_pred CCCCEEEEcCCeecCCHHHHHHHHHhcCCCCeEeEEEEeccccccceEEEEEEecCC
Q 001444 416 PRHAIIKKFAGEEISRLEDLISVLSKLSRGARVPIEYSSYTDRHRRKSVLVTIDRHE 472 (1076)
Q Consensus 416 ~~GD~I~~Vng~~v~~l~~~~~~l~~~~~g~~v~l~~~~~~~~~~~~~~~l~i~r~~ 472 (1076)
+.||+|.+ +-.+.|++.|-+.+.....++.-.|++.+++.... |+..++.-++
T Consensus 1 ~~~DVi~~--~~~i~Nl~kl~~Fi~nv~~~k~d~IrIv~yT~EGd--PI~~~L~~~G 53 (98)
T PF14275_consen 1 KNNDVINK--HGEIENLDKLDQFIENVEQGKPDKIRIVQYTIEGD--PIFQDLEYDG 53 (98)
T ss_pred CCCCEEEe--CCeEEeHHHHHHHHHHHhcCCCCEEEEEEecCCCC--CEEEEEEECC
Confidence 57899988 44489998888888887777777888888776554 4444444433
No 168
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=22.12 E-value=1.1e+02 Score=29.81 Aligned_cols=33 Identities=6% Similarity=0.023 Sum_probs=25.0
Q ss_pred eeeEEEeeccCCCCCCCceecCCCcEEEEeeccc
Q 001444 188 TFYMQAASGTKGGSSGSPVIDWQGRAVALNAGSK 221 (1076)
Q Consensus 188 ~~~i~~~a~~~~G~SGgPv~n~~G~vVGi~~~~~ 221 (1076)
..++....+..||.-||+|+=. --|+||.+++.
T Consensus 78 ~~~l~g~Gp~~PGdCGg~L~C~-HGViGi~Tagg 110 (127)
T PF00947_consen 78 YNLLIGEGPAEPGDCGGILRCK-HGVIGIVTAGG 110 (127)
T ss_dssp ECEEEEE-SSSTT-TCSEEEET-TCEEEEEEEEE
T ss_pred cCceeecccCCCCCCCceeEeC-CCeEEEEEeCC
Confidence 4467778899999999999944 45999998864
No 169
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.05 E-value=1.4e+02 Score=26.66 Aligned_cols=30 Identities=17% Similarity=0.210 Sum_probs=26.1
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIA 675 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk 675 (1076)
..+.|.+.+ ++.+.+++.++|...|+.+=.
T Consensus 13 k~V~V~l~~-gr~~~G~L~~~D~~mNlvL~~ 42 (81)
T cd01729 13 KKIRVKFQG-GREVTGILKGYDQLLNLVLDD 42 (81)
T ss_pred CeEEEEECC-CcEEEEEEEEEcCcccEEecC
Confidence 457888998 999999999999999997743
No 170
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=21.88 E-value=1.5e+02 Score=25.84 Aligned_cols=57 Identities=14% Similarity=0.163 Sum_probs=37.1
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEEEe
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVG 710 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG 710 (1076)
..+.|.+.+ ++.+.|++.++|...|+.+=...... . .....+++. -+-+|+.|..|+
T Consensus 11 k~V~V~L~~-g~~~~G~L~~~D~~mNlvL~~~~E~~-~-----~~~~~~lg~--v~IRG~~I~~i~ 67 (72)
T cd01719 11 KKLSLKLNG-NRKVSGILRGFDPFMNLVLDDAVEVN-S-----GGEKNNIGM--VVIRGNSIVMLE 67 (72)
T ss_pred CeEEEEECC-CeEEEEEEEEEcccccEEeccEEEEc-c-----CCceeEece--EEECCCEEEEEE
Confidence 357888988 99999999999999999774431100 0 001123332 256777777776
No 171
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=21.68 E-value=1.4e+02 Score=25.89 Aligned_cols=31 Identities=23% Similarity=0.158 Sum_probs=27.0
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAY 676 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~ 676 (1076)
..|.|.+.+ |+.+.|++.++|+..|+.+=..
T Consensus 15 k~V~V~lk~-g~~~~G~L~~~D~~mNlvL~d~ 45 (72)
T PRK00737 15 SPVLVRLKG-GREFRGELQGYDIHMNLVLDNA 45 (72)
T ss_pred CEEEEEECC-CCEEEEEEEEEcccceeEEeeE
Confidence 358899998 9999999999999999977654
No 172
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=21.54 E-value=1.4e+02 Score=25.43 Aligned_cols=31 Identities=16% Similarity=0.191 Sum_probs=27.3
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEEE
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAY 676 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~ 676 (1076)
..+.|.+.+ ++.+.|++.++|+..|+.+-..
T Consensus 11 ~~V~V~l~~-g~~~~G~L~~~D~~mNlvL~~~ 41 (68)
T cd01731 11 KPVLVKLKG-GKEVRGRLKSYDQHMNLVLEDA 41 (68)
T ss_pred CEEEEEECC-CCEEEEEEEEECCcceEEEeeE
Confidence 458899998 9999999999999999987765
No 173
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=21.35 E-value=1.5e+02 Score=25.90 Aligned_cols=60 Identities=15% Similarity=0.078 Sum_probs=37.7
Q ss_pred ccEEEEeecCCeEEeEEEEEeeCCCcEEEEEECCCCCCcccccceeeeeccCCccCCCCCEEEEEe
Q 001444 645 SDVMLSFAAFPIEIPGEVVFLHPVHNFALIAYDPSSLGVAGASVVRAAELLPEPALRRGDSVYLVG 710 (1076)
Q Consensus 645 ~~i~v~~~d~~~~~~a~vv~~dp~~dlAvlk~d~~~~~~~~~~~v~~~~l~~~~~l~~G~~V~~iG 710 (1076)
..+.|.+.+ ++.+.|.+.++|+..|+.+=......... .......++. -+-+|+.|..+|
T Consensus 13 k~v~V~l~~-gr~~~G~L~~fD~~~NlvL~d~~E~~~~~---~~~~~~~lG~--~viRG~~V~~ig 72 (74)
T cd01728 13 KKVVVLLRD-GRKLIGILRSFDQFANLVLQDTVERIYVG---DKYGDIPRGI--FIIRGENVVLLG 72 (74)
T ss_pred CEEEEEEcC-CeEEEEEEEEECCcccEEecceEEEEecC---CccceeEeeE--EEEECCEEEEEE
Confidence 457889998 99999999999999999774331100000 0000122322 256788888887
Done!