Query 008087
Match_columns 578
No_of_seqs 553 out of 3478
Neff 7.9
Searched_HMMs 46136
Date Thu Mar 28 19:18:07 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/008087.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/008087hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 1.3E-57 2.9E-62 490.1 40.5 387 114-550 42-451 (455)
2 TIGR02037 degP_htrA_DO peripla 100.0 1.6E-55 3.5E-60 474.6 42.9 391 115-550 4-425 (428)
3 PRK10942 serine endoprotease; 100.0 1.2E-54 2.6E-59 469.3 38.5 349 148-550 110-469 (473)
4 TIGR02038 protease_degS peripl 100.0 2.3E-47 5E-52 399.5 33.2 296 114-433 47-349 (351)
5 PRK10898 serine endoprotease; 100.0 8.7E-47 1.9E-51 394.8 33.3 297 114-434 47-351 (353)
6 KOG1421 Predicted signaling-as 100.0 1.1E-38 2.5E-43 334.7 25.6 376 114-540 54-455 (955)
7 KOG1320 Serine protease [Postt 100.0 6.2E-38 1.3E-42 328.3 15.9 410 117-571 55-473 (473)
8 COG0265 DegQ Trypsin-like seri 100.0 4.9E-36 1.1E-40 315.0 28.0 300 114-433 35-341 (347)
9 KOG1320 Serine protease [Postt 99.9 1.9E-23 4.1E-28 219.6 20.0 303 114-431 130-467 (473)
10 KOG1421 Predicted signaling-as 99.9 1.7E-19 3.6E-24 191.0 27.6 362 119-542 525-916 (955)
11 PRK10779 zinc metallopeptidase 99.7 8.4E-17 1.8E-21 174.5 10.7 153 355-549 128-282 (449)
12 PF13365 Trypsin_2: Trypsin-li 99.6 2.6E-14 5.7E-19 125.9 14.1 108 151-289 1-120 (120)
13 TIGR00054 RIP metalloprotease 99.5 2.7E-14 5.8E-19 153.5 11.6 135 353-548 128-263 (420)
14 PF00089 Trypsin: Trypsin; In 99.4 2.3E-12 5.1E-17 125.4 14.4 182 134-315 7-220 (220)
15 PF13180 PDZ_2: PDZ domain; PD 99.4 4.1E-13 8.8E-18 110.9 7.3 81 328-430 1-82 (82)
16 cd00190 Tryp_SPc Trypsin-like 99.3 4.5E-11 9.7E-16 117.2 16.9 168 148-315 24-229 (232)
17 cd00987 PDZ_serine_protease PD 99.2 4E-11 8.6E-16 100.5 8.3 88 328-427 1-89 (90)
18 smart00020 Tryp_SPc Trypsin-li 99.2 2.3E-10 5.1E-15 112.3 15.0 147 148-294 25-208 (229)
19 cd00986 PDZ_LON_protease PDZ d 99.1 5.8E-10 1.3E-14 91.3 9.0 72 352-433 7-78 (79)
20 cd00991 PDZ_archaeal_metallopr 99.0 7.6E-10 1.7E-14 90.7 8.5 68 352-429 9-77 (79)
21 cd00990 PDZ_glycyl_aminopeptid 99.0 6.9E-10 1.5E-14 90.9 7.7 77 328-431 1-78 (80)
22 TIGR01713 typeII_sec_gspC gene 99.0 9.4E-10 2E-14 110.3 8.4 100 309-430 159-259 (259)
23 cd00989 PDZ_metalloprotease PD 98.8 1.2E-08 2.6E-13 83.1 7.7 64 354-428 13-77 (79)
24 TIGR02037 degP_htrA_DO peripla 98.8 1E-08 2.2E-13 111.2 8.9 90 327-427 337-427 (428)
25 COG3591 V8-like Glu-specific e 98.7 9.4E-08 2E-12 94.0 12.4 162 148-319 63-250 (251)
26 cd00988 PDZ_CTP_protease PDZ d 98.7 4E-08 8.6E-13 81.4 8.0 67 352-429 12-82 (85)
27 cd00136 PDZ PDZ domain, also c 98.5 1.9E-07 4.2E-12 74.2 6.2 54 353-417 13-69 (70)
28 PF12812 PDZ_1: PDZ-like domai 98.5 3E-07 6.6E-12 74.7 7.4 70 446-536 5-78 (78)
29 KOG3209 WW domain-containing p 98.5 4.7E-07 1E-11 97.9 10.3 151 352-543 673-835 (984)
30 PF13180 PDZ_2: PDZ domain; PD 98.5 3.2E-07 6.9E-12 75.6 6.2 57 493-549 15-76 (82)
31 cd00991 PDZ_archaeal_metallopr 98.4 8.6E-07 1.9E-11 72.5 8.1 58 492-549 10-72 (79)
32 cd00987 PDZ_serine_protease PD 98.4 1.7E-06 3.8E-11 72.1 10.0 79 451-549 2-86 (90)
33 smart00228 PDZ Domain present 98.3 1.6E-06 3.4E-11 71.3 7.5 59 353-421 26-85 (85)
34 TIGR00054 RIP metalloprotease 98.3 1.5E-06 3.2E-11 93.9 8.1 69 353-432 203-272 (420)
35 KOG3580 Tight junction protein 98.2 6.5E-06 1.4E-10 87.6 11.3 61 352-422 39-99 (1027)
36 KOG3627 Trypsin [Amino acid tr 98.2 5E-05 1.1E-09 76.1 17.2 158 136-294 21-228 (256)
37 PRK10779 zinc metallopeptidase 98.2 2.7E-06 5.9E-11 92.7 8.1 67 354-431 222-289 (449)
38 cd00989 PDZ_metalloprotease PD 98.2 3.6E-06 7.8E-11 68.4 5.7 52 497-548 21-72 (79)
39 TIGR00225 prc C-terminal pepti 98.1 5.1E-06 1.1E-10 87.1 7.2 72 353-433 62-134 (334)
40 PF00595 PDZ: PDZ domain (Also 98.1 4E-06 8.6E-11 68.8 4.8 72 327-418 9-81 (81)
41 PRK10139 serine endoprotease; 98.1 6.3E-06 1.4E-10 89.7 7.5 64 353-428 390-454 (455)
42 cd00986 PDZ_LON_protease PDZ d 98.1 1.7E-05 3.7E-10 64.7 8.0 56 493-549 9-69 (79)
43 TIGR03279 cyano_FeS_chp putati 98.1 5.2E-06 1.1E-10 87.9 6.2 61 357-431 2-64 (433)
44 KOG3209 WW domain-containing p 98.1 2.9E-05 6.2E-10 84.5 11.6 53 357-419 782-836 (984)
45 cd00988 PDZ_CTP_protease PDZ d 98.0 2.1E-05 4.5E-10 64.9 7.0 55 493-547 14-75 (85)
46 PRK10942 serine endoprotease; 98.0 1.6E-05 3.4E-10 87.0 7.6 64 353-428 408-472 (473)
47 cd00992 PDZ_signaling PDZ doma 97.9 1.9E-05 4E-10 64.6 6.0 49 328-387 12-61 (82)
48 PLN00049 carboxyl-terminal pro 97.9 2.4E-05 5.2E-10 83.7 8.5 68 354-430 103-171 (389)
49 TIGR02038 protease_degS peripl 97.9 3.5E-05 7.6E-10 81.3 9.5 79 451-549 256-340 (351)
50 TIGR02860 spore_IV_B stage IV 97.9 1.9E-05 4.2E-10 83.1 7.4 68 352-430 104-180 (402)
51 PF14685 Tricorn_PDZ: Tricorn 97.9 6.4E-05 1.4E-09 62.4 8.6 65 352-427 11-87 (88)
52 cd00136 PDZ PDZ domain, also c 97.9 3.1E-05 6.6E-10 61.3 6.1 50 493-542 14-69 (70)
53 PF00863 Peptidase_C4: Peptida 97.9 0.0011 2.3E-08 65.2 17.9 135 158-309 40-185 (235)
54 PRK10898 serine endoprotease; 97.8 9.9E-05 2.2E-09 77.9 10.1 58 492-549 279-341 (353)
55 cd00990 PDZ_glycyl_aminopeptid 97.8 5.3E-05 1.1E-09 61.7 6.3 55 493-549 13-71 (80)
56 PF05579 Peptidase_S32: Equine 97.7 0.00036 7.7E-09 68.4 11.8 113 150-293 115-228 (297)
57 PRK09681 putative type II secr 97.7 7.2E-05 1.6E-09 75.0 7.3 67 353-430 204-275 (276)
58 TIGR01713 typeII_sec_gspC gene 97.7 9.5E-05 2.1E-09 74.4 7.4 58 492-549 191-253 (259)
59 COG0793 Prc Periplasmic protea 97.7 0.00015 3.3E-09 77.6 9.1 81 327-431 99-182 (406)
60 KOG3580 Tight junction protein 97.6 0.00059 1.3E-08 73.1 11.8 76 342-428 209-286 (1027)
61 TIGR02860 spore_IV_B stage IV 97.6 9.9E-05 2.1E-09 77.9 6.0 51 499-549 124-174 (402)
62 COG3480 SdrC Predicted secrete 97.5 0.00021 4.6E-09 71.7 6.8 71 353-433 130-201 (342)
63 cd00992 PDZ_signaling PDZ doma 97.4 0.00047 1E-08 56.2 6.9 49 493-542 27-81 (82)
64 PF00595 PDZ: PDZ domain (Also 97.4 0.00033 7.2E-09 57.3 5.4 51 492-543 25-81 (81)
65 KOG3605 Beta amyloid precursor 97.3 0.00045 9.8E-09 74.8 7.4 123 357-536 677-806 (829)
66 COG3975 Predicted protease wit 97.3 0.00044 9.5E-09 74.0 6.4 85 330-434 439-526 (558)
67 PRK11186 carboxy-terminal prot 97.2 0.0011 2.4E-08 74.9 8.8 71 353-429 255-332 (667)
68 KOG3129 26S proteasome regulat 97.2 0.00082 1.8E-08 63.5 6.4 72 355-434 141-213 (231)
69 COG3031 PulC Type II secretory 97.2 0.00038 8.3E-09 67.0 4.1 66 354-429 208-274 (275)
70 TIGR03279 cyano_FeS_chp putati 97.0 0.00068 1.5E-08 72.1 4.9 51 495-548 5-56 (433)
71 PF04495 GRASP55_65: GRASP55/6 97.0 0.0014 3.1E-08 59.3 6.0 86 327-431 25-114 (138)
72 smart00228 PDZ Domain present 97.0 0.0015 3.3E-08 53.3 5.7 53 493-546 27-85 (85)
73 TIGR00225 prc C-terminal pepti 97.0 0.0006 1.3E-08 71.5 4.1 50 497-546 71-122 (334)
74 PF12812 PDZ_1: PDZ-like domai 96.9 0.0012 2.6E-08 53.6 4.3 58 328-389 9-67 (78)
75 PF03761 DUF316: Domain of unk 96.9 0.03 6.5E-07 57.2 15.5 109 194-313 159-273 (282)
76 KOG3834 Golgi reassembly stack 96.9 0.0033 7.2E-08 65.5 8.2 143 352-542 14-164 (462)
77 PLN00049 carboxyl-terminal pro 96.9 0.0013 2.8E-08 70.5 5.3 51 497-547 111-163 (389)
78 COG5640 Secreted trypsin-like 96.8 0.02 4.3E-07 58.8 12.6 58 150-207 62-135 (413)
79 PRK09681 putative type II secr 96.8 0.0028 6E-08 63.8 6.3 50 500-549 219-269 (276)
80 PF00548 Peptidase_C3: 3C cyst 96.7 0.023 5.1E-07 53.6 12.3 138 147-292 23-169 (172)
81 PF08192 Peptidase_S64: Peptid 96.6 0.019 4.2E-07 63.3 11.8 117 195-318 542-688 (695)
82 PF14685 Tricorn_PDZ: Tricorn 96.6 0.0033 7.2E-08 52.2 4.7 48 499-546 31-80 (88)
83 PF10459 Peptidase_S46: Peptid 96.3 0.0048 1E-07 70.1 5.2 20 150-169 48-68 (698)
84 KOG3553 Tax interaction protei 96.2 0.0038 8.2E-08 51.9 3.0 34 353-386 59-93 (124)
85 COG0265 DegQ Trypsin-like seri 95.8 0.025 5.5E-07 59.6 7.6 59 492-550 270-333 (347)
86 PF09342 DUF1986: Domain of un 95.4 0.19 4.1E-06 49.3 11.1 99 135-234 12-131 (267)
87 PF02122 Peptidase_S39: Peptid 95.4 0.012 2.5E-07 56.9 2.9 138 158-311 41-184 (203)
88 COG0793 Prc Periplasmic protea 95.3 0.016 3.5E-07 62.2 4.2 48 498-545 122-171 (406)
89 PF04495 GRASP55_65: GRASP55/6 95.1 0.014 3.1E-07 52.9 2.4 49 495-543 50-99 (138)
90 KOG3550 Receptor targeting pro 94.9 0.06 1.3E-06 48.3 5.8 38 352-389 114-153 (207)
91 PRK11186 carboxy-terminal prot 94.0 0.056 1.2E-06 61.4 4.4 49 496-544 263-319 (667)
92 PF00949 Peptidase_S7: Peptida 93.7 0.079 1.7E-06 47.4 4.0 29 266-294 90-118 (132)
93 KOG3552 FERM domain protein FR 93.2 0.094 2E-06 59.6 4.4 57 353-419 75-131 (1298)
94 PF00947 Pico_P2A: Picornaviru 93.2 0.28 6.1E-06 43.2 6.5 57 248-310 65-121 (127)
95 COG3480 SdrC Predicted secrete 92.8 0.28 6.1E-06 49.8 6.7 53 492-544 130-186 (342)
96 KOG3532 Predicted protein kina 92.7 0.12 2.6E-06 57.0 4.2 37 353-389 398-435 (1051)
97 KOG3605 Beta amyloid precursor 92.7 0.14 3E-06 56.2 4.6 81 305-387 708-791 (829)
98 COG3031 PulC Type II secretory 92.4 0.19 4.1E-06 48.9 4.7 50 500-549 219-269 (275)
99 KOG3553 Tax interaction protei 92.0 0.17 3.6E-06 42.4 3.3 53 491-545 58-116 (124)
100 KOG3532 Predicted protein kina 92.0 0.37 8E-06 53.3 6.8 48 495-542 405-452 (1051)
101 PF00944 Peptidase_S3: Alphavi 91.5 0.32 6.9E-06 43.1 4.7 27 267-293 100-126 (158)
102 KOG3542 cAMP-regulated guanine 91.3 0.14 3.1E-06 56.3 2.8 39 350-388 559-598 (1283)
103 KOG3129 26S proteasome regulat 90.2 0.35 7.6E-06 46.2 4.1 56 494-549 145-203 (231)
104 KOG1892 Actin filament-binding 89.6 0.46 1E-05 54.4 5.1 61 352-422 959-1021(1629)
105 KOG2921 Intramembrane metallop 87.6 0.37 8.1E-06 50.1 2.5 38 352-389 219-258 (484)
106 PF10459 Peptidase_S46: Peptid 87.0 0.32 6.9E-06 55.7 1.8 40 196-235 200-252 (698)
107 KOG3550 Receptor targeting pro 86.3 4.8 0.0001 36.4 8.4 46 496-542 123-171 (207)
108 PF02907 Peptidase_S29: Hepati 85.8 0.72 1.6E-05 41.0 2.9 24 271-294 106-129 (148)
109 PF05580 Peptidase_S55: SpoIVB 85.6 0.52 1.1E-05 45.5 2.2 41 267-310 174-214 (218)
110 COG0750 Predicted membrane-ass 85.2 1.5 3.3E-05 46.5 5.9 57 357-424 133-194 (375)
111 COG0750 Predicted membrane-ass 83.3 1.8 3.8E-05 46.0 5.3 53 496-548 137-193 (375)
112 COG3975 Predicted protease wit 83.2 0.91 2E-05 49.3 3.0 21 498-518 472-492 (558)
113 KOG3552 FERM domain protein FR 80.1 2.3 5E-05 48.9 4.8 48 496-544 82-131 (1298)
114 KOG3606 Cell polarity protein 79.0 1.8 3.9E-05 42.9 3.1 56 330-387 173-230 (358)
115 KOG3651 Protein kinase C, alph 78.8 3.4 7.4E-05 41.6 5.0 55 354-418 31-87 (429)
116 KOG1924 RhoA GTPase effector D 76.9 9.2 0.0002 43.5 8.1 11 12-22 502-512 (1102)
117 KOG3571 Dishevelled 3 and rela 76.1 2.8 6.1E-05 45.1 3.8 38 352-389 276-315 (626)
118 PF03510 Peptidase_C24: 2C end 75.3 12 0.00025 32.2 6.7 53 153-219 3-55 (105)
119 KOG3551 Syntrophins (type beta 74.9 2.8 6E-05 43.7 3.3 55 353-418 110-167 (506)
120 KOG0606 Microtubule-associated 74.1 3 6.5E-05 49.2 3.7 35 355-389 660-695 (1205)
121 KOG0609 Calcium/calmodulin-dep 72.3 4.3 9.3E-05 44.3 4.1 49 494-543 152-203 (542)
122 KOG1924 RhoA GTPase effector D 72.2 13 0.00027 42.5 7.7 10 523-532 1043-1052(1102)
123 KOG2921 Intramembrane metallop 72.2 6.5 0.00014 41.3 5.2 43 492-534 220-267 (484)
124 KOG3606 Cell polarity protein 71.3 9.4 0.0002 38.1 5.8 46 496-542 202-250 (358)
125 PF02395 Peptidase_S6: Immunog 70.5 15 0.00033 42.8 8.3 63 151-219 67-132 (769)
126 PF01732 DUF31: Putative pepti 70.3 3.4 7.4E-05 44.0 2.9 24 269-292 351-374 (374)
127 KOG3549 Syntrophins (type gamm 68.1 8.2 0.00018 39.8 4.8 49 494-543 82-137 (505)
128 KOG3549 Syntrophins (type gamm 66.5 7 0.00015 40.3 4.0 55 354-418 81-137 (505)
129 KOG0606 Microtubule-associated 65.8 9.2 0.0002 45.4 5.2 42 495-536 665-708 (1205)
130 KOG0609 Calcium/calmodulin-dep 65.7 8.3 0.00018 42.2 4.6 56 354-419 147-204 (542)
131 KOG3938 RGS-GAIP interacting p 63.7 11 0.00024 37.4 4.7 38 505-542 167-207 (334)
132 KOG3542 cAMP-regulated guanine 62.1 7.5 0.00016 43.4 3.4 51 493-545 563-619 (1283)
133 KOG3938 RGS-GAIP interacting p 57.6 6.4 0.00014 39.1 1.8 56 355-418 151-208 (334)
134 KOG3551 Syntrophins (type beta 56.4 11 0.00024 39.5 3.3 54 499-553 121-179 (506)
135 KOG3651 Protein kinase C, alph 55.8 25 0.00055 35.7 5.7 47 495-542 37-86 (429)
136 KOG3834 Golgi reassembly stack 54.9 11 0.00023 40.2 3.1 50 495-545 22-73 (462)
137 PF05416 Peptidase_C37: Southa 53.2 1.4E+02 0.0031 32.0 10.8 136 149-294 379-527 (535)
138 KOG3571 Dishevelled 3 and rela 49.3 30 0.00066 37.6 5.4 53 490-542 275-336 (626)
139 cd01720 Sm_D2 The eukaryotic S 43.8 46 0.00099 27.6 4.6 37 167-204 10-46 (87)
140 cd00600 Sm_like The eukaryotic 40.9 75 0.0016 23.9 5.2 33 172-205 7-39 (63)
141 cd01726 LSm6 The eukaryotic Sm 36.5 82 0.0018 24.5 4.8 32 172-204 11-42 (67)
142 PRK00737 small nuclear ribonuc 35.8 91 0.002 24.7 5.0 33 172-205 15-47 (72)
143 cd01722 Sm_F The eukaryotic Sm 35.3 82 0.0018 24.6 4.7 32 172-204 12-43 (68)
144 cd01731 archaeal_Sm1 The archa 34.6 99 0.0022 24.0 5.1 33 172-205 11-43 (68)
145 cd06168 LSm9 The eukaryotic Sm 33.1 1E+02 0.0023 24.7 5.0 31 172-203 11-41 (75)
146 cd01717 Sm_B The eukaryotic Sm 32.9 96 0.0021 25.0 4.8 32 172-204 11-42 (79)
147 COG0298 HypC Hydrogenase matur 32.7 1E+02 0.0023 25.0 4.8 48 185-234 5-53 (82)
148 cd01735 LSm12_N LSm12 belongs 32.2 1.6E+02 0.0035 22.7 5.6 34 171-205 6-39 (61)
149 cd01730 LSm3 The eukaryotic Sm 32.0 89 0.0019 25.4 4.6 31 172-203 12-42 (82)
150 PF00571 CBS: CBS domain CBS d 30.5 52 0.0011 23.9 2.8 21 272-292 28-48 (57)
151 cd01729 LSm7 The eukaryotic Sm 30.2 1.2E+02 0.0026 24.7 5.0 31 172-203 13-43 (81)
152 cd01732 LSm5 The eukaryotic Sm 30.0 1.1E+02 0.0023 24.6 4.6 31 172-203 14-44 (76)
153 cd01721 Sm_D3 The eukaryotic S 29.4 1.3E+02 0.0029 23.6 5.0 33 171-204 10-42 (70)
154 KOG1738 Membrane-associated gu 28.7 29 0.00064 38.8 1.5 34 355-388 227-262 (638)
155 cd01719 Sm_G The eukaryotic Sm 28.3 1.4E+02 0.0031 23.6 5.0 31 172-203 11-41 (72)
156 KOG4371 Membrane-associated pr 28.1 1.9E+02 0.0041 34.7 7.6 155 328-547 1158-1331(1332)
157 TIGR03000 plancto_dom_1 Planct 28.0 1.3E+02 0.0028 24.3 4.5 48 372-428 10-61 (75)
158 COG1582 FlgEa Uncharacterized 27.3 1.9E+02 0.0041 22.5 5.1 52 511-565 3-54 (67)
159 PF12381 Peptidase_C3G: Tungro 27.3 87 0.0019 30.5 4.2 54 263-319 170-229 (231)
160 PF11874 DUF3394: Domain of un 26.6 1.9E+02 0.0041 27.6 6.3 19 498-516 132-150 (183)
161 smart00651 Sm snRNP Sm protein 25.9 1.7E+02 0.0036 22.3 5.0 33 172-205 9-41 (67)
162 PF12419 DUF3670: SNF2 Helicas 25.8 2.7E+02 0.0059 25.1 7.1 54 508-566 72-125 (141)
163 smart00384 AT_hook DNA binding 25.5 50 0.0011 20.8 1.4 14 5-18 1-14 (26)
164 PF11874 DUF3394: Domain of un 25.4 59 0.0013 30.9 2.7 29 352-380 121-150 (183)
165 cd01728 LSm1 The eukaryotic Sm 25.2 1.7E+02 0.0037 23.4 4.9 32 172-204 13-44 (74)
166 PF01423 LSM: LSM domain ; In 25.2 1.4E+02 0.003 22.9 4.4 35 171-206 8-42 (67)
167 COG1958 LSM1 Small nuclear rib 24.3 1.5E+02 0.0033 23.8 4.6 33 172-205 18-50 (79)
168 PF07174 FAP: Fibronectin-atta 24.1 6.4E+02 0.014 25.6 9.5 17 153-169 120-136 (297)
169 cd01727 LSm8 The eukaryotic Sm 24.1 3.3E+02 0.0071 21.5 6.5 32 172-204 10-41 (74)
170 PF09465 LBR_tudor: Lamin-B re 23.7 3E+02 0.0065 20.8 5.5 38 169-206 7-44 (55)
171 cd01723 LSm4 The eukaryotic Sm 20.9 2.5E+02 0.0053 22.4 5.1 33 171-204 11-43 (76)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=1.3e-57 Score=490.06 Aligned_cols=387 Identities=21% Similarity=0.294 Sum_probs=310.5
Q ss_pred cccccccCCCCcEEEEeeeeCCC-------------CCCccccCCCcceEEEEEEEc--CCEEEecccccCCCceEEEEE
Q 008087 114 GVARVVPAMDAVVKVFCVHTEPN-------------FSLPWQRKRQYSSSSSGFAIG--GRRVLTNAHSVEHYTQVKLKK 178 (578)
Q Consensus 114 ~~~~~~~~~~sVV~I~~~~~~~~-------------~~~p~~~~~~~~~~GsGfvI~--~g~ILT~aHvV~~~~~i~V~~ 178 (578)
....++++.+|||.|........ ...||+......+.||||||+ +||||||+|||.++..+.|++
T Consensus 42 ~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~~ 121 (455)
T PRK10139 42 LAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQL 121 (455)
T ss_pred HHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEEE
Confidence 35667788889999987653221 012343334456799999996 599999999999999999999
Q ss_pred cCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CCCCcEEEEeecCCCCcceEEEeEEeceeeeeecC
Q 008087 179 RGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELP--ALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVH 256 (578)
Q Consensus 179 ~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~ 256 (578)
.|++.|+|++++.|+.+||||||++... .+++++|+++. .+|++|+++|||+|+.. +++.|+||++.+.....
T Consensus 122 -~dg~~~~a~vvg~D~~~DlAvlkv~~~~---~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~-tvt~GivS~~~r~~~~~ 196 (455)
T PRK10139 122 -NDGREFDAKLIGSDDQSDIALLQIQNPS---KLTQIAIADSDKLRVGDFAVAVGNPFGLGQ-TATSGIISALGRSGLNL 196 (455)
T ss_pred -CCCCEEEEEEEEEcCCCCEEEEEecCCC---CCceeEecCccccCCCCEEEEEecCCCCCC-ceEEEEEccccccccCC
Confidence 5999999999999999999999998643 68899999865 56999999999999776 89999999987753221
Q ss_pred CceeeeEEEEecCCCCCCccceEEccCCeEEEEEecccccc-ccCCceeeecchhHHHHHHHHHHcCceeccccCCcccc
Q 008087 257 GSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHE-DVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQ 335 (578)
Q Consensus 257 ~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~ 335 (578)
.....+||+|+++++|||||||||.+|+||||+++.+..+ +..+++||||++.+++++++|+++|++. |+|||+.++
T Consensus 197 -~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~-r~~LGv~~~ 274 (455)
T PRK10139 197 -EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIK-RGLLGIKGT 274 (455)
T ss_pred -CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCccc-ccceeEEEE
Confidence 1234689999999999999999999999999999876543 3578999999999999999999999998 999999999
Q ss_pred cccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEE
Q 008087 336 KMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAA 414 (578)
Q Consensus 336 ~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~ 414 (578)
.+ ++++++.+|++. ..|++|..|.++|||++ |||+||+|++|||++|.+|.+ +...+.....|+++.
T Consensus 275 ~l-~~~~~~~lgl~~-~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~d----------l~~~l~~~~~g~~v~ 342 (455)
T PRK10139 275 EM-SADIAKAFNLDV-QRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAE----------LRSRIATTEPGTKVK 342 (455)
T ss_pred EC-CHHHHHhcCCCC-CCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHH----------HHHHHHhcCCCCEEE
Confidence 99 899999999975 67999999999999999 999999999999999999988 556676667889999
Q ss_pred EEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeehHHHHHHhchhhhhhhhhccccccccccccCCCceE
Q 008087 415 VKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRCLYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMS 494 (578)
Q Consensus 415 l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~v 494 (578)
++|+|+|+.+++++++.......... ....| .+.|+.+.+. .+ .....++
T Consensus 343 l~V~R~G~~~~l~v~~~~~~~~~~~~-~~~~~---~~~g~~l~~~-----~~---------------------~~~~~Gv 392 (455)
T PRK10139 343 LGLLRNGKPLEVEVTLDTSTSSSASA-EMITP---ALQGATLSDG-----QL---------------------KDGTKGI 392 (455)
T ss_pred EEEEECCEEEEEEEEECCCCCccccc-ccccc---cccccEeccc-----cc---------------------ccCCCce
Confidence 99999999999999875432211100 00000 0223322210 00 0011345
Q ss_pred EEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEEE
Q 008087 495 SLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVVL 550 (578)
Q Consensus 495 vvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~l 550 (578)
++.. ++|+++||++||+|++|||++|.+|++|.+++++.+ +.+.|++.|+++.++
T Consensus 393 ~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~~-~~v~l~v~R~g~~~~ 451 (455)
T PRK10139 393 KIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAKP-AIIALQIVRGNESIY 451 (455)
T ss_pred EEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-CeEEEEEEECCEEEE
Confidence 5554 489999999999999999999999999999999865 788999999987654
No 2
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=1.6e-55 Score=474.57 Aligned_cols=391 Identities=22% Similarity=0.311 Sum_probs=322.6
Q ss_pred ccccccCCCCcEEEEeeeeCCCCC----------Cccc----------cCCCcceEEEEEEEc-CCEEEecccccCCCce
Q 008087 115 VARVVPAMDAVVKVFCVHTEPNFS----------LPWQ----------RKRQYSSSSSGFAIG-GRRVLTNAHSVEHYTQ 173 (578)
Q Consensus 115 ~~~~~~~~~sVV~I~~~~~~~~~~----------~p~~----------~~~~~~~~GsGfvI~-~g~ILT~aHvV~~~~~ 173 (578)
.+.++++.+|||.|.+........ ..|. ......+.||||+|+ +||||||+|||.++..
T Consensus 4 ~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~~ 83 (428)
T TIGR02037 4 APLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGADE 83 (428)
T ss_pred HHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCCe
Confidence 455677888999998765321100 0010 112345789999998 7899999999999999
Q ss_pred EEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CCCCcEEEEeecCCCCcceEEEeEEeceee
Q 008087 174 VKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELP--ALQDAVTVVGYPIGGDTISVTSGVVSRIEI 251 (578)
Q Consensus 174 i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~ 251 (578)
+.|++ .+++.++|++++.|+.+|||||+++... .++++.|+++. ..|++|+++|||.+... +++.|+|++..+
T Consensus 84 i~V~~-~~~~~~~a~vv~~d~~~DlAllkv~~~~---~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~-~~t~G~vs~~~~ 158 (428)
T TIGR02037 84 ITVTL-SDGREFKAKLVGKDPRTDIAVLKIDAKK---NLPVIKLGDSDKLRVGDWVLAIGNPFGLGQ-TVTSGIVSALGR 158 (428)
T ss_pred EEEEe-CCCCEEEEEEEEecCCCCEEEEEecCCC---CceEEEccCCCCCCCCCEEEEEECCCcCCC-cEEEEEEEeccc
Confidence 99999 5999999999999999999999999752 68999998744 67999999999999765 899999998876
Q ss_pred eeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEecccccc-ccCCceeeecchhHHHHHHHHHHcCceeccccC
Q 008087 252 LSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHE-DVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLL 330 (578)
Q Consensus 252 ~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~l 330 (578)
... ....+..+||+|+++++|||||||||.+|+||||+++.+... +..+++||||++.+++++++|++++++. |+||
T Consensus 159 ~~~-~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~-~~~l 236 (428)
T TIGR02037 159 SGL-GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQ-RGWL 236 (428)
T ss_pred Ccc-CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCc-CCcC
Confidence 532 122233589999999999999999999999999998866532 3568899999999999999999999998 9999
Q ss_pred CcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCC
Q 008087 331 GVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYT 409 (578)
Q Consensus 331 Gi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~ 409 (578)
|+.++.+ ++++++.||++. ..|++|..|.++|||++ ||++||+|++|||++|.++.+ +...+.....
T Consensus 237 Gi~~~~~-~~~~~~~lgl~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~----------~~~~l~~~~~ 304 (428)
T TIGR02037 237 GVTIQEV-TSDLAKSLGLEK-QRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFAD----------LRRAIGTLKP 304 (428)
T ss_pred ceEeecC-CHHHHHHcCCCC-CCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCC
Confidence 9999999 899999999986 57999999999999999 999999999999999999988 5567777778
Q ss_pred CCEEEEEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeeh-HHHHHHhchhhhhhhhhcccccccccccc
Q 008087 410 GDSAAVKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQC 488 (578)
Q Consensus 410 g~~v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~ 488 (578)
|+.++++|+|+|+.+++++++...+... .++...+.|+.+.++ +.+...++...
T Consensus 305 g~~v~l~v~R~g~~~~~~v~l~~~~~~~-------~~~~~~~lGi~~~~l~~~~~~~~~l~~------------------ 359 (428)
T TIGR02037 305 GKKVTLGILRKGKEKTITVTLGASPEEQ-------ASSSNPFLGLTVANLSPEIRKELRLKG------------------ 359 (428)
T ss_pred CCEEEEEEEECCEEEEEEEEECcCCCcc-------ccccccccceEEecCCHHHHHHcCCCc------------------
Confidence 9999999999999999999987654221 123345789999988 77776665211
Q ss_pred CCCceEEEEee----cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhc-CCCeEEEEEecCeEEEE
Q 008087 489 HNCQMSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENC-DDEFLKFDLEYDQVVVL 550 (578)
Q Consensus 489 ~~~~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~-~~~~v~l~v~R~~~~~l 550 (578)
...++++..+ +|+++||++||+|++|||++|.++++|.+++++. .++.+.|++.|+++.++
T Consensus 360 -~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~v~R~g~~~~ 425 (428)
T TIGR02037 360 -DVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALLILRGGATIF 425 (428)
T ss_pred -CcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEE
Confidence 1146666654 7999999999999999999999999999999986 57899999999987654
No 3
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=1.2e-54 Score=469.32 Aligned_cols=349 Identities=24% Similarity=0.332 Sum_probs=289.6
Q ss_pred ceEEEEEEEc--CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CC
Q 008087 148 SSSSSGFAIG--GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELP--AL 223 (578)
Q Consensus 148 ~~~GsGfvI~--~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~ 223 (578)
.+.||||||+ +||||||+|||.++..+.|++ .|+++|+|++++.|+.+||||||++... .++++.|+++. ++
T Consensus 110 ~~~GSG~ii~~~~G~IlTn~HVv~~a~~i~V~~-~dg~~~~a~vv~~D~~~DlAvlki~~~~---~l~~~~lg~s~~l~~ 185 (473)
T PRK10942 110 MALGSGVIIDADKGYVVTNNHVVDNATKIKVQL-SDGRKFDAKVVGKDPRSDIALIQLQNPK---NLTAIKMADSDALRV 185 (473)
T ss_pred cceEEEEEEECCCCEEEeChhhcCCCCEEEEEE-CCCCEEEEEEEEecCCCCEEEEEecCCC---CCceeEecCccccCC
Confidence 5689999997 489999999999999999999 5999999999999999999999997543 68899998765 56
Q ss_pred CCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEecccccc-ccCCc
Q 008087 224 QDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHE-DVENI 302 (578)
Q Consensus 224 g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~~~~~ 302 (578)
|++|+++|+|+++.. +++.|+|+++.+..... ..+..+||+|+++++|||||||+|.+|+||||+++.+..+ +..++
T Consensus 186 G~~V~aiG~P~g~~~-tvt~GiVs~~~r~~~~~-~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~ 263 (473)
T PRK10942 186 GDYTVAIGNPYGLGE-TVTSGIVSALGRSGLNV-ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGI 263 (473)
T ss_pred CCEEEEEcCCCCCCc-ceeEEEEEEeecccCCc-ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccE
Confidence 999999999999766 89999999987653221 1234689999999999999999999999999999876543 34679
Q ss_pred eeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECC
Q 008087 303 GYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDG 381 (578)
Q Consensus 303 ~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG 381 (578)
+|+||++.+++++++|.++|++. |+|||+.++.+ ++++++.++++. ..|++|..|.++|||++ ||++||+|++|||
T Consensus 264 gfaIP~~~~~~v~~~l~~~g~v~-rg~lGv~~~~l-~~~~a~~~~l~~-~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG 340 (473)
T PRK10942 264 GFAIPSNMVKNLTSQMVEYGQVK-RGELGIMGTEL-NSELAKAMKVDA-QRGAFVSQVLPNSSAAKAGIKAGDVITSLNG 340 (473)
T ss_pred EEEEEHHHHHHHHHHHHhccccc-cceeeeEeeec-CHHHHHhcCCCC-CCceEEEEECCCChHHHcCCCCCCEEEEECC
Confidence 99999999999999999999998 99999999999 889999999986 67999999999999999 9999999999999
Q ss_pred EEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeeh-H
Q 008087 382 IDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC-L 460 (578)
Q Consensus 382 ~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~-p 460 (578)
++|.+|.+ |...+.....|+++.++|+|+|+.+++.+++......... ....+.|+....+ +
T Consensus 341 ~~V~s~~d----------l~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~~~~~~~~-------~~~~~lGl~g~~l~~ 403 (473)
T PRK10942 341 KPISSFAA----------LRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQSSQNQVD-------SSNIFNGIEGAELSN 403 (473)
T ss_pred EECCCHHH----------HHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCcCcccccc-------cccccccceeeeccc
Confidence 99999988 5567777778999999999999999999888653211000 0001123222211 0
Q ss_pred HHHHHhchhhhhhhhhccccccccccccCCCceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCC
Q 008087 461 YLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDE 536 (578)
Q Consensus 461 ~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~ 536 (578)
. ....++++.. ++|+++||++||+|++|||++|.++++|.+++++.+ +
T Consensus 404 ~---------------------------~~~~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~~~-~ 455 (473)
T PRK10942 404 K---------------------------GGDKGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDSKP-S 455 (473)
T ss_pred c---------------------------cCCCCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhCC-C
Confidence 0 0012455554 489999999999999999999999999999999854 7
Q ss_pred eEEEEEecCeEEEE
Q 008087 537 FLKFDLEYDQVVVL 550 (578)
Q Consensus 537 ~v~l~v~R~~~~~l 550 (578)
.+.|++.|++..++
T Consensus 456 ~v~l~V~R~g~~~~ 469 (473)
T PRK10942 456 VLALNIQRGDSSIY 469 (473)
T ss_pred eEEEEEEECCEEEE
Confidence 89999999987654
No 4
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=2.3e-47 Score=399.49 Aligned_cols=296 Identities=25% Similarity=0.366 Sum_probs=252.6
Q ss_pred cccccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc-CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEE
Q 008087 114 GVARVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG-GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAI 192 (578)
Q Consensus 114 ~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~-~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~ 192 (578)
..+.++++.+|||.|.......+. + ......+.||||+|+ +||||||+|||.++..+.|++ .||+.++|+++++
T Consensus 47 ~~~~~~~~~psVV~I~~~~~~~~~---~-~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~-~dg~~~~a~vv~~ 121 (351)
T TIGR02038 47 FNKAVRRAAPAVVNIYNRSISQNS---L-NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVAL-QDGRKFEAELVGS 121 (351)
T ss_pred HHHHHHhcCCcEEEEEeEeccccc---c-ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEE-CCCCEEEEEEEEe
Confidence 456677888899999986544321 1 122345689999998 789999999999999999999 5899999999999
Q ss_pred cCCCCEEEEEEeeCCCCCCeeeEEcCCC--CCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCC
Q 008087 193 GTECDIAMLTVEDDEFWEGVLPVEFGEL--PALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAI 270 (578)
Q Consensus 193 d~~~DlAlLkv~~~~~~~~~~~l~l~~~--~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i 270 (578)
|+.+||||||++.. .+++++++++ .+.|++|+++|||++... +++.|+|+.+.+..... ..+.++||+|+++
T Consensus 122 d~~~DlAvlkv~~~----~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~-s~t~GiIs~~~r~~~~~-~~~~~~iqtda~i 195 (351)
T TIGR02038 122 DPLTDLAVLKIEGD----NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQ-TITQGIISATGRNGLSS-VGRQNFIQTDAAI 195 (351)
T ss_pred cCCCCEEEEEecCC----CCceEeccCcCccCCCCEEEEEeCCCCCCC-cEEEEEEEeccCcccCC-CCcceEEEECCcc
Confidence 99999999999976 4677788754 467999999999998765 89999999987654321 2234689999999
Q ss_pred CCCCccceEEccCCeEEEEEecccccc---ccCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhc
Q 008087 271 NSGNSGGPAFNDKGKCVGIAFQSLKHE---DVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMS 347 (578)
Q Consensus 271 ~~G~SGGPlvn~~G~vVGI~~~~~~~~---~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lg 347 (578)
++|||||||||.+|+||||+++.+... ...+++|+||++.+++++++|+++|++. |+|||+.++++ ++..++.+|
T Consensus 196 ~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~-r~~lGv~~~~~-~~~~~~~lg 273 (351)
T TIGR02038 196 NAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVI-RGYIGVSGEDI-NSVVAQGLG 273 (351)
T ss_pred CCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCccc-ceEeeeEEEEC-CHHHHHhcC
Confidence 999999999999999999998765432 2367899999999999999999999998 99999999998 888899999
Q ss_pred cccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEE
Q 008087 348 MKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNF 426 (578)
Q Consensus 348 l~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~ 426 (578)
++. ..|++|..|.++|||++ ||++||+|++|||++|.++.+ |.+.+.....|+++.++|+|+|+.+++
T Consensus 274 l~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~d----------l~~~l~~~~~g~~v~l~v~R~g~~~~~ 342 (351)
T TIGR02038 274 LPD-LRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEE----------LMDRIAETRPGSKVMVTVLRQGKQLEL 342 (351)
T ss_pred CCc-cccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCCCCEEEEEEEECCEEEEE
Confidence 975 57999999999999999 999999999999999999988 556777667899999999999999999
Q ss_pred EEEeccc
Q 008087 427 NITLATH 433 (578)
Q Consensus 427 ~v~l~~~ 433 (578)
.+++..+
T Consensus 343 ~v~l~~~ 349 (351)
T TIGR02038 343 PVTIDEK 349 (351)
T ss_pred EEEecCC
Confidence 9988654
No 5
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=8.7e-47 Score=394.83 Aligned_cols=297 Identities=22% Similarity=0.331 Sum_probs=250.2
Q ss_pred cccccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc-CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEE
Q 008087 114 GVARVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG-GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAI 192 (578)
Q Consensus 114 ~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~-~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~ 192 (578)
....++++.+|||.|........ +.......+.||||+|+ +||||||+|||.++..+.|++ .|++.|+|+++++
T Consensus 47 ~~~~~~~~~psvV~v~~~~~~~~----~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~-~dg~~~~a~vv~~ 121 (353)
T PRK10898 47 YNQAVRRAAPAVVNVYNRSLNST----SHNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVAL-QDGRVFEALLVGS 121 (353)
T ss_pred HHHHHHHhCCcEEEEEeEecccc----CcccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEe-CCCCEEEEEEEEE
Confidence 35567778889999998764322 11223345789999998 789999999999999999999 5899999999999
Q ss_pred cCCCCEEEEEEeeCCCCCCeeeEEcCCC--CCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCC
Q 008087 193 GTECDIAMLTVEDDEFWEGVLPVEFGEL--PALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAI 270 (578)
Q Consensus 193 d~~~DlAlLkv~~~~~~~~~~~l~l~~~--~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i 270 (578)
|+.+||||||++.. .+++++|+++ ...|++|+++|||++... +++.|+|++..+..... ....++||+|+++
T Consensus 122 d~~~DlAvl~v~~~----~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~-~~t~Giis~~~r~~~~~-~~~~~~iqtda~i 195 (353)
T PRK10898 122 DSLTDLAVLKINAT----NLPVIPINPKRVPHIGDVVLAIGNPYNLGQ-TITQGIISATGRIGLSP-TGRQNFLQTDASI 195 (353)
T ss_pred cCCCCEEEEEEcCC----CCCeeeccCcCcCCCCCEEEEEeCCCCcCC-CcceeEEEeccccccCC-ccccceEEecccc
Confidence 99999999999875 4677888764 467999999999998665 89999999887643321 1223689999999
Q ss_pred CCCCccceEEccCCeEEEEEeccccccc----cCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhh
Q 008087 271 NSGNSGGPAFNDKGKCVGIAFQSLKHED----VENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAM 346 (578)
Q Consensus 271 ~~G~SGGPlvn~~G~vVGI~~~~~~~~~----~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~l 346 (578)
++|||||||+|.+|+||||+++.+...+ ..+++|+||++.+++++++|+++|++. ++|||+..+.+ ++.....+
T Consensus 196 ~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~-~~~lGi~~~~~-~~~~~~~~ 273 (353)
T PRK10898 196 NHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVI-RGYIGIGGREI-APLHAQGG 273 (353)
T ss_pred CCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCccc-ccccceEEEEC-CHHHHHhc
Confidence 9999999999999999999998664322 257899999999999999999999998 89999999988 66666777
Q ss_pred ccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEE
Q 008087 347 SMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILN 425 (578)
Q Consensus 347 gl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~ 425 (578)
++.. ..|++|..|.++|||++ ||++||+|++|||++|.++.+ +.+.+.....|+.+.++|+|+|+.++
T Consensus 274 ~~~~-~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~----------l~~~l~~~~~g~~v~l~v~R~g~~~~ 342 (353)
T PRK10898 274 GIDQ-LQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALE----------TMDQVAEIRPGSVIPVVVMRDDKQLT 342 (353)
T ss_pred CCCC-CCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhcCCCCEEEEEEEECCEEEE
Confidence 7765 57999999999999999 999999999999999999987 45667666789999999999999999
Q ss_pred EEEEecccc
Q 008087 426 FNITLATHR 434 (578)
Q Consensus 426 ~~v~l~~~~ 434 (578)
+.+++..++
T Consensus 343 ~~v~l~~~p 351 (353)
T PRK10898 343 LQVTIQEYP 351 (353)
T ss_pred EEEEeccCC
Confidence 999887653
No 6
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=100.00 E-value=1.1e-38 Score=334.65 Aligned_cols=376 Identities=16% Similarity=0.232 Sum_probs=309.0
Q ss_pred cccccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc--CCEEEecccccCCCceEEEEEcCCCcEEEEEEEE
Q 008087 114 GVARVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG--GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLA 191 (578)
Q Consensus 114 ~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~--~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~ 191 (578)
+...+..+.+|||.|++..... |+..-...+.||||+++ .||||||+|+|..+..+.-..+.+.++.+...++
T Consensus 54 w~~~ia~VvksvVsI~~S~v~~-----fdtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy 128 (955)
T KOG1421|consen 54 WRNTIANVVKSVVSIRFSAVRA-----FDTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY 128 (955)
T ss_pred hhhhhhhhcccEEEEEehheee-----cccccccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence 3556667778999999987544 56666778899999998 6999999999998776544444688888999999
Q ss_pred EcCCCCEEEEEEeeCCC-CCCeeeEEcCC-CCCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCC-----ceeeeEE
Q 008087 192 IGTECDIAMLTVEDDEF-WEGVLPVEFGE-LPALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHG-----STELLGL 264 (578)
Q Consensus 192 ~d~~~DlAlLkv~~~~~-~~~~~~l~l~~-~~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~-----~~~~~~i 264 (578)
.|+.||+.+++++++.. ...+..+.++. ..++|.+++++||..+ +..++..|.++++++....++ .++..++
T Consensus 129 rDpVhdfGf~r~dps~ir~s~vt~i~lap~~akvgseirvvgNDag-EklsIlagflSrldr~apdyg~~~yndfnTfy~ 207 (955)
T KOG1421|consen 129 RDPVHDFGFFRYDPSTIRFSIVTEICLAPELAKVGSEIRVVGNDAG-EKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI 207 (955)
T ss_pred CCchhhcceeecChhhcceeeeeccccCccccccCCceEEecCCcc-ceEEeehhhhhhccCCCccccccccccccceee
Confidence 99999999999998743 12455566653 5688999999999988 666999999999988644332 2334468
Q ss_pred EEecCCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhh
Q 008087 265 QIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRV 344 (578)
Q Consensus 265 ~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~ 344 (578)
|..+...+|.||+||++.+|..|.++.++. ...+.+|++|++.+++.|..++++..+. |+.|.++|... ..+.++
T Consensus 208 QaasstsggssgspVv~i~gyAVAl~agg~---~ssas~ffLpLdrV~RaL~clq~n~PIt-RGtLqvefl~k-~~de~r 282 (955)
T KOG1421|consen 208 QAASSTSGGSSGSPVVDIPGYAVALNAGGS---ISSASDFFLPLDRVVRALRCLQNNTPIT-RGTLQVEFLHK-LFDECR 282 (955)
T ss_pred eehhcCCCCCCCCceecccceEEeeecCCc---ccccccceeeccchhhhhhhhhcCCCcc-cceEEEEEehh-hhHHHH
Confidence 998899999999999999999999998854 4567789999999999999999888888 99999999988 899999
Q ss_pred hhccccC-----------CCCcE-EEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCE
Q 008087 345 AMSMKAD-----------QKGVR-IRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDS 412 (578)
Q Consensus 345 ~lgl~~~-----------~~Gv~-V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~ 412 (578)
+|||+.+ ..|++ |..|.++|||++.|++||++++||+.-+.++.+ +.+++.. ..|+.
T Consensus 283 rlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~----------l~~iLDe-gvgk~ 351 (955)
T KOG1421|consen 283 RLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEA----------LEQILDE-GVGKN 351 (955)
T ss_pred hcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHH----------HHHHHhh-ccCce
Confidence 9999864 45654 567999999999999999999999998888766 3344544 58999
Q ss_pred EEEEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCC
Q 008087 413 AAVKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNC 491 (578)
Q Consensus 413 v~l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~ 491 (578)
+.|+|+|+|++.++++++++.+...|. +|+.|+|.+|+++ ++++..+..+.
T Consensus 352 l~LtI~Rggqelel~vtvqdlh~itp~-------R~levcGav~hdlsyq~ar~y~lP~--------------------- 403 (955)
T KOG1421|consen 352 LELTIQRGGQELELTVTVQDLHGITPD-------RFLEVCGAVFHDLSYQLARLYALPV--------------------- 403 (955)
T ss_pred EEEEEEeCCEEEEEEEEeccccCCCCc-------eEEEEcceEecCCCHHHHhhccccc---------------------
Confidence 999999999999999999999887776 8999999999999 77775554221
Q ss_pred ceEEEEe---ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCC-eEEE
Q 008087 492 QMSSLLW---CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDE-FLKF 540 (578)
Q Consensus 492 ~~vvvs~---v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~-~v~l 540 (578)
+|++++. +++++.++. |.+|.+||++++.++++|++++++.+++ .|.+
T Consensus 404 ~GvyVa~~~gsf~~~~~~y-~~ii~~vanK~tPdLdaFidvlk~L~dg~rV~v 455 (955)
T KOG1421|consen 404 EGVYVASPGGSFRHRGPRY-GQIIDSVANKPTPDLDAFIDVLKELPDGARVPV 455 (955)
T ss_pred CcEEEccCCCCccccCCcc-eEEEEeecCCcCCCHHHHHHHHHhccCCCeeeE
Confidence 4788774 356666666 9999999999999999999999998654 4444
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=6.2e-38 Score=328.32 Aligned_cols=410 Identities=43% Similarity=0.617 Sum_probs=361.4
Q ss_pred ccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEcCCEEEecccccC---CCceEEEEEcCCCcEEEEEEEEEc
Q 008087 117 RVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIGGRRVLTNAHSVE---HYTQVKLKKRGSDTKYLATVLAIG 193 (578)
Q Consensus 117 ~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~~g~ILT~aHvV~---~~~~i~V~~~~~g~~~~a~vv~~d 193 (578)
.+..+..|++.+.+..+.+.+..||+..++..+.|+||.+....+|||+|++. +...+.+..++.-+.|.|++...-
T Consensus 55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~ 134 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGKKLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF 134 (473)
T ss_pred CccccccceeEEEeecccccccCcceeeehhcccccchhhcccceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence 34556679999999999999999999999999999999999999999999999 566677766667778899999999
Q ss_pred CCCCEEEEEEeeCCCCCCeeeEEcCCCCCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCC
Q 008087 194 TECDIAMLTVEDDEFWEGVLPVEFGELPALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSG 273 (578)
Q Consensus 194 ~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G 273 (578)
..+|+|+|.++..+||....++++++.+.+.+.++++| | +..++|.|.|++.....+.++......+|+++++++|
T Consensus 135 ~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---g-d~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~~~ 210 (473)
T KOG1320|consen 135 EECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---G-DGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIGPG 210 (473)
T ss_pred hcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---C-CcEEEEeeEEEEEEeccccCCCcceeeEEEEEeecCC
Confidence 99999999999999998888999999999999999999 4 5569999999999988888777777789999999999
Q ss_pred CccceEEccCCeEEEEEeccccccccCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhccccCCC
Q 008087 274 NSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMSMKADQK 353 (578)
Q Consensus 274 ~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lgl~~~~~ 353 (578)
+||+|++...+++.|+++..++.. .++++.||.-.+.+|.......+.+.+++++++..+.+++...+..+.|..+ +
T Consensus 211 ~s~ep~i~g~d~~~gvA~l~ik~~--~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~-~ 287 (473)
T KOG1320|consen 211 NSGEPVIVGVDKVAGVAFLKIKTP--ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE-T 287 (473)
T ss_pred ccCCCeEEccccccceEEEEEecC--CcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc-c
Confidence 999999998899999999977532 2789999999999999888888888789999999999989999999999887 8
Q ss_pred CcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEeccc
Q 008087 354 GVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATH 433 (578)
Q Consensus 354 Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~ 433 (578)
|+.+.++.+.+.|.+-++.||.|+++||+.|. +.++..+|+.|.+++....+++++.+.|+|.+ ++.+.+...
T Consensus 288 g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~Ig----Vn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~---e~~~~lr~~ 360 (473)
T KOG1320|consen 288 GVLISKINQTDAAINPGNSGGPLLNLDGEVIG----VNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG---EFQISLRPV 360 (473)
T ss_pred ceeeeeecccchhhhcccCCCcEEEecCcEee----eeeeeeEEeeccccceeccCchHhhhhhhhhh---hhceeeccc
Confidence 99999999999888899999999999999997 66778889999999999999999999999988 566777777
Q ss_pred ccccCCCCCCCCCCceeeccEEEeeh--HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEee----cCCcCCCC
Q 008087 434 RRLIPSHNKGRPPSYYIIAGFVFSRC--LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWC----LRSPLCLN 507 (578)
Q Consensus 434 ~~~~~~~~~~~~p~~~~~~Gl~~~~~--p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v----~A~~aGl~ 507 (578)
..+.+.+.+...|.|++++||+|.++ +++.... ..++++++.+ ++..+++.
T Consensus 361 ~~~~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~-----------------------~~q~v~is~Vlp~~~~~~~~~~ 417 (473)
T KOG1320|consen 361 KPLVPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSG-----------------------VVQLVLVSQVLPGSINGGYGLK 417 (473)
T ss_pred cCcccccccCCceeEEEecceEEeecCCCcccccc-----------------------ceeEEEEEEeccCCCccccccc
Confidence 77788888999999999999999987 4433111 1246666655 68888899
Q ss_pred CCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEEEechhhhHHHHHHHHHcCCCcC
Q 008087 508 CFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVVLRTKTSKAATLDILATHCIPSA 571 (578)
Q Consensus 508 ~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~l~~~~~~~~~~~i~~~~~i~~~ 571 (578)
+||+|++|||++|+++.++.++++.+..+ +...+|+++.+|+++..|+.+|++|++
T Consensus 418 ~g~~V~~vng~~V~n~~~l~~~i~~~~~~--------~~v~vl~~~~~e~~tl~Il~~~~~p~~ 473 (473)
T KOG1320|consen 418 PGDQVVKVNGKPVKNLKHLYELIEECSTE--------DKVAVLDRRSAEDATLEILPEHKIPSA 473 (473)
T ss_pred CCCEEEEECCEEeechHHHHHHHHhcCcC--------ceEEEEEecCccceeEEecccccCCCC
Confidence 99999999999999999999999998766 788899999999999999999999985
No 8
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=4.9e-36 Score=315.02 Aligned_cols=300 Identities=25% Similarity=0.364 Sum_probs=250.8
Q ss_pred cccccccCCCCcEEEEeeeeCCC-CCCccccCCC-cceEEEEEEEc-CCEEEecccccCCCceEEEEEcCCCcEEEEEEE
Q 008087 114 GVARVVPAMDAVVKVFCVHTEPN-FSLPWQRKRQ-YSSSSSGFAIG-GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVL 190 (578)
Q Consensus 114 ~~~~~~~~~~sVV~I~~~~~~~~-~~~p~~~~~~-~~~~GsGfvI~-~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv 190 (578)
....+.++.++||.|........ ..++-..... ..+.||||+++ +|||+||.|+|.++..+.+.+ .||+.++++++
T Consensus 35 ~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l-~dg~~~~a~~v 113 (347)
T COG0265 35 FATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTL-ADGREVPAKLV 113 (347)
T ss_pred HHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEe-CCCCEEEEEEE
Confidence 35566777889999998765442 0000000001 14789999998 899999999999999999999 69999999999
Q ss_pred EEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEec
Q 008087 191 AIGTECDIAMLTVEDDEFWEGVLPVEFGELP--ALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDA 268 (578)
Q Consensus 191 ~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da 268 (578)
+.|+..|||+|+++... .++.+.++++. .+|++++++|+|+++.. +++.|+|+.+.+...........+||+|+
T Consensus 114 g~d~~~dlavlki~~~~---~~~~~~~~~s~~l~vg~~v~aiGnp~g~~~-tvt~Givs~~~r~~v~~~~~~~~~IqtdA 189 (347)
T COG0265 114 GKDPISDLAVLKIDGAG---GLPVIALGDSDKLRVGDVVVAIGNPFGLGQ-TVTSGIVSALGRTGVGSAGGYVNFIQTDA 189 (347)
T ss_pred ecCCccCEEEEEeccCC---CCceeeccCCCCcccCCEEEEecCCCCccc-ceeccEEeccccccccCcccccchhhccc
Confidence 99999999999999864 26777888765 45899999999999665 99999999998862222122557899999
Q ss_pred CCCCCCccceEEccCCeEEEEEeccccccc-cCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhc
Q 008087 269 AINSGNSGGPAFNDKGKCVGIAFQSLKHED-VENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMS 347 (578)
Q Consensus 269 ~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~-~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lg 347 (578)
++++|+||||++|.+|++|||+++.+...+ ..+++|+||++.+..++.++...|++. ++|+|+.+..+ +.+.+ +|
T Consensus 190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~-~~~lgv~~~~~-~~~~~--~g 265 (347)
T COG0265 190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVV-RGYLGVIGEPL-TADIA--LG 265 (347)
T ss_pred ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCcc-ccccceEEEEc-ccccc--cC
Confidence 999999999999999999999999776443 356899999999999999999988888 99999999988 66666 77
Q ss_pred cccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEE
Q 008087 348 MKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNF 426 (578)
Q Consensus 348 l~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~ 426 (578)
++. ..|++|..|.+++||++ |++.||+|+++||+++.+..+ +...+.....|+.+.+++.|+|+++++
T Consensus 266 ~~~-~~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~----------l~~~v~~~~~g~~v~~~~~r~g~~~~~ 334 (347)
T COG0265 266 LPV-AAGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSD----------LVAAVASNRPGDEVALKLLRGGKEREL 334 (347)
T ss_pred CCC-CCceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHH----------HHHHHhccCCCCEEEEEEEECCEEEEE
Confidence 774 77899999999999999 999999999999999999887 556677777899999999999999999
Q ss_pred EEEeccc
Q 008087 427 NITLATH 433 (578)
Q Consensus 427 ~v~l~~~ 433 (578)
.+++.+.
T Consensus 335 ~v~l~~~ 341 (347)
T COG0265 335 AVTLGDR 341 (347)
T ss_pred EEEecCc
Confidence 9999874
No 9
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.91 E-value=1.9e-23 Score=219.58 Aligned_cols=303 Identities=21% Similarity=0.221 Sum_probs=225.7
Q ss_pred cccccccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc-CCEEEecccccCCCc-----------eEEEEEc-C
Q 008087 114 GVARVVPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG-GRRVLTNAHSVEHYT-----------QVKLKKR-G 180 (578)
Q Consensus 114 ~~~~~~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~-~g~ILT~aHvV~~~~-----------~i~V~~~-~ 180 (578)
..........|||.|.....-. ...|+....-....||||||+ +|+|+||+||+.... .+.+... +
T Consensus 130 v~~~~~~cd~Avv~Ie~~~f~~-~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~~ 208 (473)
T KOG1320|consen 130 VAAVFEECDLAVVYIESEEFWK-GMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAIG 208 (473)
T ss_pred HHHhhhcccceEEEEeeccccC-CCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEeec
Confidence 3455666778899998743211 122455555566789999998 999999999997432 2555552 1
Q ss_pred CCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCC--CCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCC-
Q 008087 181 SDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELP--ALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHG- 257 (578)
Q Consensus 181 ~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~--~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~- 257 (578)
.+..+++.+.+.|+..|+|+++++..+ .-..+++++-+. ..|+++.++|+|+++.+ +.+.|+++...|..+..+
T Consensus 209 ~~~s~ep~i~g~d~~~gvA~l~ik~~~--~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~n-t~t~g~vs~~~R~~~~lg~ 285 (473)
T KOG1320|consen 209 PGNSGEPVIVGVDKVAGVAFLKIKTPE--NILYVIPLGVSSHFRTGVEVSAIGNGFGLLN-TLTQGMVSGQLRKSFKLGL 285 (473)
T ss_pred CCccCCCeEEccccccceEEEEEecCC--cccceeecceeeeecccceeeccccCceeee-eeeecccccccccccccCc
Confidence 248899999999999999999997654 125666666544 45899999999999877 899999998877655422
Q ss_pred ---ceeeeEEEEecCCCCCCccceEEccCCeEEEEEecccccc-ccCCceeeecchhHHHHHHHHHHcCc---ee-----
Q 008087 258 ---STELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHE-DVENIGYVIPTPVIMHFIQDYEKNGA---YT----- 325 (578)
Q Consensus 258 ---~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~~~~~~~aiPi~~i~~~l~~l~~~g~---v~----- 325 (578)
....+++|+|++++.|+||||++|.+|++||++++...+- -..+++|++|.+.++.++.+..+... ..
T Consensus 286 ~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p 365 (473)
T KOG1320|consen 286 ETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVP 365 (473)
T ss_pred ccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCccc
Confidence 2345689999999999999999999999999998855321 23678999999999998887743221 11
Q ss_pred ccccCCcccccccChhhh-----hhhcccc-CCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcch
Q 008087 326 GFPLLGVEWQKMENPDLR-----VAMSMKA-DQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERI 398 (578)
Q Consensus 326 ~~~~lGi~~~~~~~~~~~-----~~lgl~~-~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~ 398 (578)
-+.|+|.....+ ...+. +.+-.+. ..++++|..|.|++++.. ++++||+|++|||++|.+..+
T Consensus 366 ~~~~~g~~s~~i-~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~--------- 435 (473)
T KOG1320|consen 366 VHQYIGLPSYYI-FAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKH--------- 435 (473)
T ss_pred ccccCCceeEEE-ecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHH---------
Confidence 134677665554 22221 1121221 135899999999999999 999999999999999999998
Q ss_pred hHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087 399 GFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA 431 (578)
Q Consensus 399 ~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 431 (578)
+.+++.....++++.+..+|+.+..++.+...
T Consensus 436 -l~~~i~~~~~~~~v~vl~~~~~e~~tl~Il~~ 467 (473)
T KOG1320|consen 436 -LYELIEECSTEDKVAVLDRRSAEDATLEILPE 467 (473)
T ss_pred -HHHHHHhcCcCceEEEEEecCccceeEEeccc
Confidence 55788888788889888888888888877654
No 10
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.86 E-value=1.7e-19 Score=190.97 Aligned_cols=362 Identities=14% Similarity=0.132 Sum_probs=260.2
Q ss_pred ccCCCCcEEEEeeeeCCCCCCccccCCCcceEEEEEEEc--CCEEEecccccC-CCceEEEEEcCCCcEEEEEEEEEcCC
Q 008087 119 VPAMDAVVKVFCVHTEPNFSLPWQRKRQYSSSSSGFAIG--GRRVLTNAHSVE-HYTQVKLKKRGSDTKYLATVLAIGTE 195 (578)
Q Consensus 119 ~~~~~sVV~I~~~~~~~~~~~p~~~~~~~~~~GsGfvI~--~g~ILT~aHvV~-~~~~i~V~~~~~g~~~~a~vv~~d~~ 195 (578)
+....+.|.+.+.......+. ......|||.|++ +|++++++.+|. +.....|++ +|...++|.+.+.++.
T Consensus 525 ~~i~~~~~~v~~~~~~~l~g~-----s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~-~dS~~i~a~~~fL~~t 598 (955)
T KOG1421|consen 525 ADISNCLVDVEPMMPVNLDGV-----SSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTE-ADSDGIPANVSFLHPT 598 (955)
T ss_pred hHHhhhhhhheeceeeccccc-----hhhhhcCceEEEEccCCceeEecccCCchhhceEEee-cccccccceeeEecCc
Confidence 334457777776664443221 1123469999997 799999999997 567888988 5778899999999999
Q ss_pred CCEEEEEEeeCCCCCCeeeEEcCCCC-CCCCcEEEEeecCCCCc----ceEEEeEEeceeeeeec-CCceeeeEEEEecC
Q 008087 196 CDIAMLTVEDDEFWEGVLPVEFGELP-ALQDAVTVVGYPIGGDT----ISVTSGVVSRIEILSYV-HGSTELLGLQIDAA 269 (578)
Q Consensus 196 ~DlAlLkv~~~~~~~~~~~l~l~~~~-~~g~~V~~iG~p~g~~~----~sv~~G~Is~~~~~~~~-~~~~~~~~i~~da~ 269 (578)
.++|.+|+++.- ...+.|.+.. ..|+++...|+...... .+++.-.+......... ....+++.|.+++.
T Consensus 599 ~n~a~~kydp~~----~~~~kl~~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~n 674 (955)
T KOG1421|consen 599 ENVASFKYDPAL----EVQLKLTDTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDN 674 (955)
T ss_pred cceeEeccChhH----hhhhccceeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEecc
Confidence 999999999863 3344554422 45899999998865432 12222111111111111 11234567888777
Q ss_pred CCCCCccceEEccCCeEEEEEecccccc-c--cCCceeeecchhHHHHHHHHHHcCceeccccCCcccccccChhhhhhh
Q 008087 270 INSGNSGGPAFNDKGKCVGIAFQSLKHE-D--VENIGYVIPTPVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAM 346 (578)
Q Consensus 270 i~~G~SGGPlvn~~G~vVGI~~~~~~~~-~--~~~~~~aiPi~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~l 346 (578)
+..++--|-+.|.+|+|+|++...++.. + ...+-|.+.+..++..|+.|+.++... ...+|++|..+ +...++.+
T Consensus 675 lsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~r-p~i~~vef~~i-~laqar~l 752 (955)
T KOG1421|consen 675 LSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSAR-PTIAGVEFSHI-TLAQARTL 752 (955)
T ss_pred ccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCC-ceeeccceeeE-Eeehhhcc
Confidence 7766666788899999999998765432 1 122456899999999999998877765 55679999998 88888889
Q ss_pred ccccC------------CCCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEE
Q 008087 347 SMKAD------------QKGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAA 414 (578)
Q Consensus 347 gl~~~------------~~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~ 414 (578)
|++.+ .+-.+|.+|.+.-+ +-|..||+|+++||+-|+...++ ++ +. .+.
T Consensus 753 glp~e~imk~e~es~~~~ql~~ishv~~~~~--kil~~gdiilsvngk~itr~~dl----------~d-~~------eid 813 (955)
T KOG1421|consen 753 GLPSEFIMKSEEESTIPRQLYVISHVRPLLH--KILGVGDIILSVNGKMITRLSDL----------HD-FE------EID 813 (955)
T ss_pred CCCHHHHhhhhhcCCCcceEEEEEeeccCcc--cccccccEEEEecCeEEeeehhh----------hh-hh------hhh
Confidence 98864 13356788877543 25999999999999999998883 33 21 567
Q ss_pred EEEEECCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCce
Q 008087 415 VKVLRDSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQM 493 (578)
Q Consensus 415 l~V~R~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~ 493 (578)
..|+|+|.++++.+++-... ..-++.+|.|..+++. .++...+- +. .++
T Consensus 814 ~~ilrdg~~~~ikipt~p~~---------et~r~vi~~gailq~ph~av~~q~e---dl------------------p~g 863 (955)
T KOG1421|consen 814 AVILRDGIEMEIKIPTYPEY---------ETSRAVIWMGAILQPPHSAVFEQVE---DL------------------PEG 863 (955)
T ss_pred eeeeecCcEEEEEecccccc---------ccceEEEEEeccccCchHHHHHHHh---cc------------------CCc
Confidence 89999999999998875432 1227889999999998 66654443 11 146
Q ss_pred EEEE----eecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCC-eEEEEE
Q 008087 494 SSLL----WCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDE-FLKFDL 542 (578)
Q Consensus 494 vvvs----~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~-~v~l~v 542 (578)
|++. ++||.+ +|.+--.|++|||..+.++++|..++...++. ++++..
T Consensus 864 vyvt~rg~gspalq-~l~aa~fitavng~~t~~lddf~~~~~~ipdnsyv~v~~ 916 (955)
T KOG1421|consen 864 VYVTSRGYGSPALQ-MLRAAHFITAVNGHDTNTLDDFYHMLLEIPDNSYVQVKQ 916 (955)
T ss_pred eEEeecccCChhHh-hcchheeEEEecccccCcHHHHHHHHhhCCCCceEEEEE
Confidence 7776 357777 89999999999999999999999999998765 666554
No 11
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=99.68 E-value=8.4e-17 Score=174.48 Aligned_cols=153 Identities=12% Similarity=0.170 Sum_probs=113.9
Q ss_pred cEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEeccc
Q 008087 355 VRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATH 433 (578)
Q Consensus 355 v~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~ 433 (578)
.+|.+|.++|||++ |||+||+|++|||++|.+|.+ +...+.....|++++++|.|+|+.+++++++...
T Consensus 128 ~lV~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~----------l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~~ 197 (449)
T PRK10779 128 PVVGEIAPNSIAAQAQIAPGTELKAVDGIETPDWDA----------VRLALVSKIGDESTTITVAPFGSDQRRDKTLDLR 197 (449)
T ss_pred ccccccCCCCHHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhhccCCceEEEEEeCCccceEEEEeccc
Confidence 46899999999999 999999999999999999988 4566777778899999999999998888888543
Q ss_pred ccccCCCCCCCCCCceeeccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEeecCCcCCCCCCCEE
Q 008087 434 RRLIPSHNKGRPPSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWCLRSPLCLNCFNKV 512 (578)
Q Consensus 434 ~~~~~~~~~~~~p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~~aGl~~GD~I 512 (578)
+........ . . ..++++.++ +..... =..|..+++|+++||++||+|
T Consensus 198 ~~~~~~~~~--~-~---~~~lGl~~~~~~~~~v--------------------------V~~V~~~SpA~~AGL~~GDvI 245 (449)
T PRK10779 198 HWAFEPDKQ--D-P---VSSLGIRPRGPQIEPV--------------------------LAEVQPNSAASKAGLQAGDRI 245 (449)
T ss_pred ccccCcccc--c-h---hhcccccccCCCcCcE--------------------------EEeeCCCCHHHHcCCCCCCEE
Confidence 221110000 0 0 111222221 100000 012334568999999999999
Q ss_pred EEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEE
Q 008087 513 LAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVV 549 (578)
Q Consensus 513 ~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~ 549 (578)
++|||++|++|+++.+.++..+++.+.+++.|+++.+
T Consensus 246 l~Ing~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~~ 282 (449)
T PRK10779 246 VKVDGQPLTQWQTFVTLVRDNPGKPLALEIERQGSPL 282 (449)
T ss_pred EEECCEEcCCHHHHHHHHHhCCCCEEEEEEEECCEEE
Confidence 9999999999999999999888889999999997653
No 12
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.59 E-value=2.6e-14 Score=125.93 Aligned_cols=108 Identities=33% Similarity=0.491 Sum_probs=71.8
Q ss_pred EEEEEEcC-CEEEecccccC--------CCceEEEEEcCCCcEEE--EEEEEEcCC-CCEEEEEEeeCCCCCCeeeEEcC
Q 008087 151 SSGFAIGG-RRVLTNAHSVE--------HYTQVKLKKRGSDTKYL--ATVLAIGTE-CDIAMLTVEDDEFWEGVLPVEFG 218 (578)
Q Consensus 151 GsGfvI~~-g~ILT~aHvV~--------~~~~i~V~~~~~g~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~~l~l~ 218 (578)
||||+|++ |+||||+|||. ....+.+... ++..+. +++++.|+. +|||||+++.
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~D~All~v~~------------- 66 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFP-DGRRVPPVAEVVYFDPDDYDLALLKVDP------------- 66 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEET-TSCEEETEEEEEEEETT-TTEEEEEESC-------------
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEec-CCCEEeeeEEEEEECCccccEEEEEEec-------------
Confidence 89999985 59999999999 4567888874 666777 999999999 9999999990
Q ss_pred CCCCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEE
Q 008087 219 ELPALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGI 289 (578)
Q Consensus 219 ~~~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI 289 (578)
....+.. ....+.......... .......+ +++.+.+|+|||||||.+|+||||
T Consensus 67 --------~~~~~~~------~~~~~~~~~~~~~~~--~~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 67 --------WTGVGGG------VRVPGSTSGVSPTST--NDNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp --------EEEEEEE------EEEEEEEEEEEEEEE--EETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred --------ccceeee------eEeeeeccccccccC--cccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 0000000 000000000000000 00111124 899999999999999999999997
No 13
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=99.54 E-value=2.7e-14 Score=153.52 Aligned_cols=135 Identities=17% Similarity=0.189 Sum_probs=105.1
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA 431 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 431 (578)
.|.+|.+|.++|||++ |||+||+|++|||+++.++.++ ...+.... +++.+++.|+|+..++.+++.
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl----------~~~ia~~~--~~v~~~I~r~g~~~~l~v~l~ 195 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEPGDEILSVNGNKIPGFKDV----------RQQIADIA--GEPMVEILAERENWTFEVMKE 195 (420)
T ss_pred CCceeeccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHH----------HHHHHhhc--ccceEEEEEecCceEeccccc
Confidence 4889999999999999 9999999999999999999884 45555544 678899999988766544332
Q ss_pred ccccccCCCCCCCCCCceeeccEEEeehHHHHHHhchhhhhhhhhccccccccccccCCCceEEEEeecCCcCCCCCCCE
Q 008087 432 THRRLIPSHNKGRPPSYYIIAGFVFSRCLYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWCLRSPLCLNCFNK 511 (578)
Q Consensus 432 ~~~~~~~~~~~~~~p~~~~~~Gl~~~~~p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~~aGl~~GD~ 511 (578)
-... .|. .+..+ ..+.++++|+++||++||+
T Consensus 196 ~~~~---------~~~----~g~vV------------------------------------~~V~~~SpA~~aGL~~GD~ 226 (420)
T TIGR00054 196 LIPR---------GPK----IEPVL------------------------------------SDVTPNSPAEKAGLKEGDY 226 (420)
T ss_pred ceec---------CCC----cCcEE------------------------------------EEECCCCHHHHcCCCCCCE
Confidence 1100 000 00000 1233456899999999999
Q ss_pred EEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEE
Q 008087 512 VLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVV 548 (578)
Q Consensus 512 I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~ 548 (578)
|++|||++|++|+|+.+.+++.+++.+.+++.|+++.
T Consensus 227 Iv~Vng~~V~s~~dl~~~l~~~~~~~v~l~v~R~g~~ 263 (420)
T TIGR00054 227 IQSINGEKLRSWTDFVSAVKENPGKSMDIKVERNGET 263 (420)
T ss_pred EEEECCEECCCHHHHHHHHHhCCCCceEEEEEECCEE
Confidence 9999999999999999999998888899999999765
No 14
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.42 E-value=2.3e-12 Score=125.39 Aligned_cols=182 Identities=22% Similarity=0.291 Sum_probs=116.1
Q ss_pred CCCCCCccccCCCc---ceEEEEEEEcCCEEEecccccCCCceEEEEEcC------CC--cEEEEEEEEEc----C---C
Q 008087 134 EPNFSLPWQRKRQY---SSSSSGFAIGGRRVLTNAHSVEHYTQVKLKKRG------SD--TKYLATVLAIG----T---E 195 (578)
Q Consensus 134 ~~~~~~p~~~~~~~---~~~GsGfvI~~g~ILT~aHvV~~~~~i~V~~~~------~g--~~~~a~vv~~d----~---~ 195 (578)
.....+||...... ...|+|++|++.+|||+|||+.....+.+.+.. ++ ..+..+-+..+ . .
T Consensus 7 ~~~~~~p~~v~i~~~~~~~~C~G~li~~~~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~ 86 (220)
T PF00089_consen 7 ASPGEFPWVVSIRYSNGRFFCTGTLISPRWVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYD 86 (220)
T ss_dssp CGTTSSTTEEEEEETTTEEEEEEEEEETTEEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTT
T ss_pred CCCCCCCeEEEEeeCCCCeeEeEEeccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 33445566554332 457999999999999999999996667775531 22 23343333332 2 5
Q ss_pred CCEEEEEEeeC-CCCCCeeeEEcCCC---CCCCCcEEEEeecCCCCcc---eEE---EeEEeceeeeeecCCceeeeEEE
Q 008087 196 CDIAMLTVEDD-EFWEGVLPVEFGEL---PALQDAVTVVGYPIGGDTI---SVT---SGVVSRIEILSYVHGSTELLGLQ 265 (578)
Q Consensus 196 ~DlAlLkv~~~-~~~~~~~~l~l~~~---~~~g~~V~~iG~p~g~~~~---sv~---~G~Is~~~~~~~~~~~~~~~~i~ 265 (578)
+|||||+++.+ .+...+.++.+... ...++.+.++|++...... .+. ..+++...+............++
T Consensus 87 ~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c 166 (220)
T PF00089_consen 87 NDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMIC 166 (220)
T ss_dssp TSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEE
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 79999999987 33456788888762 2568999999998753221 233 33333322222111111123466
Q ss_pred Eec----CCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchhHHHHH
Q 008087 266 IDA----AINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPVIMHFI 315 (578)
Q Consensus 266 ~da----~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l 315 (578)
+.. ..+.|+|||||++.++.++||++...........++++++...++++
T Consensus 167 ~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 167 AGSSGSGDACQGDSGGPLICNNNYLVGIVSFGENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EETTSSSBGGTTTTTSEEEETTEEEEEEEEEESSSSBTTSEEEEEEGGGGHHHH
T ss_pred ccccccccccccccccccccceeeecceeeecCCCCCCCcCEEEEEHHHhhccC
Confidence 655 78899999999998778999999864332223357888888777654
No 15
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.41 E-value=4.1e-13 Score=110.94 Aligned_cols=81 Identities=32% Similarity=0.520 Sum_probs=69.1
Q ss_pred ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087 328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ 406 (578)
Q Consensus 328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~ 406 (578)
||||+.+... +. ..|++|..|.++|||++ ||++||+|++|||++|.++.+ |...+..
T Consensus 1 ~~lGv~~~~~-~~-----------~~g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~----------~~~~l~~ 58 (82)
T PF13180_consen 1 GGLGVTVQNL-SD-----------TGGVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSED----------LVNILSK 58 (82)
T ss_dssp -E-SEEEEEC-SC-----------SSSEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHH----------HHHHHHC
T ss_pred CEECeEEEEc-cC-----------CCeEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHH----------HHHHHHh
Confidence 6899999876 21 35999999999999999 999999999999999999887 5677778
Q ss_pred cCCCCEEEEEEEECCEEEEEEEEe
Q 008087 407 KYTGDSAAVKVLRDSKILNFNITL 430 (578)
Q Consensus 407 ~~~g~~v~l~V~R~g~~~~~~v~l 430 (578)
...|++++|+|+|+|+.++++++|
T Consensus 59 ~~~g~~v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 59 GKPGDTVTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp SSTTSEEEEEEEETTEEEEEEEE-
T ss_pred CCCCCEEEEEEEECCEEEEEEEEC
Confidence 889999999999999999999875
No 16
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.33 E-value=4.5e-11 Score=117.19 Aligned_cols=168 Identities=21% Similarity=0.207 Sum_probs=98.8
Q ss_pred ceEEEEEEEcCCEEEecccccCCC--ceEEEEEcCC--------CcEEEEEEEEEc-------CCCCEEEEEEeeCC-CC
Q 008087 148 SSSSSGFAIGGRRVLTNAHSVEHY--TQVKLKKRGS--------DTKYLATVLAIG-------TECDIAMLTVEDDE-FW 209 (578)
Q Consensus 148 ~~~GsGfvI~~g~ILT~aHvV~~~--~~i~V~~~~~--------g~~~~a~vv~~d-------~~~DlAlLkv~~~~-~~ 209 (578)
...|+|++|++.+|||+|||+.+. ..+.|.+... ...+..+-+..+ ..+|||||+++.+. +.
T Consensus 24 ~~~C~GtlIs~~~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~~ 103 (232)
T cd00190 24 RHFCGGSLISPRWVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTLS 103 (232)
T ss_pred cEEEEEEEeeCCEEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccCC
Confidence 468999999999999999999875 4566665211 122333334444 35899999999763 33
Q ss_pred CCeeeEEcCCC---CCCCCcEEEEeecCCCCc-------ceEEEeEEeceeeeeecC--CceeeeEEEE-----ecCCCC
Q 008087 210 EGVLPVEFGEL---PALQDAVTVVGYPIGGDT-------ISVTSGVVSRIEILSYVH--GSTELLGLQI-----DAAINS 272 (578)
Q Consensus 210 ~~~~~l~l~~~---~~~g~~V~~iG~p~g~~~-------~sv~~G~Is~~~~~~~~~--~~~~~~~i~~-----da~i~~ 272 (578)
..+.|+.|... ...++.+.++||...... ......+++...+..... .......++. ....|.
T Consensus 104 ~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~ 183 (232)
T cd00190 104 DNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDACQ 183 (232)
T ss_pred CcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCcccc
Confidence 45788888764 244789999998764321 111222222222221111 0011122333 335788
Q ss_pred CCccceEEccC---CeEEEEEeccccccccCCceeeecchhHHHHH
Q 008087 273 GNSGGPAFNDK---GKCVGIAFQSLKHEDVENIGYVIPTPVIMHFI 315 (578)
Q Consensus 273 G~SGGPlvn~~---G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l 315 (578)
|+|||||+... +.++||.+....-......+.+..+...++++
T Consensus 184 gdsGgpl~~~~~~~~~lvGI~s~g~~c~~~~~~~~~t~v~~~~~WI 229 (232)
T cd00190 184 GDSGGPLVCNDNGRGVLVGIVSWGSGCARPNYPGVYTRVSSYLDWI 229 (232)
T ss_pred CCCCCcEEEEeCCEEEEEEEEehhhccCCCCCCCEEEEcHHhhHHh
Confidence 99999999864 79999998744211112333444455554444
No 17
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.22 E-value=4e-11 Score=100.46 Aligned_cols=88 Identities=34% Similarity=0.584 Sum_probs=74.0
Q ss_pred ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087 328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ 406 (578)
Q Consensus 328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~ 406 (578)
+|+|+.++.+ ++.....+++.. ..|++|..|.++|||++ ||++||+|++|||++|.++.+ +...+..
T Consensus 1 ~~~G~~~~~~-~~~~~~~~~~~~-~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~----------~~~~l~~ 68 (90)
T cd00987 1 PWLGVTVQDL-TPDLAEELGLKD-TKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVAD----------LRRALAE 68 (90)
T ss_pred CccceEEeEC-CHHHHHHcCCCC-CCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHH----------HHHHHHh
Confidence 5899999998 666666666643 56999999999999998 999999999999999999987 4566666
Q ss_pred cCCCCEEEEEEEECCEEEEEE
Q 008087 407 KYTGDSAAVKVLRDSKILNFN 427 (578)
Q Consensus 407 ~~~g~~v~l~V~R~g~~~~~~ 427 (578)
...++.+.+++.|+|+..++.
T Consensus 69 ~~~~~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 69 LKPGDKVTLTVLRGGKELTVT 89 (90)
T ss_pred cCCCCEEEEEEEECCEEEEee
Confidence 556889999999999876654
No 18
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.21 E-value=2.3e-10 Score=112.29 Aligned_cols=147 Identities=22% Similarity=0.237 Sum_probs=90.8
Q ss_pred ceEEEEEEEcCCEEEecccccCCCc--eEEEEEcCCC-------cEEEEEEEEEc-------CCCCEEEEEEeeCC-CCC
Q 008087 148 SSSSSGFAIGGRRVLTNAHSVEHYT--QVKLKKRGSD-------TKYLATVLAIG-------TECDIAMLTVEDDE-FWE 210 (578)
Q Consensus 148 ~~~GsGfvI~~g~ILT~aHvV~~~~--~i~V~~~~~g-------~~~~a~vv~~d-------~~~DlAlLkv~~~~-~~~ 210 (578)
...|+|++|++.+|||+|||+.+.. .+.|.+.... ..+...-+..+ ..+|||||+++.+. +..
T Consensus 25 ~~~C~GtlIs~~~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~~ 104 (229)
T smart00020 25 RHFCGGSLISPRWVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLSD 104 (229)
T ss_pred CcEEEEEEecCCEEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCCC
Confidence 4579999999999999999998753 6777773222 22333434432 45899999998762 334
Q ss_pred CeeeEEcCCC---CCCCCcEEEEeecCCCCc-----ceEE---EeEEeceeeeeecCC--ceeeeEEEE-----ecCCCC
Q 008087 211 GVLPVEFGEL---PALQDAVTVVGYPIGGDT-----ISVT---SGVVSRIEILSYVHG--STELLGLQI-----DAAINS 272 (578)
Q Consensus 211 ~~~~l~l~~~---~~~g~~V~~iG~p~g~~~-----~sv~---~G~Is~~~~~~~~~~--~~~~~~i~~-----da~i~~ 272 (578)
.+.|+.|... ...+..+.++||+..... .... .-+++...+...... ......++. ....++
T Consensus 105 ~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c~ 184 (229)
T smart00020 105 NVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDACQ 184 (229)
T ss_pred ceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCcccC
Confidence 5788888763 345788999998765320 0111 122222111111000 001112222 345788
Q ss_pred CCccceEEccCC--eEEEEEeccc
Q 008087 273 GNSGGPAFNDKG--KCVGIAFQSL 294 (578)
Q Consensus 273 G~SGGPlvn~~G--~vVGI~~~~~ 294 (578)
|+|||||+...+ .++||++...
T Consensus 185 gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 185 GDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCCCCeeEEECCCEEEEEEEEECC
Confidence 999999998654 9999998743
No 19
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.08 E-value=5.8e-10 Score=91.31 Aligned_cols=72 Identities=28% Similarity=0.371 Sum_probs=63.3
Q ss_pred CCCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087 352 QKGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA 431 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 431 (578)
..|++|..|.++|||+.||++||+|++|||+++.+|.+ +..++.....|+.+.+++.|+|+..++++++.
T Consensus 7 ~~Gv~V~~V~~~s~A~~gL~~GD~I~~Ing~~v~~~~~----------~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~ 76 (79)
T cd00986 7 YHGVYVTSVVEGMPAAGKLKAGDHIIAVDGKPFKEAEE----------LIDYIQSKKEGDTVKLKVKREEKELPEDLILK 76 (79)
T ss_pred ecCEEEEEECCCCchhhCCCCCCEEEEECCEECCCHHH----------HHHHHHhCCCCCEEEEEEEECCEEEEEEEEEe
Confidence 35899999999999988999999999999999999987 55667655678899999999999999999887
Q ss_pred cc
Q 008087 432 TH 433 (578)
Q Consensus 432 ~~ 433 (578)
.+
T Consensus 77 ~~ 78 (79)
T cd00986 77 TF 78 (79)
T ss_pred cc
Confidence 54
No 20
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.05 E-value=7.6e-10 Score=90.66 Aligned_cols=68 Identities=28% Similarity=0.317 Sum_probs=59.8
Q ss_pred CCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEE
Q 008087 352 QKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNIT 429 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~ 429 (578)
..|++|..|.++|||++ ||++||+|++|||+++.+|.+ |...+.....|+.+.+++.|+|+..+++++
T Consensus 9 ~~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d----------~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 9 VAGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLED----------FMEALKPTKPGEVITVTVLPSTTKLTNVST 77 (79)
T ss_pred CCcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHH----------HHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence 46999999999999998 999999999999999999988 556676655688999999999998887764
No 21
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.03 E-value=6.9e-10 Score=90.93 Aligned_cols=77 Identities=22% Similarity=0.384 Sum_probs=63.2
Q ss_pred ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087 328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ 406 (578)
Q Consensus 328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~ 406 (578)
+|+|+.+..- ..|++|..|.++|||++ ||++||+|++|||+++.+|.+ ++..
T Consensus 1 ~~~G~~~~~~--------------~~~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~-------------~l~~ 53 (80)
T cd00990 1 PYLGLTLDKE--------------EGLGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQD-------------RLKE 53 (80)
T ss_pred CcccEEEEcc--------------CCcEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHH-------------HHHh
Confidence 5778777532 34799999999999999 999999999999999998654 3444
Q ss_pred cCCCCEEEEEEEECCEEEEEEEEec
Q 008087 407 KYTGDSAAVKVLRDSKILNFNITLA 431 (578)
Q Consensus 407 ~~~g~~v~l~V~R~g~~~~~~v~l~ 431 (578)
...++.+.+++.|+|+..++.+++.
T Consensus 54 ~~~~~~v~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 54 YQAGDPVELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred cCCCCEEEEEEEECCEEEEEEEEec
Confidence 4567899999999999988887764
No 22
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.00 E-value=9.4e-10 Score=110.27 Aligned_cols=100 Identities=15% Similarity=0.183 Sum_probs=85.5
Q ss_pred hhHHHHHHHHHHcCceeccccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCC
Q 008087 309 PVIMHFIQDYEKNGAYTGFPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAND 387 (578)
Q Consensus 309 ~~i~~~l~~l~~~g~v~~~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~ 387 (578)
..+.++++++.+++.+. ++|+|+..... + ....|++|..+.++++|++ |||+||+|++|||+++.++
T Consensus 159 ~~~~~v~~~l~~~g~~~-~~~lgi~p~~~-~----------g~~~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~ 226 (259)
T TIGR01713 159 VVSRRIIEELTKDPQKM-FDYIRLSPVMK-N----------DKLEGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDP 226 (259)
T ss_pred hhHHHHHHHHHHCHHhh-hheEeEEEEEe-C----------CceeEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCH
Confidence 45678899999999888 89999987543 1 1245999999999999999 9999999999999999999
Q ss_pred CCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEe
Q 008087 388 GTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITL 430 (578)
Q Consensus 388 ~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l 430 (578)
.+ +..++.....++.+.|+|+|+|+.+++.+.+
T Consensus 227 ~~----------~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 227 EQ----------AFQALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred HH----------HHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 88 5577777778899999999999998887753
No 23
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.82 E-value=1.2e-08 Score=83.15 Aligned_cols=64 Identities=23% Similarity=0.360 Sum_probs=54.5
Q ss_pred CcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEE
Q 008087 354 GVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNI 428 (578)
Q Consensus 354 Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v 428 (578)
.++|..|.++|+|++ ||++||+|++|||+++.+|.+ +...+... .++.+.+++.|+|+..++.+
T Consensus 13 ~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~----------~~~~l~~~-~~~~~~l~v~r~~~~~~~~l 77 (79)
T cd00989 13 EPVIGEVVPGSPAAKAGLKAGDRILAINGQKIKSWED----------LVDAVQEN-PGKPLTLTVERNGETITLTL 77 (79)
T ss_pred CcEEEeECCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHHC-CCceEEEEEEECCEEEEEEe
Confidence 588999999999998 999999999999999999987 44555554 37789999999998776665
No 24
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.81 E-value=1e-08 Score=111.15 Aligned_cols=90 Identities=24% Similarity=0.477 Sum_probs=79.4
Q ss_pred cccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHh
Q 008087 327 FPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVS 405 (578)
Q Consensus 327 ~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~ 405 (578)
+.|+|+.+..+ ++..++.++++....|++|..|.++|||++ ||++||+|++|||++|.++.+ |..++.
T Consensus 337 ~~~lGi~~~~l-~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d----------~~~~l~ 405 (428)
T TIGR02037 337 NPFLGLTVANL-SPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQPGDVILSVNQQPVSSVAE----------LRKVLD 405 (428)
T ss_pred ccccceEEecC-CHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHH
Confidence 46899999998 888888899887567999999999999999 999999999999999999988 567777
Q ss_pred ccCCCCEEEEEEEECCEEEEEE
Q 008087 406 QKYTGDSAAVKVLRDSKILNFN 427 (578)
Q Consensus 406 ~~~~g~~v~l~V~R~g~~~~~~ 427 (578)
....++.+.|+|+|+|+...+.
T Consensus 406 ~~~~g~~v~l~v~R~g~~~~~~ 427 (428)
T TIGR02037 406 RAKKGGRVALLILRGGATIFVT 427 (428)
T ss_pred hcCCCCEEEEEEEECCEEEEEE
Confidence 7667899999999999977654
No 25
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.75 E-value=9.4e-08 Score=94.05 Aligned_cols=162 Identities=22% Similarity=0.248 Sum_probs=93.4
Q ss_pred ceEEEEEEEcCCEEEecccccCCCc----eEEEEEc---CCCc-EE--EEEEEEEc-C---CCCEEEEEEeeCCCC----
Q 008087 148 SSSSSGFAIGGRRVLTNAHSVEHYT----QVKLKKR---GSDT-KY--LATVLAIG-T---ECDIAMLTVEDDEFW---- 209 (578)
Q Consensus 148 ~~~GsGfvI~~g~ILT~aHvV~~~~----~i~V~~~---~~g~-~~--~a~vv~~d-~---~~DlAlLkv~~~~~~---- 209 (578)
...|++|+|+++.+||++||+.... ++.+... +++. .+ ......+. . ..|.+...+....+.
T Consensus 63 ~~~~~~~lI~pntvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~ 142 (251)
T COG3591 63 RLCTAATLIGPNTVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGIN 142 (251)
T ss_pred cceeeEEEEcCceEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCC
Confidence 3456779999999999999997533 1111110 1111 11 11111112 2 345555555443221
Q ss_pred --CCee--eEEcCCCCCCCCcEEEEeecCCCCc---ceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEcc
Q 008087 210 --EGVL--PVEFGELPALQDAVTVVGYPIGGDT---ISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFND 282 (578)
Q Consensus 210 --~~~~--~l~l~~~~~~g~~V~~iG~p~g~~~---~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~ 282 (578)
.... ...+....++++.+.++|||..... .-...+.|..... ..++.++.+.+|+||+||++.
T Consensus 143 ~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~~----------~~l~y~~dT~pG~SGSpv~~~ 212 (251)
T COG3591 143 IGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIKG----------NKLFYDADTLPGSSGSPVLIS 212 (251)
T ss_pred ccccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEec----------ceEEEEecccCCCCCCceEec
Confidence 1111 2223334467888999999987441 1222333333221 257788889999999999999
Q ss_pred CCeEEEEEeccccccccCCcee-eecchhHHHHHHHHH
Q 008087 283 KGKCVGIAFQSLKHEDVENIGY-VIPTPVIMHFIQDYE 319 (578)
Q Consensus 283 ~G~vVGI~~~~~~~~~~~~~~~-aiPi~~i~~~l~~l~ 319 (578)
+.++||+++.+....+....++ +.-...++.+++++.
T Consensus 213 ~~~vigv~~~g~~~~~~~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 213 KDEVIGVHYNGPGANGGSLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred CceEEEEEecCCCcccccccCcceEecHHHHHHHHHhh
Confidence 8899999998654222233343 445667778877764
No 26
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.73 E-value=4e-08 Score=81.35 Aligned_cols=67 Identities=22% Similarity=0.346 Sum_probs=56.0
Q ss_pred CCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCC--CCCccccCcchhHHHHHhccCCCCEEEEEEEEC-CEEEEEE
Q 008087 352 QKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAND--GTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRD-SKILNFN 427 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~--~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~-g~~~~~~ 427 (578)
..+++|..|.++|||++ ||++||+|++|||+++.+| .+ +...+.. ..|+.+.+++.|+ |+..+++
T Consensus 12 ~~~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~----------~~~~l~~-~~~~~i~l~v~r~~~~~~~~~ 80 (85)
T cd00988 12 DGGLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLED----------VVKLLRG-KAGTKVRLTLKRGDGEPREVT 80 (85)
T ss_pred CCeEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHH----------HHHHhcC-CCCCEEEEEEEcCCCCEEEEE
Confidence 35899999999999999 9999999999999999998 55 3344443 3578999999998 8887777
Q ss_pred EE
Q 008087 428 IT 429 (578)
Q Consensus 428 v~ 429 (578)
+.
T Consensus 81 ~~ 82 (85)
T cd00988 81 LT 82 (85)
T ss_pred EE
Confidence 65
No 27
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.53 E-value=1.9e-07 Score=74.16 Aligned_cols=54 Identities=28% Similarity=0.528 Sum_probs=45.5
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCC--CCCccccCcchhHHHHHhccCCCCEEEEEE
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAND--GTVPFRHGERIGFSYLVSQKYTGDSAAVKV 417 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~--~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V 417 (578)
.|++|..|.++|||+. ||++||+|++|||+++.+| .+ +..++.... |+.++|+|
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~----------~~~~l~~~~-g~~v~l~v 69 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLED----------VAELLKKEV-GEKVTLTV 69 (70)
T ss_pred CCEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHH----------HHHHHhhCC-CCeEEEEE
Confidence 3899999999999999 9999999999999999999 55 445665543 78888876
No 28
>PF12812 PDZ_1: PDZ-like domain
Probab=98.52 E-value=3e-07 Score=74.67 Aligned_cols=70 Identities=14% Similarity=0.142 Sum_probs=56.8
Q ss_pred CCceeeccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEee---cCCcCCCCCCCEEEEeCCeecC
Q 008087 446 PSYYIIAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWC---LRSPLCLNCFNKVLAFNGNPVK 521 (578)
Q Consensus 446 p~~~~~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v---~A~~aGl~~GD~I~~VNG~~V~ 521 (578)
-+|+.|+|..|+++ +.....++..+ ++++++.. ++...++..|.+|.+|||++++
T Consensus 5 ~r~v~~~Ga~f~~Ls~q~aR~~~~~~---------------------~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~ 63 (78)
T PF12812_consen 5 SRFVEVCGAVFHDLSYQQARQYGIPV---------------------GGVYVAVSGGSLAFAGGISKGFIITSVNGKPTP 63 (78)
T ss_pred CEEEEEcCeecccCCHHHHHHhCCCC---------------------CEEEEEecCCChhhhCCCCCCeEEEeECCcCCc
Confidence 37889999999999 66666777333 36776643 5555669999999999999999
Q ss_pred CHHHHHHHHHhcCCC
Q 008087 522 NLKSLANMVENCDDE 536 (578)
Q Consensus 522 ~~~~l~~~l~~~~~~ 536 (578)
++++|++++++.++.
T Consensus 64 ~Ld~f~~vvk~ipd~ 78 (78)
T PF12812_consen 64 DLDDFIKVVKKIPDN 78 (78)
T ss_pred CHHHHHHHHHhCCCC
Confidence 999999999998863
No 29
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.51 E-value=4.7e-07 Score=97.88 Aligned_cols=151 Identities=16% Similarity=0.218 Sum_probs=98.0
Q ss_pred CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEE
Q 008087 352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNIT 429 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~ 429 (578)
.+-++|+.|.+.+.|++ .|++||.|+.|||.+|.....-. .-.++........|.|+|.|.-..
T Consensus 673 ~qpi~iG~Iv~lGaAe~DGRL~~gDElv~iDG~pV~GksH~~--------vv~Lm~~AArnghV~LtVRRkv~~------ 738 (984)
T KOG3209|consen 673 GQPIYIGAIVPLGAAEEDGRLREGDELVCIDGIPVEGKSHSE--------VVDLMEAAARNGHVNLTVRRKVRT------ 738 (984)
T ss_pred CCeeEEeeeeecccccccCcccCCCeEEEecCeeccCccHHH--------HHHHHHHHHhcCceEEEEeeeeee------
Confidence 34588999999999998 59999999999999999876521 224454444556789999883110
Q ss_pred ecccccccCCCCCCCCCCcee------eccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEeecCC
Q 008087 430 LATHRRLIPSHNKGRPPSYYI------IAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWCLRS 502 (578)
Q Consensus 430 l~~~~~~~~~~~~~~~p~~~~------~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~ 502 (578)
.. ....+.......+.|-. -.||+|.-+ ..-+. +..-|.++.+++|+
T Consensus 739 -~~-~~rsp~~s~~~~~~yDV~lhR~ENeGFGFVi~sS~~kp------------------------~sgiGrIieGSPAd 792 (984)
T KOG3209|consen 739 -GP-ARRSPRNSAAPSGPYDVVLHRKENEGFGFVIMSSQNKP------------------------ESGIGRIIEGSPAD 792 (984)
T ss_pred -cc-ccCCcccccCCCCCeeeEEecccCCceeEEEEecccCC------------------------CCCccccccCChhH
Confidence 00 01111111111112211 236666543 11110 11125788999999
Q ss_pred cCC-CCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEe
Q 008087 503 PLC-LNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLE 543 (578)
Q Consensus 503 ~aG-l~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~ 543 (578)
+.| |+.||+|++|||+.|-++ .+.+++|+.+ +-.|+|++.
T Consensus 793 RCgkLkVGDrilAVNG~sI~~lsHadiv~LIKda-GlsVtLtIi 835 (984)
T KOG3209|consen 793 RCGKLKVGDRILAVNGQSILNLSHADIVSLIKDA-GLSVTLTII 835 (984)
T ss_pred hhccccccceEEEecCeeeeccCchhHHHHHHhc-CceEEEEEc
Confidence 998 589999999999999865 6789999874 667888874
No 30
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=98.46 E-value=3.2e-07 Score=75.60 Aligned_cols=57 Identities=18% Similarity=0.194 Sum_probs=48.1
Q ss_pred eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHh-cCCCeEEEEEecCeEEE
Q 008087 493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVEN-CDDEFLKFDLEYDQVVV 549 (578)
Q Consensus 493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~-~~~~~v~l~v~R~~~~~ 549 (578)
++++.. ++|+++||++||+|++|||++|.++.+|.+++.+ .+++.+.|++.|+++..
T Consensus 15 g~~V~~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~ 76 (82)
T PF13180_consen 15 GVVVVSVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRDGEEL 76 (82)
T ss_dssp SEEEEEESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEETTEEE
T ss_pred eEEEEEeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEECCEEE
Confidence 455444 5899999999999999999999999999999965 46889999999987764
No 31
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.44 E-value=8.6e-07 Score=72.50 Aligned_cols=58 Identities=10% Similarity=0.169 Sum_probs=49.3
Q ss_pred ceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhc-CCCeEEEEEecCeEEE
Q 008087 492 QMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENC-DDEFLKFDLEYDQVVV 549 (578)
Q Consensus 492 ~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~-~~~~v~l~v~R~~~~~ 549 (578)
.++++.. ++|+++||++||+|++|||++|.+|.+|.+.+... +++.+.+++.|+++..
T Consensus 10 ~Gv~V~~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~ 72 (79)
T cd00991 10 AGVVIVGVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKL 72 (79)
T ss_pred CcEEEEEECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEE
Confidence 3565554 57999999999999999999999999999999986 4778999999987543
No 32
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.43 E-value=1.7e-06 Score=72.09 Aligned_cols=79 Identities=15% Similarity=0.137 Sum_probs=61.3
Q ss_pred eccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEee----cCCcCCCCCCCEEEEeCCeecCCHHH
Q 008087 451 IAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNLKS 525 (578)
Q Consensus 451 ~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~~~ 525 (578)
+.|+.++++ +.....++. ....++++..+ +|+++||++||+|++|||++|.++.+
T Consensus 2 ~~G~~~~~~~~~~~~~~~~--------------------~~~~g~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~ 61 (90)
T cd00987 2 WLGVTVQDLTPDLAEELGL--------------------KDTKGVLVASVDPGSPAAKAGLKPGDVILAVNGKPVKSVAD 61 (90)
T ss_pred ccceEEeECCHHHHHHcCC--------------------CCCCEEEEEEECCCCHHHHcCCCcCCEEEEECCEECCCHHH
Confidence 568888888 665544331 01246666654 78889999999999999999999999
Q ss_pred HHHHHHhcC-CCeEEEEEecCeEEE
Q 008087 526 LANMVENCD-DEFLKFDLEYDQVVV 549 (578)
Q Consensus 526 l~~~l~~~~-~~~v~l~v~R~~~~~ 549 (578)
+.+++.... ++.+.|.+.|+++.+
T Consensus 62 ~~~~l~~~~~~~~i~l~v~r~g~~~ 86 (90)
T cd00987 62 LRRALAELKPGDKVTLTVLRGGKEL 86 (90)
T ss_pred HHHHHHhcCCCCEEEEEEEECCEEE
Confidence 999998764 778999999988654
No 33
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.34 E-value=1.6e-06 Score=71.32 Aligned_cols=59 Identities=27% Similarity=0.402 Sum_probs=47.2
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECC
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDS 421 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g 421 (578)
.|++|..|.++|||++ ||++||+|++|||+.+.++.+.. ........++.+.|++.|++
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~----------~~~~~~~~~~~~~l~i~r~~ 85 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLE----------AVDLLKKAGGKVTLTVLRGG 85 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHH----------HHHHHHhCCCeEEEEEEeCC
Confidence 5899999999999999 99999999999999999886632 22222234568899998864
No 34
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.30 E-value=1.5e-06 Score=93.86 Aligned_cols=69 Identities=26% Similarity=0.360 Sum_probs=60.2
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA 431 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 431 (578)
.|++|..|.++|||++ |||+||+|++|||++|.+|.+ +...+.. ..++.+.+++.|+|+..++++++.
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~d----------l~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 271 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKEGDYIQSINGEKLRSWTD----------FVSAVKE-NPGKSMDIKVERNGETLSISLTPE 271 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHh-CCCCceEEEEEECCEEEEEEEEEc
Confidence 4799999999999999 999999999999999999988 4456655 467889999999999988888875
Q ss_pred c
Q 008087 432 T 432 (578)
Q Consensus 432 ~ 432 (578)
.
T Consensus 272 ~ 272 (420)
T TIGR00054 272 A 272 (420)
T ss_pred C
Confidence 3
No 35
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=98.25 E-value=6.5e-06 Score=87.58 Aligned_cols=61 Identities=25% Similarity=0.401 Sum_probs=46.8
Q ss_pred CCCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCE
Q 008087 352 QKGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSK 422 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~ 422 (578)
.+-++|..|.||+||+..||.||.|+.|||....+.... | .+-.....|+...++|.|-.+
T Consensus 39 etSiViSDVlpGGPAeG~LQenDrvvMVNGvsMenv~ha---------F-AvQqLrksgK~A~ItvkRprk 99 (1027)
T KOG3580|consen 39 ETSIVISDVLPGGPAEGLLQENDRVVMVNGVSMENVLHA---------F-AVQQLRKSGKVAAITVKRPRK 99 (1027)
T ss_pred ceeEEEeeccCCCCcccccccCCeEEEEcCcchhhhHHH---------H-HHHHHHhhccceeEEecccce
Confidence 456899999999999999999999999999999887552 1 111123467788889887544
No 36
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.23 E-value=5e-05 Score=76.15 Aligned_cols=158 Identities=23% Similarity=0.256 Sum_probs=92.8
Q ss_pred CCCCccccCCCcc----eEEEEEEEcCCEEEecccccCCCc--eEEEEEcC--------CC---cEE-EEEEEEEcC---
Q 008087 136 NFSLPWQRKRQYS----SSSSGFAIGGRRVLTNAHSVEHYT--QVKLKKRG--------SD---TKY-LATVLAIGT--- 194 (578)
Q Consensus 136 ~~~~p~~~~~~~~----~~GsGfvI~~g~ILT~aHvV~~~~--~i~V~~~~--------~g---~~~-~a~vv~~d~--- 194 (578)
...+||+...... ..|.|.+|++.||||+|||+.+.. .+.|.+.. .+ ... ..+++ .++
T Consensus 21 ~~~~Pw~~~l~~~~~~~~~Cggsli~~~~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~ 99 (256)
T KOG3627|consen 21 PGSFPWQVSLQYGGNGRHLCGGSLISPRWVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYN 99 (256)
T ss_pred CCCCCCEEEEEECCCcceeeeeEEeeCCEEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCC
Confidence 3467787654433 278888888889999999999876 66666621 01 111 11233 221
Q ss_pred ----C-CCEEEEEEeeC-CCCCCeeeEEcCCCC----CCC-CcEEEEeecCCC----C-c---ceEEEeEEeceeeeeec
Q 008087 195 ----E-CDIAMLTVEDD-EFWEGVLPVEFGELP----ALQ-DAVTVVGYPIGG----D-T---ISVTSGVVSRIEILSYV 255 (578)
Q Consensus 195 ----~-~DlAlLkv~~~-~~~~~~~~l~l~~~~----~~g-~~V~~iG~p~g~----~-~---~sv~~G~Is~~~~~~~~ 255 (578)
. +|||||+++.+ .|...+.|+.|.... ..+ ..+++.||.... . . ..+...+++...+....
T Consensus 100 ~~~~~~nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~ 179 (256)
T KOG3627|consen 100 PRTLENNDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAY 179 (256)
T ss_pred CCCCCCCCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccc
Confidence 3 79999999975 455678888876322 223 778888865321 1 1 11122333332222111
Q ss_pred CCc--eeeeEEEEe-----cCCCCCCccceEEccC---CeEEEEEeccc
Q 008087 256 HGS--TELLGLQID-----AAINSGNSGGPAFNDK---GKCVGIAFQSL 294 (578)
Q Consensus 256 ~~~--~~~~~i~~d-----a~i~~G~SGGPlvn~~---G~vVGI~~~~~ 294 (578)
... .....++.. ...|.|+|||||+..+ ..++||++.+.
T Consensus 180 ~~~~~~~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~ 228 (256)
T KOG3627|consen 180 GGLGTITDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGS 228 (256)
T ss_pred cCccccCCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecC
Confidence 110 001124443 2468899999999875 69999998854
No 37
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.22 E-value=2.7e-06 Score=92.67 Aligned_cols=67 Identities=25% Similarity=0.360 Sum_probs=59.2
Q ss_pred CcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087 354 GVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA 431 (578)
Q Consensus 354 Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 431 (578)
+++|..|.++|||++ ||++||+|++|||++|.+|.+ +...+.. ..++.+.++|.|+|+..++++++.
T Consensus 222 ~~vV~~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~d----------l~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 289 (449)
T PRK10779 222 EPVLAEVQPNSAASKAGLQAGDRIVKVDGQPLTQWQT----------FVTLVRD-NPGKPLALEIERQGSPLSLTLTPD 289 (449)
T ss_pred CcEEEeeCCCCHHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHh-CCCCEEEEEEEECCEEEEEEEEee
Confidence 588999999999999 999999999999999999988 4556655 467889999999999988888875
No 38
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.16 E-value=3.6e-06 Score=68.44 Aligned_cols=52 Identities=19% Similarity=0.245 Sum_probs=46.0
Q ss_pred EeecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEE
Q 008087 497 LWCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVV 548 (578)
Q Consensus 497 s~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~ 548 (578)
.+++|+++||++||+|++|||+++.+++++...+.+..++.+.+++.|+++.
T Consensus 21 ~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~~~~~~~l~v~r~~~~ 72 (79)
T cd00989 21 PGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQENPGKPLTLTVERNGET 72 (79)
T ss_pred CCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHCCCceEEEEEEECCEE
Confidence 3457888999999999999999999999999999987777899999888754
No 39
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.11 E-value=5.1e-06 Score=87.08 Aligned_cols=72 Identities=21% Similarity=0.276 Sum_probs=56.0
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLA 431 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~ 431 (578)
.+++|..|.++|||++ ||++||+|++|||++|.+|..-. +..++. ...|..+.++|.|+|+..++++++.
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~--------~~~~l~-~~~g~~v~l~v~R~g~~~~~~v~l~ 132 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDD--------AVALIR-GKKGTKVSLEILRAGKSKPLTFTLK 132 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHHH--------HHHhcc-CCCCCEEEEEEEeCCCCceEEEEEE
Confidence 4799999999999999 99999999999999999884100 222332 2468899999999987777666665
Q ss_pred cc
Q 008087 432 TH 433 (578)
Q Consensus 432 ~~ 433 (578)
..
T Consensus 133 ~~ 134 (334)
T TIGR00225 133 RD 134 (334)
T ss_pred EE
Confidence 43
No 40
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.10 E-value=4e-06 Score=68.77 Aligned_cols=72 Identities=24% Similarity=0.309 Sum_probs=51.9
Q ss_pred cccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHh
Q 008087 327 FPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVS 405 (578)
Q Consensus 327 ~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~ 405 (578)
...||+.+.... .. ...+++|..|.++|+|+. ||+.||+|++|||+++.++.... ...++.
T Consensus 9 ~~~lG~~l~~~~-~~---------~~~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~--------~~~~l~ 70 (81)
T PF00595_consen 9 NGPLGFTLRGGS-DN---------DEKGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDE--------VVQLLK 70 (81)
T ss_dssp TSBSSEEEEEES-TS---------SSEEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHH--------HHHHHH
T ss_pred CCCcCEEEEecC-CC---------CcCCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHH--------HHHHHH
Confidence 456788877541 10 024899999999999999 99999999999999999986521 223333
Q ss_pred ccCCCCEEEEEEE
Q 008087 406 QKYTGDSAAVKVL 418 (578)
Q Consensus 406 ~~~~g~~v~l~V~ 418 (578)
. .+..++|+|+
T Consensus 71 ~--~~~~v~L~V~ 81 (81)
T PF00595_consen 71 S--ASNPVTLTVQ 81 (81)
T ss_dssp H--STSEEEEEEE
T ss_pred C--CCCcEEEEEC
Confidence 3 3447888764
No 41
>PRK10139 serine endoprotease; Provisional
Probab=98.09 E-value=6.3e-06 Score=89.71 Aligned_cols=64 Identities=17% Similarity=0.345 Sum_probs=55.8
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEE
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNI 428 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v 428 (578)
.|++|..|.++|||++ ||++||+|++|||++|.+|.+ |...+... .+.+.|+|+|+|+.+.+.+
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~----------~~~~l~~~--~~~v~l~v~R~g~~~~~~~ 454 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAE----------MRKVLAAK--PAIIALQIVRGNESIYLLL 454 (455)
T ss_pred CceEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhC--CCeEEEEEEECCEEEEEEe
Confidence 5899999999999999 999999999999999999988 55677653 3689999999999876654
No 42
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.07 E-value=1.7e-05 Score=64.66 Aligned_cols=56 Identities=13% Similarity=0.236 Sum_probs=46.1
Q ss_pred eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHh-cCCCeEEEEEecCeEEE
Q 008087 493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVEN-CDDEFLKFDLEYDQVVV 549 (578)
Q Consensus 493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~-~~~~~v~l~v~R~~~~~ 549 (578)
|+++.. ++|+. ||++||+|++|||+++.+|++|.+++.. .++..+.|++.|+++..
T Consensus 9 Gv~V~~V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~ 69 (79)
T cd00986 9 GVYVTSVVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKEL 69 (79)
T ss_pred CEEEEEECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEE
Confidence 455554 45665 7999999999999999999999999986 46778999999987654
No 43
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.07 E-value=5.2e-06 Score=87.92 Aligned_cols=61 Identities=20% Similarity=0.317 Sum_probs=50.5
Q ss_pred EEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEE-ECCEEEEEEEEec
Q 008087 357 IRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVL-RDSKILNFNITLA 431 (578)
Q Consensus 357 V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~-R~g~~~~~~v~l~ 431 (578)
|..|.|+|+|++ ||++||+|++|||++|.+|.++ ...+ .++.+.++|. |+|+..++++...
T Consensus 2 I~~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~----------~~~l----~~e~l~L~V~~rdGe~~~l~Ie~~ 64 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEPGDALVSINGVAPRDLIDY----------QFLC----ADEELELEVLDANGESHQIEIEKD 64 (433)
T ss_pred cCCcCCCCHHHHcCCCCCCEEEEECCEECCCHHHH----------HHHh----cCCcEEEEEEcCCCeEEEEEEecC
Confidence 677999999999 9999999999999999999884 3334 2467889997 8998888877654
No 44
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=98.06 E-value=2.9e-05 Score=84.46 Aligned_cols=53 Identities=26% Similarity=0.351 Sum_probs=43.3
Q ss_pred EEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE
Q 008087 357 IRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR 419 (578)
Q Consensus 357 V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R 419 (578)
|++|.+||||++ .|+.||.|++|||+.|.+..... .-.++. ..|-+|+|+|.-
T Consensus 782 iGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHad--------iv~LIK--daGlsVtLtIip 836 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHAD--------IVSLIK--DAGLSVTLTIIP 836 (984)
T ss_pred ccccccCChhHhhccccccceEEEecCeeeeccCchh--------HHHHHH--hcCceEEEEEcC
Confidence 789999999999 59999999999999999887632 223443 468899999875
No 45
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=97.98 E-value=2.1e-05 Score=64.91 Aligned_cols=55 Identities=11% Similarity=0.181 Sum_probs=47.0
Q ss_pred eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEecC-eE
Q 008087 493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLEYD-QV 547 (578)
Q Consensus 493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~R~-~~ 547 (578)
++++.. ++|+++||++||+|++|||+++.+| .++..+++...++.+.|++.|+ +.
T Consensus 14 ~~~V~~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~~~~~~i~l~v~r~~~~ 75 (85)
T cd00988 14 GLVITSVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRGKAGTKVRLTLKRGDGE 75 (85)
T ss_pred eEEEEEecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcCCCCCEEEEEEEcCCCC
Confidence 455544 4788999999999999999999999 9999999887788899999987 53
No 46
>PRK10942 serine endoprotease; Provisional
Probab=97.96 E-value=1.6e-05 Score=87.04 Aligned_cols=64 Identities=22% Similarity=0.375 Sum_probs=55.4
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEE
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNI 428 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v 428 (578)
.|++|..|.++|+|++ ||++||+|++|||++|.+|.+ |...+... ++.+.|+|+|+|+.+.+.+
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~d----------l~~~l~~~--~~~v~l~V~R~g~~~~v~~ 472 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKKGDVIIGANQQPVKNIAE----------LRKILDSK--PSVLALNIQRGDSSIYLLM 472 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCCCCEEEEECCEEcCCHHH----------HHHHHHhC--CCeEEEEEEECCEEEEEEe
Confidence 5899999999999999 999999999999999999988 55666652 3689999999998876654
No 47
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.94 E-value=1.9e-05 Score=64.58 Aligned_cols=49 Identities=24% Similarity=0.406 Sum_probs=39.7
Q ss_pred ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCC
Q 008087 328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAND 387 (578)
Q Consensus 328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~ 387 (578)
..+|+.+... ... ..|++|..|.++|||++ ||++||+|++|||+++.++
T Consensus 12 ~~~G~~~~~~-~~~----------~~~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~ 61 (82)
T cd00992 12 GGLGFSLRGG-KDS----------GGGIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGL 61 (82)
T ss_pred CCcCEEEeCc-ccC----------CCCeEEEEECCCChHHhCCCCCCCEEEEECCEEcCcc
Confidence 4577777644 110 24899999999999999 9999999999999999943
No 48
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.94 E-value=2.4e-05 Score=83.66 Aligned_cols=68 Identities=19% Similarity=0.295 Sum_probs=54.1
Q ss_pred CcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEe
Q 008087 354 GVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITL 430 (578)
Q Consensus 354 Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l 430 (578)
|++|..|.++|||++ ||++||+|++|||++|.++... .+..++. ...|..+.|+|.|+|+..+++++-
T Consensus 103 g~~V~~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~--------~~~~~l~-g~~g~~v~ltv~r~g~~~~~~l~r 171 (389)
T PLN00049 103 GLVVVAPAPGGPAARAGIRPGDVILAIDGTSTEGLSLY--------EAADRLQ-GPEGSSVELTLRRGPETRLVTLTR 171 (389)
T ss_pred cEEEEEeCCCChHHHcCCCCCCEEEEECCEECCCCCHH--------HHHHHHh-cCCCCEEEEEEEECCEEEEEEEEe
Confidence 799999999999999 9999999999999999876321 0223343 346889999999999877766543
No 49
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=97.93 E-value=3.5e-05 Score=81.29 Aligned_cols=79 Identities=8% Similarity=0.013 Sum_probs=62.2
Q ss_pred eccEEEeeh-HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHH
Q 008087 451 IAGFVFSRC-LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKS 525 (578)
Q Consensus 451 ~~Gl~~~~~-p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~ 525 (578)
|.|+.+.++ +.....++.+ ...++++.. ++|+++||++||+|++|||++|.++++
T Consensus 256 ~lGv~~~~~~~~~~~~lgl~--------------------~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~d 315 (351)
T TIGR02038 256 YIGVSGEDINSVVAQGLGLP--------------------DLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEE 315 (351)
T ss_pred EeeeEEEECCHHHHHhcCCC--------------------ccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHH
Confidence 567778777 6666666621 123566654 478999999999999999999999999
Q ss_pred HHHHHHh-cCCCeEEEEEecCeEEE
Q 008087 526 LANMVEN-CDDEFLKFDLEYDQVVV 549 (578)
Q Consensus 526 l~~~l~~-~~~~~v~l~v~R~~~~~ 549 (578)
|.+.+++ .+++.+.|++.|+++..
T Consensus 316 l~~~l~~~~~g~~v~l~v~R~g~~~ 340 (351)
T TIGR02038 316 LMDRIAETRPGSKVMVTVLRQGKQL 340 (351)
T ss_pred HHHHHHhcCCCCEEEEEEEECCEEE
Confidence 9999987 46778999999987654
No 50
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.93 E-value=1.9e-05 Score=83.13 Aligned_cols=68 Identities=25% Similarity=0.374 Sum_probs=55.9
Q ss_pred CCCcEEEEeC--------CCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCE
Q 008087 352 QKGVRIRRVD--------PTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSK 422 (578)
Q Consensus 352 ~~Gv~V~~V~--------p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~ 422 (578)
..|++|.... .+|||++ |||+||+|++|||++|.+|.+ +.+++... .++.+.++|.|+|+
T Consensus 104 t~GVlVvg~~~v~~~~g~~~SPAa~AGLq~GDiIvsING~~V~s~~D----------L~~iL~~~-~g~~V~LtV~R~Ge 172 (402)
T TIGR02860 104 TKGVLVVGFSDIETEKGKIHSPGEEAGIQIGDRILKINGEKIKNMDD----------LANLINKA-GGEKLTLTIERGGK 172 (402)
T ss_pred cCEEEEEEEEcccccCCCCCCHHHHcCCCCCCEEEEECCEECCCHHH----------HHHHHHhC-CCCeEEEEEEECCE
Confidence 4588886542 2589998 999999999999999999998 55666665 47899999999999
Q ss_pred EEEEEEEe
Q 008087 423 ILNFNITL 430 (578)
Q Consensus 423 ~~~~~v~l 430 (578)
..++.+..
T Consensus 173 ~~tv~V~P 180 (402)
T TIGR02860 173 IIETVIKP 180 (402)
T ss_pred EEEEEEEE
Confidence 88888763
No 51
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.91 E-value=6.4e-05 Score=62.40 Aligned_cols=65 Identities=23% Similarity=0.352 Sum_probs=43.0
Q ss_pred CCCcEEEEeCCC--------CcccC-C--CCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEEC
Q 008087 352 QKGVRIRRVDPT--------APESE-V--LKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRD 420 (578)
Q Consensus 352 ~~Gv~V~~V~p~--------spA~~-G--L~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~ 420 (578)
..+..|.+|.++ ||..+ | +++||+|++|||+++....+ +..++.. ..|+.+.|+|.+.
T Consensus 11 ~~~y~I~~I~~gd~~~~~~~sPL~~pGv~v~~GD~I~aInG~~v~~~~~----------~~~lL~~-~agk~V~Ltv~~~ 79 (88)
T PF14685_consen 11 NGGYRIARIYPGDPWNPNARSPLAQPGVDVREGDYILAINGQPVTADAN----------PYRLLEG-KAGKQVLLTVNRK 79 (88)
T ss_dssp TTEEEEEEE-BS-TTSSS-B-GGGGGS----TT-EEEEETTEE-BTTB-----------HHHHHHT-TTTSEEEEEEE-S
T ss_pred CCEEEEEEEeCCCCCCccccCCccCCCCCCCCCCEEEEECCEECCCCCC----------HHHHhcc-cCCCEEEEEEecC
Confidence 357888888875 77777 6 56999999999999998777 3345544 4789999999986
Q ss_pred C-EEEEEE
Q 008087 421 S-KILNFN 427 (578)
Q Consensus 421 g-~~~~~~ 427 (578)
+ +.+++.
T Consensus 80 ~~~~R~v~ 87 (88)
T PF14685_consen 80 PGGARTVV 87 (88)
T ss_dssp TT-EEEEE
T ss_pred CCCceEEE
Confidence 5 455554
No 52
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=97.88 E-value=3.1e-05 Score=61.34 Aligned_cols=50 Identities=24% Similarity=0.259 Sum_probs=43.4
Q ss_pred eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEE
Q 008087 493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDL 542 (578)
Q Consensus 493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v 542 (578)
++++.. ++|+.+||++||+|++|||+++.++ +++.++++...++.+.|++
T Consensus 14 ~~~V~~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v 69 (70)
T cd00136 14 GVVVLSVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKEVGEKVTLTV 69 (70)
T ss_pred CEEEEEeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhCCCCeEEEEE
Confidence 455544 5788899999999999999999999 9999999998888888876
No 53
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=97.88 E-value=0.0011 Score=65.17 Aligned_cols=135 Identities=21% Similarity=0.290 Sum_probs=69.2
Q ss_pred CCEEEecccccCC-CceEEEEEcCCCcEEEEE-----EEEEcCCCCEEEEEEeeCCCCCCeeeEEcC---CCCCCCCcEE
Q 008087 158 GRRVLTNAHSVEH-YTQVKLKKRGSDTKYLAT-----VLAIGTECDIAMLTVEDDEFWEGVLPVEFG---ELPALQDAVT 228 (578)
Q Consensus 158 ~g~ILT~aHvV~~-~~~i~V~~~~~g~~~~a~-----vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~---~~~~~g~~V~ 228 (578)
..||+||+|.... ...+.|.. ..-.|... -+..-...||.++++..+ ++|.+-. ..+..++.|.
T Consensus 40 G~~iItn~HLf~~nng~L~i~s--~hG~f~v~nt~~lkv~~i~~~DiviirmPkD-----fpPf~~kl~FR~P~~~e~v~ 112 (235)
T PF00863_consen 40 GSYIITNAHLFKRNNGELTIKS--QHGEFTVPNTTQLKVHPIEGRDIVIIRMPKD-----FPPFPQKLKFRAPKEGERVC 112 (235)
T ss_dssp TTEEEEEGGGGSSTTCEEEEEE--TTEEEEECEGGGSEEEE-TCSSEEEEE--TT-----S----S---B----TT-EEE
T ss_pred CCEEEEChhhhccCCCeEEEEe--CceEEEcCCccccceEEeCCccEEEEeCCcc-----cCCcchhhhccCCCCCCEEE
Confidence 6799999999964 34567766 22233322 133446899999999875 4443221 2567799999
Q ss_pred EEeecCCCCcceEEEeEEeceeeeee-cCCceeeeEEEEecCCCCCCccceEEcc-CCeEEEEEeccccccccCCceeee
Q 008087 229 VVGYPIGGDTISVTSGVVSRIEILSY-VHGSTELLGLQIDAAINSGNSGGPAFND-KGKCVGIAFQSLKHEDVENIGYVI 306 (578)
Q Consensus 229 ~iG~p~g~~~~sv~~G~Is~~~~~~~-~~~~~~~~~i~~da~i~~G~SGGPlvn~-~G~vVGI~~~~~~~~~~~~~~~ai 306 (578)
.+|.-+.....+. .||....... ..+.....+|. ...|+=|.|+++. +|++|||++... .....+|+.
T Consensus 113 mVg~~fq~k~~~s---~vSesS~i~p~~~~~fWkHwIs----Tk~G~CG~PlVs~~Dg~IVGiHsl~~---~~~~~N~F~ 182 (235)
T PF00863_consen 113 MVGSNFQEKSISS---TVSESSWIYPEENSHFWKHWIS----TKDGDCGLPLVSTKDGKIVGIHSLTS---NTSSRNYFT 182 (235)
T ss_dssp EEEEECSSCCCEE---EEEEEEEEEEETTTTEEEE-C-------TT-TT-EEEETTT--EEEEEEEEE---TTTSSEEEE
T ss_pred EEEEEEEcCCeeE---EECCceEEeecCCCCeeEEEec----CCCCccCCcEEEcCCCcEEEEEcCcc---CCCCeEEEE
Confidence 9998665333222 2333222111 22222233444 4468889999986 899999999743 335567776
Q ss_pred cch
Q 008087 307 PTP 309 (578)
Q Consensus 307 Pi~ 309 (578)
|+.
T Consensus 183 ~f~ 185 (235)
T PF00863_consen 183 PFP 185 (235)
T ss_dssp E--
T ss_pred cCC
Confidence 664
No 54
>PRK10898 serine endoprotease; Provisional
Probab=97.80 E-value=9.9e-05 Score=77.86 Aligned_cols=58 Identities=9% Similarity=0.025 Sum_probs=50.0
Q ss_pred ceEEEEee----cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHh-cCCCeEEEEEecCeEEE
Q 008087 492 QMSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVEN-CDDEFLKFDLEYDQVVV 549 (578)
Q Consensus 492 ~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~-~~~~~v~l~v~R~~~~~ 549 (578)
.++++..+ +|+++||+.||+|++|||++|.++.+|.+.+.. .+++.+.|++.|+++.+
T Consensus 279 ~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~ 341 (353)
T PRK10898 279 QGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQL 341 (353)
T ss_pred CeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEE
Confidence 46766654 799999999999999999999999999999887 56778999999987654
No 55
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=97.79 E-value=5.3e-05 Score=61.73 Aligned_cols=55 Identities=13% Similarity=0.145 Sum_probs=42.0
Q ss_pred eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEE
Q 008087 493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVV 549 (578)
Q Consensus 493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~ 549 (578)
++++.. ++|+++||++||+|++|||+++.+|.++.+.+ ..++.+.+++.|+++..
T Consensus 13 ~~~V~~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~~l~~~--~~~~~v~l~v~r~g~~~ 71 (80)
T cd00990 13 LGKVTFVRDDSPADKAGLVAGDELVAVNGWRVDALQDRLKEY--QAGDPVELTVFRDDRLI 71 (80)
T ss_pred cEEEEEECCCChHHHhCCCCCCEEEEECCEEhHHHHHHHHhc--CCCCEEEEEEEECCEEE
Confidence 455554 47899999999999999999999966654333 25678899999887653
No 56
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.73 E-value=0.00036 Score=68.40 Aligned_cols=113 Identities=19% Similarity=0.281 Sum_probs=62.7
Q ss_pred EEEEEEEc-CCEEEecccccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCCCCCCcEE
Q 008087 150 SSSGFAIG-GRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELPALQDAVT 228 (578)
Q Consensus 150 ~GsGfvI~-~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~g~~V~ 228 (578)
+|+.|-++ +-.|+|+.||+. .+...|.. .+.. +...++..-|+|.-.++. +....|.++++.. ..|..
T Consensus 115 sggvft~~~~~vvvTAtHVlg-~~~a~v~~--~g~~---~~~tF~~~GDfA~~~~~~--~~G~~P~~k~a~~-~~GrA-- 183 (297)
T PF05579_consen 115 SGGVFTIGGNTVVVTATHVLG-GNTARVSG--VGTR---RMLTFKKNGDFAEADITN--WPGAAPKYKFAQN-YTGRA-- 183 (297)
T ss_dssp EEEEEECTTEEEEEEEHHHCB-TTEEEEEE--TTEE---EEEEEEEETTEEEEEETT--S-S---B--B-TT--SEEE--
T ss_pred ccceEEECCeEEEEEEEEEcC-CCeEEEEe--cceE---EEEEEeccCcEEEEECCC--CCCCCCceeecCC-cccce--
Confidence 44444444 459999999998 55666665 3333 333456678999998843 2235677777622 12211
Q ss_pred EEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEecc
Q 008087 229 VVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQS 293 (578)
Q Consensus 229 ~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~ 293 (578)
+.+...-+..|.|..-.+ +.+ ..+|+||+|++..+|.+|||++++
T Consensus 184 -----yW~t~tGvE~G~ig~~~~------------~~f---T~~GDSGSPVVt~dg~liGVHTGS 228 (297)
T PF05579_consen 184 -----YWLTSTGVEPGFIGGGGA------------VCF---TGPGDSGSPVVTEDGDLIGVHTGS 228 (297)
T ss_dssp -----EEEETTEEEEEEEETTEE------------EES---S-GGCTT-EEEETTC-EEEEEEEE
T ss_pred -----EEEcccCcccceecCceE------------EEE---cCCCCCCCccCcCCCCEEEEEecC
Confidence 111111355565544322 222 357999999999999999999984
No 57
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=97.73 E-value=7.2e-05 Score=75.02 Aligned_cols=67 Identities=25% Similarity=0.326 Sum_probs=54.3
Q ss_pred CCcEEE-EeCCCCc---ccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEE
Q 008087 353 KGVRIR-RVDPTAP---ESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFN 427 (578)
Q Consensus 353 ~Gv~V~-~V~p~sp---A~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~ 427 (578)
.| +++ .+.|+.. ..+ |||+||++++|||.++.+..+ ...++........++|+|+|+|+.+++.
T Consensus 204 ~G-l~GYrl~Pgkd~~lF~~~GLq~GDva~sING~dL~D~~q----------a~~l~~~L~~~tei~ltVeRdGq~~~i~ 272 (276)
T PRK09681 204 EG-IVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFTDPRA----------MIALMRQLPSMDSIQLTVLRKGARHDIS 272 (276)
T ss_pred CC-ceEEEECCCCcHHHHHHcCCCCCCEEEEeCCeeCCCHHH----------HHHHHHHhccCCeEEEEEEECCEEEEEE
Confidence 35 444 4667643 455 999999999999999998876 4577888888899999999999999988
Q ss_pred EEe
Q 008087 428 ITL 430 (578)
Q Consensus 428 v~l 430 (578)
+.+
T Consensus 273 i~l 275 (276)
T PRK09681 273 IAL 275 (276)
T ss_pred EEc
Confidence 765
No 58
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=97.68 E-value=9.5e-05 Score=74.36 Aligned_cols=58 Identities=16% Similarity=0.032 Sum_probs=50.6
Q ss_pred ceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcC-CCeEEEEEecCeEEE
Q 008087 492 QMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCD-DEFLKFDLEYDQVVV 549 (578)
Q Consensus 492 ~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~-~~~v~l~v~R~~~~~ 549 (578)
.|+.+.. ++++++||+.||+|++|||+++.+++++.+++.+.+ ++.+.|++.|+++.+
T Consensus 191 ~G~~v~~v~~~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~ 253 (259)
T TIGR01713 191 EGYRLNPGKDPSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQRE 253 (259)
T ss_pred eEEEEEecCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEE
Confidence 5676664 489999999999999999999999999999999864 468999999998754
No 59
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.67 E-value=0.00015 Score=77.63 Aligned_cols=81 Identities=25% Similarity=0.393 Sum_probs=59.5
Q ss_pred cccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHH-
Q 008087 327 FPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLV- 404 (578)
Q Consensus 327 ~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~- 404 (578)
+..+|++++.. + ..++.|.++.+++||++ ||++||+|++|||+++....- ...+
T Consensus 99 ~~GiG~~i~~~-~------------~~~~~V~s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~-----------~~av~ 154 (406)
T COG0793 99 FGGIGIELQME-D------------IGGVKVVSPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSL-----------DEAVK 154 (406)
T ss_pred ccceeEEEEEe-c------------CCCcEEEecCCCChHHHcCCCCCCEEEEECCEEccCCCH-----------HHHHH
Confidence 56777777643 1 15899999999999999 999999999999999988641 1222
Q ss_pred -hccCCCCEEEEEEEECCEEEEEEEEec
Q 008087 405 -SQKYTGDSAAVKVLRDSKILNFNITLA 431 (578)
Q Consensus 405 -~~~~~g~~v~l~V~R~g~~~~~~v~l~ 431 (578)
.+-..|..|+|++.|.+....+.+++.
T Consensus 155 ~irG~~Gt~V~L~i~r~~~~k~~~v~l~ 182 (406)
T COG0793 155 LIRGKPGTKVTLTILRAGGGKPFTVTLT 182 (406)
T ss_pred HhCCCCCCeEEEEEEEcCCCceeEEEEE
Confidence 233478999999999744333333333
No 60
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=97.59 E-value=0.00059 Score=73.12 Aligned_cols=76 Identities=21% Similarity=0.274 Sum_probs=52.8
Q ss_pred hhhhhccccCCCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE
Q 008087 342 LRVAMSMKADQKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR 419 (578)
Q Consensus 342 ~~~~lgl~~~~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R 419 (578)
..+.|||.- ..-+.|.++...+.|++ +|+.||+|++|||.-..+..- . +...++... . .+++|.|+|
T Consensus 209 ~nEEyGlrL-gSqIFvKeit~~gLAardgnlqEGDiiLkINGtvteNmSL-t-------Dar~LIEkS-~-GKL~lvVlR 277 (1027)
T KOG3580|consen 209 ANEEYGLRL-GSQIFVKEITRTGLAARDGNLQEGDIILKINGTVTENMSL-T-------DARKLIEKS-R-GKLQLVVLR 277 (1027)
T ss_pred cchhhcccc-cchhhhhhhcccchhhccCCcccccEEEEECcEeeccccc-h-------hHHHHHHhc-c-CceEEEEEe
Confidence 344566654 23478888988888887 899999999999998887632 1 133455443 3 468999999
Q ss_pred CCEEEEEEE
Q 008087 420 DSKILNFNI 428 (578)
Q Consensus 420 ~g~~~~~~v 428 (578)
+.+..-+.|
T Consensus 278 D~~qtLiNi 286 (1027)
T KOG3580|consen 278 DSQQTLINI 286 (1027)
T ss_pred cCCceeeec
Confidence 866554544
No 61
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.58 E-value=9.9e-05 Score=77.87 Aligned_cols=51 Identities=22% Similarity=0.399 Sum_probs=47.0
Q ss_pred ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEE
Q 008087 499 CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVV 549 (578)
Q Consensus 499 v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~ 549 (578)
++|+++||+.||+|++|||++|.+|+||.+++++..++.+.|++.|+++..
T Consensus 124 SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~g~~V~LtV~R~Ge~~ 174 (402)
T TIGR02860 124 SPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAGGEKLTLTIERGGKII 174 (402)
T ss_pred CHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCCCCeEEEEEEECCEEE
Confidence 478889999999999999999999999999999998899999999987653
No 62
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.50 E-value=0.00021 Score=71.70 Aligned_cols=71 Identities=25% Similarity=0.312 Sum_probs=63.8
Q ss_pred CCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE-CCEEEEEEEEec
Q 008087 353 KGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR-DSKILNFNITLA 431 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R-~g~~~~~~v~l~ 431 (578)
.|+++..|..++|+..-|+.||.|++|||+++.+..+ |...+.....|+++++++.| +++....++++.
T Consensus 130 ~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e----------~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl~ 199 (342)
T COG3480 130 AGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDE----------LIDYVSSKKPGDEVTIDYERHNETPEIVTITLI 199 (342)
T ss_pred eeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHH----------HHHHHhccCCCCeEEEEEEeccCCCceEEEEEE
Confidence 5899999999999988999999999999999999988 67888888999999999997 888888887777
Q ss_pred cc
Q 008087 432 TH 433 (578)
Q Consensus 432 ~~ 433 (578)
..
T Consensus 200 ~~ 201 (342)
T COG3480 200 KN 201 (342)
T ss_pred ee
Confidence 65
No 63
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.42 E-value=0.00047 Score=56.16 Aligned_cols=49 Identities=14% Similarity=0.128 Sum_probs=40.2
Q ss_pred eEEEEe----ecCCcCCCCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE
Q 008087 493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL 542 (578)
Q Consensus 493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v 542 (578)
++++.. ++|+++||++||+|++|||+++. +++++.++++.... .+.|.+
T Consensus 27 ~~~V~~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~~-~v~l~v 81 (82)
T cd00992 27 GIFVSRVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSGD-EVTLTV 81 (82)
T ss_pred CeEEEEECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCCC-eEEEEE
Confidence 455554 47889999999999999999999 99999999997654 566654
No 64
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.36 E-value=0.00033 Score=57.27 Aligned_cols=51 Identities=10% Similarity=0.224 Sum_probs=41.6
Q ss_pred ceEEEEee----cCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEe
Q 008087 492 QMSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLE 543 (578)
Q Consensus 492 ~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~ 543 (578)
.+++++.+ +|+++||+.||+|++|||+++.++ .+..++++.+.+ .++|+|+
T Consensus 25 ~~~~V~~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~~~-~v~L~V~ 81 (81)
T PF00595_consen 25 KGVFVSSVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSASN-PVTLTVQ 81 (81)
T ss_dssp EEEEEEEECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHSTS-EEEEEEE
T ss_pred CCEEEEEEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCCCC-cEEEEEC
Confidence 35666655 789999999999999999999955 777888888766 7888763
No 65
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=97.34 E-value=0.00045 Score=74.84 Aligned_cols=123 Identities=12% Similarity=0.109 Sum_probs=79.6
Q ss_pred EEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEecccc
Q 008087 357 IRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATHR 434 (578)
Q Consensus 357 V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~~ 434 (578)
|.....++||++ .|-.||.|++|||..+-..--. .++..+........|+++|.+=--..++.|. .
T Consensus 677 iAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLs--------tcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~--R-- 744 (829)
T KOG3605|consen 677 IANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLS--------TCQSIIKGLKNQTAVKLNIVSCPPVTTVLIR--R-- 744 (829)
T ss_pred HHhcccCChhhhcCCccccceeEeecCceeccccHH--------HHHHHHhcccccceEEEEEecCCCceEEEee--c--
Confidence 335566899998 5999999999999877653211 1456677666666788887763333233221 1
Q ss_pred cccCCCCCCCCCCceeeccEEEeehHHHHHHhchhhhhhhhhccccccccccccCCCceEEEE---eecCCcCCCCCCCE
Q 008087 435 RLIPSHNKGRPPSYYIIAGFVFSRCLYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLL---WCLRSPLCLNCFNK 511 (578)
Q Consensus 435 ~~~~~~~~~~~p~~~~~~Gl~~~~~p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs---~v~A~~aGl~~GD~ 511 (578)
|..+-.+|+.. +.|++++ +.-|++.|++.|-+
T Consensus 745 -------------------------Pd~kyQLGFSV--------------------QNGiICSLlRGGIAERGGVRVGHR 779 (829)
T KOG3605|consen 745 -------------------------PDLRYQLGFSV--------------------QNGIICSLLRGGIAERGGVRVGHR 779 (829)
T ss_pred -------------------------ccchhhcccee--------------------eCcEeehhhcccchhccCceeeee
Confidence 11111222111 2355554 66899999999999
Q ss_pred EEEeCCeecCC--HHHHHHHHHhcCCC
Q 008087 512 VLAFNGNPVKN--LKSLANMVENCDDE 536 (578)
Q Consensus 512 I~~VNG~~V~~--~~~l~~~l~~~~~~ 536 (578)
|++||||.|-- .+-.+++|..+-++
T Consensus 780 IIEINgQSVVA~pHekIV~lLs~aVGE 806 (829)
T KOG3605|consen 780 IIEINGQSVVATPHEKIVQLLSNAVGE 806 (829)
T ss_pred EEEECCceEEeccHHHHHHHHHHhhhh
Confidence 99999999863 35578888776554
No 66
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.28 E-value=0.00044 Score=73.95 Aligned_cols=85 Identities=21% Similarity=0.368 Sum_probs=64.3
Q ss_pred CCcccccccChhhhhhhcccc--CCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087 330 LGVEWQKMENPDLRVAMSMKA--DQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ 406 (578)
Q Consensus 330 lGi~~~~~~~~~~~~~lgl~~--~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~ 406 (578)
.|+.+..+ . ...-+||+.- +..+.+|..|.++|||++ ||.+||.|++|||. + + .+.+
T Consensus 439 ~gL~~~~~-~-~~~~~LGl~v~~~~g~~~i~~V~~~gPA~~AGl~~Gd~ivai~G~---s--~-------------~l~~ 498 (558)
T COG3975 439 FGLTFTPK-P-REAYYLGLKVKSEGGHEKITFVFPGGPAYKAGLSPGDKIVAINGI---S--D-------------QLDR 498 (558)
T ss_pred cceEEEec-C-CCCcccceEecccCCeeEEEecCCCChhHhccCCCccEEEEEcCc---c--c-------------cccc
Confidence 45666554 2 2244566543 345688999999999999 99999999999998 1 1 2445
Q ss_pred cCCCCEEEEEEEECCEEEEEEEEecccc
Q 008087 407 KYTGDSAAVKVLRDSKILNFNITLATHR 434 (578)
Q Consensus 407 ~~~g~~v~l~V~R~g~~~~~~v~l~~~~ 434 (578)
...++.+++++.|.|+.+++.+++....
T Consensus 499 ~~~~d~i~v~~~~~~~L~e~~v~~~~~~ 526 (558)
T COG3975 499 YKVNDKIQVHVFREGRLREFLVKLGGDP 526 (558)
T ss_pred cccccceEEEEccCCceEEeecccCCCc
Confidence 5688999999999999999988876543
No 67
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.19 E-value=0.0011 Score=74.94 Aligned_cols=71 Identities=17% Similarity=0.178 Sum_probs=47.9
Q ss_pred CCcEEEEeCCCCcccC--CCCCCCEEEEEC--CEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEEC---CEEEE
Q 008087 353 KGVRIRRVDPTAPESE--VLKPSDIILSFD--GIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRD---SKILN 425 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~--GL~~GDiIl~In--G~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~---g~~~~ 425 (578)
.+++|..|.+||||++ ||++||+|++|| |.++.+.....+. .+..++. -..|.+|.|+|.|+ |+..+
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~-----~vv~lir-G~~Gt~V~LtV~r~~~~~~~~~ 328 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLD-----DVVALIK-GPKGSKVRLEILPAGKGTKTRI 328 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHH-----HHHHHhc-CCCCCEEEEEEEeCCCCCceEE
Confidence 3688999999999987 899999999999 5554432211100 0223333 34789999999983 45555
Q ss_pred EEEE
Q 008087 426 FNIT 429 (578)
Q Consensus 426 ~~v~ 429 (578)
+++.
T Consensus 329 vtl~ 332 (667)
T PRK11186 329 VTLT 332 (667)
T ss_pred EEEE
Confidence 5554
No 68
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.18 E-value=0.00082 Score=63.54 Aligned_cols=72 Identities=22% Similarity=0.271 Sum_probs=57.2
Q ss_pred cEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEEeccc
Q 008087 355 VRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNITLATH 433 (578)
Q Consensus 355 v~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~l~~~ 433 (578)
++|..|.|+|||+. ||+.||.|+++....-.++..+. + + ..+.....++.+.++|.|.|+...+.++...|
T Consensus 141 a~V~sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq-----~--i-~~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~W 212 (231)
T KOG3129|consen 141 AVVDSVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQ-----N--I-AAVVQSNEDQIVSVTVIREGQKVVLSLTPKKW 212 (231)
T ss_pred EEEeecCCCChhhhhCcccCceEEEecccccccchhHH-----H--H-HHHHHhccCcceeEEEecCCCEEEEEeCcccc
Confidence 67899999999999 99999999999887666665421 0 1 12334557888999999999999999988777
Q ss_pred c
Q 008087 434 R 434 (578)
Q Consensus 434 ~ 434 (578)
.
T Consensus 213 ~ 213 (231)
T KOG3129|consen 213 Q 213 (231)
T ss_pred c
Confidence 4
No 69
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.16 E-value=0.00038 Score=67.00 Aligned_cols=66 Identities=18% Similarity=0.229 Sum_probs=52.9
Q ss_pred CcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCEEEEEEEE
Q 008087 354 GVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSKILNFNIT 429 (578)
Q Consensus 354 Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~~~~~~v~ 429 (578)
|..+.-..+++..+. |||.||+.++||+..+++.++ +..++.....-..++++|+|+|+...+.+.
T Consensus 208 Gyr~~pgkd~slF~~sglq~GDIavaiNnldltdp~~----------m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 208 GYRFEPGKDGSLFYKSGLQRGDIAVAINNLDLTDPED----------MFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred EEEecCCCCcchhhhhcCCCcceEEEecCcccCCHHH----------HHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 343443445566777 999999999999999999877 456777777778899999999999888764
No 70
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=97.03 E-value=0.00068 Score=72.14 Aligned_cols=51 Identities=16% Similarity=0.170 Sum_probs=43.4
Q ss_pred EEEeecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEe-cCeEE
Q 008087 495 SLLWCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLE-YDQVV 548 (578)
Q Consensus 495 vvs~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~-R~~~~ 548 (578)
|.++++|+++||++||+|++|||++|.+|.++...+. ++.+.+++. |+++.
T Consensus 5 V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~---~e~l~L~V~~rdGe~ 56 (433)
T TIGR03279 5 VLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA---DEELELEVLDANGES 56 (433)
T ss_pred cCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc---CCcEEEEEEcCCCeE
Confidence 5567899999999999999999999999999988874 467888885 66644
No 71
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.00 E-value=0.0014 Score=59.34 Aligned_cols=86 Identities=23% Similarity=0.338 Sum_probs=54.6
Q ss_pred cccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCC-CCEEEEECCEEeCCCCCCccccCcchhHHHHH
Q 008087 327 FPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKP-SDIILSFDGIDIANDGTVPFRHGERIGFSYLV 404 (578)
Q Consensus 327 ~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~-GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~ 404 (578)
.+.||+.++-- ...- ....+..|.+|.|+|||++ ||++ .|.|+.+|+..+.+.++ |..++
T Consensus 25 ~g~LG~sv~~~-~~~~-------~~~~~~~Vl~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~----------l~~~v 86 (138)
T PF04495_consen 25 QGLLGISVRFE-SFEG-------AEEEGWHVLRVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDD----------LFELV 86 (138)
T ss_dssp SSSS-EEEEEE-E-TT-------GCCCEEEEEEE-TTSHHHHTT--TTTEEEEEETTCE--STCH----------HHHHH
T ss_pred CCCCcEEEEEe-cccc-------cccceEEEeEecCCCHHHHCCccccccEEEEccceecCCHHH----------HHHHH
Confidence 46788877643 1110 1245789999999999999 9999 69999999988887655 56666
Q ss_pred hccCCCCEEEEEEEEC--CEEEEEEEEec
Q 008087 405 SQKYTGDSAAVKVLRD--SKILNFNITLA 431 (578)
Q Consensus 405 ~~~~~g~~v~l~V~R~--g~~~~~~v~l~ 431 (578)
.. ..++.+.|.|... ...+++.+...
T Consensus 87 ~~-~~~~~l~L~Vyns~~~~vR~V~i~P~ 114 (138)
T PF04495_consen 87 EA-NENKPLQLYVYNSKTDSVREVTITPS 114 (138)
T ss_dssp HH-TTTS-EEEEEEETTTTCEEEEEE---
T ss_pred HH-cCCCcEEEEEEECCCCeEEEEEEEcC
Confidence 65 4678999999863 45556666554
No 72
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=96.99 E-value=0.0015 Score=53.30 Aligned_cols=53 Identities=13% Similarity=0.144 Sum_probs=39.6
Q ss_pred eEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHH--HHHHHhcCCCeEEEEEecCe
Q 008087 493 MSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSL--ANMVENCDDEFLKFDLEYDQ 546 (578)
Q Consensus 493 ~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l--~~~l~~~~~~~v~l~v~R~~ 546 (578)
++++.. ++|+++||++||+|++|||+++.++.+. ...+... ++.+.|.+.|++
T Consensus 27 ~~~i~~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~-~~~~~l~i~r~~ 85 (85)
T smart00228 27 GVVVSSVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKA-GGKVTLTVLRGG 85 (85)
T ss_pred CEEEEEECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhC-CCeEEEEEEeCC
Confidence 455554 4788999999999999999999977554 3444443 458888888763
No 73
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=96.99 E-value=0.0006 Score=71.54 Aligned_cols=50 Identities=8% Similarity=0.109 Sum_probs=43.7
Q ss_pred EeecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEecCe
Q 008087 497 LWCLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLEYDQ 546 (578)
Q Consensus 497 s~v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~R~~ 546 (578)
.+++|+++||+.||+|++|||++|.+| .++...+....++.+.|++.|++
T Consensus 71 ~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~~~g~~v~l~v~R~g 122 (334)
T TIGR00225 71 EGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRGKKGTKVSLEILRAG 122 (334)
T ss_pred CCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccCCCCCEEEEEEEeCC
Confidence 456899999999999999999999986 67888887777888999998875
No 74
>PF12812 PDZ_1: PDZ-like domain
Probab=96.92 E-value=0.0012 Score=53.65 Aligned_cols=58 Identities=12% Similarity=0.073 Sum_probs=51.0
Q ss_pred ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCC
Q 008087 328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGT 389 (578)
Q Consensus 328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~ 389 (578)
-|.|..++++ +...++.++++- |+++.....++++.. ++..|-+|.+|||+++.+.++
T Consensus 9 ~~~Ga~f~~L-s~q~aR~~~~~~---~gv~v~~~~g~~~~~~~i~~g~iI~~Vn~kpt~~Ld~ 67 (78)
T PF12812_consen 9 EVCGAVFHDL-SYQQARQYGIPV---GGVYVAVSGGSLAFAGGISKGFIITSVNGKPTPDLDD 67 (78)
T ss_pred EEcCeecccC-CHHHHHHhCCCC---CEEEEEecCCChhhhCCCCCCeEEEeECCcCCcCHHH
Confidence 4789999999 899999999977 466667788899888 699999999999999999877
No 75
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.89 E-value=0.03 Score=57.15 Aligned_cols=109 Identities=16% Similarity=0.205 Sum_probs=64.3
Q ss_pred CCCCEEEEEEeeCCCCCCeeeEEcCCCCC---CCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCC
Q 008087 194 TECDIAMLTVEDDEFWEGVLPVEFGELPA---LQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAI 270 (578)
Q Consensus 194 ~~~DlAlLkv~~~~~~~~~~~l~l~~~~~---~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i 270 (578)
...+++||+++.+ +.....++.|+++.. .++.+.+.|+... . .+....+.-..... ....+......
T Consensus 159 ~~~~~mIlEl~~~-~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~-~--~~~~~~~~i~~~~~------~~~~~~~~~~~ 228 (282)
T PF03761_consen 159 RPYSPMILELEED-FSKNVSPPCLADSSTNWEKGDEVDVYGFNST-G--KLKHRKLKITNCTK------CAYSICTKQYS 228 (282)
T ss_pred cccceEEEEEccc-ccccCCCEEeCCCccccccCceEEEeecCCC-C--eEEEEEEEEEEeec------cceeEeccccc
Confidence 3569999999988 334788899987543 3788888888222 1 12222222111110 11235556677
Q ss_pred CCCCccceEEc---cCCeEEEEEeccccccccCCceeeecchhHHH
Q 008087 271 NSGNSGGPAFN---DKGKCVGIAFQSLKHEDVENIGYVIPTPVIMH 313 (578)
Q Consensus 271 ~~G~SGGPlvn---~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i~~ 313 (578)
+.|++||||+. ....||||.+..... ...+..+++.+...++
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~-~~~~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASGNYE-CNKNNSYFFNVSWYQD 273 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccCCCc-ccccccEEEEHHHhhh
Confidence 89999999983 334799998763211 1112456666655444
No 76
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=96.88 E-value=0.0033 Score=65.55 Aligned_cols=143 Identities=17% Similarity=0.238 Sum_probs=95.7
Q ss_pred CCCcEEEEeCCCCcccC-CCCC-CCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEEC--CEEEEEE
Q 008087 352 QKGVRIRRVDPTAPESE-VLKP-SDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRD--SKILNFN 427 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~-GL~~-GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~--g~~~~~~ 427 (578)
..|.-|-+|..+|+|++ ||++ -|.|++|||..+...++. |..+++... +.|+++|.-- -..+.+.
T Consensus 14 teg~hvlkVqedSpa~~aglepffdFIvSI~g~rL~~dnd~---------Lk~llk~~s--ekVkltv~n~kt~~~R~v~ 82 (462)
T KOG3834|consen 14 TEGYHVLKVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDT---------LKALLKANS--EKVKLTVYNSKTQEVRIVE 82 (462)
T ss_pred ceeEEEEEeecCChHHhcCcchhhhhhheeCcccccCchHH---------HHHHHHhcc--cceEEEEEecccceeEEEE
Confidence 45788899999999999 9988 499999999999988773 445555433 3389998742 2333444
Q ss_pred EEecccccccCCCCCCCCCCceeeccEEEeeh---HHHHHHhchhhhhhhhhccccccccccccCCCceEEEEeecCCcC
Q 008087 428 ITLATHRRLIPSHNKGRPPSYYIIAGFVFSRC---LYLISVLSMERIMNMKLRSSFWTSSCIQCHNCQMSSLLWCLRSPL 504 (578)
Q Consensus 428 v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~---p~~~~~~g~~~~~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~~a 504 (578)
|+....-.. . +.|+.+.-. .+....|. -.-|-..++|+.|
T Consensus 83 I~ps~~wgg----------q---llGvsvrFcsf~~A~~~vwH------------------------vl~V~p~SPaalA 125 (462)
T KOG3834|consen 83 IVPSNNWGG----------Q---LLGVSVRFCSFDGAVESVWH------------------------VLSVEPNSPAALA 125 (462)
T ss_pred ecccccccc----------c---ccceEEEeccCccchhheee------------------------eeecCCCCHHHhc
Confidence 443322110 0 223333221 11111111 1234556799999
Q ss_pred CCC-CCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEE
Q 008087 505 CLN-CFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDL 542 (578)
Q Consensus 505 Gl~-~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v 542 (578)
||. .+|.|+.+-..-....+||..+|+.+.++.+++-|
T Consensus 126 gl~~~~DYivG~~~~~~~~~eDl~~lIeshe~kpLklyV 164 (462)
T KOG3834|consen 126 GLRPYTDYIVGIWDAVMHEEEDLFTLIESHEGKPLKLYV 164 (462)
T ss_pred ccccccceEecchhhhccchHHHHHHHHhccCCCcceeE
Confidence 998 89999999666677899999999999888888766
No 77
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=96.86 E-value=0.0013 Score=70.49 Aligned_cols=51 Identities=12% Similarity=0.089 Sum_probs=44.0
Q ss_pred EeecCCcCCCCCCCEEEEeCCeecCC--HHHHHHHHHhcCCCeEEEEEecCeE
Q 008087 497 LWCLRSPLCLNCFNKVLAFNGNPVKN--LKSLANMVENCDDEFLKFDLEYDQV 547 (578)
Q Consensus 497 s~v~A~~aGl~~GD~I~~VNG~~V~~--~~~l~~~l~~~~~~~v~l~v~R~~~ 547 (578)
.+++|+++||++||+|++|||++|.+ +.++...++...+..+.|++.|++.
T Consensus 111 ~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g~~g~~v~ltv~r~g~ 163 (389)
T PLN00049 111 PGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQGPEGSSVELTLRRGPE 163 (389)
T ss_pred CCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhcCCCCEEEEEEEECCE
Confidence 35689999999999999999999985 4788888877778889999988765
No 78
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=96.78 E-value=0.02 Score=58.77 Aligned_cols=58 Identities=22% Similarity=0.290 Sum_probs=37.1
Q ss_pred EEEEEEEcCCEEEecccccCCCc-----eEEEEEc-CC---CcEEEEEEEEEc-------CCCCEEEEEEeeCC
Q 008087 150 SSSGFAIGGRRVLTNAHSVEHYT-----QVKLKKR-GS---DTKYLATVLAIG-------TECDIAMLTVEDDE 207 (578)
Q Consensus 150 ~GsGfvI~~g~ILT~aHvV~~~~-----~i~V~~~-~~---g~~~~a~vv~~d-------~~~DlAlLkv~~~~ 207 (578)
.|-|-+++..||||+|||+.... .+.|.+. +| ++....+.++.+ ...|+|++++....
T Consensus 62 fCGgs~l~~RYvLTAAHC~~~~s~is~d~~~vv~~l~d~Sq~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a 135 (413)
T COG5640 62 FCGGSKLGGRYVLTAAHCADASSPISSDVNRVVVDLNDSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAA 135 (413)
T ss_pred EeccceecceEEeeehhhccCCCCccccceEEEecccccccccCcceEEEeeecccccccccCcceeecccccc
Confidence 57788888889999999998654 2333331 12 222234444443 35799999998753
No 79
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=96.75 E-value=0.0028 Score=63.76 Aligned_cols=50 Identities=6% Similarity=0.074 Sum_probs=44.4
Q ss_pred cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCC-CeEEEEEecCeEEE
Q 008087 500 LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDD-EFLKFDLEYDQVVV 549 (578)
Q Consensus 500 ~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~-~~v~l~v~R~~~~~ 549 (578)
.-.++||++||++++|||.++.+.++..+++++..+ ..+.|+|+|||+.+
T Consensus 219 lF~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~ 269 (276)
T PRK09681 219 LFDASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARH 269 (276)
T ss_pred HHHHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEE
Confidence 356789999999999999999999999999998765 48999999998864
No 80
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.75 E-value=0.023 Score=53.61 Aligned_cols=138 Identities=17% Similarity=0.293 Sum_probs=80.3
Q ss_pred cceEEEEEEEcCCEEEecccccCCCceEEEEEcCCCcEEE--EEEEEEcC---CCCEEEEEEeeCCCCCCe-eeEEcCCC
Q 008087 147 YSSSSSGFAIGGRRVLTNAHSVEHYTQVKLKKRGSDTKYL--ATVLAIGT---ECDIAMLTVEDDEFWEGV-LPVEFGEL 220 (578)
Q Consensus 147 ~~~~GsGfvI~~g~ILT~aHvV~~~~~i~V~~~~~g~~~~--a~vv~~d~---~~DlAlLkv~~~~~~~~~-~~l~l~~~ 220 (578)
....++|+-|.+.++|.+.| ......+.+ ++..++ ..+...|. ..||++++++...-..++ +.+. ...
T Consensus 23 g~~t~l~~gi~~~~~lvp~H---~~~~~~i~i--~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIrk~~~-~~~ 96 (172)
T PF00548_consen 23 GEFTMLALGIYDRYFLVPTH---EEPEDTIYI--DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIRKFFP-ESI 96 (172)
T ss_dssp EEEEEEEEEEEBTEEEEEGG---GGGCSEEEE--TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GGGGSB-SSG
T ss_pred ceEEEecceEeeeEEEEECc---CCCcEEEEE--CCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchhhhhc-ccc
Confidence 45678888999999999999 223334555 455543 23333454 459999999765321222 2222 112
Q ss_pred CCCCCcEEEEeecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEcc---CCeEEEEEec
Q 008087 221 PALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFND---KGKCVGIAFQ 292 (578)
Q Consensus 221 ~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~---~G~vVGI~~~ 292 (578)
....+...++-+. ......+..+.++..+.. ...+......+..+++...|+-||||+.. .++++||+.+
T Consensus 97 ~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHva 169 (172)
T PF00548_consen 97 PEYPECVLLVNST-KFPRMIVEVGFVTNFGFI-NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVA 169 (172)
T ss_dssp GTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEE
T ss_pred ccCCCcEEEEECC-CCccEEEEEEEEeecCcc-ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEec
Confidence 2233444444332 223224455555554443 22233333567778888899999999963 5799999987
No 81
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.59 E-value=0.019 Score=63.35 Aligned_cols=117 Identities=19% Similarity=0.283 Sum_probs=74.6
Q ss_pred CCCEEEEEEeeCC-----CCCCe------eeEEcCCC--------CCCCCcEEEEeecCCCCcceEEEeEEeceeeeeec
Q 008087 195 ECDIAMLTVEDDE-----FWEGV------LPVEFGEL--------PALQDAVTVVGYPIGGDTISVTSGVVSRIEILSYV 255 (578)
Q Consensus 195 ~~DlAlLkv~~~~-----~~~~~------~~l~l~~~--------~~~g~~V~~iG~p~g~~~~sv~~G~Is~~~~~~~~ 255 (578)
-.|+|||+++... +.+++ +.+.+.+. ...|..|+=+|+..| .|.|+|.++....+.
T Consensus 542 LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg-----yT~G~lNg~klvyw~ 616 (695)
T PF08192_consen 542 LSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG-----YTTGILNGIKLVYWA 616 (695)
T ss_pred ccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC-----ccceEecceEEEEec
Confidence 3599999998653 11122 22333332 123788999997766 356777777655444
Q ss_pred CCcee-eeEEEEe----cCCCCCCccceEEccCC------eEEEEEeccccccccCCceeeecchhHHHHHHHH
Q 008087 256 HGSTE-LLGLQID----AAINSGNSGGPAFNDKG------KCVGIAFQSLKHEDVENIGYVIPTPVIMHFIQDY 318 (578)
Q Consensus 256 ~~~~~-~~~i~~d----a~i~~G~SGGPlvn~~G------~vVGI~~~~~~~~~~~~~~~aiPi~~i~~~l~~l 318 (578)
.+... .+++..+ .=...|+||+=|++.-+ .|+||.++.-+ ....+|++.|+..|+.-|++.
T Consensus 617 dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg--e~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 617 DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG--EQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC--ccceeeccCcHHHHHHHHHHh
Confidence 44322 3444444 22347999999998644 49999987432 455788899999887766554
No 82
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=96.59 E-value=0.0033 Score=52.24 Aligned_cols=48 Identities=15% Similarity=0.127 Sum_probs=36.7
Q ss_pred ecCCcCCC--CCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCe
Q 008087 499 CLRSPLCL--NCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQ 546 (578)
Q Consensus 499 v~A~~aGl--~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~ 546 (578)
+|-.+.|+ +.||.|++|||+++..-.++.++|....++.+.|+|.+..
T Consensus 31 sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~~agk~V~Ltv~~~~ 80 (88)
T PF14685_consen 31 SPLAQPGVDVREGDYILAINGQPVTADANPYRLLEGKAGKQVLLTVNRKP 80 (88)
T ss_dssp -GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHTTTTSEEEEEEE-ST
T ss_pred CCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcccCCCEEEEEEecCC
Confidence 35556666 5999999999999999999999999999999999998753
No 83
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.28 E-value=0.0048 Score=70.13 Aligned_cols=20 Identities=35% Similarity=0.295 Sum_probs=15.3
Q ss_pred EEEEEEEc-CCEEEecccccC
Q 008087 150 SSSGFAIG-GRRVLTNAHSVE 169 (578)
Q Consensus 150 ~GsGfvI~-~g~ILT~aHvV~ 169 (578)
-|||.+|+ +|+||||.||+.
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred ceeEEEEcCCceEEecchhhh
Confidence 48888887 788888888864
No 84
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.23 E-value=0.0038 Score=51.94 Aligned_cols=34 Identities=32% Similarity=0.489 Sum_probs=31.3
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCC
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIAN 386 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~ 386 (578)
.|++|++|..+|||+. ||+.+|.|+.+||...+-
T Consensus 59 ~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTM 93 (124)
T KOG3553|consen 59 KGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTM 93 (124)
T ss_pred ccEEEEEeccCChhhhhcceecceEEEecCceeEE
Confidence 6999999999999999 999999999999976653
No 85
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=95.77 E-value=0.025 Score=59.60 Aligned_cols=59 Identities=17% Similarity=0.124 Sum_probs=49.7
Q ss_pred ceEEEEe----ecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcC-CCeEEEEEecCeEEEE
Q 008087 492 QMSSLLW----CLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCD-DEFLKFDLEYDQVVVL 550 (578)
Q Consensus 492 ~~vvvs~----v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~-~~~v~l~v~R~~~~~l 550 (578)
.|+++.+ ++|+++|++.||+|+++||+++.+..+|.+.+.... +..+.+.+.|+++..-
T Consensus 270 ~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~ 333 (347)
T COG0265 270 AGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERE 333 (347)
T ss_pred CceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEE
Confidence 4555554 489999999999999999999999999999988875 7799999999855443
No 86
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=95.37 E-value=0.19 Score=49.25 Aligned_cols=99 Identities=20% Similarity=0.301 Sum_probs=70.7
Q ss_pred CCCCCccccCC--CcceEEEEEEEcCCEEEecccccCCC----ceEEEEEcCCCcEEE------EEEEEEc-----CCCC
Q 008087 135 PNFSLPWQRKR--QYSSSSSGFAIGGRRVLTNAHSVEHY----TQVKLKKRGSDTKYL------ATVLAIG-----TECD 197 (578)
Q Consensus 135 ~~~~~p~~~~~--~~~~~GsGfvI~~g~ILT~aHvV~~~----~~i~V~~~~~g~~~~------a~vv~~d-----~~~D 197 (578)
..+.+||-... .+...|+|++|+..|||++..|+.+- ..+.+.+ +.++.+. -++..+| +..+
T Consensus 12 e~y~WPWlA~IYvdG~~~CsgvLlD~~WlLvsssCl~~I~L~~~Yvsall-G~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~ 90 (267)
T PF09342_consen 12 EDYHWPWLADIYVDGRYWCSGVLLDPHWLLVSSSCLRGISLSHHYVSALL-GGGKTYLSVDGPHEQISRVDCFKDVPESN 90 (267)
T ss_pred ccccCcceeeEEEcCeEEEEEEEeccceEEEeccccCCcccccceEEEEe-cCcceecccCCChheEEEeeeeeeccccc
Confidence 46789997643 45668999999999999999999873 4577777 4665442 2444444 5789
Q ss_pred EEEEEEeeC-CCCCCeeeEEcCC---CCCCCCcEEEEeecC
Q 008087 198 IAMLTVEDD-EFWEGVLPVEFGE---LPALQDAVTVVGYPI 234 (578)
Q Consensus 198 lAlLkv~~~-~~~~~~~~l~l~~---~~~~g~~V~~iG~p~ 234 (578)
++||.++.+ .|...+.|+-+.+ .....+.++++|...
T Consensus 91 v~LLHL~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 91 VLLLHLEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeeeeecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 999999987 4445566666654 122346899999876
No 87
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=95.36 E-value=0.012 Score=56.87 Aligned_cols=138 Identities=22% Similarity=0.268 Sum_probs=48.8
Q ss_pred CCEEEecccccCCCceEEEEEcCCCcEEE---EEEEEEcCCCCEEEEEEeeCCCC--CCeeeEEcCCCCCCC-CcEEEEe
Q 008087 158 GRRVLTNAHSVEHYTQVKLKKRGSDTKYL---ATVLAIGTECDIAMLTVEDDEFW--EGVLPVEFGELPALQ-DAVTVVG 231 (578)
Q Consensus 158 ~g~ILT~aHvV~~~~~i~V~~~~~g~~~~---a~vv~~d~~~DlAlLkv~~~~~~--~~~~~l~l~~~~~~g-~~V~~iG 231 (578)
...++|++||......+.... +|+.++ -+.+..+...|++||+.... ++ -.++.+.|.....+. ..+.+.+
T Consensus 41 ~~~L~ta~Hv~~~~~~~~~~k--~g~kipl~~f~~~~~~~~~D~~il~~P~n-~~s~Lg~k~~~~~~~~~~~~g~~~~y~ 117 (203)
T PF02122_consen 41 EDALLTARHVWSRPSKVTSLK--TGEKIPLAEFTDLLESRIADFVILRGPPN-WESKLGVKAAQLSQNSQLAKGPVSFYG 117 (203)
T ss_dssp -EEEEE-HHHHTSSS---EEE--TTEEEE--S-EEEEE-TTT-EEEEE--HH-HHHHHT-----B----SEEEEESSTTS
T ss_pred ccceecccccCCCccceeEcC--CCCcccchhChhhhCCCccCEEEEecCcC-HHHHhCcccccccchhhhCCCCeeeee
Confidence 459999999999865554443 454443 35566788999999999943 11 134444443322210 0010001
Q ss_pred ecCCCCcceEEEeEEeceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchhH
Q 008087 232 YPIGGDTISVTSGVVSRIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPVI 311 (578)
Q Consensus 232 ~p~g~~~~sv~~G~Is~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~i 311 (578)
...+ .. ......|. +... .++.+-+...+|.||.|+++.. ++||++.+.......++.++..|+.-+
T Consensus 118 ~~~~-~~-~~~sa~i~---------g~~~-~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~~~~~~~~n~n~~spip~~ 184 (203)
T PF02122_consen 118 FSSG-EW-PCSSAKIP---------GTEG-KFASVLSNTSPGWSGTPYYSGK-NVVGVHTGSPSGSNRENNNRMSPIPPI 184 (203)
T ss_dssp EEEE-EE-EEEE-S-------------ST-TEEEE-----TT-TT-EEE-SS--EEEEEEEE------------------
T ss_pred ecCC-Cc-eeccCccc---------cccC-cCCceEcCCCCCCCCCCeEECC-CceEeecCccccccccccccccccccc
Confidence 0000 00 11111111 1111 1455666788999999999987 999999985322244566665555443
No 88
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=95.31 E-value=0.016 Score=62.17 Aligned_cols=48 Identities=4% Similarity=0.068 Sum_probs=43.2
Q ss_pred eecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCCeEEEEEecC
Q 008087 498 WCLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDEFLKFDLEYD 545 (578)
Q Consensus 498 ~v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~~v~l~v~R~ 545 (578)
++||+++|+++||+|++|||+++.+. ++.++.|+-.+|..++|++.|.
T Consensus 122 ~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG~~Gt~V~L~i~r~ 171 (406)
T COG0793 122 GSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRGKPGTKVTLTILRA 171 (406)
T ss_pred CChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCCCCCCeEEEEEEEc
Confidence 56999999999999999999999965 6788888888888999999884
No 89
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=95.07 E-value=0.014 Score=52.87 Aligned_cols=49 Identities=14% Similarity=0.197 Sum_probs=39.5
Q ss_pred EEEeecCCcCCCCC-CCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEe
Q 008087 495 SLLWCLRSPLCLNC-FNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLE 543 (578)
Q Consensus 495 vvs~v~A~~aGl~~-GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~ 543 (578)
|..++||++|||++ .|.|+.+|+...++.++|.+.++++.++.+.|.|-
T Consensus 50 V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~~~~~~l~L~Vy 99 (138)
T PF04495_consen 50 VAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEANENKPLQLYVY 99 (138)
T ss_dssp E-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHHTTTS-EEEEEE
T ss_pred ecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHHcCCCcEEEEEE
Confidence 55678999999997 69999999999999999999999999999999883
No 90
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=94.94 E-value=0.06 Score=48.34 Aligned_cols=38 Identities=24% Similarity=0.410 Sum_probs=33.4
Q ss_pred CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCC
Q 008087 352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGT 389 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~ 389 (578)
..-++|.++.|++.|.. ||+-||.+++|||..|.....
T Consensus 114 nspiyisriipggvadrhgglkrgdqllsvngvsvege~h 153 (207)
T KOG3550|consen 114 NSPIYISRIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHH 153 (207)
T ss_pred CCceEEEeecCCccccccCcccccceeEeecceeecchhh
Confidence 45689999999999988 899999999999999987543
No 91
>PRK11186 carboxy-terminal protease; Provisional
Probab=93.95 E-value=0.056 Score=61.41 Aligned_cols=49 Identities=10% Similarity=0.203 Sum_probs=40.4
Q ss_pred EEeecCCcC-CCCCCCEEEEeC--CeecC-----CHHHHHHHHHhcCCCeEEEEEec
Q 008087 496 LLWCLRSPL-CLNCFNKVLAFN--GNPVK-----NLKSLANMVENCDDEFLKFDLEY 544 (578)
Q Consensus 496 vs~v~A~~a-Gl~~GD~I~~VN--G~~V~-----~~~~l~~~l~~~~~~~v~l~v~R 544 (578)
+.++||+++ ||++||+|++|| |+++. .++++.++|+..+|..|+|++.|
T Consensus 263 ipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG~~Gt~V~LtV~r 319 (667)
T PRK11186 263 VAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKGPKGSKVRLEILP 319 (667)
T ss_pred cCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcCCCCCEEEEEEEe
Confidence 346799998 999999999999 55443 35688999998888999999987
No 92
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=93.68 E-value=0.079 Score=47.45 Aligned_cols=29 Identities=38% Similarity=0.672 Sum_probs=21.5
Q ss_pred EecCCCCCCccceEEccCCeEEEEEeccc
Q 008087 266 IDAAINSGNSGGPAFNDKGKCVGIAFQSL 294 (578)
Q Consensus 266 ~da~i~~G~SGGPlvn~~G~vVGI~~~~~ 294 (578)
.+....+|.||+|+||.+|++|||....+
T Consensus 90 ~~~d~~~GsSGSpi~n~~g~ivGlYg~g~ 118 (132)
T PF00949_consen 90 IDLDFPKGSSGSPIFNQNGEIVGLYGNGV 118 (132)
T ss_dssp E---S-TTGTT-EEEETTSCEEEEEEEEE
T ss_pred eecccCCCCCCCceEcCCCcEEEEEccce
Confidence 34447899999999999999999988755
No 93
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=93.22 E-value=0.094 Score=59.62 Aligned_cols=57 Identities=26% Similarity=0.327 Sum_probs=43.8
Q ss_pred CCcEEEEeCCCCcccCCCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE
Q 008087 353 KGVRIRRVDPTAPESEVLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR 419 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R 419 (578)
.-|+|..|.+|+|+...|++||.|++|||++|.+.-.- | ..+++... .+.|.|+|.+
T Consensus 75 rPviVr~VT~GGps~GKL~PGDQIl~vN~Epv~dapre------r--vIdlvRac--e~sv~ltV~q 131 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSIGKLQPGDQILAVNGEPVKDAPRE------R--VIDLVRAC--ESSVNLTVCQ 131 (1298)
T ss_pred CceEEEEecCCCCccccccCCCeEEEecCcccccccHH------H--HHHHHHHH--hhhcceEEec
Confidence 35889999999999989999999999999999874321 1 22455443 4678888877
No 94
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=93.21 E-value=0.28 Score=43.20 Aligned_cols=57 Identities=19% Similarity=0.116 Sum_probs=37.2
Q ss_pred ceeeeeecCCceeeeEEEEecCCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchh
Q 008087 248 RIEILSYVHGSTELLGLQIDAAINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPV 310 (578)
Q Consensus 248 ~~~~~~~~~~~~~~~~i~~da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~ 310 (578)
.++.+.+.+..+...++....+..||+-||+|+..-| ||||+++ ++++.-.|..+..
T Consensus 65 ~i~~s~YYP~h~Q~~~l~g~Gp~~PGdCGg~L~C~HG-ViGi~Ta-----gg~g~VaF~dir~ 121 (127)
T PF00947_consen 65 WIEESEYYPKHYQYNLLIGEGPAEPGDCGGILRCKHG-VIGIVTA-----GGEGHVAFADIRD 121 (127)
T ss_dssp EE-SBTTB-SEEEECEEEEE-SSSTT-TCSEEEETTC-EEEEEEE-----EETTEEEEEECCC
T ss_pred EECCccCchhheecCceeecccCCCCCCCceeEeCCC-eEEEEEe-----CCCceEEEEechh
Confidence 3334444444555566777889999999999998655 9999998 5566655555543
No 95
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=92.79 E-value=0.28 Score=49.80 Aligned_cols=53 Identities=17% Similarity=0.157 Sum_probs=42.9
Q ss_pred ceEEEEee---cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhc-CCCeEEEEEec
Q 008087 492 QMSSLLWC---LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENC-DDEFLKFDLEY 544 (578)
Q Consensus 492 ~~vvvs~v---~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~-~~~~v~l~v~R 544 (578)
.||++..+ .....-|+.||.|++|||+++.+.++|.+++++. .++.+++++.|
T Consensus 130 ~gvyv~~v~~~~~~~gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r 186 (342)
T COG3480 130 AGVYVLSVIDNSPFKGKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYER 186 (342)
T ss_pred eeEEEEEccCCcchhceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEe
Confidence 35665555 2344457899999999999999999999999876 57899999986
No 96
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=92.69 E-value=0.12 Score=56.97 Aligned_cols=37 Identities=16% Similarity=0.498 Sum_probs=33.9
Q ss_pred CCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCC
Q 008087 353 KGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGT 389 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~ 389 (578)
.-|.|..|.++++|.+ .|++||++++|||.+|.+..+
T Consensus 398 ~~v~v~tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q 435 (1051)
T KOG3532|consen 398 RAVKVCTVEDNSLADKAAFKPGDVLVAINNVPIRSERQ 435 (1051)
T ss_pred eEEEEEEecCCChhhHhcCCCcceEEEecCccchhHHH
Confidence 3577888999999999 999999999999999999877
No 97
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=92.66 E-value=0.14 Score=56.24 Aligned_cols=81 Identities=16% Similarity=0.222 Sum_probs=59.9
Q ss_pred eecchhHHHHHHHHHHcCcee--ccccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECC
Q 008087 305 VIPTPVIMHFIQDYEKNGAYT--GFPLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDG 381 (578)
Q Consensus 305 aiPi~~i~~~l~~l~~~g~v~--~~~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG 381 (578)
.+|.+....+++.+++.-.|. --+.--++-..+.-++.+..||++- ++|+ |-....|+.|++ |++.|-+|++|||
T Consensus 708 GLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~kyQLGFSV-QNGi-ICSLlRGGIAERGGVRVGHRIIEINg 785 (829)
T KOG3605|consen 708 GLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDLRYQLGFSV-QNGI-ICSLLRGGIAERGGVRVGHRIIEING 785 (829)
T ss_pred cccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccchhhcccee-eCcE-eehhhcccchhccCceeeeeEEEECC
Confidence 589999999999887644442 0112222323333788888899887 6676 677889999999 9999999999999
Q ss_pred EEeCCC
Q 008087 382 IDIAND 387 (578)
Q Consensus 382 ~~V~~~ 387 (578)
+.|--.
T Consensus 786 QSVVA~ 791 (829)
T KOG3605|consen 786 QSVVAT 791 (829)
T ss_pred ceEEec
Confidence 988654
No 98
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=92.35 E-value=0.19 Score=48.88 Aligned_cols=50 Identities=10% Similarity=-0.003 Sum_probs=44.1
Q ss_pred cCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCC-eEEEEEecCeEEE
Q 008087 500 LRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDE-FLKFDLEYDQVVV 549 (578)
Q Consensus 500 ~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~-~v~l~v~R~~~~~ 549 (578)
+-...||+.||+.+++|+..+.+.++..++++...+. .+.|++.|+|+.-
T Consensus 219 lF~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rh 269 (275)
T COG3031 219 LFYKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRH 269 (275)
T ss_pred hhhhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccc
Confidence 5667899999999999999999999999999998655 7999999988653
No 99
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=91.97 E-value=0.17 Score=42.40 Aligned_cols=53 Identities=13% Similarity=0.103 Sum_probs=38.2
Q ss_pred CceEEEEee----cCCcCCCCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEecC
Q 008087 491 CQMSSLLWC----LRSPLCLNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLEYD 545 (578)
Q Consensus 491 ~~~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~R~ 545 (578)
+.|++++.+ ||+.|||+.+|.|+.|||-... +.+..++.|.+ ++.+.+.|.|.
T Consensus 58 D~GiYvT~V~eGsPA~~AGLrihDKIlQvNG~DfTMvTHd~Avk~i~k--~~vl~mLVaR~ 116 (124)
T KOG3553|consen 58 DKGIYVTRVSEGSPAEIAGLRIHDKILQVNGWDFTMVTHDQAVKRITK--EEVLRMLVARQ 116 (124)
T ss_pred CccEEEEEeccCChhhhhcceecceEEEecCceeEEEEhHHHHHHhhH--hHHHHHHHHhh
Confidence 457777754 8999999999999999997765 55666666665 34444444443
No 100
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=91.96 E-value=0.37 Score=53.25 Aligned_cols=48 Identities=8% Similarity=0.092 Sum_probs=40.8
Q ss_pred EEEeecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEE
Q 008087 495 SLLWCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDL 542 (578)
Q Consensus 495 vvs~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v 542 (578)
|....+|+++++.+||++++|||.||.+..+..+.++...+....|..
T Consensus 405 v~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~~~~~~~l~~ 452 (1051)
T KOG3532|consen 405 VEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQSTTGDLTVLVE 452 (1051)
T ss_pred ecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhcccceEEEEe
Confidence 344568999999999999999999999999999999998776554443
No 101
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=91.49 E-value=0.32 Score=43.06 Aligned_cols=27 Identities=30% Similarity=0.629 Sum_probs=23.3
Q ss_pred ecCCCCCCccceEEccCCeEEEEEecc
Q 008087 267 DAAINSGNSGGPAFNDKGKCVGIAFQS 293 (578)
Q Consensus 267 da~i~~G~SGGPlvn~~G~vVGI~~~~ 293 (578)
...-.+|+||-|++|..|+||||+.++
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG 126 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGG 126 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecC
Confidence 456789999999999999999999884
No 102
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=91.28 E-value=0.14 Score=56.27 Aligned_cols=39 Identities=26% Similarity=0.326 Sum_probs=34.6
Q ss_pred cCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCC
Q 008087 350 ADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDG 388 (578)
Q Consensus 350 ~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~ 388 (578)
....|++|.+|.|++.|+. ||+-||.|++|||+...+..
T Consensus 559 EkGfgifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis 598 (1283)
T KOG3542|consen 559 EKGFGIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENIS 598 (1283)
T ss_pred cccceeEEeeecCCchHHHhhhhhhhhhhhccccchhhhh
Confidence 3357899999999999999 99999999999999887754
No 103
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=90.24 E-value=0.35 Score=46.15 Aligned_cols=56 Identities=18% Similarity=0.195 Sum_probs=43.7
Q ss_pred EEEEeecCCcCCCCCCCEEEEeCCeecCC---HHHHHHHHHhcCCCeEEEEEecCeEEE
Q 008087 494 SSLLWCLRSPLCLNCFNKVLAFNGNPVKN---LKSLANMVENCDDEFLKFDLEYDQVVV 549 (578)
Q Consensus 494 vvvs~v~A~~aGl~~GD~I~~VNG~~V~~---~~~l~~~l~~~~~~~v~l~v~R~~~~~ 549 (578)
.|+.++||+++||+.||.|+++....--+ +..+...+++..++.+.+++.|.++.+
T Consensus 145 sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v 203 (231)
T KOG3129|consen 145 SVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKV 203 (231)
T ss_pred ecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEE
Confidence 35557799999999999999987665544 555666777788899999998876544
No 104
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=89.64 E-value=0.46 Score=54.38 Aligned_cols=61 Identities=20% Similarity=0.306 Sum_probs=47.6
Q ss_pred CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEECCE
Q 008087 352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLRDSK 422 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R~g~ 422 (578)
.-|++|..|.+|++|+. .|+.||.+++|||+.+-...+-+ ..+++. ..|..|.|+|...|.
T Consensus 959 klGIYvKsVV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQEr--------AA~lmt--rtg~vV~leVaKqgA 1021 (1629)
T KOG1892|consen 959 KLGIYVKSVVEGGAADHDGRLEAGDQLLSVDGHSLIGISQER--------AARLMT--RTGNVVHLEVAKQGA 1021 (1629)
T ss_pred ccceEEEEeccCCccccccccccCceeeeecCcccccccHHH--------HHHHHh--ccCCeEEEehhhhhh
Confidence 56899999999999988 59999999999999888776522 123333 368889999976554
No 105
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=87.61 E-value=0.37 Score=50.14 Aligned_cols=38 Identities=26% Similarity=0.378 Sum_probs=35.0
Q ss_pred CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCC
Q 008087 352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGT 389 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~ 389 (578)
..|+.|.+|...||+.. ||++||+|+++||-+|.+.++
T Consensus 219 g~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~d 258 (484)
T KOG2921|consen 219 GEGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSD 258 (484)
T ss_pred CceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHH
Confidence 56899999999999987 999999999999999998766
No 106
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=86.98 E-value=0.32 Score=55.65 Aligned_cols=40 Identities=23% Similarity=0.187 Sum_probs=24.5
Q ss_pred CCEEEEEEeeCC------CC---CCee---eEEcCC-CCCCCCcEEEEeecCC
Q 008087 196 CDIAMLTVEDDE------FW---EGVL---PVEFGE-LPALQDAVTVVGYPIG 235 (578)
Q Consensus 196 ~DlAlLkv~~~~------~~---~~~~---~l~l~~-~~~~g~~V~~iG~p~g 235 (578)
.|++++|+=... +. .++. .+++.. ..+-|+-|+++|||..
T Consensus 200 gDfs~fRvY~~~dg~PA~Ys~dnvP~~p~~~l~is~~G~keGD~vmv~GyPG~ 252 (698)
T PF10459_consen 200 GDFSFFRVYADKDGKPADYSKDNVPYKPKHFLKISLKGVKEGDFVMVAGYPGR 252 (698)
T ss_pred CceEEEEEEeCCCCCccccCcCCCCCCCccccccCCCCCCCCCeEEEccCCCc
Confidence 599999995431 10 1222 233332 3356899999999965
No 107
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=86.31 E-value=4.8 Score=36.41 Aligned_cols=46 Identities=13% Similarity=0.164 Sum_probs=34.3
Q ss_pred EEeecCCcC-CCCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE
Q 008087 496 LLWCLRSPL-CLNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL 542 (578)
Q Consensus 496 vs~v~A~~a-Gl~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v 542 (578)
+++..|++- ||+.||.+++|||..|. ..+..+++|++..+. ++|.+
T Consensus 123 ipggvadrhgglkrgdqllsvngvsvege~hekavellkaa~gs-vklvv 171 (207)
T KOG3550|consen 123 IPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAAVGS-VKLVV 171 (207)
T ss_pred cCCccccccCcccccceeEeecceeecchhhHHHHHHHHHhcCc-EEEEE
Confidence 334445554 68999999999999997 556688889887554 67766
No 108
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=85.82 E-value=0.72 Score=41.00 Aligned_cols=24 Identities=33% Similarity=0.609 Sum_probs=18.6
Q ss_pred CCCCccceEEccCCeEEEEEeccc
Q 008087 271 NSGNSGGPAFNDKGKCVGIAFQSL 294 (578)
Q Consensus 271 ~~G~SGGPlvn~~G~vVGI~~~~~ 294 (578)
-.|.||||++..+|.+|||-.+..
T Consensus 106 lkGSSGgPiLC~~GH~vG~f~aa~ 129 (148)
T PF02907_consen 106 LKGSSGGPILCPSGHAVGMFRAAV 129 (148)
T ss_dssp HTT-TT-EEEETTSEEEEEEEEEE
T ss_pred EecCCCCcccCCCCCEEEEEEEEE
Confidence 469999999999999999976543
No 109
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=85.64 E-value=0.52 Score=45.50 Aligned_cols=41 Identities=27% Similarity=0.375 Sum_probs=32.3
Q ss_pred ecCCCCCCccceEEccCCeEEEEEeccccccccCCceeeecchh
Q 008087 267 DAAINSGNSGGPAFNDKGKCVGIAFQSLKHEDVENIGYVIPTPV 310 (578)
Q Consensus 267 da~i~~G~SGGPlvn~~G~vVGI~~~~~~~~~~~~~~~aiPi~~ 310 (578)
...+..|+||+|++- +|++||-++..+. +....+|.++++.
T Consensus 174 TGGIvqGMSGSPI~q-dGKLiGAVthvf~--~dp~~Gygi~ie~ 214 (218)
T PF05580_consen 174 TGGIVQGMSGSPIIQ-DGKLIGAVTHVFV--NDPTKGYGIFIEW 214 (218)
T ss_pred hCCEEecccCCCEEE-CCEEEEEEEEEEe--cCCCceeeecHHH
Confidence 346778999999986 8999999887664 4566788888654
No 110
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=85.19 E-value=1.5 Score=46.45 Aligned_cols=57 Identities=28% Similarity=0.432 Sum_probs=43.5
Q ss_pred EEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCE---EEEEEEE-CCEEE
Q 008087 357 IRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDS---AAVKVLR-DSKIL 424 (578)
Q Consensus 357 V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~---v~l~V~R-~g~~~ 424 (578)
+.++..++++.. ++++||.|+++|++++.+|.++ ...+.. ..+.. +.+.+.| ++...
T Consensus 133 ~~~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~----------~~~~~~-~~~~~~~~~~i~~~~~~~~~~ 194 (375)
T COG0750 133 VGEVAPKSAAALAGLRPGDRIVAVDGEKVASWDDV----------RRLLVA-AAGDVFNLLTILVIRLDGEAH 194 (375)
T ss_pred eeecCCCCHHHHcCCCCCCEEEeECCEEccCHHHH----------HHHHHh-ccCCcccceEEEEEeccceee
Confidence 337889999999 9999999999999999999874 233333 23444 7888889 76664
No 111
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=83.29 E-value=1.8 Score=46.02 Aligned_cols=53 Identities=11% Similarity=0.102 Sum_probs=45.2
Q ss_pred EEeecCCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcCCCe---EEEEEec-CeEE
Q 008087 496 LLWCLRSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCDDEF---LKFDLEY-DQVV 548 (578)
Q Consensus 496 vs~v~A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~---v~l~v~R-~~~~ 548 (578)
...+++..++++.||.|+++|++++.++++..+.+....+.. +.+.+.| ++..
T Consensus 137 ~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~ 193 (375)
T COG0750 137 APKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAAAGDVFNLLTILVIRLDGEA 193 (375)
T ss_pred CCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhccCCcccceEEEEEecccee
Confidence 345689999999999999999999999999999998877766 7888888 5444
No 112
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=83.22 E-value=0.91 Score=49.30 Aligned_cols=21 Identities=24% Similarity=0.055 Sum_probs=19.7
Q ss_pred eecCCcCCCCCCCEEEEeCCe
Q 008087 498 WCLRSPLCLNCFNKVLAFNGN 518 (578)
Q Consensus 498 ~v~A~~aGl~~GD~I~~VNG~ 518 (578)
++||.+|||.+||.|++|||.
T Consensus 472 ~gPA~~AGl~~Gd~ivai~G~ 492 (558)
T COG3975 472 GGPAYKAGLSPGDKIVAINGI 492 (558)
T ss_pred CChhHhccCCCccEEEEEcCc
Confidence 458999999999999999999
No 113
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=80.10 E-value=2.3 Score=48.95 Aligned_cols=48 Identities=23% Similarity=0.283 Sum_probs=37.1
Q ss_pred EEeecCCcCCCCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEec
Q 008087 496 LLWCLRSPLCLNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLEY 544 (578)
Q Consensus 496 vs~v~A~~aGl~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~R 544 (578)
|+..-.....|++||+|++|||++|+ .|+..+++++++... |.|+|-+
T Consensus 82 VT~GGps~GKL~PGDQIl~vN~Epv~daprervIdlvRace~s-v~ltV~q 131 (1298)
T KOG3552|consen 82 VTEGGPSIGKLQPGDQILAVNGEPVKDAPRERVIDLVRACESS-VNLTVCQ 131 (1298)
T ss_pred ecCCCCccccccCCCeEEEecCcccccccHHHHHHHHHHHhhh-cceEEec
Confidence 33334555668999999999999998 679999999998654 5566644
No 114
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=78.96 E-value=1.8 Score=42.92 Aligned_cols=56 Identities=21% Similarity=0.294 Sum_probs=40.8
Q ss_pred CCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-C-CCCCCEEEEECCEEeCCC
Q 008087 330 LGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-V-LKPSDIILSFDGIDIAND 387 (578)
Q Consensus 330 lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-G-L~~GDiIl~InG~~V~~~ 387 (578)
||+.+..- +.--....||.. ..|+.|.+..||+-|+. | |-..|.|++|||.+|...
T Consensus 173 LGFYIRDG-~SVRVtp~Glek-vpGIFISRlVpGGLAeSTGLLaVnDEVlEVNGIEVaGK 230 (358)
T KOG3606|consen 173 LGFYIRDG-TSVRVTPHGLEK-VPGIFISRLVPGGLAESTGLLAVNDEVLEVNGIEVAGK 230 (358)
T ss_pred ceEEEecC-ceEEeccccccc-cCceEEEeecCCccccccceeeecceeEEEcCEEeccc
Confidence 66665543 111112246655 67999999999999999 6 677999999999999864
No 115
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=78.79 E-value=3.4 Score=41.65 Aligned_cols=55 Identities=15% Similarity=0.244 Sum_probs=41.3
Q ss_pred CcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEE
Q 008087 354 GVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVL 418 (578)
Q Consensus 354 Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~ 418 (578)
-++|..|-.++||++ .++.||.|++|||..|.....+. ...++... -+.|++++.
T Consensus 31 ClYiVQvFD~tPAa~dG~i~~GDEi~avNg~svKGktKve--------VAkmIQ~~--~~eV~IhyN 87 (429)
T KOG3651|consen 31 CLYIVQVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVE--------VAKMIQVS--LNEVKIHYN 87 (429)
T ss_pred eEEEEEeccCCchhccCccccCCeeEEecceeecCccHHH--------HHHHHHHh--ccceEEEeh
Confidence 478889999999999 49999999999999999876543 23444442 245666664
No 116
>KOG1924 consensus RhoA GTPase effector DIA/Diaphanous [Signal transduction mechanisms; Cytoskeleton]
Probab=76.90 E-value=9.2 Score=43.54 Aligned_cols=11 Identities=27% Similarity=0.087 Sum_probs=6.4
Q ss_pred CCCCCCccccc
Q 008087 12 PKIPDAEKTLD 22 (578)
Q Consensus 12 ~~~~~~~~~~~ 22 (578)
+||..-++|..
T Consensus 502 ~Ki~~l~ae~~ 512 (1102)
T KOG1924|consen 502 EKIKLLEAEKQ 512 (1102)
T ss_pred hhcccCchhhh
Confidence 66666555543
No 117
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=76.08 E-value=2.8 Score=45.15 Aligned_cols=38 Identities=16% Similarity=0.338 Sum_probs=31.9
Q ss_pred CCCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCC
Q 008087 352 QKGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGT 389 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~ 389 (578)
..|++|+.|.+++..+. .+.+||.||.||.....++..
T Consensus 276 DggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSN 315 (626)
T KOG3571|consen 276 DGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSN 315 (626)
T ss_pred CCceEEeeeccCceeeccCccCccceEEEeeecchhhcCc
Confidence 46899999999887666 599999999999987776543
No 118
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=75.32 E-value=12 Score=32.18 Aligned_cols=53 Identities=15% Similarity=0.271 Sum_probs=35.8
Q ss_pred EEEEcCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCC
Q 008087 153 GFAIGGRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGE 219 (578)
Q Consensus 153 GfvI~~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~ 219 (578)
++-|++|..+|+.||.+.++.+ ++..+ +++. ...|+++++.... .++.+.+++
T Consensus 3 avHIGnG~~vt~tHva~~~~~v------~g~~f--~~~~--~~ge~~~v~~~~~----~~p~~~ig~ 55 (105)
T PF03510_consen 3 AVHIGNGRYVTVTHVAKSSDSV------DGQPF--KIVK--TDGELCWVQSPLV----HLPAAQIGT 55 (105)
T ss_pred eEEeCCCEEEEEEEEeccCceE------cCcCc--EEEE--eccCEEEEECCCC----CCCeeEecc
Confidence 5667899999999999877653 22222 2333 4569999998876 356666654
No 119
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=74.86 E-value=2.8 Score=43.74 Aligned_cols=55 Identities=22% Similarity=0.254 Sum_probs=40.7
Q ss_pred CCcEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc-cCCCCEEEEEEE
Q 008087 353 KGVRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ-KYTGDSAAVKVL 418 (578)
Q Consensus 353 ~Gv~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~-~~~g~~v~l~V~ 418 (578)
--++|..+-+|-.|.+ .|..||.|++|||..+.+... .+.+.. +..|+.|.++|.
T Consensus 110 MPIlISKIFkGlAADQt~aL~~gDaIlSVNG~dL~~AtH-----------deAVqaLKraGkeV~levK 167 (506)
T KOG3551|consen 110 MPILISKIFKGLAADQTGALFLGDAILSVNGEDLRDATH-----------DEAVQALKRAGKEVLLEVK 167 (506)
T ss_pred CceehhHhccccccccccceeeccEEEEecchhhhhcch-----------HHHHHHHHhhCceeeeeee
Confidence 3578889999888888 699999999999999887654 133332 346787766554
No 120
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=74.13 E-value=3 Score=49.22 Aligned_cols=35 Identities=20% Similarity=0.247 Sum_probs=31.3
Q ss_pred cEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCC
Q 008087 355 VRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGT 389 (578)
Q Consensus 355 v~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~ 389 (578)
-.|..|.++|||.. ||++||.|+.|||++|.....
T Consensus 660 h~v~sv~egsPA~~agls~~DlIthvnge~v~gl~H 695 (1205)
T KOG0606|consen 660 HSVGSVEEGSPAFEAGLSAGDLITHVNGEPVHGLVH 695 (1205)
T ss_pred eeeeeecCCCCccccCCCccceeEeccCcccchhhH
Confidence 56889999999988 999999999999999987643
No 121
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=72.28 E-value=4.3 Score=44.30 Aligned_cols=49 Identities=16% Similarity=0.226 Sum_probs=40.5
Q ss_pred EEEEeecCCcCCC-CCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEe
Q 008087 494 SSLLWCLRSPLCL-NCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLE 543 (578)
Q Consensus 494 vvvs~v~A~~aGl-~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~ 543 (578)
-++.+..+++.|+ ..||.|.+|||..|. +..++.+++.+.. +.++|.+.
T Consensus 152 RI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G~itfkii 203 (542)
T KOG0609|consen 152 RIMHGGMADRQGLLHVGDEILEVNGISVANKSPEELQELLRNSR-GSITFKII 203 (542)
T ss_pred eeccCCcchhccceeeccchheecCeecccCCHHHHHHHHHhCC-CcEEEEEc
Confidence 3455667888887 699999999999998 5799999999987 67788774
No 122
>KOG1924 consensus RhoA GTPase effector DIA/Diaphanous [Signal transduction mechanisms; Cytoskeleton]
Probab=72.22 E-value=13 Score=42.52 Aligned_cols=10 Identities=20% Similarity=0.431 Sum_probs=5.2
Q ss_pred HHHHHHHHHh
Q 008087 523 LKSLANMVEN 532 (578)
Q Consensus 523 ~~~l~~~l~~ 532 (578)
++.|.+.|++
T Consensus 1043 mDslLeaLqs 1052 (1102)
T KOG1924|consen 1043 MDSLLEALQS 1052 (1102)
T ss_pred HHHHHHHHHh
Confidence 3445555554
No 123
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=72.21 E-value=6.5 Score=41.30 Aligned_cols=43 Identities=12% Similarity=0.131 Sum_probs=34.8
Q ss_pred ceEEEEeec-----CCcCCCCCCCEEEEeCCeecCCHHHHHHHHHhcC
Q 008087 492 QMSSLLWCL-----RSPLCLNCFNKVLAFNGNPVKNLKSLANMVENCD 534 (578)
Q Consensus 492 ~~vvvs~v~-----A~~aGl~~GD~I~~VNG~~V~~~~~l~~~l~~~~ 534 (578)
.+|.+.+++ -..-||.+||+|+++||-+|.+.+|+.+.++.+.
T Consensus 220 ~gV~Vtev~~~Spl~gprGL~vgdvitsldgcpV~~v~dW~ecl~tsl 267 (484)
T KOG2921|consen 220 EGVTVTEVPSVSPLFGPRGLSVGDVITSLDGCPVHKVSDWLECLATSL 267 (484)
T ss_pred ceEEEEeccccCCCcCcccCCccceEEecCCcccCCHHHHHHHHHhhc
Confidence 467777763 2233899999999999999999999999998743
No 124
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=71.28 E-value=9.4 Score=38.05 Aligned_cols=46 Identities=22% Similarity=0.229 Sum_probs=35.9
Q ss_pred EEeecCCcCCCC-CCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE
Q 008087 496 LLWCLRSPLCLN-CFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL 542 (578)
Q Consensus 496 vs~v~A~~aGl~-~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v 542 (578)
+.+.+|+.-||- .+|.|++|||..|. ++++..++|-++.-. +-++|
T Consensus 202 VpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANshN-LIiTV 250 (358)
T KOG3606|consen 202 VPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANSHN-LIITV 250 (358)
T ss_pred cCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcccc-eEEEe
Confidence 345678888985 89999999999996 999999998876433 33444
No 125
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=70.53 E-value=15 Score=42.80 Aligned_cols=63 Identities=16% Similarity=0.248 Sum_probs=35.7
Q ss_pred EEEEEEcCCEEEecccccCCCceEEEEEcC-CCcEEEEEEEEEcC--CCCEEEEEEeeCCCCCCeeeEEcCC
Q 008087 151 SSGFAIGGRRVLTNAHSVEHYTQVKLKKRG-SDTKYLATVLAIGT--ECDIAMLTVEDDEFWEGVLPVEFGE 219 (578)
Q Consensus 151 GsGfvI~~g~ILT~aHvV~~~~~i~V~~~~-~g~~~~a~vv~~d~--~~DlAlLkv~~~~~~~~~~~l~l~~ 219 (578)
|...+|++.||+|.+|+..+... |.|.. ....| +++..+. ..|+.+-|++.-. ..+.|+++..
T Consensus 67 G~aTLigpqYiVSV~HN~~gy~~--v~FG~~g~~~Y--~iV~RNn~~~~Df~~pRLnK~V--TEvaP~~~t~ 132 (769)
T PF02395_consen 67 GVATLIGPQYIVSVKHNGKGYNS--VSFGNEGQNTY--KIVDRNNYPSGDFHMPRLNKFV--TEVAPAEMTT 132 (769)
T ss_dssp SS-EEEETTEEEBETTG-TSCCE--ECESCSSTCEE--EEEEEEBETTSTEBEEEESS-----SS----BBS
T ss_pred ceEEEecCCeEEEEEccCCCcCc--eeecccCCceE--EEEEccCCCCcccceeecCceE--EEEecccccc
Confidence 66889999999999999854443 45532 22333 4555543 3699999998642 2455655543
No 126
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=70.25 E-value=3.4 Score=44.03 Aligned_cols=24 Identities=33% Similarity=0.588 Sum_probs=21.4
Q ss_pred CCCCCCccceEEccCCeEEEEEec
Q 008087 269 AINSGNSGGPAFNDKGKCVGIAFQ 292 (578)
Q Consensus 269 ~i~~G~SGGPlvn~~G~vVGI~~~ 292 (578)
....|.||+.|+|.+|++|||.++
T Consensus 351 ~l~gGaSGS~V~n~~~~lvGIy~g 374 (374)
T PF01732_consen 351 SLGGGASGSMVINQNNELVGIYFG 374 (374)
T ss_pred CCCCCCCcCeEECCCCCEEEEeCC
Confidence 566899999999999999999764
No 127
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=68.08 E-value=8.2 Score=39.77 Aligned_cols=49 Identities=12% Similarity=0.090 Sum_probs=37.9
Q ss_pred EEEEee----cCCcCCC-CCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEe
Q 008087 494 SSLLWC----LRSPLCL-NCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLE 543 (578)
Q Consensus 494 vvvs~v----~A~~aGl-~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~ 543 (578)
|+++.. .|+..|+ -.||-|++|||.-|. ..++.+++|++. ++.|+|+|.
T Consensus 82 vviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRNA-GdeVtlTV~ 137 (505)
T KOG3549|consen 82 VVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRNA-GDEVTLTVK 137 (505)
T ss_pred EEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHhc-CCEEEEEeH
Confidence 677765 4555565 499999999999997 568899999975 666777773
No 128
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=66.48 E-value=7 Score=40.26 Aligned_cols=55 Identities=20% Similarity=0.230 Sum_probs=41.3
Q ss_pred CcEEEEeCCCCcccC-C-CCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEE
Q 008087 354 GVRIRRVDPTAPESE-V-LKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVL 418 (578)
Q Consensus 354 Gv~V~~V~p~spA~~-G-L~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~ 418 (578)
-++|..+..+-.|+. | |-.||-|+.|||..|..-..-. .-.++ .+.|+.|+|+|.
T Consensus 81 PvviSkI~kdQaAd~tG~LFvGDAilqvNGi~v~~c~Hee--------vV~iL--RNAGdeVtlTV~ 137 (505)
T KOG3549|consen 81 PVVISKIYKDQAADITGQLFVGDAILQVNGIYVTACPHEE--------VVNIL--RNAGDEVTLTVK 137 (505)
T ss_pred cEEeehhhhhhhhhhcCceEeeeeeEEeccEEeecCChHH--------HHHHH--HhcCCEEEEEeH
Confidence 478888988888887 5 8899999999999998765411 11233 347899998885
No 129
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=65.76 E-value=9.2 Score=45.39 Aligned_cols=42 Identities=14% Similarity=0.139 Sum_probs=34.6
Q ss_pred EEEeecCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhcCCC
Q 008087 495 SLLWCLRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENCDDE 536 (578)
Q Consensus 495 vvs~v~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~~~~ 536 (578)
|..+++|+.+|++++|.|+.|||++|..+ .++.+++-+..++
T Consensus 665 v~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~~gn~ 708 (1205)
T KOG0606|consen 665 VEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLKSGNK 708 (1205)
T ss_pred ecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHhcCCe
Confidence 55567999999999999999999999955 6688888765444
No 130
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=65.74 E-value=8.3 Score=42.16 Aligned_cols=56 Identities=23% Similarity=0.307 Sum_probs=42.1
Q ss_pred CcEEEEeCCCCcccC-C-CCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEEE
Q 008087 354 GVRIRRVDPTAPESE-V-LKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVLR 419 (578)
Q Consensus 354 Gv~V~~V~p~spA~~-G-L~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~R 419 (578)
-++|..+..|+.+.+ | |+.||.|+.|||..|.+..- . .++.++.... ..++++|.-
T Consensus 147 ~~~vARI~~GG~~~r~glL~~GD~i~EvNGi~v~~~~~-~-------e~q~~l~~~~--G~itfkiiP 204 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHVGDEILEVNGISVANKSP-E-------ELQELLRNSR--GSITFKIIP 204 (542)
T ss_pred ccEEeeeccCCcchhccceeeccchheecCeecccCCH-H-------HHHHHHHhCC--CcEEEEEcc
Confidence 488999999998888 5 89999999999999998532 1 1455665544 457777753
No 131
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=63.67 E-value=11 Score=37.45 Aligned_cols=38 Identities=18% Similarity=0.325 Sum_probs=28.5
Q ss_pred CCCCCCEEEEeCCeecCCHH--HHHHHHHhcC-CCeEEEEE
Q 008087 505 CLNCFNKVLAFNGNPVKNLK--SLANMVENCD-DEFLKFDL 542 (578)
Q Consensus 505 Gl~~GD~I~~VNG~~V~~~~--~l~~~l~~~~-~~~v~l~v 542 (578)
.+..||.|.+|||+.|-.+. ++.++|+..+ ++..++.+
T Consensus 167 ~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrL 207 (334)
T KOG3938|consen 167 AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRL 207 (334)
T ss_pred heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEe
Confidence 35799999999999998664 4678888874 45555444
No 132
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=62.06 E-value=7.5 Score=43.42 Aligned_cols=51 Identities=14% Similarity=0.186 Sum_probs=37.5
Q ss_pred eEEEEee----cCCcCCCCCCCEEEEeCCeecCCHHH--HHHHHHhcCCCeEEEEEecC
Q 008087 493 MSSLLWC----LRSPLCLNCFNKVLAFNGNPVKNLKS--LANMVENCDDEFLKFDLEYD 545 (578)
Q Consensus 493 ~vvvs~v----~A~~aGl~~GD~I~~VNG~~V~~~~~--l~~~l~~~~~~~v~l~v~R~ 545 (578)
++++.++ -|++.||+.||.|++||||..+++.. ..++|.+ +-.+.|++.-+
T Consensus 563 gifV~~V~pgskAa~~GlKRgDqilEVNgQnfenis~~KA~eiLrn--nthLtltvKtN 619 (1283)
T KOG3542|consen 563 GIFVAEVFPGSKAAREGLKRGDQILEVNGQNFENISAKKAEEILRN--NTHLTLTVKTN 619 (1283)
T ss_pred eeEEeeecCCchHHHhhhhhhhhhhhccccchhhhhHHHHHHHhcC--CceEEEEEecc
Confidence 5676665 58899999999999999999997754 3445543 34667777544
No 133
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=57.62 E-value=6.4 Score=39.14 Aligned_cols=56 Identities=13% Similarity=0.237 Sum_probs=44.8
Q ss_pred cEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCEEEEEEE
Q 008087 355 VRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDSAAVKVL 418 (578)
Q Consensus 355 v~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~v~l~V~ 418 (578)
..|..+.++|.... -++.||.|-+|||+.|-.+.... ...++.....|++.+|.+.
T Consensus 151 AFIKrIkegsvidri~~i~VGd~IEaiNge~ivG~RHYe--------VArmLKel~rge~ftlrLi 208 (334)
T KOG3938|consen 151 AFIKRIKEGSVIDRIEAICVGDHIEAINGESIVGKRHYE--------VARMLKELPRGETFTLRLI 208 (334)
T ss_pred eeeEeecCCchhhhhhheeHHhHHHhhcCccccchhHHH--------HHHHHHhcccCCeeEEEee
Confidence 57888899998877 79999999999999999887643 3466777777887777665
No 134
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=56.36 E-value=11 Score=39.53 Aligned_cols=54 Identities=7% Similarity=0.121 Sum_probs=38.7
Q ss_pred ecCCcCC-CCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE--ecCeEEEEech
Q 008087 499 CLRSPLC-LNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL--EYDQVVVLRTK 553 (578)
Q Consensus 499 v~A~~aG-l~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v--~R~~~~~l~~~ 553 (578)
..|++.+ |..||.|++|||.... +.++.+++|+.. ++.|.++| .|+-..++.+.
T Consensus 121 lAADQt~aL~~gDaIlSVNG~dL~~AtHdeAVqaLKra-GkeV~levKy~REvtPy~kk~ 179 (506)
T KOG3551|consen 121 LAADQTGALFLGDAILSVNGEDLRDATHDEAVQALKRA-GKEVLLEVKYMREVTPYFKKE 179 (506)
T ss_pred cccccccceeeccEEEEecchhhhhcchHHHHHHHHhh-CceeeeeeeeehhcchhhccC
Confidence 3466655 4699999999999997 668889999875 56565555 56655566543
No 135
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=55.76 E-value=25 Score=35.65 Aligned_cols=47 Identities=21% Similarity=0.261 Sum_probs=35.6
Q ss_pred EEEeecCCcCC-CCCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEE
Q 008087 495 SLLWCLRSPLC-LNCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDL 542 (578)
Q Consensus 495 vvs~v~A~~aG-l~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v 542 (578)
|..+.||++-| ++.||.|++|||..|+ +--+..++|+...++ |++.+
T Consensus 37 vFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~~~e-V~Ihy 86 (429)
T KOG3651|consen 37 VFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVSLNE-VKIHY 86 (429)
T ss_pred eccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHhccc-eEEEe
Confidence 34455777666 5899999999999998 567788899887655 55554
No 136
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=54.89 E-value=11 Score=40.18 Aligned_cols=50 Identities=12% Similarity=0.180 Sum_probs=41.1
Q ss_pred EEEeecCCcCCCC-CCCEEEEeCCeecC-CHHHHHHHHHhcCCCeEEEEEecC
Q 008087 495 SLLWCLRSPLCLN-CFNKVLAFNGNPVK-NLKSLANMVENCDDEFLKFDLEYD 545 (578)
Q Consensus 495 vvs~v~A~~aGl~-~GD~I~~VNG~~V~-~~~~l~~~l~~~~~~~v~l~v~R~ 545 (578)
|..+++|+++||. --|.|++|||..++ +-+.|...++++..+ |++++-.-
T Consensus 22 VqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~sek-Vkltv~n~ 73 (462)
T KOG3834|consen 22 VQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSEK-VKLTVYNS 73 (462)
T ss_pred eecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhcccc-eEEEEEec
Confidence 5567899999997 68999999999998 556788888888777 88887443
No 137
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=53.25 E-value=1.4e+02 Score=32.01 Aligned_cols=136 Identities=16% Similarity=0.214 Sum_probs=69.3
Q ss_pred eEEEEEEEcCCEEEecccccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCCCCCCcEE
Q 008087 149 SSSSGFAIGGRRVLTNAHSVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELPALQDAVT 228 (578)
Q Consensus 149 ~~GsGfvI~~g~ILT~aHvV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~g~~V~ 228 (578)
++|=||-|++.+.+|+-||+.....- ++ | .+-.-+.++..-+|.-+++..+- ..++.-+-|.+...-|.-+.
T Consensus 379 GsGWGfWVS~~lfITttHViP~g~~E---~F--G--v~i~~i~vh~sGeF~~~rFpk~i-RPDvtgmiLEeGapEGtV~s 450 (535)
T PF05416_consen 379 GSGWGFWVSPTLFITTTHVIPPGAKE---AF--G--VPISQIQVHKSGEFCRFRFPKPI-RPDVTGMILEEGAPEGTVCS 450 (535)
T ss_dssp TTEEEEESSSSEEEEEGGGS-STTSE---ET--T--EECGGEEEEEETTEEEEEESS-S-STTS---EE-SS--TT-EEE
T ss_pred CCceeeeecceEEEEeeeecCCcchh---hh--C--CChhHeEEeeccceEEEecCCCC-CCCccceeeccCCCCceEEE
Confidence 46889999999999999999854210 00 0 01111344555677777777642 12455555544333444333
Q ss_pred -EEeecCCCC-cceEEEeEEeceeeeee-cCCceeeeEEEE-------ecCCCCCCccceEEccCC---eEEEEEeccc
Q 008087 229 -VVGYPIGGD-TISVTSGVVSRIEILSY-VHGSTELLGLQI-------DAAINSGNSGGPAFNDKG---KCVGIAFQSL 294 (578)
Q Consensus 229 -~iG~p~g~~-~~sv~~G~Is~~~~~~~-~~~~~~~~~i~~-------da~i~~G~SGGPlvn~~G---~vVGI~~~~~ 294 (578)
.|=++.|.. .+.+..|.......... ..+ ...++.+ |-...||+-|-|-|-..| -|+|++.+..
T Consensus 451 iLiKR~sGEllpLAvRMgt~AsmkIqgr~v~G--Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAt 527 (535)
T PF05416_consen 451 ILIKRPSGELLPLAVRMGTHASMKIQGRTVHG--QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAAT 527 (535)
T ss_dssp EEEE-TTSBEEEEEEEEEEEEEEEETTEEEEE--EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-
T ss_pred EEEEcCCccchhhhhhhccceeEEEcceeecc--eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhc
Confidence 345565522 23566776666543211 111 1123333 335668999999996654 6999998854
No 138
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=49.27 E-value=30 Score=37.63 Aligned_cols=53 Identities=8% Similarity=-0.026 Sum_probs=37.6
Q ss_pred CCceEEEEee-----cCCcCCCCCCCEEEEeCCeecCCH--HHHHHHHHhc--CCCeEEEEE
Q 008087 490 NCQMSSLLWC-----LRSPLCLNCFNKVLAFNGNPVKNL--KSLANMVENC--DDEFLKFDL 542 (578)
Q Consensus 490 ~~~~vvvs~v-----~A~~aGl~~GD~I~~VNG~~V~~~--~~l~~~l~~~--~~~~v~l~v 542 (578)
.+++++|..+ -|...-+.+||.|+.||.....++ ++.+++|+.. +...++|++
T Consensus 275 gDggIYVgsImkgGAVA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~gPi~ltv 336 (626)
T KOG3571|consen 275 GDGGIYVGSIMKGGAVALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPGPIKLTV 336 (626)
T ss_pred CCCceEEeeeccCceeeccCccCccceEEEeeecchhhcCchHHHHHHHHHhccCCCeEEEE
Confidence 3466777654 255555789999999999999866 6777777764 334566665
No 139
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=43.82 E-value=46 Score=27.61 Aligned_cols=37 Identities=27% Similarity=0.442 Sum_probs=30.3
Q ss_pred ccCCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087 167 SVEHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE 204 (578)
Q Consensus 167 vV~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~ 204 (578)
++.....+.|.+ .+++.+.+++.++|.+.+|.|=...
T Consensus 10 ~~~~~~~V~V~l-r~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 10 AVKNNTQVLINC-RNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred HHcCCCEEEEEE-cCCCEEEEEEEEecCccEEEEcceE
Confidence 344567889999 5999999999999999999876553
No 140
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=40.88 E-value=75 Score=23.92 Aligned_cols=33 Identities=12% Similarity=0.200 Sum_probs=27.8
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED 205 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 205 (578)
..+.|.+ .+|+.|.+.+..+|...++.|-....
T Consensus 7 ~~V~V~l-~~g~~~~G~L~~~D~~~Ni~L~~~~~ 39 (63)
T cd00600 7 KTVRVEL-KDGRVLEGVLVAFDKYMNLVLDDVEE 39 (63)
T ss_pred CEEEEEE-CCCcEEEEEEEEECCCCCEEECCEEE
Confidence 4688888 59999999999999999998776543
No 141
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=36.53 E-value=82 Score=24.49 Aligned_cols=32 Identities=22% Similarity=0.256 Sum_probs=27.1
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE 204 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~ 204 (578)
..+.|.+ .+|+.|.+++..+|...+|-|=...
T Consensus 11 ~~V~V~L-k~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 11 RPVVVKL-NSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CeEEEEE-CCCCEEEEEEEEEccceeeEEeeEE
Confidence 4688888 5999999999999999999886553
No 142
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=35.78 E-value=91 Score=24.67 Aligned_cols=33 Identities=6% Similarity=0.189 Sum_probs=28.0
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED 205 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 205 (578)
..+.|.+ .+|+.|.+++.++|...++-|=....
T Consensus 15 k~V~V~l-k~g~~~~G~L~~~D~~mNlvL~d~~e 47 (72)
T PRK00737 15 SPVLVRL-KGGREFRGELQGYDIHMNLVLDNAEE 47 (72)
T ss_pred CEEEEEE-CCCCEEEEEEEEEcccceeEEeeEEE
Confidence 4688888 59999999999999999998876543
No 143
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=35.33 E-value=82 Score=24.60 Aligned_cols=32 Identities=16% Similarity=0.266 Sum_probs=27.0
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE 204 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~ 204 (578)
..+.|.+ .+|+.|.+++..+|...+|.|=...
T Consensus 12 ~~V~V~L-k~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 12 KPVIVKL-KWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CEEEEEE-CCCcEEEEEEEEECCCEEEEEeeEE
Confidence 4688888 6999999999999999999875543
No 144
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=34.57 E-value=99 Score=24.03 Aligned_cols=33 Identities=9% Similarity=0.133 Sum_probs=28.4
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED 205 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 205 (578)
..+.|.+ .+|+.+.+++..+|...+|.|-....
T Consensus 11 ~~V~V~l-~~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 11 KPVLVKL-KGGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CEEEEEE-CCCCEEEEEEEEECCcceEEEeeEEE
Confidence 5688888 58999999999999999998877653
No 145
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=33.05 E-value=1e+02 Score=24.67 Aligned_cols=31 Identities=10% Similarity=0.338 Sum_probs=26.3
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV 203 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv 203 (578)
..+.|.+ .||+.+.+++.++|...+|.|=..
T Consensus 11 ~~v~V~l-~dgR~~~G~l~~~D~~~NivL~~~ 41 (75)
T cd06168 11 RTMRIHM-TDGRTLVGVFLCTDRDCNIILGSA 41 (75)
T ss_pred CeEEEEE-cCCeEEEEEEEEEcCCCcEEecCc
Confidence 4688888 699999999999999999876544
No 146
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=32.86 E-value=96 Score=24.99 Aligned_cols=32 Identities=9% Similarity=0.301 Sum_probs=26.8
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE 204 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~ 204 (578)
..+.|.+ .+|+.+.+.+.++|...+|.|=...
T Consensus 11 ~~V~V~l-~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 11 YRLRVTL-QDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CEEEEEE-CCCcEEEEEEEEEcCccCEEcCCEE
Confidence 4678888 6999999999999999999765443
No 147
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=32.70 E-value=1e+02 Score=25.03 Aligned_cols=48 Identities=23% Similarity=0.283 Sum_probs=32.1
Q ss_pred EEEEEEEEcCCCCEEEEEEeeCCCCCCeeeEEcCCCCCCCCcEEE-EeecC
Q 008087 185 YLATVLAIGTECDIAMLTVEDDEFWEGVLPVEFGELPALQDAVTV-VGYPI 234 (578)
Q Consensus 185 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~l~~~~~~g~~V~~-iG~p~ 234 (578)
++++++..|...++|++.+-.-. ..+.---+....++|++|.+ +||-.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvk--reV~l~Lv~~~v~~GdyVLVHvGfAi 53 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVK--REVNLDLVGEEVKVGDYVLVHVGFAM 53 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEe--EEEEeeeecCccccCCEEEEEeeEEE
Confidence 57899999988889999887532 11222222336788999876 67654
No 148
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=32.16 E-value=1.6e+02 Score=22.66 Aligned_cols=34 Identities=15% Similarity=0.190 Sum_probs=27.5
Q ss_pred CceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087 171 YTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED 205 (578)
Q Consensus 171 ~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 205 (578)
+..+.++. -.|..++++|+.+|....+.+|+-..
T Consensus 6 Gs~V~~kT-c~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 6 GSQVSCRT-CFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred ccEEEEEe-cCCceEEEEEEEecCCCcEEEEECcc
Confidence 44566666 37899999999999999999998554
No 149
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=32.04 E-value=89 Score=25.41 Aligned_cols=31 Identities=16% Similarity=0.164 Sum_probs=26.1
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV 203 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv 203 (578)
..+.|.+ .+|+.+.+++.++|.+.+|.|=..
T Consensus 12 k~V~V~l-~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKL-RGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEE-CCCCEEEEEEEEEccceEEeccce
Confidence 4688888 599999999999999999876443
No 150
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=30.53 E-value=52 Score=23.86 Aligned_cols=21 Identities=38% Similarity=0.555 Sum_probs=17.6
Q ss_pred CCCccceEEccCCeEEEEEec
Q 008087 272 SGNSGGPAFNDKGKCVGIAFQ 292 (578)
Q Consensus 272 ~G~SGGPlvn~~G~vVGI~~~ 292 (578)
.+.+.-|++|.+|+++|+.+.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 356678999999999999875
No 151
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.15 E-value=1.2e+02 Score=24.66 Aligned_cols=31 Identities=3% Similarity=0.070 Sum_probs=26.2
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV 203 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv 203 (578)
..+.|.+ .+|+.+.+++.++|...+|.|=..
T Consensus 13 k~V~V~l-~~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKF-QGGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEE-CCCcEEEEEEEEEcCcccEEecCE
Confidence 4678888 589999999999999999877544
No 152
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.02 E-value=1.1e+02 Score=24.65 Aligned_cols=31 Identities=16% Similarity=0.348 Sum_probs=26.4
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV 203 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv 203 (578)
..+.|.+ .+++.+.+++.++|...++.|=..
T Consensus 14 ~~V~V~l-~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVM-KSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEE-CCCeEEEEEEEEeccceEEEEccE
Confidence 5688888 699999999999999999987544
No 153
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=29.41 E-value=1.3e+02 Score=23.60 Aligned_cols=33 Identities=9% Similarity=0.113 Sum_probs=28.6
Q ss_pred CceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087 171 YTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE 204 (578)
Q Consensus 171 ~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~ 204 (578)
...+.|.+ .+|..|.+++..+|...++.|-...
T Consensus 10 g~~V~VeL-k~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 10 GHIVTVEL-KTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CCEEEEEE-CCCcEEEEEEEEEcCCceeEEEEEE
Confidence 45688888 5899999999999999999988774
No 154
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=28.72 E-value=29 Score=38.77 Aligned_cols=34 Identities=9% Similarity=0.085 Sum_probs=30.8
Q ss_pred cEEEEeCCCCcccC--CCCCCCEEEEECCEEeCCCC
Q 008087 355 VRIRRVDPTAPESE--VLKPSDIILSFDGIDIANDG 388 (578)
Q Consensus 355 v~V~~V~p~spA~~--GL~~GDiIl~InG~~V~~~~ 388 (578)
.+|.++.++|||.. .|..||.|+.||++.|-.|+
T Consensus 227 h~~s~~~e~Spad~~~kI~dgdEv~qiN~qtvVgwq 262 (638)
T KOG1738|consen 227 HVTSKIFEQSPADYRQKILDGDEVLQINEQTVVGWQ 262 (638)
T ss_pred eeccccccCChHHHhhcccCccceeeecccccccch
Confidence 56778899999988 79999999999999999996
No 155
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=28.29 E-value=1.4e+02 Score=23.61 Aligned_cols=31 Identities=10% Similarity=0.129 Sum_probs=26.1
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEE
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTV 203 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv 203 (578)
..+.|.+ .+|+.+.+++.++|...+|.|=..
T Consensus 11 k~V~V~L-~~g~~~~G~L~~~D~~mNlvL~~~ 41 (72)
T cd01719 11 KKLSLKL-NGNRKVSGILRGFDPFMNLVLDDA 41 (72)
T ss_pred CeEEEEE-CCCeEEEEEEEEEcccccEEeccE
Confidence 4678888 599999999999999999877544
No 156
>KOG4371 consensus Membrane-associated protein tyrosine phosphatase PTP-BAS and related proteins, contain FERM domain [Signal transduction mechanisms]
Probab=28.10 E-value=1.9e+02 Score=34.70 Aligned_cols=155 Identities=16% Similarity=0.154 Sum_probs=0.0
Q ss_pred ccCCcccccccChhhhhhhccccCCCCcEEEEeCCCCcccC-CCCCCCEEEEECCEEeCCCCCCccccCcchhHHHHHhc
Q 008087 328 PLLGVEWQKMENPDLRVAMSMKADQKGVRIRRVDPTAPESE-VLKPSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQ 406 (578)
Q Consensus 328 ~~lGi~~~~~~~~~~~~~lgl~~~~~Gv~V~~V~p~spA~~-GL~~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~ 406 (578)
|.||++...+ ...+-+....-...-.. -|+.||+++.+||..+...-. ....-..
T Consensus 1158 ~~l~~~~a~~--------------~~~~~~~~~~~~~~~~~pd~~~g~~l~~~n~i~~~~~~~----------~~~~~~~ 1213 (1332)
T KOG4371|consen 1158 GSLGVQIASL--------------SGRVCIKQLTSEPAISHPDIRVGDVLLYVNGIAVEGKVH----------QEVVAML 1213 (1332)
T ss_pred CCCCceeccC--------------ccceehhhcccCCCCCCCCcchhhhhhhccceeeechhh----------HHHHHHH
Q ss_pred cCCCCEEEEEEEE---------------CCEEEEEEEEecccccccCCCCCCCCCCceeeccEEEeehHHHHHHhchhhh
Q 008087 407 KYTGDSAAVKVLR---------------DSKILNFNITLATHRRLIPSHNKGRPPSYYIIAGFVFSRCLYLISVLSMERI 471 (578)
Q Consensus 407 ~~~g~~v~l~V~R---------------~g~~~~~~v~l~~~~~~~~~~~~~~~p~~~~~~Gl~~~~~p~~~~~~g~~~~ 471 (578)
...|+.+.|-|+| +...+.-...+........-....+.|+--++.|
T Consensus 1214 ~~~~~~~~~~~~r~~~~~~d~~~~s~~~~~~~l~~~~~~~~p~~~~~~~~~~~~~s~~~~~~------------------ 1275 (1332)
T KOG4371|consen 1214 RGGGDRVVLGVQRPPPAYSDQHHASSTSASAPLISVMLLKKPMATLGLSLAKRTMSDGIFIR------------------ 1275 (1332)
T ss_pred hccCceEEEEeecCCcccccchhhhhhcccchhhhheeeecccccccccccccCcCCceeee------------------
Q ss_pred hhhhhccccccccccccCCCceEEEEeecCCcCCC-CCCCEEEEeCCeecC--CHHHHHHHHHhcCCCeEEEEEecCeE
Q 008087 472 MNMKLRSSFWTSSCIQCHNCQMSSLLWCLRSPLCL-NCFNKVLAFNGNPVK--NLKSLANMVENCDDEFLKFDLEYDQV 547 (578)
Q Consensus 472 ~~~~~l~~~~~~~~~~~~~~~~vvvs~v~A~~aGl-~~GD~I~~VNG~~V~--~~~~l~~~l~~~~~~~v~l~v~R~~~ 547 (578)
++.++..|.-.|- +.||.++..+|+++. ......+.++ .--+.+.+.+.|+++
T Consensus 1276 ----------------------~~~~~~~a~~~~~~r~g~~~~~~~~~~~~~~~p~~~l~~~~-~v~~p~~~~~~~~q~ 1331 (1332)
T KOG4371|consen 1276 ----------------------NIAQDSAASSEGTLRVGDRLVSLDGEPVDGFTPATILEKLK-LVQGPVQITVTREQT 1331 (1332)
T ss_pred ----------------------cccccccccccccccccceeeccCCccCCCCChHHHHHHhh-hccCchhheehhhhc
No 157
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=27.96 E-value=1.3e+02 Score=24.28 Aligned_cols=48 Identities=27% Similarity=0.386 Sum_probs=31.4
Q ss_pred CCCEEEEECCEEeCCCCCCccccCcchhHHHHHhccCCCCE----EEEEEEECCEEEEEEE
Q 008087 372 PSDIILSFDGIDIANDGTVPFRHGERIGFSYLVSQKYTGDS----AAVKVLRDSKILNFNI 428 (578)
Q Consensus 372 ~GDiIl~InG~~V~~~~~v~~~~~~~~~~~~~~~~~~~g~~----v~l~V~R~g~~~~~~v 428 (578)
|-|-.+.+||++..+.+..+- ..-..+..|.. +..++.|||+..+.+-
T Consensus 10 PadAkl~v~G~~t~~~G~~R~---------F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~~ 61 (75)
T TIGR03000 10 PADAKLKVDGKETNGTGTVRT---------FTTPPLEAGKEYEYTVTAEYDRDGRILTRTR 61 (75)
T ss_pred CCCCEEEECCeEcccCccEEE---------EECCCCCCCCEEEEEEEEEEecCCcEEEEEE
Confidence 468889999999999887531 11223445554 5566678998766543
No 158
>COG1582 FlgEa Uncharacterized protein, possibly involved in motility [Cell motility and secretion]
Probab=27.32 E-value=1.9e+02 Score=22.48 Aligned_cols=52 Identities=13% Similarity=0.114 Sum_probs=39.5
Q ss_pred EEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEEEechhhhHHHHHHHHH
Q 008087 511 KVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVVLRTKTSKAATLDILAT 565 (578)
Q Consensus 511 ~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~l~~~~~~~~~~~i~~~ 565 (578)
.++..||++.-=-.++.+.++..|+..++| -+|.-+...+.+++..++|..-
T Consensus 3 ~vtrlNG~~~~lN~~~IE~ie~~PDttItL---inGkkyvVkEsveEVi~kI~~y 54 (67)
T COG1582 3 KVTRLNGREFWLNAHHIETIEAFPDTTITL---INGKKYVVKESVEEVINKIIEY 54 (67)
T ss_pred EEEEecCcceeeCHHHhhhhhccCCcEEEE---EcCcEEEEcccHHHHHHHHHHH
Confidence 467899999876688899999999887654 3566666678888888877653
No 159
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=27.29 E-value=87 Score=30.53 Aligned_cols=54 Identities=26% Similarity=0.441 Sum_probs=37.9
Q ss_pred EEEEecCCCCCCccceEEccC----CeEEEEEeccccccccCCceeeecc--hhHHHHHHHHH
Q 008087 263 GLQIDAAINSGNSGGPAFNDK----GKCVGIAFQSLKHEDVENIGYVIPT--PVIMHFIQDYE 319 (578)
Q Consensus 263 ~i~~da~i~~G~SGGPlvn~~----G~vVGI~~~~~~~~~~~~~~~aiPi--~~i~~~l~~l~ 319 (578)
.+........|+=|||++-.+ -++|||+.++. ...+.+||-++ ..+++.+..|+
T Consensus 170 gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~---~~~~~gYAe~itQEDL~~A~~~l~ 229 (231)
T PF12381_consen 170 GLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGS---ANHAMGYAESITQEDLMRAINKLE 229 (231)
T ss_pred eeeEECCCcCCCccceeeEcchhhhhhhheeeeccc---ccccceehhhhhHHHHHHHHHhhc
Confidence 355667788899999998432 58999998844 33467787665 46666666554
No 160
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=26.58 E-value=1.9e+02 Score=27.57 Aligned_cols=19 Identities=0% Similarity=-0.178 Sum_probs=16.1
Q ss_pred eecCCcCCCCCCCEEEEeC
Q 008087 498 WCLRSPLCLNCFNKVLAFN 516 (578)
Q Consensus 498 ~v~A~~aGl~~GD~I~~VN 516 (578)
+++|+++|++-++.|++|-
T Consensus 132 gS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 132 GSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred CCHHHHcCCCCCcEEEEEE
Confidence 4589999999999888773
No 161
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=25.86 E-value=1.7e+02 Score=22.33 Aligned_cols=33 Identities=15% Similarity=0.250 Sum_probs=27.2
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED 205 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 205 (578)
..+.|.+ .+|+.+.+.+..+|...++-|=....
T Consensus 9 ~~V~V~l-~~g~~~~G~L~~~D~~~NlvL~~~~e 41 (67)
T smart00651 9 KRVLVEL-KNGREYRGTLKGFDQFMNLVLEDVEE 41 (67)
T ss_pred cEEEEEE-CCCcEEEEEEEEECccccEEEccEEE
Confidence 4678888 59999999999999999987765543
No 162
>PF12419 DUF3670: SNF2 Helicase protein ; InterPro: IPR022138 This domain family is found in bacteria, archaea and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with PF00271 from PFAM, PF00176 from PFAM. Most of the proteins in this family are annotated as SNF2 helicases but there is little accompanying literature to confirm this.
Probab=25.79 E-value=2.7e+02 Score=25.08 Aligned_cols=54 Identities=15% Similarity=0.135 Sum_probs=45.0
Q ss_pred CCCEEEEeCCeecCCHHHHHHHHHhcCCCeEEEEEecCeEEEEechhhhHHHHHHHHHc
Q 008087 508 CFNKVLAFNGNPVKNLKSLANMVENCDDEFLKFDLEYDQVVVLRTKTSKAATLDILATH 566 (578)
Q Consensus 508 ~GD~I~~VNG~~V~~~~~l~~~l~~~~~~~v~l~v~R~~~~~l~~~~~~~~~~~i~~~~ 566 (578)
..|+=++|+|+.+ +.++|.+++++. ...|+ .|+.-+.++.++++++...+.+..
T Consensus 72 ~f~W~lalGd~~L-s~eEf~~L~~~~-~~LV~---~rg~WV~ld~~~l~~~~~~~~~~~ 125 (141)
T PF12419_consen 72 DFDWELALGDEEL-SEEEFEQLVEQK-RPLVR---FRGRWVELDPEELRRALAFLEKAP 125 (141)
T ss_pred cceEEEEECCEEC-CHHHHHHHHHcC-CCeEE---ECCEEEEECHHHHHHHHHHHHhcc
Confidence 7788899999988 899999999863 34443 499999999999999999888753
No 163
>smart00384 AT_hook DNA binding domain with preference for A/T rich regions. Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y).
Probab=25.45 E-value=50 Score=20.79 Aligned_cols=14 Identities=57% Similarity=0.864 Sum_probs=11.3
Q ss_pred ccccCCCCCCCCCc
Q 008087 5 KRKRGRKPKIPDAE 18 (578)
Q Consensus 5 ~~~~~~~~~~~~~~ 18 (578)
++||||-+|.....
T Consensus 1 kRkRGRPrK~~~~~ 14 (26)
T smart00384 1 KRKRGRPRKAPKDX 14 (26)
T ss_pred CCCCCCCCCCCCcc
Confidence 68999999987744
No 164
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=25.43 E-value=59 Score=30.95 Aligned_cols=29 Identities=14% Similarity=0.081 Sum_probs=25.1
Q ss_pred CCCcEEEEeCCCCcccC-CCCCCCEEEEEC
Q 008087 352 QKGVRIRRVDPTAPESE-VLKPSDIILSFD 380 (578)
Q Consensus 352 ~~Gv~V~~V~p~spA~~-GL~~GDiIl~In 380 (578)
...+.|..|..+|||++ |+.-|+.|++|-
T Consensus 121 ~~~~~Vd~v~fgS~A~~~g~d~d~~I~~v~ 150 (183)
T PF11874_consen 121 GGKVIVDEVEFGSPAEKAGIDFDWEITEVE 150 (183)
T ss_pred CCEEEEEecCCCCHHHHcCCCCCcEEEEEE
Confidence 34588999999999999 999999888773
No 165
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=25.19 E-value=1.7e+02 Score=23.38 Aligned_cols=32 Identities=6% Similarity=0.096 Sum_probs=26.7
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE 204 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~ 204 (578)
..+.|.+ .+|+.+.+.+.++|+..+|.|=...
T Consensus 13 k~v~V~l-~~gr~~~G~L~~fD~~~NlvL~d~~ 44 (74)
T cd01728 13 KKVVVLL-RDGRKLIGILRSFDQFANLVLQDTV 44 (74)
T ss_pred CEEEEEE-cCCeEEEEEEEEECCcccEEecceE
Confidence 4678888 5999999999999999999875543
No 166
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=25.16 E-value=1.4e+02 Score=22.88 Aligned_cols=35 Identities=11% Similarity=0.247 Sum_probs=29.3
Q ss_pred CceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeC
Q 008087 171 YTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDD 206 (578)
Q Consensus 171 ~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~ 206 (578)
...+.|.+ .+|+.+.+.+..+|...++-|-.....
T Consensus 8 g~~V~V~l-~~g~~~~G~L~~~D~~~Nl~L~~~~~~ 42 (67)
T PF01423_consen 8 GKRVRVEL-KNGRTYRGTLVSFDQFMNLVLSDVTET 42 (67)
T ss_dssp TSEEEEEE-TTSEEEEEEEEEEETTEEEEEEEEEEE
T ss_pred CcEEEEEE-eCCEEEEEEEEEeechheEEeeeEEEE
Confidence 35688888 599999999999999999988777653
No 167
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=24.34 E-value=1.5e+02 Score=23.77 Aligned_cols=33 Identities=18% Similarity=0.310 Sum_probs=28.0
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEee
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVED 205 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~ 205 (578)
..+.|.+ .+|+.|.+++..+|...++.|--+..
T Consensus 18 ~~V~V~l-k~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 18 KRVLVKL-KNGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CEEEEEE-CCCCEEEEEEEEEccceeEEEeceEE
Confidence 5788888 58999999999999999998775554
No 168
>PF07174 FAP: Fibronectin-attachment protein (FAP); InterPro: IPR010801 This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix [].; GO: 0050840 extracellular matrix binding, 0005576 extracellular region
Probab=24.10 E-value=6.4e+02 Score=25.57 Aligned_cols=17 Identities=12% Similarity=0.311 Sum_probs=9.4
Q ss_pred EEEEcCCEEEecccccC
Q 008087 153 GFAIGGRRVLTNAHSVE 169 (578)
Q Consensus 153 GfvI~~g~ILT~aHvV~ 169 (578)
-|+|=.||+.+.+.-+.
T Consensus 120 S~vvP~GW~~Sda~~L~ 136 (297)
T PF07174_consen 120 SYVVPAGWVESDASHLD 136 (297)
T ss_pred EEeccCCccccccceee
Confidence 34444777766654433
No 169
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.06 E-value=3.3e+02 Score=21.53 Aligned_cols=32 Identities=6% Similarity=0.066 Sum_probs=26.9
Q ss_pred ceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087 172 TQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE 204 (578)
Q Consensus 172 ~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~ 204 (578)
..+.|.+ .+++.+.+++.++|...++.|=...
T Consensus 10 ~~V~V~l-~dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 10 KTVSVIT-VDGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CEEEEEE-CCCcEEEEEEEEEccccCEEccceE
Confidence 4677888 6999999999999999998876653
No 170
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=23.74 E-value=3e+02 Score=20.79 Aligned_cols=38 Identities=24% Similarity=0.211 Sum_probs=29.1
Q ss_pred CCCceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEeeC
Q 008087 169 EHYTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVEDD 206 (578)
Q Consensus 169 ~~~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~~~ 206 (578)
..+..+.++-+.+...|+|+++.+|...+++-++++.-
T Consensus 7 ~~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG 44 (55)
T PF09465_consen 7 AIGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG 44 (55)
T ss_dssp -SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred cCCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence 34567888887777788999999999999999998764
No 171
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=20.85 E-value=2.5e+02 Score=22.42 Aligned_cols=33 Identities=9% Similarity=0.079 Sum_probs=28.3
Q ss_pred CceEEEEEcCCCcEEEEEEEEEcCCCCEEEEEEe
Q 008087 171 YTQVKLKKRGSDTKYLATVLAIGTECDIAMLTVE 204 (578)
Q Consensus 171 ~~~i~V~~~~~g~~~~a~vv~~d~~~DlAlLkv~ 204 (578)
...+.|.+ .+|..+.+++..+|...++.|-.+.
T Consensus 11 g~~V~VeL-kng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 11 NHPMLVEL-KNGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CCEEEEEE-CCCCEEEEEEEEEcCCCceEEEeEE
Confidence 45788888 5899999999999999999887654
Done!