Query 016647
Match_columns 385
No_of_seqs 349 out of 3407
Neff 7.8
Searched_HMMs 46136
Date Fri Mar 29 08:44:15 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/016647.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/016647hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3.8E-46 8.2E-51 376.9 35.8 262 117-382 41-362 (455)
2 PRK10898 serine endoprotease; 100.0 3.6E-45 7.8E-50 359.3 36.3 264 112-383 41-352 (353)
3 TIGR02038 protease_degS peripl 100.0 5.5E-45 1.2E-49 358.2 36.3 263 112-382 41-350 (351)
4 PRK10942 serine endoprotease; 100.0 1.1E-43 2.4E-48 360.5 34.5 261 117-381 39-382 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 4.3E-42 9.4E-47 346.8 32.5 261 118-382 3-329 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 2E-34 4.3E-39 283.6 28.8 262 116-380 33-340 (347)
7 KOG1320 Serine protease [Postt 99.9 2.4E-26 5.1E-31 227.9 19.9 267 116-382 128-470 (473)
8 KOG1421 Predicted signaling-as 99.8 1.3E-19 2.7E-24 182.2 19.3 254 117-381 53-372 (955)
9 PF13365 Trypsin_2: Trypsin-li 99.7 2.7E-16 5.8E-21 130.0 12.9 109 154-293 1-120 (120)
10 PF00089 Trypsin: Trypsin; In 99.4 1.3E-11 2.8E-16 112.1 18.6 146 151-297 24-198 (220)
11 KOG1421 Predicted signaling-as 99.3 3.6E-11 7.9E-16 121.9 17.7 247 122-382 524-833 (955)
12 cd00190 Tryp_SPc Trypsin-like 99.3 8.2E-11 1.8E-15 107.6 16.2 147 151-298 24-208 (232)
13 PF13180 PDZ_2: PDZ domain; PD 99.3 1.1E-11 2.3E-16 96.4 8.4 63 316-378 20-82 (82)
14 KOG1320 Serine protease [Postt 99.3 8.8E-12 1.9E-16 124.5 7.7 236 123-369 57-351 (473)
15 smart00020 Tryp_SPc Trypsin-li 99.2 4.5E-10 9.8E-15 102.9 15.4 147 151-298 25-208 (229)
16 cd00991 PDZ_archaeal_metallopr 99.0 1E-09 2.2E-14 84.7 8.6 62 316-377 16-77 (79)
17 cd00986 PDZ_LON_protease PDZ d 99.0 1.7E-09 3.6E-14 83.3 9.2 65 317-382 15-79 (79)
18 COG3591 V8-like Glu-specific e 98.9 3.9E-08 8.4E-13 91.3 13.4 135 153-300 65-226 (251)
19 cd00989 PDZ_metalloprotease PD 98.8 2.1E-08 4.6E-13 76.8 7.9 61 316-377 18-78 (79)
20 cd00990 PDZ_glycyl_aminopeptid 98.7 3.7E-08 8E-13 75.7 7.3 61 316-379 18-78 (80)
21 cd00987 PDZ_serine_protease PD 98.7 3.9E-08 8.3E-13 77.2 7.5 60 316-375 30-89 (90)
22 PRK10779 zinc metallopeptidase 98.7 4.3E-08 9.3E-13 100.0 7.5 66 315-380 131-196 (449)
23 TIGR01713 typeII_sec_gspC gene 98.7 7.2E-08 1.6E-12 91.0 8.4 61 318-378 199-259 (259)
24 cd00988 PDZ_CTP_protease PDZ d 98.6 1.9E-07 4.1E-12 72.5 8.0 62 316-378 19-83 (85)
25 PRK10779 zinc metallopeptidase 98.4 9.7E-07 2.1E-11 90.1 9.0 64 316-380 227-290 (449)
26 TIGR00054 RIP metalloprotease 98.4 9E-07 1.9E-11 89.5 8.2 64 316-380 209-272 (420)
27 TIGR02860 spore_IV_B stage IV 98.3 1.8E-06 3.9E-11 85.6 8.9 60 319-379 122-181 (402)
28 PF00863 Peptidase_C4: Peptida 98.3 8.7E-05 1.9E-09 68.6 18.1 165 123-315 14-186 (235)
29 TIGR02037 degP_htrA_DO peripla 98.3 1.7E-06 3.7E-11 87.8 7.6 60 316-375 368-427 (428)
30 KOG3627 Trypsin [Amino acid tr 98.3 2.4E-05 5.2E-10 73.3 14.4 145 153-299 39-229 (256)
31 cd00136 PDZ PDZ domain, also c 98.2 2E-06 4.4E-11 64.1 5.0 50 316-366 19-70 (70)
32 PRK09681 putative type II secr 98.1 7.2E-06 1.6E-10 77.4 7.8 55 324-378 221-275 (276)
33 PRK10139 serine endoprotease; 98.1 7.4E-06 1.6E-10 83.7 7.9 59 316-376 396-454 (455)
34 TIGR00225 prc C-terminal pepti 98.1 7.9E-06 1.7E-10 80.2 7.2 64 316-380 68-133 (334)
35 TIGR03279 cyano_FeS_chp putati 98.0 1.5E-05 3.3E-10 79.6 7.8 61 316-380 4-65 (433)
36 PRK10942 serine endoprotease; 98.0 1.5E-05 3.2E-10 81.8 7.7 59 316-376 414-472 (473)
37 smart00228 PDZ Domain present 97.9 1.4E-05 3.1E-10 61.4 5.0 54 316-369 32-85 (85)
38 PLN00049 carboxyl-terminal pro 97.9 3.3E-05 7.2E-10 77.4 8.2 62 316-378 108-171 (389)
39 COG3480 SdrC Predicted secrete 97.8 5.1E-05 1.1E-09 72.0 7.8 59 323-381 142-201 (342)
40 TIGR00054 RIP metalloprotease 97.8 2.1E-05 4.7E-10 79.6 5.5 61 316-378 134-194 (420)
41 PF14685 Tricorn_PDZ: Tricorn 97.7 0.00012 2.6E-09 57.4 7.2 55 320-375 30-87 (88)
42 COG0793 Prc Periplasmic protea 97.7 0.00011 2.4E-09 73.9 7.5 62 316-378 118-183 (406)
43 cd00992 PDZ_signaling PDZ doma 97.7 9.2E-05 2E-09 56.7 5.4 48 316-365 32-81 (82)
44 PF05579 Peptidase_S32: Equine 97.6 0.00049 1.1E-08 63.9 10.4 116 151-297 111-228 (297)
45 PF00595 PDZ: PDZ domain (Also 97.5 7.8E-05 1.7E-09 57.3 3.1 49 316-366 31-81 (81)
46 COG3031 PulC Type II secretory 97.4 0.00025 5.4E-09 64.7 5.6 54 324-377 221-274 (275)
47 COG5640 Secreted trypsin-like 97.1 0.022 4.8E-07 55.3 15.2 56 154-210 63-135 (413)
48 PRK11186 carboxy-terminal prot 97.0 0.0014 3.1E-08 69.6 7.2 61 316-377 261-332 (667)
49 KOG3129 26S proteasome regulat 96.8 0.0024 5.3E-08 57.2 5.6 65 316-380 145-211 (231)
50 PF03761 DUF316: Domain of unk 96.7 0.11 2.4E-06 49.4 17.0 91 197-297 159-254 (282)
51 PF05580 Peptidase_S55: SpoIVB 96.7 0.023 5E-07 51.6 11.1 164 147-315 15-214 (218)
52 PF04495 GRASP55_65: GRASP55/6 96.6 0.0027 5.7E-08 54.3 4.3 62 316-378 49-113 (138)
53 COG3975 Predicted protease wit 96.2 0.0048 1E-07 62.6 4.1 59 316-382 468-526 (558)
54 PF00548 Peptidase_C3: 3C cyst 96.2 0.049 1.1E-06 48.3 10.1 137 151-297 24-170 (172)
55 KOG3209 WW domain-containing p 95.8 0.014 3E-07 61.0 5.3 64 303-368 763-837 (984)
56 PF09342 DUF1986: Domain of un 95.6 0.095 2.1E-06 48.5 9.5 96 141-237 17-131 (267)
57 PF00949 Peptidase_S7: Peptida 95.5 0.013 2.8E-07 49.5 3.4 32 268-299 88-119 (132)
58 PF10459 Peptidase_S46: Peptid 95.3 0.012 2.6E-07 62.9 3.3 21 153-173 48-68 (698)
59 PF08192 Peptidase_S64: Peptid 95.0 0.16 3.4E-06 53.3 10.1 109 198-315 542-680 (695)
60 KOG3580 Tight junction protein 94.1 0.074 1.6E-06 54.7 5.1 73 286-367 414-488 (1027)
61 PF02122 Peptidase_S39: Peptid 94.1 0.18 3.8E-06 45.9 7.0 144 152-313 30-181 (203)
62 TIGR02860 spore_IV_B stage IV 93.8 0.46 1E-05 47.6 10.0 41 271-315 354-394 (402)
63 KOG3580 Tight junction protein 93.0 0.16 3.5E-06 52.3 5.4 52 323-376 233-286 (1027)
64 PF10459 Peptidase_S46: Peptid 92.9 0.053 1.1E-06 58.1 1.8 28 268-295 624-651 (698)
65 PF12812 PDZ_1: PDZ-like domai 92.6 0.15 3.2E-06 39.1 3.5 33 324-356 44-76 (78)
66 COG0750 Predicted membrane-ass 92.4 0.39 8.6E-06 47.6 7.3 56 316-372 135-194 (375)
67 PF00944 Peptidase_S3: Alphavi 92.3 0.12 2.6E-06 43.4 2.8 28 271-298 100-127 (158)
68 KOG3209 WW domain-containing p 92.2 0.24 5.1E-06 52.2 5.4 68 303-370 493-566 (984)
69 KOG3605 Beta amyloid precursor 89.1 1.4 2.9E-05 46.3 7.5 79 286-371 654-737 (829)
70 KOG3532 Predicted protein kina 88.8 0.57 1.2E-05 49.2 4.7 47 316-363 404-450 (1051)
71 KOG3552 FERM domain protein FR 88.6 0.5 1.1E-05 51.2 4.2 45 321-367 85-131 (1298)
72 KOG3553 Tax interaction protei 83.7 0.62 1.4E-05 37.1 1.5 28 316-343 65-92 (124)
73 PF02907 Peptidase_S29: Hepati 80.5 0.86 1.9E-05 38.4 1.3 114 155-299 15-130 (148)
74 KOG3550 Receptor targeting pro 79.7 3.7 8E-05 35.3 4.8 48 316-365 121-171 (207)
75 PF03510 Peptidase_C24: 2C end 78.9 10 0.00022 30.7 6.9 52 156-219 3-54 (105)
76 PF01732 DUF31: Putative pepti 75.6 2.2 4.8E-05 42.5 2.8 24 272-295 350-373 (374)
77 KOG3549 Syntrophins (type gamm 75.6 9.1 0.0002 37.4 6.7 47 319-367 89-138 (505)
78 KOG3542 cAMP-regulated guanine 74.5 2.5 5.4E-05 44.6 2.8 50 316-366 568-617 (1283)
79 KOG3834 Golgi reassembly stack 71.6 11 0.00024 37.9 6.4 62 316-378 115-177 (462)
80 PF02395 Peptidase_S6: Immunog 69.9 15 0.00032 40.3 7.6 62 152-218 65-130 (769)
81 PF00947 Pico_P2A: Picornaviru 66.7 8.5 0.00018 32.2 3.9 32 265-297 78-109 (127)
82 KOG1892 Actin filament-binding 65.6 7.9 0.00017 42.6 4.3 52 317-370 967-1021(1629)
83 KOG0609 Calcium/calmodulin-dep 61.6 9.7 0.00021 39.3 4.0 40 326-367 163-204 (542)
84 cd01720 Sm_D2 The eukaryotic S 60.2 18 0.00039 28.3 4.4 37 171-207 10-46 (87)
85 cd01735 LSm12_N LSm12 belongs 58.8 35 0.00075 24.8 5.4 33 176-208 7-39 (61)
86 cd00600 Sm_like The eukaryotic 58.4 27 0.00058 24.8 4.9 33 176-208 7-39 (63)
87 KOG3651 Protein kinase C, alph 56.1 51 0.0011 31.8 7.4 49 316-366 36-87 (429)
88 cd01731 archaeal_Sm1 The archa 54.7 30 0.00065 25.4 4.7 33 176-208 11-43 (68)
89 PRK00737 small nuclear ribonuc 54.0 30 0.00066 25.7 4.7 32 176-207 15-46 (72)
90 cd01726 LSm6 The eukaryotic Sm 53.9 28 0.00062 25.5 4.5 32 176-207 11-42 (67)
91 cd01722 Sm_F The eukaryotic Sm 52.6 29 0.00063 25.5 4.3 32 176-207 12-43 (68)
92 KOG0606 Microtubule-associated 51.3 19 0.00042 40.4 4.4 48 316-365 664-713 (1205)
93 PF00571 CBS: CBS domain CBS d 51.3 15 0.00033 25.1 2.5 21 276-296 28-48 (57)
94 cd01730 LSm3 The eukaryotic Sm 50.6 28 0.00061 26.7 4.1 31 176-206 12-42 (82)
95 KOG3834 Golgi reassembly stack 50.5 25 0.00053 35.5 4.6 60 316-377 21-84 (462)
96 COG0298 HypC Hydrogenase matur 50.2 37 0.00079 26.1 4.5 46 188-236 5-52 (82)
97 PF05416 Peptidase_C37: Southa 49.5 41 0.00088 34.0 5.9 135 152-299 379-528 (535)
98 cd01717 Sm_B The eukaryotic Sm 49.2 36 0.00078 25.8 4.5 31 176-206 11-41 (79)
99 KOG2921 Intramembrane metallop 48.8 16 0.00035 36.4 3.0 29 326-354 237-265 (484)
100 cd06168 LSm9 The eukaryotic Sm 48.1 45 0.00097 25.2 4.8 31 176-206 11-41 (75)
101 cd01729 LSm7 The eukaryotic Sm 47.0 43 0.00094 25.6 4.6 31 176-206 13-43 (81)
102 cd01732 LSm5 The eukaryotic Sm 46.9 39 0.00084 25.6 4.3 31 176-206 14-44 (76)
103 KOG3938 RGS-GAIP interacting p 46.1 11 0.00024 35.5 1.4 41 327-367 167-209 (334)
104 TIGR03000 plancto_dom_1 Planct 44.8 36 0.00077 25.8 3.7 49 330-378 10-63 (75)
105 cd01719 Sm_G The eukaryotic Sm 44.3 53 0.0011 24.5 4.7 31 176-206 11-41 (72)
106 PF04225 OapA: Opacity-associa 43.4 14 0.0003 28.7 1.4 53 328-380 7-68 (85)
107 cd01728 LSm1 The eukaryotic Sm 43.2 54 0.0012 24.7 4.5 31 176-206 13-43 (74)
108 smart00651 Sm snRNP Sm protein 43.1 57 0.0012 23.4 4.7 32 176-207 9-40 (67)
109 cd01721 Sm_D3 The eukaryotic S 42.3 59 0.0013 24.1 4.6 32 176-207 11-42 (70)
110 cd01727 LSm8 The eukaryotic Sm 40.5 60 0.0013 24.2 4.5 31 176-206 10-40 (74)
111 COG1958 LSM1 Small nuclear rib 40.1 55 0.0012 24.7 4.3 33 176-208 18-50 (79)
112 KOG3551 Syntrophins (type beta 39.9 26 0.00057 34.9 2.9 42 326-369 127-172 (506)
113 PF01423 LSM: LSM domain ; In 37.8 66 0.0014 23.1 4.3 33 176-208 9-41 (67)
114 PF02601 Exonuc_VII_L: Exonucl 35.9 43 0.00093 32.4 3.9 34 153-186 281-314 (319)
115 KOG3606 Cell polarity protein 35.2 93 0.002 29.6 5.6 40 317-356 201-243 (358)
116 KOG3571 Dishevelled 3 and rela 35.0 38 0.00082 34.9 3.3 43 322-367 290-338 (626)
117 PF01455 HupF_HypC: HupF/HypC 34.6 1.2E+02 0.0026 22.4 5.2 43 188-233 5-47 (68)
118 KOG3605 Beta amyloid precursor 34.1 28 0.0006 37.0 2.2 42 317-359 763-806 (829)
119 PF05578 Peptidase_S31: Pestiv 33.6 1E+02 0.0022 26.7 5.2 73 224-297 109-182 (211)
120 PF09122 DUF1930: Domain of un 31.3 1.9E+02 0.0041 21.2 5.4 45 331-376 19-64 (68)
121 PF14827 Cache_3: Sensory doma 30.4 39 0.00084 27.4 2.2 17 281-297 94-110 (116)
122 cd01723 LSm4 The eukaryotic Sm 30.3 1.6E+02 0.0035 22.0 5.4 32 176-207 12-43 (76)
123 PF14275 DUF4362: Domain of un 29.9 1.2E+02 0.0026 24.2 4.8 25 329-355 1-25 (98)
124 COG4956 Integral membrane prot 29.4 48 0.0011 32.1 2.8 38 335-372 270-308 (356)
125 cd04627 CBS_pair_14 The CBS do 26.7 50 0.0011 26.2 2.2 22 276-297 97-118 (123)
126 cd01725 LSm2 The eukaryotic Sm 25.7 1.6E+02 0.0034 22.5 4.7 32 176-207 12-43 (81)
127 cd04603 CBS_pair_KefB_assoc Th 24.5 59 0.0013 25.4 2.2 20 277-296 86-105 (111)
128 PF11948 DUF3465: Protein of u 24.5 4.3E+02 0.0094 22.3 10.1 12 223-234 85-96 (131)
129 PF08669 GCV_T_C: Glycine clea 24.4 58 0.0013 25.2 2.1 20 278-297 34-53 (95)
130 cd04620 CBS_pair_7 The CBS dom 23.8 62 0.0013 25.2 2.2 20 277-296 90-109 (115)
131 cd01733 LSm10 The eukaryotic S 23.4 3.3E+02 0.0072 20.6 6.6 32 176-207 20-51 (78)
132 PF10049 DUF2283: Protein of u 23.1 56 0.0012 22.4 1.6 10 285-294 36-45 (50)
133 KOG1379 Serine/threonine prote 22.4 98 0.0021 30.1 3.5 71 275-354 187-271 (330)
134 PRK13835 conjugal transfer pro 21.9 73 0.0016 27.4 2.3 40 118-158 47-86 (145)
135 PRK14864 putative biofilm stre 21.6 1.7E+02 0.0037 23.6 4.3 26 108-133 32-57 (104)
136 PF14438 SM-ATX: Ataxin 2 SM d 21.1 2.2E+02 0.0047 21.2 4.7 30 176-206 13-45 (77)
137 cd04597 CBS_pair_DRTGG_assoc2 20.8 84 0.0018 24.9 2.4 21 276-296 87-107 (113)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=3.8e-46 Score=376.87 Aligned_cols=262 Identities=39% Similarity=0.609 Sum_probs=223.9
Q ss_pred hHHHHHHHhCCceEEEEEeeeccC------ccc----c---cc-ccCCCeEEEEEEEcC-CCEEEecccccCCCCeEEEE
Q 016647 117 ATVRLFQENTPSVVNITNLAARQD------AFT----L---DV-LEVPQGSGSGFVWDS-KGHVVTNYHVIRGASDIRVT 181 (385)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~------~~~----~---~~-~~~~~~~GSGfiI~~-~G~ILT~aHvv~~~~~i~V~ 181 (385)
++.++++++.||||.|.+...... .|. . +. .....+.||||||++ +||||||+|||++++.+.|+
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~ 120 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ 120 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence 578999999999999987653221 110 0 00 112247899999985 79999999999999999999
Q ss_pred ecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCC
Q 016647 182 FADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG 261 (385)
Q Consensus 182 ~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~ 261 (385)
+.|+++++|++++.|+.+||||||++.+. .+++++|+++..+++||+|+++|||++...+++.|+|++..+.... .
T Consensus 121 ~~dg~~~~a~vvg~D~~~DlAvlkv~~~~-~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~---~ 196 (455)
T PRK10139 121 LNDGREFDAKLIGSDDQSDIALLQIQNPS-KLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLN---L 196 (455)
T ss_pred ECCCCEEEEEEEEEcCCCCEEEEEecCCC-CCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccC---C
Confidence 99999999999999999999999998643 6899999998899999999999999999999999999987764221 1
Q ss_pred CCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc--------------------------
Q 016647 262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT-------------------------- 315 (385)
Q Consensus 262 ~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~-------------------------- 315 (385)
..+.+++++|+.+++|+|||||||.+||||||+++.+.++++..|+||+||++.
T Consensus 197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~l 276 (455)
T PRK10139 197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEM 276 (455)
T ss_pred CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEEC
Confidence 224578999999999999999999999999999998877777789999999986
Q ss_pred -------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647 316 -------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV 376 (385)
Q Consensus 316 -------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v 376 (385)
.+.++++++++|||+||+|++|||++|.++.|+.+.+...++|++++++|+|+|+.+++++
T Consensus 277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V~R~G~~~~l~v 356 (455)
T PRK10139 277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGLLRNGKPLEVEV 356 (455)
T ss_pred CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCCCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEE
Confidence 2235667889999999999999999999999999999888899999999999999999999
Q ss_pred EeecCC
Q 016647 377 KLEPKP 382 (385)
Q Consensus 377 ~~~~~~ 382 (385)
++.+.+
T Consensus 357 ~~~~~~ 362 (455)
T PRK10139 357 TLDTST 362 (455)
T ss_pred EECCCC
Confidence 885543
No 2
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=3.6e-45 Score=359.33 Aligned_cols=264 Identities=33% Similarity=0.492 Sum_probs=223.1
Q ss_pred CccchhHHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeE
Q 016647 112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK 191 (385)
Q Consensus 112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~ 191 (385)
...+.++.++++++.||||.|......... .......+.||||+|+++|+||||+||+.+++.+.|++.||+.++|+
T Consensus 41 ~~~~~~~~~~~~~~~psvV~v~~~~~~~~~---~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~ 117 (353)
T PRK10898 41 DETPASYNQAVRRAAPAVVNVYNRSLNSTS---HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQDGRVFEAL 117 (353)
T ss_pred ccccchHHHHHHHhCCcEEEEEeEeccccC---cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCCCCEEEEE
Confidence 334457889999999999999885532211 01112347899999999999999999999999999999999999999
Q ss_pred EEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEc
Q 016647 192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD 271 (385)
Q Consensus 192 vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d 271 (385)
++++|+.+||||||++.. .+++++++++..+++|++|+++|||++...+++.|+|++..+.... .....+++++|
T Consensus 118 vv~~d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~---~~~~~~~iqtd 192 (353)
T PRK10898 118 LVGSDSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS---PTGRQNFLQTD 192 (353)
T ss_pred EEEEcCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC---CccccceEEec
Confidence 999999999999999873 5788999888889999999999999998889999999987664321 11234789999
Q ss_pred cccCCCCCCceeeCCCccEEEEeecccCCCC---CCCCceeeecccc---------------------------------
Q 016647 272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSG---ASSGVGFSIPVDT--------------------------------- 315 (385)
Q Consensus 272 ~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~---~~~~~g~aIP~~~--------------------------------- 315 (385)
+.+++|+|||||+|.+||||||+++.+...+ ...++||+||++.
T Consensus 193 a~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~ 272 (353)
T PRK10898 193 ASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQG 272 (353)
T ss_pred cccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHh
Confidence 9999999999999999999999998765432 2368999999987
Q ss_pred ------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeecCCC
Q 016647 316 ------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEPKPD 383 (385)
Q Consensus 316 ------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~~~ 383 (385)
.+.++++++++||++||+|++|||++|.++.++.+.+...++|++++++|+|+|+.+++++++.+.+.
T Consensus 273 ~~~~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~p~ 352 (353)
T PRK10898 273 GGIDQLQGIVVNEVSPDGPAAKAGIQVNDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQLTLQVTIQEYPA 352 (353)
T ss_pred cCCCCCCeEEEEEECCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEeccCCC
Confidence 12345677888999999999999999999999999998888999999999999999999999987764
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=5.5e-45 Score=358.17 Aligned_cols=263 Identities=37% Similarity=0.605 Sum_probs=223.5
Q ss_pred CccchhHHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeE
Q 016647 112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK 191 (385)
Q Consensus 112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~ 191 (385)
...+.++.++++++.||||.|.......+.+ ......+.||||+|+++||||||+||+.+++.+.|.+.||+.++|+
T Consensus 41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~~---~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~dg~~~~a~ 117 (351)
T TIGR02038 41 NTVEISFNKAVRRAAPAVVNIYNRSISQNSL---NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQDGRKFEAE 117 (351)
T ss_pred cccchhHHHHHHhcCCcEEEEEeEecccccc---ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECCCCEEEEE
Confidence 4455578899999999999998865433211 1123357899999999999999999999999999999999999999
Q ss_pred EEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEc
Q 016647 192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD 271 (385)
Q Consensus 192 vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d 271 (385)
+++.|+.+||||||++.. .+++++++++..+++|++|+++|||++...+++.|+|+...+.... .....+++++|
T Consensus 118 vv~~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd 192 (351)
T TIGR02038 118 LVGSDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD 192 (351)
T ss_pred EEEecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence 999999999999999874 4788999888889999999999999998899999999987764321 11235789999
Q ss_pred cccCCCCCCceeeCCCccEEEEeecccCCC--CCCCCceeeecccc----------------------------------
Q 016647 272 AAINPGNSGGPLLDSSGSLIGINTAIYSPS--GASSGVGFSIPVDT---------------------------------- 315 (385)
Q Consensus 272 ~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~--~~~~~~g~aIP~~~---------------------------------- 315 (385)
+.+++|+|||||+|.+|+||||+++.+... +...+++|+||++.
T Consensus 193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l 272 (351)
T TIGR02038 193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL 272 (351)
T ss_pred CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence 999999999999999999999998766432 22468999999986
Q ss_pred -----------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeecCC
Q 016647 316 -----------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEPKP 382 (385)
Q Consensus 316 -----------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~~ 382 (385)
.+.++++++++||++||+|++|||++|.++.|+.+.+...++|++++++|+|+|+.+++++++.+.|
T Consensus 273 gl~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~p 350 (351)
T TIGR02038 273 GLPDLRGIVITGVDPNGPAARAGILVRDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLRQGKQLELPVTIDEKP 350 (351)
T ss_pred CCCccccceEeecCCCChHHHCCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEEecCCC
Confidence 1234567788899999999999999999999999999988899999999999999999999987765
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=1.1e-43 Score=360.52 Aligned_cols=261 Identities=38% Similarity=0.595 Sum_probs=222.3
Q ss_pred hHHHHHHHhCCceEEEEEeeeccC---c--------cccc--c------------------------ccCCCeEEEEEEE
Q 016647 117 ATVRLFQENTPSVVNITNLAARQD---A--------FTLD--V------------------------LEVPQGSGSGFVW 159 (385)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~---~--------~~~~--~------------------------~~~~~~~GSGfiI 159 (385)
.+.++++++.||||.|.+...... + |..+ . ....++.||||||
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 588999999999999987653211 1 1000 0 0012468999999
Q ss_pred cC-CCEEEecccccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCC
Q 016647 160 DS-KGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFG 238 (385)
Q Consensus 160 ~~-~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g 238 (385)
++ +||||||+||+.+++.+.|++.|+++++|++++.|+.+||||||++... .+++++|+++..+++|++|+++|+|++
T Consensus 119 ~~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~~-~l~~~~lg~s~~l~~G~~V~aiG~P~g 197 (473)
T PRK10942 119 DADKGYVVTNNHVVDNATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNPK-NLTAIKMADSDALRVGDYTVAIGNPYG 197 (473)
T ss_pred ECCCCEEEeChhhcCCCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCCC-CCceeEecCccccCCCCEEEEEcCCCC
Confidence 86 5999999999999999999999999999999999999999999997543 689999999889999999999999999
Q ss_pred CCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc---
Q 016647 239 LDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT--- 315 (385)
Q Consensus 239 ~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~--- 315 (385)
...+++.|+|+...+.... ...+..++++|+.+++|+|||||+|.+|+||||+++.+.++++..++||+||++.
T Consensus 198 ~~~tvt~GiVs~~~r~~~~---~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~ 274 (473)
T PRK10942 198 LGETVTSGIVSALGRSGLN---VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKN 274 (473)
T ss_pred CCcceeEEEEEEeecccCC---cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHH
Confidence 9999999999987764211 1223578999999999999999999999999999998887777789999999975
Q ss_pred ------------------------------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHh
Q 016647 316 ------------------------------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD 353 (385)
Q Consensus 316 ------------------------------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~ 353 (385)
.+.++++++++|||+||+|++|||++|.++.++.+.+.
T Consensus 275 v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~GDvIl~InG~~V~s~~dl~~~l~ 354 (473)
T PRK10942 275 LTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAALRAQVG 354 (473)
T ss_pred HHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCCCCCCEEEEECCEECCCHHHHHHHHH
Confidence 22345677889999999999999999999999999999
Q ss_pred cCCCCCEEEEEEEECCEEEEEEEEeecC
Q 016647 354 QCKVGDEVIVEVLRGDQKEKIPVKLEPK 381 (385)
Q Consensus 354 ~~~~g~~v~l~v~R~g~~~~~~v~~~~~ 381 (385)
..++|++++++|+|+|+.+++++++.+.
T Consensus 355 ~~~~g~~v~l~v~R~G~~~~v~v~l~~~ 382 (473)
T PRK10942 355 TMPVGSKLTLGLLRDGKPVNVNVELQQS 382 (473)
T ss_pred hcCCCCEEEEEEEECCeEEEEEEEeCcC
Confidence 8889999999999999999999988654
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=4.3e-42 Score=346.81 Aligned_cols=261 Identities=43% Similarity=0.658 Sum_probs=222.7
Q ss_pred HHHHHHHhCCceEEEEEeeeccC---------c----ccc---c-----cccCCCeEEEEEEEcCCCEEEecccccCCCC
Q 016647 118 TVRLFQENTPSVVNITNLAARQD---------A----FTL---D-----VLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS 176 (385)
Q Consensus 118 ~~~~~~~~~~SVV~I~~~~~~~~---------~----~~~---~-----~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~ 176 (385)
+.++++++.||||.|.+...... + |.. . ......+.||||+|+++|+||||+||+.++.
T Consensus 3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~ 82 (428)
T TIGR02037 3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD 82 (428)
T ss_pred HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence 56899999999999988652211 1 100 0 0112357899999999999999999999999
Q ss_pred eEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeec
Q 016647 177 DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREIS 256 (385)
Q Consensus 177 ~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~ 256 (385)
.+.|++.|++.++|++++.|+.+||||||++.+ ..++++.|+++..+++|++|+++|||++...+++.|+|+...+...
T Consensus 83 ~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~ 161 (428)
T TIGR02037 83 EITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGL 161 (428)
T ss_pred eEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCcc
Confidence 999999999999999999999999999999875 3689999998888999999999999999999999999998766521
Q ss_pred cCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc---------------------
Q 016647 257 SAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT--------------------- 315 (385)
Q Consensus 257 ~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~--------------------- 315 (385)
....+..++++|+.+++|+|||||+|.+|+||||+++.+...++..+++|+||++.
T Consensus 162 ---~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi 238 (428)
T TIGR02037 162 ---GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGV 238 (428)
T ss_pred ---CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCce
Confidence 11234568999999999999999999999999999988777666789999999876
Q ss_pred ------------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEE
Q 016647 316 ------------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQK 371 (385)
Q Consensus 316 ------------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~ 371 (385)
.+.++++++++||++||+|++|||++|.++.++.+++...++|++++++|+|+|+.
T Consensus 239 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~v~l~v~R~g~~ 318 (428)
T TIGR02037 239 TIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKAGDVILSVNGKPISSFADLRRAIGTLKPGKKVTLGILRKGKE 318 (428)
T ss_pred EeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEE
Confidence 22356778889999999999999999999999999999888999999999999999
Q ss_pred EEEEEEeecCC
Q 016647 372 EKIPVKLEPKP 382 (385)
Q Consensus 372 ~~~~v~~~~~~ 382 (385)
+++++++...+
T Consensus 319 ~~~~v~l~~~~ 329 (428)
T TIGR02037 319 KTITVTLGASP 329 (428)
T ss_pred EEEEEEECcCC
Confidence 99999876544
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=2e-34 Score=283.60 Aligned_cols=262 Identities=45% Similarity=0.665 Sum_probs=225.0
Q ss_pred hhHHHHHHHhCCceEEEEEeeeccC-ccccccc-cC-CCeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeEE
Q 016647 116 LATVRLFQENTPSVVNITNLAARQD-AFTLDVL-EV-PQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKI 192 (385)
Q Consensus 116 ~~~~~~~~~~~~SVV~I~~~~~~~~-~~~~~~~-~~-~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~v 192 (385)
..+..+++++.|+||.|........ .|..... .. ..+.||||+++++|+|+|+.||+.+++.+.+.+.||+++++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~dg~~~~a~~ 112 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLADGREVPAKL 112 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeCCCCEEEEEE
Confidence 5778999999999999988654321 1110000 00 1479999999999999999999999999999999999999999
Q ss_pred EEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcc
Q 016647 193 VGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDA 272 (385)
Q Consensus 193 v~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~ 272 (385)
++.|+..|+|++|++.... ++.+.++++..++.|++++++|+|++...+++.|+++...+. . ........+++++|+
T Consensus 113 vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~-v~~~~~~~~~IqtdA 189 (347)
T COG0265 113 VGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-G-VGSAGGYVNFIQTDA 189 (347)
T ss_pred EecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-c-ccCcccccchhhccc
Confidence 9999999999999998543 788899999999999999999999999999999999988885 1 111122568899999
Q ss_pred ccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc------------------------------c------
Q 016647 273 AINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT------------------------------G------ 316 (385)
Q Consensus 273 ~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~------------------------------~------ 316 (385)
.+++|+||||++|.+|++|||++..+.+.++..+++|+||++. .
T Consensus 190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~~g~~~~ 269 (347)
T COG0265 190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIALGLPVA 269 (347)
T ss_pred ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccccCCCCC
Confidence 9999999999999999999999999887766678999999986 1
Q ss_pred -------ccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647 317 -------LLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 380 (385)
Q Consensus 317 -------v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 380 (385)
+.+.++++++|++.||+|+++||+++.+..++.+.+....+|+.+.++++|+|+.+++.+++.+
T Consensus 270 ~G~~V~~v~~~spa~~agi~~Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~~~v~l~~ 340 (347)
T COG0265 270 AGAVVLGVLPGSPAAKAGIKAGDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERELAVTLGD 340 (347)
T ss_pred CceEEEecCCCChHHHcCCCCCCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEEEEEEecC
Confidence 2345667888999999999999999999999999999999999999999999999999999987
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.95 E-value=2.4e-26 Score=227.90 Aligned_cols=267 Identities=37% Similarity=0.532 Sum_probs=219.9
Q ss_pred hhHHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCCCEEEecccccCCCC-----------eEEEEecC
Q 016647 116 LATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS-----------DIRVTFAD 184 (385)
Q Consensus 116 ~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~-----------~i~V~~~d 184 (385)
.....++++...++|.|+...-..........+.+...|||||++.+|.++|++||+.... .+.|...+
T Consensus 128 ~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~ 207 (473)
T KOG1320|consen 128 AFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAI 207 (473)
T ss_pred hhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEee
Confidence 3456789999999999998544333322233445678899999999999999999986432 37777776
Q ss_pred C--CeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCC-
Q 016647 185 Q--SAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG- 261 (385)
Q Consensus 185 g--~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~- 261 (385)
+ ..+++.+++.|+..|+|+++++.+..-.++++++-...+..|+++..+|.|++..+..+.|+++...|........
T Consensus 208 ~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~~ 287 (473)
T KOG1320|consen 208 GPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLET 287 (473)
T ss_pred cCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCccc
Confidence 6 8899999999999999999997654337788888888899999999999999999999999999888775543333
Q ss_pred -CCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc-------------------------
Q 016647 262 -RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT------------------------- 315 (385)
Q Consensus 262 -~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~------------------------- 315 (385)
....+++++|+.++.|+||+|++|.+|++||+++......+-..+++|++|.+.
T Consensus 288 g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~~ 367 (473)
T KOG1320|consen 288 GVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPVH 367 (473)
T ss_pred ceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCccccc
Confidence 445688999999999999999999999999999877654444568899999886
Q ss_pred ------------------------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCC
Q 016647 316 ------------------------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGD 359 (385)
Q Consensus 316 ------------------------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~ 359 (385)
.+.+++++...++++||+|.+|||++|.|..++.++++++.+++
T Consensus 368 ~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~g~~V~~vng~~V~n~~~l~~~i~~~~~~~ 447 (473)
T KOG1320|consen 368 QYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKPGDQVVKVNGKPVKNLKHLYELIEECSTED 447 (473)
T ss_pred ccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccCCCEEEEECCEEeechHHHHHHHHhcCcCc
Confidence 23346666777899999999999999999999999999999999
Q ss_pred EEEEEEEECCEEEEEEEEeecCC
Q 016647 360 EVIVEVLRGDQKEKIPVKLEPKP 382 (385)
Q Consensus 360 ~v~l~v~R~g~~~~~~v~~~~~~ 382 (385)
+|.+..+|..|..++.+..++..
T Consensus 448 ~v~vl~~~~~e~~tl~Il~~~~~ 470 (473)
T KOG1320|consen 448 KVAVLDRRSAEDATLEILPEHKI 470 (473)
T ss_pred eEEEEEecCccceeEEecccccC
Confidence 99999999999999988766543
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.84 E-value=1.3e-19 Score=182.16 Aligned_cols=254 Identities=26% Similarity=0.352 Sum_probs=197.6
Q ss_pred hHHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCC-CEEEecccccC-CCCeEEEEecCCCeEeeEEEE
Q 016647 117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVG 194 (385)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~-~~~~i~V~~~dg~~~~a~vv~ 194 (385)
.+...+..+-++||.|...... .| +....+.+.+|||++++. |++|||+|++. +.-.-.+.+.+..+.+.-.++
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~--~f--dtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy 128 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVR--AF--DTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY 128 (955)
T ss_pred hhhhhhhhhcccEEEEEehhee--ec--ccccccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence 5667888999999999875432 11 222335678999999977 89999999996 444567778777788888889
Q ss_pred ECCCCCeEEEEEcCCC---CCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCC---CCcccEE
Q 016647 195 FDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG---RPIQDVI 268 (385)
Q Consensus 195 ~d~~~DlAlLkv~~~~---~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i 268 (385)
.|+.+|+.+++.+... ..+..+.++. +..++|.++.++|+..+.-.++..|.++.+++....+... .....++
T Consensus 129 rDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~ 207 (955)
T KOG1421|consen 129 RDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI 207 (955)
T ss_pred CCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceee
Confidence 9999999999998643 2344455543 3468899999999987777788889999888876655322 1123456
Q ss_pred EEccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc---------------------------------
Q 016647 269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT--------------------------------- 315 (385)
Q Consensus 269 ~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~--------------------------------- 315 (385)
|.-+....|.||+|++|.+|..|.++..+.. ..+.+|++|.+.
T Consensus 208 QaasstsggssgspVv~i~gyAVAl~agg~~----ssas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rr 283 (955)
T KOG1421|consen 208 QAASSTSGGSSGSPVVDIPGYAVALNAGGSI----SSASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRR 283 (955)
T ss_pred eehhcCCCCCCCCceecccceEEeeecCCcc----cccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHh
Confidence 7777788999999999999999999987654 335678888875
Q ss_pred -------------------------cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCE
Q 016647 316 -------------------------GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQ 370 (385)
Q Consensus 316 -------------------------~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~ 370 (385)
++.+++++.+ .|++||+++++|+.-+.++.++.++|++. .|+.++|+|+|+|+
T Consensus 284 lGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k-~Le~GDillavN~t~l~df~~l~~iLDeg-vgk~l~LtI~Rggq 361 (955)
T KOG1421|consen 284 LGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEK-KLEPGDILLAVNSTCLNDFEALEQILDEG-VGKNLELTIQRGGQ 361 (955)
T ss_pred cCCcHHHHHHHHhcCcccceeEEEEEeccCCchhh-ccCCCcEEEEEcceehHHHHHHHHHHhhc-cCceEEEEEEeCCE
Confidence 2234444444 49999999999999999999999999985 89999999999999
Q ss_pred EEEEEEEeecC
Q 016647 371 KEKIPVKLEPK 381 (385)
Q Consensus 371 ~~~~~v~~~~~ 381 (385)
+.++.++.+..
T Consensus 362 elel~vtvqdl 372 (955)
T KOG1421|consen 362 ELELTVTVQDL 372 (955)
T ss_pred EEEEEEEeccc
Confidence 99999887665
No 9
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.70 E-value=2.7e-16 Score=129.96 Aligned_cols=109 Identities=38% Similarity=0.599 Sum_probs=74.4
Q ss_pred EEEEEEcCCCEEEecccccC--------CCCeEEEEecCCCeEe--eEEEEECCC-CCeEEEEEcCCCCCCcceecCCCC
Q 016647 154 GSGFVWDSKGHVVTNYHVIR--------GASDIRVTFADQSAYD--AKIVGFDQD-KDVAVLRIDAPKDKLRPIPIGVSA 222 (385)
Q Consensus 154 GSGfiI~~~G~ILT~aHvv~--------~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~l~l~~~~ 222 (385)
||||+|+++|+||||+||+. ....+.+...++.... +++++.|+. .|+|||+++. .
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~-------------~ 67 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP-------------W 67 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC-------------E
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec-------------c
Confidence 89999999999999999998 4566888888888888 999999999 9999999970 0
Q ss_pred CCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEE
Q 016647 223 DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGI 293 (385)
Q Consensus 223 ~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI 293 (385)
...+......+ ........... ......+ +++.+.+|+||||+||.+|+||||
T Consensus 68 -~~~~~~~~~~~------------~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 68 -TGVGGGVRVPG------------STSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -EEEEEEEEEEE------------EEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred -cceeeeeEeee------------eccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 00000000000 00000000000 0001113 799999999999999999999997
No 10
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.42 E-value=1.3e-11 Score=112.09 Aligned_cols=146 Identities=25% Similarity=0.401 Sum_probs=98.0
Q ss_pred CeEEEEEEEcCCCEEEecccccCCCCeEEEEecC-------C--CeEeeEEEEE----CC---CCCeEEEEEcCC---CC
Q 016647 151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFAD-------Q--SAYDAKIVGF----DQ---DKDVAVLRIDAP---KD 211 (385)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~d-------g--~~~~a~vv~~----d~---~~DlAlLkv~~~---~~ 211 (385)
...|+|++|+++ +|||++||+.+..++.+.+.. + ..+..+-+.. +. .+|+|||+++.+ ..
T Consensus 24 ~~~C~G~li~~~-~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~ 102 (220)
T PF00089_consen 24 RFFCTGTLISPR-WVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD 102 (220)
T ss_dssp EEEEEEEEEETT-EEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred CeeEeEEecccc-ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 458999999987 999999999996667775542 2 2344443333 22 569999999987 24
Q ss_pred CCcceecCCC-CCCCCCCEEEEEeCCCCCCC----ceEEeEEeeeeeeeccCC-CCCCcccEEEEcc----ccCCCCCCc
Q 016647 212 KLRPIPIGVS-ADLLVGQKVYAIGNPFGLDH----TLTTGVISGLRREISSAA-TGRPIQDVIQTDA----AINPGNSGG 281 (385)
Q Consensus 212 ~~~~l~l~~~-~~~~~G~~V~~iG~p~g~~~----~~~~G~Vs~~~~~~~~~~-~~~~~~~~i~~d~----~i~~G~SGG 281 (385)
.+.++.+... ..+..|+.+.++||+..... ......+.......+... ........+.... ..+.|+|||
T Consensus 103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~ 182 (220)
T PF00089_consen 103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGG 182 (220)
T ss_dssp SBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTS
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 5677777652 23588999999999875332 344444443333222211 1112245566655 788999999
Q ss_pred eeeCCCccEEEEeecc
Q 016647 282 PLLDSSGSLIGINTAI 297 (385)
Q Consensus 282 Plvn~~G~VVGI~s~~ 297 (385)
|+++.++.|+||++..
T Consensus 183 pl~~~~~~lvGI~s~~ 198 (220)
T PF00089_consen 183 PLICNNNYLVGIVSFG 198 (220)
T ss_dssp EEEETTEEEEEEEEEE
T ss_pred ccccceeeecceeeec
Confidence 9998776799999987
No 11
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.35 E-value=3.6e-11 Score=121.90 Aligned_cols=247 Identities=21% Similarity=0.242 Sum_probs=161.4
Q ss_pred HHHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCC-CEEEecccccC-CCCeEEEEecCCCeEeeEEEEECCCC
Q 016647 122 FQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVGFDQDK 199 (385)
Q Consensus 122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~-~~~~i~V~~~dg~~~~a~vv~~d~~~ 199 (385)
.+++..+.|.+.... +++.+........|||.|++.+ |++++...++. +..+.+|.+.|...++|.+.+.++..
T Consensus 524 ~~~i~~~~~~v~~~~----~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~ 599 (955)
T KOG1421|consen 524 SADISNCLVDVEPMM----PVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTE 599 (955)
T ss_pred hhHHhhhhhhheece----eeccccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCcc
Confidence 355666777776632 3444444444567999999866 89999999996 66788999999889999999999999
Q ss_pred CeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeecc----CCCCCCcccEEEEccccC
Q 016647 200 DVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISS----AATGRPIQDVIQTDAAIN 275 (385)
Q Consensus 200 DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~----~~~~~~~~~~i~~d~~i~ 275 (385)
++|.+|.+... ...++|.+ ..+..||++...|+......-.....|..+...... ........+.|..++.+.
T Consensus 600 n~a~~kydp~~--~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nls 676 (955)
T KOG1421|consen 600 NVASFKYDPAL--EVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNLS 676 (955)
T ss_pred ceeEeccChhH--hhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEecccc
Confidence 99999998743 34555643 447899999999987543322122222211111111 111122345666666665
Q ss_pred CCCCCceeeCCCccEEEEeecccCCC--CCC--CCceeeecccc--------------------------------ccc-
Q 016647 276 PGNSGGPLLDSSGSLIGINTAIYSPS--GAS--SGVGFSIPVDT--------------------------------GLL- 318 (385)
Q Consensus 276 ~G~SGGPlvn~~G~VVGI~s~~~~~~--~~~--~~~g~aIP~~~--------------------------------~v~- 318 (385)
-+.--|-+.|.+|+|+|++-..+... +.. .-+|.+++.-. |+.
T Consensus 677 T~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lglp~ 756 (955)
T KOG1421|consen 677 TSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGLPS 756 (955)
T ss_pred ccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCCCH
Confidence 55555668899999999976554431 111 12244433322 000
Q ss_pred --------------------ccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647 319 --------------------STKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 378 (385)
Q Consensus 319 --------------------~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 378 (385)
.-.+-...-|..||||+++|||.|+.+.||.++. .+.++|+|+|.++++++.+
T Consensus 757 e~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~~-------eid~~ilrdg~~~~ikipt 829 (955)
T KOG1421|consen 757 EFIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDFE-------EIDAVILRDGIEMEIKIPT 829 (955)
T ss_pred HHHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhhh-------hhheeeeecCcEEEEEecc
Confidence 0001111236789999999999999999999733 4789999999999999887
Q ss_pred ecCC
Q 016647 379 EPKP 382 (385)
Q Consensus 379 ~~~~ 382 (385)
-+..
T Consensus 830 ~p~~ 833 (955)
T KOG1421|consen 830 YPEY 833 (955)
T ss_pred cccc
Confidence 6554
No 12
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.31 E-value=8.2e-11 Score=107.55 Aligned_cols=147 Identities=22% Similarity=0.297 Sum_probs=91.6
Q ss_pred CeEEEEEEEcCCCEEEecccccCCC--CeEEEEecCC---------CeEeeEEEEEC-------CCCCeEEEEEcCCC--
Q 016647 151 QGSGSGFVWDSKGHVVTNYHVIRGA--SDIRVTFADQ---------SAYDAKIVGFD-------QDKDVAVLRIDAPK-- 210 (385)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~--~~i~V~~~dg---------~~~~a~vv~~d-------~~~DlAlLkv~~~~-- 210 (385)
...|+|++|+++ +|||+|||+.+. ..+.|.+... ..+..+-+..+ ..+|||||+++.+.
T Consensus 24 ~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~ 102 (232)
T cd00190 24 RHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTL 102 (232)
T ss_pred cEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccC
Confidence 458999999987 999999999875 5666766432 22333333333 35799999998754
Q ss_pred -CCCcceecCCCC-CCCCCCEEEEEeCCCCCCC-----ceEEeEEeeeeeeeccCCCC---CCcccEEEE-----ccccC
Q 016647 211 -DKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDH-----TLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN 275 (385)
Q Consensus 211 -~~~~~l~l~~~~-~~~~G~~V~~iG~p~g~~~-----~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----d~~i~ 275 (385)
..+.|+.|.... .+..|+.+.++||...... ......+..+....+..... ......+.. ....|
T Consensus 103 ~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 182 (232)
T cd00190 103 SDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDAC 182 (232)
T ss_pred CCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccc
Confidence 236778886553 5778999999999754332 12222222222211111100 011122222 34567
Q ss_pred CCCCCceeeCCC---ccEEEEeeccc
Q 016647 276 PGNSGGPLLDSS---GSLIGINTAIY 298 (385)
Q Consensus 276 ~G~SGGPlvn~~---G~VVGI~s~~~ 298 (385)
.|+||||++... +.++||.+...
T Consensus 183 ~gdsGgpl~~~~~~~~~lvGI~s~g~ 208 (232)
T cd00190 183 QGDSGGPLVCNDNGRGVLVGIVSWGS 208 (232)
T ss_pred cCCCCCcEEEEeCCEEEEEEEEehhh
Confidence 899999999653 78999998764
No 13
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.30 E-value=1.1e-11 Score=96.42 Aligned_cols=63 Identities=37% Similarity=0.527 Sum_probs=58.0
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 378 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 378 (385)
.+.+++|++.+||++||+|++|||++|++..++.+++...++|++++|+|+|+|+.+++++++
T Consensus 20 ~V~~~spA~~aGl~~GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 20 SVIPGSPAAKAGLQPGDIILAINGKPVNSSEDLVNILSKGKPGDTVTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp EESTTSHHHHTTS-TTEEEEEETTEESSSHHHHHHHHHCSSTTSEEEEEEEETTEEEEEEEE-
T ss_pred EeCCCCcHHHCCCCCCcEEEEECCEEcCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEC
Confidence 367889999999999999999999999999999999998999999999999999999999875
No 14
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.26 E-value=8.8e-12 Score=124.52 Aligned_cols=236 Identities=21% Similarity=0.277 Sum_probs=161.7
Q ss_pred HHhCCceEEEEEeeeccCcccccccc-CCCeEEEEEEEcCCCEEEecccccC---CCCeEEEEe-cCCCeEeeEEEEECC
Q 016647 123 QENTPSVVNITNLAARQDAFTLDVLE-VPQGSGSGFVWDSKGHVVTNYHVIR---GASDIRVTF-ADQSAYDAKIVGFDQ 197 (385)
Q Consensus 123 ~~~~~SVV~I~~~~~~~~~~~~~~~~-~~~~~GSGfiI~~~G~ILT~aHvv~---~~~~i~V~~-~dg~~~~a~vv~~d~ 197 (385)
+....+++.+........+...|... .....|+||.+... .++|++|++. +...+.+.- +.-+.|.+++...-.
T Consensus 57 ~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~~ 135 (473)
T KOG1320|consen 57 DLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVFE 135 (473)
T ss_pred cccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccccccccccCCCchhhhhhHHHhhh
Confidence 34456788887766555433333332 23457999999754 8999999998 555566652 233568888888889
Q ss_pred CCCeEEEEEcCCC--CCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccC
Q 016647 198 DKDVAVLRIDAPK--DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAIN 275 (385)
Q Consensus 198 ~~DlAlLkv~~~~--~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~ 275 (385)
+.|+|++.++... ....++.+ .+-+...+.++++| |....++.|.|....... +..+......+++++.++
T Consensus 136 ~cd~Avv~Ie~~~f~~~~~~~e~--~~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~--y~~~~~~l~~vqi~aa~~ 208 (473)
T KOG1320|consen 136 ECDLAVVYIESEEFWKGMNPFEL--GDIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRI--YAHSSTVLLRVQIDAAIG 208 (473)
T ss_pred cccceEEEEeeccccCCCccccc--CCCcccCccEEEEc---CCcEEEEeeEEEEEEecc--ccCCCcceeeEEEEEeec
Confidence 9999999998643 12222333 34456678899998 667789999998765542 222333445789999999
Q ss_pred CCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc-------------------------ccc------------
Q 016647 276 PGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT-------------------------GLL------------ 318 (385)
Q Consensus 276 ~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~-------------------------~v~------------ 318 (385)
+|+||+|.+...+++.|+........+ .+++.||... ++.
T Consensus 209 ~~~s~ep~i~g~d~~~gvA~l~ik~~~---~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~ 285 (473)
T KOG1320|consen 209 PGNSGEPVIVGVDKVAGVAFLKIKTPE---NILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGL 285 (473)
T ss_pred CCccCCCeEEccccccceEEEEEecCC---cccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCc
Confidence 999999999877899999987764221 4566666543 000
Q ss_pred ---------ccccccccCCCCCcEEEEECCEEeCC-HH-----HHHHHHhcCCCCCEEEEEEEECC
Q 016647 319 ---------STKRDAYGRLILGDIITSVNGKKVSN-GS-----DLYRILDQCKVGDEVIVEVLRGD 369 (385)
Q Consensus 319 ---------~~~~a~~~gl~~GDiI~~ing~~i~s-~~-----~l~~~l~~~~~g~~v~l~v~R~g 369 (385)
....++..-++.||+|+.+||+.|-- .. .+...+....++|+|.+.+.|.+
T Consensus 286 ~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~ 351 (473)
T KOG1320|consen 286 ETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG 351 (473)
T ss_pred ccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence 00112333478999999999999941 11 24456677889999999999987
No 15
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.20 E-value=4.5e-10 Score=102.90 Aligned_cols=147 Identities=23% Similarity=0.329 Sum_probs=90.8
Q ss_pred CeEEEEEEEcCCCEEEecccccCCCC--eEEEEecCC--------CeEeeEEEEEC-------CCCCeEEEEEcCCC---
Q 016647 151 QGSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ--------SAYDAKIVGFD-------QDKDVAVLRIDAPK--- 210 (385)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~dg--------~~~~a~vv~~d-------~~~DlAlLkv~~~~--- 210 (385)
...|+|++|+++ +|||+|||+.+.. .+.|.+... ..+.+.-+..+ ..+|||||+++.+.
T Consensus 25 ~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~ 103 (229)
T smart00020 25 RHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLS 103 (229)
T ss_pred CcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCC
Confidence 458999999987 9999999998753 677777543 22334433322 45799999998763
Q ss_pred CCCcceecCCC-CCCCCCCEEEEEeCCCCCC------CceEEeEEeeeeeeeccCCCC---CCcccEEEE-----ccccC
Q 016647 211 DKLRPIPIGVS-ADLLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN 275 (385)
Q Consensus 211 ~~~~~l~l~~~-~~~~~G~~V~~iG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----d~~i~ 275 (385)
..+.++.+... ..+..++.+.+.||+.... .......+..+....+..... ......+.. ....+
T Consensus 104 ~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 183 (229)
T smart00020 104 DNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDAC 183 (229)
T ss_pred CceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCccc
Confidence 24667777653 2467789999999986542 112222222222211111000 001111221 35578
Q ss_pred CCCCCceeeCCCc--cEEEEeeccc
Q 016647 276 PGNSGGPLLDSSG--SLIGINTAIY 298 (385)
Q Consensus 276 ~G~SGGPlvn~~G--~VVGI~s~~~ 298 (385)
+|+||||++...+ .++||.+...
T Consensus 184 ~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 184 QGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCCCCCeeEEECCCEEEEEEEEECC
Confidence 8999999996443 8999998764
No 16
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.04 E-value=1e-09 Score=84.67 Aligned_cols=62 Identities=26% Similarity=0.349 Sum_probs=56.0
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK 377 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 377 (385)
.+.++++++++||++||+|++|||++|.++.++.+++...++|+.+.+++.|+|+..+++++
T Consensus 16 ~V~~~spa~~aGL~~GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 16 GVIVGSPAENAVLHTGDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVST 77 (79)
T ss_pred EECCCChHHhcCCCCCCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence 34577888899999999999999999999999999998877899999999999998887765
No 17
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.03 E-value=1.7e-09 Score=83.35 Aligned_cols=65 Identities=26% Similarity=0.351 Sum_probs=57.4
Q ss_pred ccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeecCC
Q 016647 317 LLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEPKP 382 (385)
Q Consensus 317 v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~~ 382 (385)
+.++++++. +|++||+|++|||++|.+++++.+++...++|+.+.+++.|+|+.+++++++.+.+
T Consensus 15 V~~~s~A~~-gL~~GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l~~~~ 79 (79)
T cd00986 15 VVEGMPAAG-KLKAGDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLILKTFP 79 (79)
T ss_pred ECCCCchhh-CCCCCCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEEeccC
Confidence 345666665 79999999999999999999999999877789999999999999999999987653
No 18
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.87 E-value=3.9e-08 Score=91.34 Aligned_cols=135 Identities=20% Similarity=0.258 Sum_probs=81.2
Q ss_pred EEEEEEEcCCCEEEecccccCCCC----eEEEEe----cCCC-eEeeE--EEEEC-C---CCCeEEEEEcCCC-------
Q 016647 153 SGSGFVWDSKGHVVTNYHVIRGAS----DIRVTF----ADQS-AYDAK--IVGFD-Q---DKDVAVLRIDAPK------- 210 (385)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHvv~~~~----~i~V~~----~dg~-~~~a~--vv~~d-~---~~DlAlLkv~~~~------- 210 (385)
.+++|+|+++ .+||++||+.... ++.+.. .++. .+..+ ..... . +.|.+...+..-.
T Consensus 65 ~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~ 143 (251)
T COG3591 65 CTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINI 143 (251)
T ss_pred eeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCc
Confidence 4466999998 9999999986433 222211 1111 11111 11112 2 3466666654211
Q ss_pred -CCCcceecCCCCCCCCCCEEEEEeCCCCCCCce----EEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeC
Q 016647 211 -DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTL----TTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD 285 (385)
Q Consensus 211 -~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~----~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn 285 (385)
.......+......+.++.+.++|||.+..... ..+.+..+ ....+.+++.+.+|+||+|+++
T Consensus 144 ~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~------------~~~~l~y~~dT~pG~SGSpv~~ 211 (251)
T COG3591 144 GDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI------------KGNKLFYDADTLPGSSGSPVLI 211 (251)
T ss_pred cccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE------------ecceEEEEecccCCCCCCceEe
Confidence 112222333345578899999999997755322 22322211 1236889999999999999999
Q ss_pred CCccEEEEeecccCC
Q 016647 286 SSGSLIGINTAIYSP 300 (385)
Q Consensus 286 ~~G~VVGI~s~~~~~ 300 (385)
.+.+|||+++.+...
T Consensus 212 ~~~~vigv~~~g~~~ 226 (251)
T COG3591 212 SKDEVIGVHYNGPGA 226 (251)
T ss_pred cCceEEEEEecCCCc
Confidence 888999999877553
No 19
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.80 E-value=2.1e-08 Score=76.80 Aligned_cols=61 Identities=21% Similarity=0.326 Sum_probs=53.7
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK 377 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 377 (385)
.+..+++++++||++||+|++|||+++.++.++...+... .|+.+.+++.|+|+..++.++
T Consensus 18 ~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~l~~~-~~~~~~l~v~r~~~~~~~~l~ 78 (79)
T cd00989 18 EVVPGSPAAKAGLKAGDRILAINGQKIKSWEDLVDAVQEN-PGKPLTLTVERNGETITLTLT 78 (79)
T ss_pred eECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHHC-CCceEEEEEEECCEEEEEEec
Confidence 5567788888999999999999999999999999998875 488999999999988777664
No 20
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.73 E-value=3.7e-08 Score=75.73 Aligned_cols=61 Identities=28% Similarity=0.405 Sum_probs=51.6
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE 379 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~ 379 (385)
.+.++++++.+||++||+|++|||++|.++.+ ++...+.|+.+.++++|+|+..++.+++.
T Consensus 18 ~V~~~s~a~~aGl~~GD~I~~Ing~~v~~~~~---~l~~~~~~~~v~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 18 FVRDDSPADKAGLVAGDELVAVNGWRVDALQD---RLKEYQAGDPVELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred EECCCChHHHhCCCCCCEEEEECCEEhHHHHH---HHHhcCCCCEEEEEEEECCEEEEEEEEec
Confidence 45678889999999999999999999998554 45455688999999999999998888765
No 21
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.73 E-value=3.9e-08 Score=77.16 Aligned_cols=60 Identities=37% Similarity=0.488 Sum_probs=53.2
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIP 375 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~ 375 (385)
.+.++++++.+||++||+|++|||++|.++.++.+++.....|+.+.+++.|+|+.+++.
T Consensus 30 ~v~~~s~a~~~gl~~GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 30 SVDPGSPAAKAGLKPGDVILAVNGKPVKSVADLRRALAELKPGDKVTLTVLRGGKELTVT 89 (90)
T ss_pred EECCCCHHHHcCCCcCCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEee
Confidence 445678888899999999999999999999999999988777999999999999876654
No 22
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.66 E-value=4.3e-08 Score=99.97 Aligned_cols=66 Identities=11% Similarity=0.078 Sum_probs=61.2
Q ss_pred ccccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647 315 TGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 380 (385)
Q Consensus 315 ~~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 380 (385)
.++.+++|++++|||+||+|++|||++|++++|+...+....+|++++++|.|+|+.++.++++..
T Consensus 131 ~~V~~~SpA~kAGLk~GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~ 196 (449)
T PRK10779 131 GEIAPNSIAAQAQIAPGTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTLDL 196 (449)
T ss_pred cccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEecc
Confidence 378899999999999999999999999999999999998888999999999999999888888753
No 23
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.66 E-value=7.2e-08 Score=91.02 Aligned_cols=61 Identities=21% Similarity=0.272 Sum_probs=56.4
Q ss_pred cccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647 318 LSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 378 (385)
Q Consensus 318 ~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 378 (385)
.++++++.+|||+||+|++|||+++.++.++.+++.+.++++.++++|.|+|+.+++.+.+
T Consensus 199 ~~~s~a~~aGLr~GDvIv~ING~~i~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 199 KDPSLFYKSGLQDGDIAVALNGLDLRDPEQAFQALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred CCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 4567889999999999999999999999999999999999999999999999998888764
No 24
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.60 E-value=1.9e-07 Score=72.53 Aligned_cols=62 Identities=24% Similarity=0.436 Sum_probs=53.6
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEEC-CEEEEEEEEe
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLRG-DQKEKIPVKL 378 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~R~-g~~~~~~v~~ 378 (385)
.+.++++++.+||++||+|++|||+++.++ .++.+++.. ..|+.+.+++.|+ |+..+++++.
T Consensus 19 ~v~~~s~a~~~gl~~GD~I~~vng~~i~~~~~~~~~~~l~~-~~~~~i~l~v~r~~~~~~~~~~~~ 83 (85)
T cd00988 19 SVLPGSPAAKAGIKAGDIIVAIDGEPVDGLSLEDVVKLLRG-KAGTKVRLTLKRGDGEPREVTLTR 83 (85)
T ss_pred EecCCCCHHHcCCCCCCEEEEECCEEcCCCCHHHHHHHhcC-CCCCEEEEEEEcCCCCEEEEEEEE
Confidence 456778888899999999999999999998 899888876 4689999999999 8887777653
No 25
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.39 E-value=9.7e-07 Score=90.08 Aligned_cols=64 Identities=14% Similarity=0.261 Sum_probs=58.3
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 380 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 380 (385)
.+.++++++++||++||+|++|||++|++++|+.+.+.. .+|+.+.+++.|+|+..++++++..
T Consensus 227 ~V~~~SpA~~AGL~~GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~~ 290 (449)
T PRK10779 227 EVQPNSAASKAGLQAGDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEIERQGSPLSLTLTPDS 290 (449)
T ss_pred eeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEEEECCEEEEEEEEeee
Confidence 567889999999999999999999999999999999877 5789999999999999999888753
No 26
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.38 E-value=9e-07 Score=89.53 Aligned_cols=64 Identities=22% Similarity=0.321 Sum_probs=58.0
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 380 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 380 (385)
.+.++++++++|||+||+|++|||++|.+++|+.+.+.. .+|+++.+++.|+|+..++++++..
T Consensus 209 ~V~~~SpA~~aGL~~GD~Iv~Vng~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~~ 272 (420)
T TIGR00054 209 DVTPNSPAEKAGLKEGDYIQSINGEKLRSWTDFVSAVKE-NPGKSMDIKVERNGETLSISLTPEA 272 (420)
T ss_pred EECCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHh-CCCCceEEEEEECCEEEEEEEEEcC
Confidence 566888999999999999999999999999999999987 5789999999999999998888743
No 27
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.33 E-value=1.8e-06 Score=85.58 Aligned_cols=60 Identities=22% Similarity=0.387 Sum_probs=54.0
Q ss_pred ccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEee
Q 016647 319 STKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLE 379 (385)
Q Consensus 319 ~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~ 379 (385)
..++++.+|||+||+|++|||++|.+++|+.+++...+ |+.+.++|.|+|+..+++++..
T Consensus 122 ~~SPAa~AGLq~GDiIvsING~~V~s~~DL~~iL~~~~-g~~V~LtV~R~Ge~~tv~V~Pv 181 (402)
T TIGR02860 122 IHSPGEEAGIQIGDRILKINGEKIKNMDDLANLINKAG-GEKLTLTIERGGKIIETVIKPV 181 (402)
T ss_pred CCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHHHhCC-CCeEEEEEEECCEEEEEEEEEe
Confidence 35678889999999999999999999999999998764 8999999999999998888754
No 28
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.28 E-value=8.7e-05 Score=68.58 Aligned_cols=165 Identities=16% Similarity=0.270 Sum_probs=86.4
Q ss_pred HHhCCceEEEEEeeeccCccccccccCCCeEEEEEEEcCCCEEEecccccCC-CCeEEEEecCCCeEeeE-----EEEEC
Q 016647 123 QENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRG-ASDIRVTFADQSAYDAK-----IVGFD 196 (385)
Q Consensus 123 ~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~-~~~i~V~~~dg~~~~a~-----vv~~d 196 (385)
..+...|++|....... ...=-|+... + +|||++|.+.. ...++|...-|.- ... -+..-
T Consensus 14 n~Ia~~ic~l~n~s~~~-----------~~~l~gigyG-~-~iItn~HLf~~nng~L~i~s~hG~f-~v~nt~~lkv~~i 79 (235)
T PF00863_consen 14 NPIASNICRLTNESDGG-----------TRSLYGIGYG-S-YIITNAHLFKRNNGELTIKSQHGEF-TVPNTTQLKVHPI 79 (235)
T ss_dssp HHHHTTEEEEEEEETTE-----------EEEEEEEEET-T-EEEEEGGGGSSTTCEEEEEETTEEE-EECEGGGSEEEE-
T ss_pred chhhheEEEEEEEeCCC-----------eEEEEEEeEC-C-EEEEChhhhccCCCeEEEEeCceEE-EcCCccccceEEe
Confidence 34556788887643211 1233477775 3 89999999954 4567887766632 211 22334
Q ss_pred CCCCeEEEEEcCCCCCCcceecC-CCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccC
Q 016647 197 QDKDVAVLRIDAPKDKLRPIPIG-VSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAIN 275 (385)
Q Consensus 197 ~~~DlAlLkv~~~~~~~~~l~l~-~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~ 275 (385)
+..||.++|+.. ++||.+-. ....++.+|.|+++|.-+.... ....||........ .. ..+..+-....
T Consensus 80 ~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p~-~~----~~fWkHwIsTk 149 (235)
T PF00863_consen 80 EGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYPE-EN----SHFWKHWISTK 149 (235)
T ss_dssp TCSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEEE-TT----TTEEEE-C---
T ss_pred CCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEeec-CC----CCeeEEEecCC
Confidence 688999999965 45554422 2245889999999997544322 22233322221111 11 23445555566
Q ss_pred CCCCCceeeC-CCccEEEEeecccCCCCCCCCceeeecccc
Q 016647 276 PGNSGGPLLD-SSGSLIGINTAIYSPSGASSGVGFSIPVDT 315 (385)
Q Consensus 276 ~G~SGGPlvn-~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~ 315 (385)
.|+=|.|+++ .||.+|||++.... ....+|..|+..
T Consensus 150 ~G~CG~PlVs~~Dg~IVGiHsl~~~----~~~~N~F~~f~~ 186 (235)
T PF00863_consen 150 DGDCGLPLVSTKDGKIVGIHSLTSN----TSSRNYFTPFPD 186 (235)
T ss_dssp TT-TT-EEEETTT--EEEEEEEEET----TTSSEEEEE--T
T ss_pred CCccCCcEEEcCCCcEEEEEcCccC----CCCeEEEEcCCH
Confidence 8999999998 79999999997643 235678777766
No 29
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.28 E-value=1.7e-06 Score=87.79 Aligned_cols=60 Identities=33% Similarity=0.447 Sum_probs=54.8
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIP 375 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~ 375 (385)
.+.++++++++||++||+|++|||++|.++.++.+++...+.|+.++++|+|+|+..++.
T Consensus 368 ~V~~~SpA~~aGL~~GDvI~~Ing~~V~s~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~ 427 (428)
T TIGR02037 368 KVVSGSPAARAGLQPGDVILSVNQQPVSSVAELRKVLDRAKKGGRVALLILRGGATIFVT 427 (428)
T ss_pred EeCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEE
Confidence 456788899999999999999999999999999999998888999999999999987664
No 30
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.26 E-value=2.4e-05 Score=73.31 Aligned_cols=145 Identities=25% Similarity=0.306 Sum_probs=83.3
Q ss_pred EEEEEEEcCCCEEEecccccCCCC--eEEEEecCC------------CeEee-EEEEEC-------CC-CCeEEEEEcCC
Q 016647 153 SGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ------------SAYDA-KIVGFD-------QD-KDVAVLRIDAP 209 (385)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~dg------------~~~~a-~vv~~d-------~~-~DlAlLkv~~~ 209 (385)
.+-|.+|+++ ||+|++||+.+.. .+.|.+... ..... +++ .+ .. +|||||+++.+
T Consensus 39 ~Cggsli~~~-~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~ 116 (256)
T KOG3627|consen 39 LCGGSLISPR-WVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEP 116 (256)
T ss_pred eeeeEEeeCC-EEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCC
Confidence 6677788665 9999999999875 666766321 11111 222 22 13 79999999875
Q ss_pred C---CCCcceecCCCCC---CCCCCEEEEEeCCCCCC------CceEEeEEeeeeeeeccCCCCC---CcccEEEEc---
Q 016647 210 K---DKLRPIPIGVSAD---LLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQTD--- 271 (385)
Q Consensus 210 ~---~~~~~l~l~~~~~---~~~G~~V~~iG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~~---~~~~~i~~d--- 271 (385)
. ..+.++.|..... ...+..+++.||+.... .......+.-+....+...... .....+...
T Consensus 117 v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~ 196 (256)
T KOG3627|consen 117 VTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPE 196 (256)
T ss_pred cccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccC
Confidence 3 4566777753332 34458888899864321 1222222222222212111110 011224332
Q ss_pred --cccCCCCCCceeeCCC---ccEEEEeecccC
Q 016647 272 --AAINPGNSGGPLLDSS---GSLIGINTAIYS 299 (385)
Q Consensus 272 --~~i~~G~SGGPlvn~~---G~VVGI~s~~~~ 299 (385)
...|.|+|||||+-.+ ..++||++++..
T Consensus 197 ~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 197 GGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred CCCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 2368899999998654 699999998754
No 31
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.22 E-value=2e-06 Score=64.11 Aligned_cols=50 Identities=30% Similarity=0.415 Sum_probs=43.7
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVL 366 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~ 366 (385)
.+..+++++.+||++||+|++|||+++.++ .++.+++... .|+.++|+++
T Consensus 19 ~v~~~s~a~~~gl~~GD~I~~Ing~~v~~~~~~~~~~~l~~~-~g~~v~l~v~ 70 (70)
T cd00136 19 SVEPGSPAERAGLQAGDVILAVNGTDVKNLTLEDVAELLKKE-VGEKVTLTVR 70 (70)
T ss_pred EeCCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhhC-CCCeEEEEEC
Confidence 456778888899999999999999999999 8999999875 4899998863
No 32
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=98.13 E-value=7.2e-06 Score=77.42 Aligned_cols=55 Identities=20% Similarity=0.326 Sum_probs=51.2
Q ss_pred cccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647 324 AYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 378 (385)
Q Consensus 324 ~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 378 (385)
...|||+||++++|||..+++.++..+++.+.+....++|+|+|||+..++.+.+
T Consensus 221 ~~~GLq~GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i~l 275 (276)
T PRK09681 221 DASGFKEGDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIAL 275 (276)
T ss_pred HHcCCCCCCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEEEc
Confidence 4679999999999999999999999999999999999999999999999888765
No 33
>PRK10139 serine endoprotease; Provisional
Probab=98.10 E-value=7.4e-06 Score=83.65 Aligned_cols=59 Identities=17% Similarity=0.299 Sum_probs=52.7
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV 376 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v 376 (385)
.+.++++++++|||+||+|++|||++|.+++++.+++.+. + +++.|+|+|+|+.+++.+
T Consensus 396 ~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~l~~~-~-~~v~l~v~R~g~~~~~~~ 454 (455)
T PRK10139 396 EVVKGSPAAQAGLQKDDVIIGVNRDRVNSIAEMRKVLAAK-P-AIIALQIVRGNESIYLLL 454 (455)
T ss_pred EeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhC-C-CeEEEEEEECCEEEEEEe
Confidence 5567889999999999999999999999999999999874 3 789999999999887765
No 34
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.07 E-value=7.9e-06 Score=80.23 Aligned_cols=64 Identities=23% Similarity=0.364 Sum_probs=53.3
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 380 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 380 (385)
.+.++++++.+||++||+|++|||++|.++ .++...+.. ++|+.+.+++.|+|+..++++++..
T Consensus 68 ~V~~~spA~~aGL~~GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g~~v~l~v~R~g~~~~~~v~l~~ 133 (334)
T TIGR00225 68 SPFEGSPAEKAGIKPGDKIIKINGKSVAGMSLDDAVALIRG-KKGTKVSLEILRAGKSKPLTFTLKR 133 (334)
T ss_pred EeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHhccC-CCCCEEEEEEEeCCCCceEEEEEEE
Confidence 567889999999999999999999999985 567666654 5799999999999877776666543
No 35
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.01 E-value=1.5e-05 Score=79.59 Aligned_cols=61 Identities=20% Similarity=0.229 Sum_probs=51.8
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE-ECCEEEEEEEEeec
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVL-RGDQKEKIPVKLEP 380 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~-R~g~~~~~~v~~~~ 380 (385)
.+.++++|+.+||++||+|++|||++|.++.|+...+. ++.+.++|. |+|+..++++...+
T Consensus 4 ~V~pgSpAe~AGLe~GD~IlsING~~V~Dw~D~~~~l~----~e~l~L~V~~rdGe~~~l~Ie~~~ 65 (433)
T TIGR03279 4 AVLPGSIAEELGFEPGDALVSINGVAPRDLIDYQFLCA----DEELELEVLDANGESHQIEIEKDL 65 (433)
T ss_pred CcCCCCHHHHcCCCCCCEEEEECCEECCCHHHHHHHhc----CCcEEEEEEcCCCeEEEEEEecCC
Confidence 45688999999999999999999999999999887773 467899997 89988888776543
No 36
>PRK10942 serine endoprotease; Provisional
Probab=97.99 E-value=1.5e-05 Score=81.85 Aligned_cols=59 Identities=27% Similarity=0.332 Sum_probs=52.6
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPV 376 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v 376 (385)
.+.++++++.+||++||+|++|||++|.+++++.+++.. ++ +.+.|+|+|+|+.+++.+
T Consensus 414 ~V~~~S~A~~aGL~~GDvIv~VNg~~V~s~~dl~~~l~~-~~-~~v~l~V~R~g~~~~v~~ 472 (473)
T PRK10942 414 NVKPGTPAAQIGLKKGDVIIGANQQPVKNIAELRKILDS-KP-SVLALNIQRGDSSIYLLM 472 (473)
T ss_pred EeCCCChHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHh-CC-CeEEEEEEECCEEEEEEe
Confidence 556788999999999999999999999999999999987 33 799999999999887765
No 37
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=97.94 E-value=1.4e-05 Score=61.44 Aligned_cols=54 Identities=31% Similarity=0.399 Sum_probs=42.5
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECC
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGD 369 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g 369 (385)
.+.++++++.+||++||+|++|||+++.++.+..........++.+.+++.|++
T Consensus 32 ~v~~~s~a~~~gl~~GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r~~ 85 (85)
T smart00228 32 SVVPGSPAAKAGLKVGDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLRGG 85 (85)
T ss_pred EECCCCHHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEeCC
Confidence 455778889999999999999999999987665544433345679999999975
No 38
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.90 E-value=3.3e-05 Score=77.40 Aligned_cols=62 Identities=16% Similarity=0.205 Sum_probs=52.2
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 378 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 378 (385)
.+.+++|++++||++||+|++|||++|.++ .++...+.. ..|+.|.++|.|+|+..+++++-
T Consensus 108 ~V~~~SPA~~aGl~~GD~Iv~InG~~v~~~~~~~~~~~l~g-~~g~~v~ltv~r~g~~~~~~l~r 171 (389)
T PLN00049 108 APAPGGPAARAGIRPGDVILAIDGTSTEGLSLYEAADRLQG-PEGSSVELTLRRGPETRLVTLTR 171 (389)
T ss_pred EeCCCChHHHcCCCCCCEEEEECCEECCCCCHHHHHHHHhc-CCCCEEEEEEEECCEEEEEEEEe
Confidence 566889999999999999999999999864 677777754 57999999999999887776653
No 39
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.84 E-value=5.1e-05 Score=72.02 Aligned_cols=59 Identities=27% Similarity=0.442 Sum_probs=54.3
Q ss_pred ccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEE-CCEEEEEEEEeecC
Q 016647 323 DAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLR-GDQKEKIPVKLEPK 381 (385)
Q Consensus 323 a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~~~~~ 381 (385)
.+.+.|+.||.|+++||+++.+.+|+.+.+...++||+|++++.| +++...+++++.+.
T Consensus 142 ~~~gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~tl~~~ 201 (342)
T COG3480 142 PFKGKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTITLIKN 201 (342)
T ss_pred chhceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEEEEee
Confidence 455669999999999999999999999999999999999999997 88888899988877
No 40
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.83 E-value=2.1e-05 Score=79.56 Aligned_cols=61 Identities=25% Similarity=0.214 Sum_probs=53.3
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 378 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 378 (385)
.+.++++++++|||+||+|+++||+++.++.++.+.+.... +++.+++.|+++...+++++
T Consensus 134 ~V~~~SpA~~AGL~~GDvI~~vng~~v~~~~dl~~~ia~~~--~~v~~~I~r~g~~~~l~v~l 194 (420)
T TIGR00054 134 LLDKNSIALEAGIEPGDEILSVNGNKIPGFKDVRQQIADIA--GEPMVEILAERENWTFEVMK 194 (420)
T ss_pred ccCCCCHHHHcCCCCCCEEEEECCEEcCCHHHHHHHHHhhc--ccceEEEEEecCceEecccc
Confidence 66789999999999999999999999999999999888765 68899999998877655443
No 41
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.74 E-value=0.00012 Score=57.44 Aligned_cols=55 Identities=27% Similarity=0.419 Sum_probs=38.1
Q ss_pred cccccccC--CCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECC-EEEEEE
Q 016647 320 TKRDAYGR--LILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGD-QKEKIP 375 (385)
Q Consensus 320 ~~~a~~~g--l~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g-~~~~~~ 375 (385)
.+|-...| +++||+|++|||+++....++..+|.. +.|+.|.|+|.+++ +.+++.
T Consensus 30 ~sPL~~pGv~v~~GD~I~aInG~~v~~~~~~~~lL~~-~agk~V~Ltv~~~~~~~R~v~ 87 (88)
T PF14685_consen 30 RSPLAQPGVDVREGDYILAINGQPVTADANPYRLLEG-KAGKQVLLTVNRKPGGARTVV 87 (88)
T ss_dssp B-GGGGGS----TT-EEEEETTEE-BTTB-HHHHHHT-TTTSEEEEEEE-STT-EEEEE
T ss_pred cCCccCCCCCCCCCCEEEEECCEECCCCCCHHHHhcc-cCCCEEEEEEecCCCCceEEE
Confidence 34444444 579999999999999999999999987 68999999999976 455554
No 42
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.67 E-value=0.00011 Score=73.93 Aligned_cols=62 Identities=19% Similarity=0.353 Sum_probs=50.1
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEEEC--CEEEEEEEEe
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVLRG--DQKEKIPVKL 378 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~R~--g~~~~~~v~~ 378 (385)
....++|++++||++||+|++|||+++..+ ++..+.+.. ++|..|+|+|.|. ++.+.++++-
T Consensus 118 s~~~~~PA~kagi~~GD~I~~IdG~~~~~~~~~~av~~irG-~~Gt~V~L~i~r~~~~k~~~v~l~R 183 (406)
T COG0793 118 SPIDGSPAAKAGIKPGDVIIKIDGKSVGGVSLDEAVKLIRG-KPGTKVTLTILRAGGGKPFTVTLTR 183 (406)
T ss_pred ecCCCChHHHcCCCCCCEEEEECCEEccCCCHHHHHHHhCC-CCCCeEEEEEEEcCCCceeEEEEEE
Confidence 446789999999999999999999999976 456666665 6899999999997 4555655543
No 43
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=97.66 E-value=9.2e-05 Score=56.67 Aligned_cols=48 Identities=29% Similarity=0.370 Sum_probs=39.7
Q ss_pred cccccccccccCCCCCcEEEEECCEEeC--CHHHHHHHHhcCCCCCEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKVGDEVIVEV 365 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~--s~~~l~~~l~~~~~g~~v~l~v 365 (385)
.+..+++++.++|++||+|++|||+++. +..++.+++.... ..+.+++
T Consensus 32 ~v~~~s~a~~~gl~~GD~I~~ing~~i~~~~~~~~~~~l~~~~--~~v~l~v 81 (82)
T cd00992 32 RVEPGGPAERGGLRVGDRILEVNGVSVEGLTHEEAVELLKNSG--DEVTLTV 81 (82)
T ss_pred EECCCChHHhCCCCCCCEEEEECCEEcCccCHHHHHHHHHhCC--CeEEEEE
Confidence 4567788888999999999999999999 8899999887632 3666655
No 44
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.62 E-value=0.00049 Score=63.88 Aligned_cols=116 Identities=26% Similarity=0.373 Sum_probs=61.8
Q ss_pred CeEEEEEEEcCCC--EEEecccccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCC
Q 016647 151 QGSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQ 228 (385)
Q Consensus 151 ~~~GSGfiI~~~G--~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~ 228 (385)
...|||=+...+| .|+|+.||+. .+...|... +..... -++..-|+|.-.++.-.-.+|.++++.. ..|-
T Consensus 111 ss~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~~-g~~~~~---tF~~~GDfA~~~~~~~~G~~P~~k~a~~---~~Gr 182 (297)
T PF05579_consen 111 SSVGSGGVFTIGGNTVVVTATHVLG-GNTARVSGV-GTRRML---TFKKNGDFAEADITNWPGAAPKYKFAQN---YTGR 182 (297)
T ss_dssp SSEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEET-TEEEEE---EEEEETTEEEEEETTS-S---B--B-TT----SEE
T ss_pred ecccccceEEECCeEEEEEEEEEcC-CCeEEEEec-ceEEEE---EEeccCcEEEEECCCCCCCCCceeecCC---cccc
Confidence 3456665555444 5999999998 444444443 323222 3445669999999543335666666521 2232
Q ss_pred EEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEEeecc
Q 016647 229 KVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI 297 (385)
Q Consensus 229 ~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~ 297 (385)
--+.- ..-+..|.|... ..+. -..+|+||+|++..+|.+|||++..
T Consensus 183 AyW~t------~tGvE~G~ig~~--------------~~~~---fT~~GDSGSPVVt~dg~liGVHTGS 228 (297)
T PF05579_consen 183 AYWLT------STGVEPGFIGGG--------------GAVC---FTGPGDSGSPVVTEDGDLIGVHTGS 228 (297)
T ss_dssp EEEEE------TTEEEEEEEETT--------------EEEE---SS-GGCTT-EEEETTC-EEEEEEEE
T ss_pred eEEEc------ccCcccceecCc--------------eEEE---EcCCCCCCCccCcCCCCEEEEEecC
Confidence 22211 123455555421 1122 2347999999999999999999975
No 45
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=97.50 E-value=7.8e-05 Score=57.34 Aligned_cols=49 Identities=20% Similarity=0.358 Sum_probs=39.5
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIVEVL 366 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l~v~ 366 (385)
.+.++++++.+||++||.|++|||+.+.++ .+..+++... +++++|+|+
T Consensus 31 ~v~~~~~a~~~gl~~GD~Il~INg~~v~~~~~~~~~~~l~~~--~~~v~L~V~ 81 (81)
T PF00595_consen 31 SVVPGSPAERAGLKVGDRILEINGQSVRGMSHDEVVQLLKSA--SNPVTLTVQ 81 (81)
T ss_dssp EECTTSHHHHHTSSTTEEEEEETTEESTTSBHHHHHHHHHHS--TSEEEEEEE
T ss_pred EEeCCChHHhcccchhhhhheeCCEeCCCCCHHHHHHHHHCC--CCcEEEEEC
Confidence 556788888889999999999999999976 4666777664 348888874
No 46
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.42 E-value=0.00025 Score=64.74 Aligned_cols=54 Identities=22% Similarity=0.404 Sum_probs=49.5
Q ss_pred cccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Q 016647 324 AYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVK 377 (385)
Q Consensus 324 ~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 377 (385)
...|||.|||.++||+..+++.+++..++.....-+.++++|.|+|+..++.|.
T Consensus 221 ~~sglq~GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 221 YKSGLQRGDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred hhhcCCCcceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 456899999999999999999999999999888888999999999999988875
No 47
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.09 E-value=0.022 Score=55.28 Aligned_cols=56 Identities=21% Similarity=0.254 Sum_probs=34.8
Q ss_pred EEEEEEcCCCEEEecccccCCCCe-----EEE--EecC---CCeEeeEEEEEC-------CCCCeEEEEEcCCC
Q 016647 154 GSGFVWDSKGHVVTNYHVIRGASD-----IRV--TFAD---QSAYDAKIVGFD-------QDKDVAVLRIDAPK 210 (385)
Q Consensus 154 GSGfiI~~~G~ILT~aHvv~~~~~-----i~V--~~~d---g~~~~a~vv~~d-------~~~DlAlLkv~~~~ 210 (385)
|=|-+++.+ ||||+|||+.+..- +.| .+.| ++...++.+..+ ...|+|++++....
T Consensus 63 CGgs~l~~R-YvLTAAHC~~~~s~is~d~~~vv~~l~d~Sq~~rg~vr~i~~~efY~~~n~~ND~Av~~l~~~a 135 (413)
T COG5640 63 CGGSKLGGR-YVLTAAHCADASSPISSDVNRVVVDLNDSSQAERGHVRTIYVHEFYSPGNLGNDIAVLELARAA 135 (413)
T ss_pred eccceecce-EEeeehhhccCCCCccccceEEEecccccccccCcceEEEeeecccccccccCcceeecccccc
Confidence 446777777 99999999986552 222 2233 233334433332 45699999998753
No 48
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.03 E-value=0.0014 Score=69.65 Aligned_cols=61 Identities=23% Similarity=0.396 Sum_probs=47.3
Q ss_pred ccccccccccc-CCCCCcEEEEEC--CEEeCC-----HHHHHHHHhcCCCCCEEEEEEEEC---CEEEEEEEE
Q 016647 316 GLLSTKRDAYG-RLILGDIITSVN--GKKVSN-----GSDLYRILDQCKVGDEVIVEVLRG---DQKEKIPVK 377 (385)
Q Consensus 316 ~v~~~~~a~~~-gl~~GDiI~~in--g~~i~s-----~~~l~~~l~~~~~g~~v~l~v~R~---g~~~~~~v~ 377 (385)
.+.+++||+++ ||++||+|++|| |+++.+ .+++.+++.. +.|.+|.|+|.|+ ++..+++++
T Consensus 261 ~vipGsPA~ka~gLk~GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG-~~Gt~V~LtV~r~~~~~~~~~vtl~ 332 (667)
T PRK11186 261 SLVAGGPAAKSKKLSVGDKIVGVGQDGKPIVDVIGWRLDDVVALIKG-PKGSKVRLEILPAGKGTKTRIVTLT 332 (667)
T ss_pred EccCCChHHHhCCCCCCCEEEEECCCCCcccccccCCHHHHHHHhcC-CCCCEEEEEEEeCCCCCceEEEEEE
Confidence 56789999987 999999999998 565543 3477777765 6899999999993 455666655
No 49
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=96.79 E-value=0.0024 Score=57.24 Aligned_cols=65 Identities=20% Similarity=0.163 Sum_probs=53.2
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHH--HHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYR--ILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 380 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~--~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 380 (385)
.|.+++|++.+||+.||.|++++...--+...|++ .+.+...++.+.+++.|.|+...+.++...
T Consensus 145 sV~~~SPA~~aGl~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~ltP~~ 211 (231)
T KOG3129|consen 145 SVVPGSPADEAGLCVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLTPKK 211 (231)
T ss_pred ecCCCChhhhhCcccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeCccc
Confidence 67799999999999999999998877666665554 333456889999999999999999887643
No 50
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.72 E-value=0.11 Score=49.45 Aligned_cols=91 Identities=19% Similarity=0.227 Sum_probs=57.3
Q ss_pred CCCCeEEEEEcCC-CCCCcceecCCCC-CCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccc
Q 016647 197 QDKDVAVLRIDAP-KDKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI 274 (385)
Q Consensus 197 ~~~DlAlLkv~~~-~~~~~~l~l~~~~-~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i 274 (385)
...+++||.++.+ .....++.|+++. ....|+.+.+.|+. .........+.-.... .....+......
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~--~~~~~~~~~~~i~~~~--------~~~~~~~~~~~~ 228 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFN--STGKLKHRKLKITNCT--------KCAYSICTKQYS 228 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecC--CCCeEEEEEEEEEEee--------ccceeEeccccc
Confidence 4569999999886 2367888887654 36788999999882 1122222222211110 012345555667
Q ss_pred CCCCCCceee---CCCccEEEEeecc
Q 016647 275 NPGNSGGPLL---DSSGSLIGINTAI 297 (385)
Q Consensus 275 ~~G~SGGPlv---n~~G~VVGI~s~~ 297 (385)
+.|++|||++ |.+..||||.+..
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~ 254 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASG 254 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccC
Confidence 7999999997 4445699998754
No 51
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.68 E-value=0.023 Score=51.60 Aligned_cols=164 Identities=16% Similarity=0.254 Sum_probs=84.0
Q ss_pred ccCCCeEEEEEEEcCC-CEEEecccccCCCCe-EEEEecCCCeEeeEEEEECC----------------CCCeEEEEEc-
Q 016647 147 LEVPQGSGSGFVWDSK-GHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQ----------------DKDVAVLRID- 207 (385)
Q Consensus 147 ~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~-i~V~~~dg~~~~a~vv~~d~----------------~~DlAlLkv~- 207 (385)
.....+.||=.+++++ +..--=.|.+.+.+. -.+.+.+|+.+++++..+.+ ..-+.-+.-+
T Consensus 15 RD~~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~Nt 94 (218)
T PF05580_consen 15 RDSTAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEKNT 94 (218)
T ss_pred EeCCcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEecc
Confidence 3445678888889874 555555888876654 45667788888877665432 1112222221
Q ss_pred ---------CC----CCCCcceecCCCCCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCC----CcccEEEE
Q 016647 208 ---------AP----KDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGR----PIQDVIQT 270 (385)
Q Consensus 208 ---------~~----~~~~~~l~l~~~~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~----~~~~~i~~ 270 (385)
.. ....++++++...++++|..-+..-........+.--++. +.+......... ....++..
T Consensus 95 ~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~~G~~ie~f~ieI~~-v~~~~~~~~k~~vi~vtd~~Ll~~ 173 (218)
T PF05580_consen 95 QFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVIDGTKIEEFDIEIEK-VLPQSSPSGKGMVIKVTDPRLLEK 173 (218)
T ss_pred ccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEEcCCeEEEeEEEEEE-EccCCCCCCCcEEEEECCcchhhh
Confidence 11 0123445555555666675432211110001111111111 111100000000 00122233
Q ss_pred ccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc
Q 016647 271 DAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT 315 (385)
Q Consensus 271 d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~ 315 (385)
...+..||||+|++ .+|++||-++..+- .....||.+++.+
T Consensus 174 TGGIvqGMSGSPI~-qdGKLiGAVthvf~---~dp~~Gygi~ie~ 214 (218)
T PF05580_consen 174 TGGIVQGMSGSPII-QDGKLIGAVTHVFV---NDPTKGYGIFIEW 214 (218)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEe---cCCCceeeecHHH
Confidence 34567899999999 69999998887664 2346788887753
No 52
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=96.60 E-value=0.0027 Score=54.27 Aligned_cols=62 Identities=13% Similarity=0.228 Sum_probs=45.8
Q ss_pred cccccccccccCCCC-CcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECC--EEEEEEEEe
Q 016647 316 GLLSTKRDAYGRLIL-GDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGD--QKEKIPVKL 378 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~-GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g--~~~~~~v~~ 378 (385)
.|.+++||+.+||++ .|.|+.+|+..+.+.++|.+.+.. ..+..+.|.|++.. +.+++++++
T Consensus 49 ~V~p~SPA~~AGL~p~~DyIig~~~~~l~~~~~l~~~v~~-~~~~~l~L~Vyns~~~~vR~V~i~P 113 (138)
T PF04495_consen 49 RVAPNSPAAKAGLEPFFDYIIGIDGGLLDDEDDLFELVEA-NENKPLQLYVYNSKTDSVREVTITP 113 (138)
T ss_dssp EE-TTSHHHHTT--TTTEEEEEETTCE--STCHHHHHHHH-TTTS-EEEEEEETTTTCEEEEEE--
T ss_pred EecCCCHHHHCCccccccEEEEccceecCCHHHHHHHHHH-cCCCcEEEEEEECCCCeEEEEEEEc
Confidence 678999999999999 699999999999999999999987 47899999999844 445555544
No 53
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=96.19 E-value=0.0048 Score=62.60 Aligned_cols=59 Identities=31% Similarity=0.370 Sum_probs=51.5
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEeecCC
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKLEPKP 382 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~~~ 382 (385)
.|.+++|+..+||.+||.|++|||. .+.+..+++++.+++++.|.|..+++.|++...+
T Consensus 468 ~V~~~gPA~~AGl~~Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~~~~L~e~~v~~~~~~ 526 (558)
T COG3975 468 FVFPGGPAYKAGLSPGDKIVAINGI--------SDQLDRYKVNDKIQVHVFREGRLREFLVKLGGDP 526 (558)
T ss_pred ecCCCChhHhccCCCccEEEEEcCc--------cccccccccccceEEEEccCCceEEeecccCCCc
Confidence 4667899999999999999999999 4566778999999999999999999988876543
No 54
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.17 E-value=0.049 Score=48.32 Aligned_cols=137 Identities=18% Similarity=0.262 Sum_probs=76.7
Q ss_pred CeEEEEEEEcCCCEEEecccccCCCCeEEEEecCCCeEee--EEEEECC---CCCeEEEEEcCCCCCCccee-cCCCCCC
Q 016647 151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDA--KIVGFDQ---DKDVAVLRIDAPKDKLRPIP-IGVSADL 224 (385)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a--~vv~~d~---~~DlAlLkv~~~~~~~~~l~-l~~~~~~ 224 (385)
...++++.|-++ ++|...| -.... ++.+ +|+.++. .+...+. ..||++++++... +++-+. +-.....
T Consensus 24 ~~t~l~~gi~~~-~~lvp~H-~~~~~--~i~i-~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~-kfrDIrk~~~~~~~ 97 (172)
T PF00548_consen 24 EFTMLALGIYDR-YFLVPTH-EEPED--TIYI-DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNP-KFRDIRKFFPESIP 97 (172)
T ss_dssp EEEEEEEEEEBT-EEEEEGG-GGGCS--EEEE-TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS--B--GGGGSBSSGG
T ss_pred eEEEecceEeee-EEEEECc-CCCcE--EEEE-CCEEEEeeeeEEEecCCCcceeEEEEEccCCc-ccCchhhhhccccc
Confidence 457888888766 9999999 22222 3333 3444433 2223443 4599999997643 232221 1001112
Q ss_pred CCCCEEEEEeCCCCCCC-ceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeC---CCccEEEEeecc
Q 016647 225 LVGQKVYAIGNPFGLDH-TLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD---SSGSLIGINTAI 297 (385)
Q Consensus 225 ~~G~~V~~iG~p~g~~~-~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn---~~G~VVGI~s~~ 297 (385)
...+...++=.. .... ....+.+...... .. .+......+.++++..+|+-||||+. ..++++||+.++
T Consensus 98 ~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~~--~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 98 EYPECVLLVNST-KFPRMIVEVGFVTNFGFI-NL--SGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp TEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-EE--TTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred cCCCcEEEEECC-CCccEEEEEEEEeecCcc-cc--CCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 334444444332 3332 2344444433332 11 22334577888999999999999984 367999999875
No 55
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.79 E-value=0.014 Score=60.99 Aligned_cols=64 Identities=25% Similarity=0.424 Sum_probs=51.7
Q ss_pred CCCCceeeecccc--------ccccccccccc-CCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEEC
Q 016647 303 ASSGVGFSIPVDT--------GLLSTKRDAYG-RLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRG 368 (385)
Q Consensus 303 ~~~~~g~aIP~~~--------~v~~~~~a~~~-gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v~R~ 368 (385)
++.||||.|=.+. .+..++||+.. .|+.||-|+++||+.|.++. |+..+++. .|-+|+|+|.-.
T Consensus 763 ENeGFGFVi~sS~~kp~sgiGrIieGSPAdRCgkLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGlsVtLtIip~ 837 (984)
T KOG3209|consen 763 ENEGFGFVIMSSQNKPESGIGRIIEGSPADRCGKLKVGDRILAVNGQSILNLSHADIVSLIKD--AGLSVTLTIIPP 837 (984)
T ss_pred cCCceeEEEEecccCCCCCccccccCChhHhhccccccceEEEecCeeeeccCchhHHHHHHh--cCceEEEEEcCh
Confidence 4678999886655 46778888775 59999999999999999875 67777765 699999998754
No 56
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=95.60 E-value=0.095 Score=48.51 Aligned_cols=96 Identities=17% Similarity=0.249 Sum_probs=65.6
Q ss_pred ccccccccCCCeEEEEEEEcCCCEEEecccccCCC----CeEEEEecCCCeEe------eEEEEEC-----CCCCeEEEE
Q 016647 141 AFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGA----SDIRVTFADQSAYD------AKIVGFD-----QDKDVAVLR 205 (385)
Q Consensus 141 ~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~----~~i~V~~~dg~~~~------a~vv~~d-----~~~DlAlLk 205 (385)
||..+...++...++|++||++ |||++-.|+.+- .-+.+.++.++.+. -++..+| ++.++.+|.
T Consensus 17 PWlA~IYvdG~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLH 95 (267)
T PF09342_consen 17 PWLADIYVDGRYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLH 95 (267)
T ss_pred cceeeEEEcCeEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeee
Confidence 3444555567889999999987 999999999863 34777788777543 1344444 678999999
Q ss_pred EcCCC---CCCcceecCC-CCCCCCCCEEEEEeCCC
Q 016647 206 IDAPK---DKLRPIPIGV-SADLLVGQKVYAIGNPF 237 (385)
Q Consensus 206 v~~~~---~~~~~l~l~~-~~~~~~G~~V~~iG~p~ 237 (385)
++.+. ..+.|.-+.. .......+..+++|.-.
T Consensus 96 L~~~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 96 LEQPANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred ecCcccceeeecccccccccCCCCCCCceEEEEccc
Confidence 98764 2334444433 23345556888998653
No 57
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.54 E-value=0.013 Score=49.54 Aligned_cols=32 Identities=22% Similarity=0.456 Sum_probs=22.4
Q ss_pred EEEccccCCCCCCceeeCCCccEEEEeecccC
Q 016647 268 IQTDAAINPGNSGGPLLDSSGSLIGINTAIYS 299 (385)
Q Consensus 268 i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~ 299 (385)
...+....+|.||+|+||.+|++|||......
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~ 119 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE 119 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence 34444567999999999999999999876543
No 58
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.35 E-value=0.012 Score=62.91 Aligned_cols=21 Identities=33% Similarity=0.272 Sum_probs=18.9
Q ss_pred EEEEEEEcCCCEEEecccccC
Q 016647 153 SGSGFVWDSKGHVVTNYHVIR 173 (385)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHvv~ 173 (385)
-|||-+|+++|+|+||.||..
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred ceeEEEEcCCceEEecchhhh
Confidence 599999999999999999964
No 59
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=95.04 E-value=0.16 Score=53.30 Aligned_cols=109 Identities=21% Similarity=0.428 Sum_probs=65.9
Q ss_pred CCCeEEEEEcCCC-------CCC------cceecCCC------CCCCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccC
Q 016647 198 DKDVAVLRIDAPK-------DKL------RPIPIGVS------ADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSA 258 (385)
Q Consensus 198 ~~DlAlLkv~~~~-------~~~------~~l~l~~~------~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~ 258 (385)
-.|+|||+++... +.+ |.+.+.+. ..+..|..|+-+|.-.+ .+.|.+.++.-.. +
T Consensus 542 LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy--w 615 (695)
T PF08192_consen 542 LSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY--W 615 (695)
T ss_pred ccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE--e
Confidence 4599999997532 111 22333221 23567999999987644 4677777654322 1
Q ss_pred CCCCC-cccEEEEc----cccCCCCCCceeeCCCcc------EEEEeecccCCCCCCCCceeeecccc
Q 016647 259 ATGRP-IQDVIQTD----AAINPGNSGGPLLDSSGS------LIGINTAIYSPSGASSGVGFSIPVDT 315 (385)
Q Consensus 259 ~~~~~-~~~~i~~d----~~i~~G~SGGPlvn~~G~------VVGI~s~~~~~~~~~~~~g~aIP~~~ 315 (385)
..+.. ..+++... .-...|+||+-|++.-+. |+||.++.- |+...+|...|+..
T Consensus 616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsyd---ge~kqfglftPi~~ 680 (695)
T PF08192_consen 616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYD---GEQKQFGLFTPINE 680 (695)
T ss_pred cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecC---CccceeeccCcHHH
Confidence 12211 12333333 234589999999985444 999998753 34557888888764
No 60
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.13 E-value=0.074 Score=54.72 Aligned_cols=73 Identities=23% Similarity=0.234 Sum_probs=53.2
Q ss_pred CCccEEEEeecccCCCCCCCCceeeecccccccccccccccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEE
Q 016647 286 SSGSLIGINTAIYSPSGASSGVGFSIPVDTGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIV 363 (385)
Q Consensus 286 ~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l 363 (385)
.+|.-||+--++-. -.|+.+ .||+.+++++..||+.||.|+++|..+..+.- +...+|-...+|+.|+|
T Consensus 414 ~KGdSvGLRLAGGN------DVGIFV---aGvqegspA~~eGlqEGDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevti 484 (1027)
T KOG3580|consen 414 KKGDSVGLRLAGGN------DVGIFV---AGVQEGSPAEQEGLQEGDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTI 484 (1027)
T ss_pred ecCCeeeeEeccCC------ceeEEE---eecccCCchhhccccccceeEEeccccchhhhHHHHHHHHhcCCCCcEEee
Confidence 46777776544311 223322 27889999999999999999999999999854 44455667899999999
Q ss_pred EEEE
Q 016647 364 EVLR 367 (385)
Q Consensus 364 ~v~R 367 (385)
.-++
T Consensus 485 laQ~ 488 (1027)
T KOG3580|consen 485 LAQS 488 (1027)
T ss_pred hhhh
Confidence 6544
No 61
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.06 E-value=0.18 Score=45.94 Aligned_cols=144 Identities=15% Similarity=0.170 Sum_probs=50.1
Q ss_pred eEEEEEEEcCCC--EEEecccccCCCCeEEEEecCCCeEee---EEEEECCCCCeEEEEEcCCC---CCCcceecCCCCC
Q 016647 152 GSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDA---KIVGFDQDKDVAVLRIDAPK---DKLRPIPIGVSAD 223 (385)
Q Consensus 152 ~~GSGfiI~~~G--~ILT~aHvv~~~~~i~V~~~dg~~~~a---~vv~~d~~~DlAlLkv~~~~---~~~~~l~l~~~~~ 223 (385)
+.++.+-. .+| .++|+.||..+...+. ...+|+.++- +.+..+...|++||++.... ...+.+.+.....
T Consensus 30 Gya~cv~l-~~g~~~L~ta~Hv~~~~~~~~-~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~ 107 (203)
T PF02122_consen 30 GYATCVRL-FDGEDALLTARHVWSRPSKVT-SLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQ 107 (203)
T ss_dssp ----EEEE-----EEEEE-HHHHTSSS----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----S
T ss_pred ccceEEEC-cCCccceecccccCCCcccee-EcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhh
Confidence 44455432 233 6999999998855543 3344544432 34456788999999997321 1223333322111
Q ss_pred CCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeCCCccEEEEeecccCCCCC
Q 016647 224 LLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGA 303 (385)
Q Consensus 224 ~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~ 303 (385)
+..| .+ ..+....+........+.. ....+...-+...+|.||.|+++.+ ++||++... .....
T Consensus 108 ~~~g----~~-----~~y~~~~~~~~~~sa~i~g-----~~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~-~~~~~ 171 (203)
T PF02122_consen 108 LAKG----PV-----SFYGFSSGEWPCSSAKIPG-----TEGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS-PSGSN 171 (203)
T ss_dssp EEEE----ES-----STTSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE------
T ss_pred hCCC----Ce-----eeeeecCCCceeccCcccc-----ccCcCCceEcCCCCCCCCCCeEECC-CceEeecCc-ccccc
Confidence 1100 01 1112222222211111111 1123556667788999999999987 999999875 22233
Q ss_pred CCCceeeecc
Q 016647 304 SSGVGFSIPV 313 (385)
Q Consensus 304 ~~~~g~aIP~ 313 (385)
....++..|+
T Consensus 172 ~~n~n~~spi 181 (203)
T PF02122_consen 172 RENNNRMSPI 181 (203)
T ss_dssp ----------
T ss_pred cccccccccc
Confidence 4455555444
No 62
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=93.80 E-value=0.46 Score=47.63 Aligned_cols=41 Identities=24% Similarity=0.552 Sum_probs=29.1
Q ss_pred ccccCCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccc
Q 016647 271 DAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT 315 (385)
Q Consensus 271 d~~i~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~ 315 (385)
...+..||||+|++ .+|++||=++-.+-. ....||.|-+.+
T Consensus 354 tgGivqGMSGSPi~-q~gkliGAvtHVfvn---dpt~GYGi~ie~ 394 (402)
T TIGR02860 354 TGGIVQGMSGSPII-QNGKVIGAVTHVFVN---DPTSGYGVYIEW 394 (402)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCcceeehHHH
Confidence 34566899999999 799999977655432 335677765543
No 63
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=93.02 E-value=0.16 Score=52.32 Aligned_cols=52 Identities=31% Similarity=0.453 Sum_probs=40.1
Q ss_pred ccccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647 323 DAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRGDQKEKIPV 376 (385)
Q Consensus 323 a~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v 376 (385)
+..++|+.||+|++|||.--.|+. |..+++... -.++.|.|+||.+..-+++
T Consensus 233 ardgnlqEGDiiLkINGtvteNmSLtDar~LIEkS--~GKL~lvVlRD~~qtLiNi 286 (1027)
T KOG3580|consen 233 ARDGNLQEGDIILKINGTVTENMSLTDARKLIEKS--RGKLQLVVLRDSQQTLINI 286 (1027)
T ss_pred hccCCcccccEEEEECcEeeccccchhHHHHHHhc--cCceEEEEEecCCceeeec
Confidence 456789999999999998888754 777887753 3467999999976655554
No 64
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=92.86 E-value=0.053 Score=58.13 Aligned_cols=28 Identities=36% Similarity=0.719 Sum_probs=15.0
Q ss_pred EEEccccCCCCCCceeeCCCccEEEEee
Q 016647 268 IQTDAAINPGNSGGPLLDSSGSLIGINT 295 (385)
Q Consensus 268 i~~d~~i~~G~SGGPlvn~~G~VVGI~s 295 (385)
+.++..+..||||+|++|.+|||||+++
T Consensus 624 FlstnDitGGNSGSPvlN~~GeLVGl~F 651 (698)
T PF10459_consen 624 FLSTNDITGGNSGSPVLNAKGELVGLAF 651 (698)
T ss_pred EEeccCcCCCCCCCccCCCCceEEEEee
Confidence 3444455555555555555555555554
No 65
>PF12812 PDZ_1: PDZ-like domain
Probab=92.56 E-value=0.15 Score=39.09 Aligned_cols=33 Identities=33% Similarity=0.443 Sum_probs=28.7
Q ss_pred cccCCCCCcEEEEECCEEeCCHHHHHHHHhcCC
Q 016647 324 AYGRLILGDIITSVNGKKVSNGSDLYRILDQCK 356 (385)
Q Consensus 324 ~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~ 356 (385)
...++..|-+|++|||+++.+.+++.+.+++.+
T Consensus 44 ~~~~i~~g~iI~~Vn~kpt~~Ld~f~~vvk~ip 76 (78)
T PF12812_consen 44 FAGGISKGFIITSVNGKPTPDLDDFIKVVKKIP 76 (78)
T ss_pred hhCCCCCCeEEEeECCcCCcCHHHHHHHHHhCC
Confidence 334499999999999999999999999998753
No 66
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=92.41 E-value=0.39 Score=47.60 Aligned_cols=56 Identities=29% Similarity=0.421 Sum_probs=47.9
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE---EEEEEEE-CCEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE---VIVEVLR-GDQKE 372 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~---v~l~v~R-~g~~~ 372 (385)
++...++++.+++++||.|+++|++++.++.+..+.+... .|.. +.+.+.| +++..
T Consensus 135 ~v~~~s~a~~a~l~~Gd~iv~~~~~~i~~~~~~~~~~~~~-~~~~~~~~~i~~~~~~~~~~ 194 (375)
T COG0750 135 EVAPKSAAALAGLRPGDRIVAVDGEKVASWDDVRRLLVAA-AGDVFNLLTILVIRLDGEAH 194 (375)
T ss_pred ecCCCCHHHHcCCCCCCEEEeECCEEccCHHHHHHHHHhc-cCCcccceEEEEEeccceee
Confidence 5678888999999999999999999999999998888764 5565 8999999 77663
No 67
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=92.33 E-value=0.12 Score=43.38 Aligned_cols=28 Identities=32% Similarity=0.565 Sum_probs=23.2
Q ss_pred ccccCCCCCCceeeCCCccEEEEeeccc
Q 016647 271 DAAINPGNSGGPLLDSSGSLIGINTAIY 298 (385)
Q Consensus 271 d~~i~~G~SGGPlvn~~G~VVGI~s~~~ 298 (385)
...-.+|+||-|++|..|+||||+-.+.
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~ 127 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGGA 127 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence 3445689999999999999999998764
No 68
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=92.20 E-value=0.24 Score=52.17 Aligned_cols=68 Identities=25% Similarity=0.417 Sum_probs=51.2
Q ss_pred CCCCceeeecccccccc----cccccccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEECCE
Q 016647 303 ASSGVGFSIPVDTGLLS----TKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLRGDQ 370 (385)
Q Consensus 303 ~~~~~g~aIP~~~~v~~----~~~a~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v~R~g~ 370 (385)
+.+++||+|--...... -.+..--||++||.|++||++.+.... ++-+++.+...|..+.|.|.|+|-
T Consensus 493 gpegfgftiADsPtgqrvK~ilDp~~c~gl~eGd~IVei~~rnvr~L~h~qvvdmlke~piG~r~~Llv~RGgp 566 (984)
T KOG3209|consen 493 GPEGFGFTIADSPTGQRVKQILDPQDCPGLSEGDLIVEINERNVRALTHTQVVDMLKECPIGSRVHLLVKRGGP 566 (984)
T ss_pred CCCCCCceeccCCCCCceeeecCcccCCCCCCCCeEEecccccccccchHHHHHHHHhccCCcceeEEEecCCC
Confidence 35678888744332211 122334589999999999999999765 677899999999999999999874
No 69
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=89.06 E-value=1.4 Score=46.28 Aligned_cols=79 Identities=23% Similarity=0.333 Sum_probs=57.4
Q ss_pred CCccEEEEeecccCCCCCCCCceeeecccc--ccccccccccc-CCCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCE
Q 016647 286 SSGSLIGINTAIYSPSGASSGVGFSIPVDT--GLLSTKRDAYG-RLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDE 360 (385)
Q Consensus 286 ~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~--~v~~~~~a~~~-gl~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~ 360 (385)
.+||.+||+-- ..|.|=.+|.-. .+..++++++. .|..||.|++|||...-- ...-+.+++..|.-..
T Consensus 654 ~kGEiLGVViV-------ESGWGSmLPTVViAnmm~~GpAarsgkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~ 726 (829)
T KOG3605|consen 654 HKGEILGVVIV-------ESGWGSILPTVVIANMMHGGPAARSGKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTA 726 (829)
T ss_pred ccCceeeEEEE-------ecCccccchHHHHHhcccCChhhhcCCccccceeEeecCceeccccHHHHHHHHhcccccce
Confidence 57899998753 346666677644 56677887765 599999999999988764 3466677888777777
Q ss_pred EEEEEEECCEE
Q 016647 361 VIVEVLRGDQK 371 (385)
Q Consensus 361 v~l~v~R~g~~ 371 (385)
|+++|.+=--.
T Consensus 727 VkltiV~cpPV 737 (829)
T KOG3605|consen 727 VKLNIVSCPPV 737 (829)
T ss_pred EEEEEecCCCc
Confidence 88877764333
No 70
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=88.77 E-value=0.57 Score=49.17 Aligned_cols=47 Identities=15% Similarity=0.280 Sum_probs=39.6
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIV 363 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l 363 (385)
.|.+.+++.++.+++||++++|||.+|++..+..+.+... .|+...+
T Consensus 404 tv~~ns~a~k~~~~~gdvlvai~~~pi~s~~q~~~~~~s~-~~~~~~l 450 (1051)
T KOG3532|consen 404 TVEDNSLADKAAFKPGDVLVAINNVPIRSERQATRFLQST-TGDLTVL 450 (1051)
T ss_pred EecCCChhhHhcCCCcceEEEecCccchhHHHHHHHHHhc-ccceEEE
Confidence 4567889999999999999999999999999999999876 3544333
No 71
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=88.61 E-value=0.5 Score=51.22 Aligned_cols=45 Identities=27% Similarity=0.452 Sum_probs=35.7
Q ss_pred ccccccCCCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 016647 321 KRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLR 367 (385)
Q Consensus 321 ~~a~~~gl~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~v~l~v~R 367 (385)
+....+.|++||.|+.|||++|.. ++.+.++++.+ .+.|.|+|.+
T Consensus 85 GGps~GKL~PGDQIl~vN~Epv~daprervIdlvRac--e~sv~ltV~q 131 (1298)
T KOG3552|consen 85 GGPSIGKLQPGDQILAVNGEPVKDAPRERVIDLVRAC--ESSVNLTVCQ 131 (1298)
T ss_pred CCCccccccCCCeEEEecCcccccccHHHHHHHHHHH--hhhcceEEec
Confidence 335567799999999999999996 55677888764 4678888877
No 72
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=83.67 E-value=0.62 Score=37.09 Aligned_cols=28 Identities=21% Similarity=0.197 Sum_probs=25.0
Q ss_pred cccccccccccCCCCCcEEEEECCEEeC
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVS 343 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~ 343 (385)
.+..++|++.+||+.+|.|+.+||-..+
T Consensus 65 ~V~eGsPA~~AGLrihDKIlQvNG~DfT 92 (124)
T KOG3553|consen 65 RVSEGSPAEIAGLRIHDKILQVNGWDFT 92 (124)
T ss_pred EeccCChhhhhcceecceEEEecCceeE
Confidence 5667899999999999999999997766
No 73
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=80.55 E-value=0.86 Score=38.39 Aligned_cols=114 Identities=21% Similarity=0.271 Sum_probs=55.3
Q ss_pred EEEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEEe
Q 016647 155 SGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIG 234 (385)
Q Consensus 155 SGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~iG 234 (385)
-|+.|+ |..-|.+|--... ++--+. -+..-.+.+...|+..-....-...+.|-.-+ -+.++++-
T Consensus 15 mgt~vn--GV~wT~~HGagsr---tlAgp~---Gpv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg-------~~dlylVt 79 (148)
T PF02907_consen 15 MGTCVN--GVMWTVYHGAGSR---TLAGPK---GPVNQMYTSVDDDLVGWPAPPGARSLTPCTCG-------SSDLYLVT 79 (148)
T ss_dssp EEEEET--TEEEEEHHHHTTS---EEEBTT---SEB-ESEEETTTTEEEEE-STTB--BBB-SSS-------SSEEEEE-
T ss_pred ehhEEc--cEEEEEEecCCcc---cccCCC---CcceEeEEcCCCCCcccccccccccCCccccC-------CccEEEEe
Confidence 367775 7888888864321 111111 12334456777898887775433334433332 24566664
Q ss_pred CCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEE--ccccCCCCCCceeeCCCccEEEEeecccC
Q 016647 235 NPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQT--DAAINPGNSGGPLLDSSGSLIGINTAIYS 299 (385)
Q Consensus 235 ~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~--d~~i~~G~SGGPlvn~~G~VVGI~s~~~~ 299 (385)
+- ..+-.+ ++. +.. ...+.. -.....|.||||++-.+|.+|||..+...
T Consensus 80 r~----~~v~p~-----rr~------gd~-~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~ 130 (148)
T PF02907_consen 80 RD----ADVIPV-----RRR------GDS-RASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVC 130 (148)
T ss_dssp TT----S-EEEE-----EEE------STT-EEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEE
T ss_pred cc----CcEeee-----EEc------CCC-ceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEE
Confidence 32 111111 111 000 001111 11234799999999999999999776544
No 74
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=79.65 E-value=3.7 Score=35.26 Aligned_cols=48 Identities=27% Similarity=0.320 Sum_probs=31.1
Q ss_pred cccccccc-cccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEE
Q 016647 316 GLLSTKRD-AYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEV 365 (385)
Q Consensus 316 ~v~~~~~a-~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v 365 (385)
.+.+++.+ -.+|||.||.++++||..|.--. -..++|... . ..|++.|
T Consensus 121 riipggvadrhgglkrgdqllsvngvsvege~hekavellkaa-~-gsvklvv 171 (207)
T KOG3550|consen 121 RIIPGGVADRHGGLKRGDQLLSVNGVSVEGEHHEKAVELLKAA-V-GSVKLVV 171 (207)
T ss_pred eecCCccccccCcccccceeEeecceeecchhhHHHHHHHHHh-c-CcEEEEE
Confidence 34566554 56799999999999999997432 233344432 2 3467655
No 75
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=78.92 E-value=10 Score=30.69 Aligned_cols=52 Identities=23% Similarity=0.346 Sum_probs=34.0
Q ss_pred EEEEcCCCEEEecccccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCCCCCcceecC
Q 016647 156 GFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIG 219 (385)
Q Consensus 156 GfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~ 219 (385)
++-|. +|.++|+.||.+..+.+. |..+ +++. ...|+++++.+.. .++..+++
T Consensus 3 avHIG-nG~~vt~tHva~~~~~v~-----g~~f--~~~~--~~ge~~~v~~~~~--~~p~~~ig 54 (105)
T PF03510_consen 3 AVHIG-NGRYVTVTHVAKSSDSVD-----GQPF--KIVK--TDGELCWVQSPLV--HLPAAQIG 54 (105)
T ss_pred eEEeC-CCEEEEEEEEeccCceEc-----CcCc--EEEE--eccCEEEEECCCC--CCCeeEec
Confidence 55675 689999999998876542 2222 2323 3459999999763 35556664
No 76
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=75.58 E-value=2.2 Score=42.53 Aligned_cols=24 Identities=25% Similarity=0.504 Sum_probs=21.3
Q ss_pred cccCCCCCCceeeCCCccEEEEee
Q 016647 272 AAINPGNSGGPLLDSSGSLIGINT 295 (385)
Q Consensus 272 ~~i~~G~SGGPlvn~~G~VVGI~s 295 (385)
..+..|.||+.++|.+|++|||..
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 356689999999999999999975
No 77
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=75.58 E-value=9.1 Score=37.41 Aligned_cols=47 Identities=32% Similarity=0.457 Sum_probs=35.6
Q ss_pred ccccccccC-CCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 016647 319 STKRDAYGR-LILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLR 367 (385)
Q Consensus 319 ~~~~a~~~g-l~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~v~l~v~R 367 (385)
.+..++..| |=.||-|++|||..|+. -+|+-.+|++ .||.|+++|..
T Consensus 89 kdQaAd~tG~LFvGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV~~ 138 (505)
T KOG3549|consen 89 KDQAADITGQLFVGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTVKH 138 (505)
T ss_pred hhhhhhhcCceEeeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEeHh
Confidence 344444444 67999999999999996 4578888875 79999998853
No 78
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=74.54 E-value=2.5 Score=44.62 Aligned_cols=50 Identities=22% Similarity=0.235 Sum_probs=34.9
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVL 366 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~ 366 (385)
+|.+++.++..|||.||.|+++||+...+... .++..-...+.-+.+++.
T Consensus 568 ~V~pgskAa~~GlKRgDqilEVNgQnfenis~-~KA~eiLrnnthLtltvK 617 (1283)
T KOG3542|consen 568 EVFPGSKAAREGLKRGDQILEVNGQNFENISA-KKAEEILRNNTHLTLTVK 617 (1283)
T ss_pred eecCCchHHHhhhhhhhhhhhccccchhhhhH-HHHHHHhcCCceEEEEEe
Confidence 67788899999999999999999999887543 222222223344555543
No 79
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=71.63 E-value=11 Score=37.88 Aligned_cols=62 Identities=16% Similarity=0.227 Sum_probs=47.7
Q ss_pred cccccccccccCCC-CCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEEe
Q 016647 316 GLLSTKRDAYGRLI-LGDIITSVNGKKVSNGSDLYRILDQCKVGDEVIVEVLRGDQKEKIPVKL 378 (385)
Q Consensus 316 ~v~~~~~a~~~gl~-~GDiI~~ing~~i~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~~ 378 (385)
.|.+.++++.+||+ -+|.|+-+-+......+||..++..+ .++.+++-|+.-....--+|++
T Consensus 115 ~V~p~SPaalAgl~~~~DYivG~~~~~~~~~eDl~~lIesh-e~kpLklyVYN~D~d~~ReVti 177 (462)
T KOG3834|consen 115 SVEPNSPAALAGLRPYTDYIVGIWDAVMHEEEDLFTLIESH-EGKPLKLYVYNHDTDSCREVTI 177 (462)
T ss_pred ecCCCCHHHhcccccccceEecchhhhccchHHHHHHHHhc-cCCCcceeEeecCCCccceEEe
Confidence 56788999999999 68999999555566778999988875 6899999998855544334433
No 80
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=69.91 E-value=15 Score=40.29 Aligned_cols=62 Identities=23% Similarity=0.375 Sum_probs=35.4
Q ss_pred eEEEEEEEcCCCEEEecccccCCCCeEEEEecC--CCeEeeEEEEEC--CCCCeEEEEEcCCCCCCcceec
Q 016647 152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFAD--QSAYDAKIVGFD--QDKDVAVLRIDAPKDKLRPIPI 218 (385)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~d--g~~~~a~vv~~d--~~~DlAlLkv~~~~~~~~~l~l 218 (385)
..|...+|++. ||+|.+|+..+... |.|.+ +..| +++... +..|+.+-|++.-.....|+..
T Consensus 65 ~~G~aTLigpq-YiVSV~HN~~gy~~--v~FG~~g~~~Y--~iV~RNn~~~~Df~~pRLnK~VTEvaP~~~ 130 (769)
T PF02395_consen 65 NKGVATLIGPQ-YIVSVKHNGKGYNS--VSFGNEGQNTY--KIVDRNNYPSGDFHMPRLNKFVTEVAPAEM 130 (769)
T ss_dssp TTSS-EEEETT-EEEBETTG-TSCCE--ECESCSSTCEE--EEEEEEBETTSTEBEEEESS---SS----B
T ss_pred CCceEEEecCC-eEEEEEccCCCcCc--eeecccCCceE--EEEEccCCCCcccceeecCceEEEEecccc
Confidence 34789999986 99999999855444 45544 3334 444443 4469999999764333344443
No 81
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=66.70 E-value=8.5 Score=32.18 Aligned_cols=32 Identities=31% Similarity=0.471 Sum_probs=24.1
Q ss_pred ccEEEEccccCCCCCCceeeCCCccEEEEeecc
Q 016647 265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI 297 (385)
Q Consensus 265 ~~~i~~d~~i~~G~SGGPlvn~~G~VVGI~s~~ 297 (385)
.+++....+..||+-||+|+- +--||||++++
T Consensus 78 ~~~l~g~Gp~~PGdCGg~L~C-~HGViGi~Tag 109 (127)
T PF00947_consen 78 YNLLIGEGPAEPGDCGGILRC-KHGVIGIVTAG 109 (127)
T ss_dssp ECEEEEE-SSSTT-TCSEEEE-TTCEEEEEEEE
T ss_pred cCceeecccCCCCCCCceeEe-CCCeEEEEEeC
Confidence 345666778899999999995 55699999986
No 82
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=65.59 E-value=7.9 Score=42.63 Aligned_cols=52 Identities=29% Similarity=0.417 Sum_probs=36.9
Q ss_pred cccccc-ccccCCCCCcEEEEECCEEeCCHHH--HHHHHhcCCCCCEEEEEEEECCE
Q 016647 317 LLSTKR-DAYGRLILGDIITSVNGKKVSNGSD--LYRILDQCKVGDEVIVEVLRGDQ 370 (385)
Q Consensus 317 v~~~~~-a~~~gl~~GDiI~~ing~~i~s~~~--l~~~l~~~~~g~~v~l~v~R~g~ 370 (385)
|..+++ +..+.|+.||.++++||+..--..+ ..+++ .+.|..|.++|...|.
T Consensus 967 VV~GgaAd~DGRL~aGDQLLsVdG~SLiGisQErAA~lm--trtg~vV~leVaKqgA 1021 (1629)
T KOG1892|consen 967 VVEGGAADHDGRLEAGDQLLSVDGHSLIGISQERAARLM--TRTGNVVHLEVAKQGA 1021 (1629)
T ss_pred eccCCccccccccccCceeeeecCcccccccHHHHHHHH--hccCCeEEEehhhhhh
Confidence 344554 4566799999999999998875443 33444 4579999999866553
No 83
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=61.65 E-value=9.7 Score=39.33 Aligned_cols=40 Identities=33% Similarity=0.577 Sum_probs=32.2
Q ss_pred cCCCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 016647 326 GRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVLR 367 (385)
Q Consensus 326 ~gl~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~v~l~v~R 367 (385)
+-|+.||.|.+|||..+.+ ..++++++.+.+ | .++++|.-
T Consensus 163 glL~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G-~itfkiiP 204 (542)
T KOG0609|consen 163 GLLHVGDEILEVNGISVANKSPEELQELLRNSR-G-SITFKIIP 204 (542)
T ss_pred cceeeccchheecCeecccCCHHHHHHHHHhCC-C-cEEEEEcc
Confidence 3478999999999999986 579999999876 4 55776653
No 84
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=60.18 E-value=18 Score=28.28 Aligned_cols=37 Identities=5% Similarity=0.297 Sum_probs=30.8
Q ss_pred ccCCCCeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 171 VIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 171 vv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
++.....+.|.+.+++.+.+++.++|.+.++.+=...
T Consensus 10 ~~~~~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 10 AVKNNTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred HHcCCCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 3445568999999999999999999999998876553
No 85
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=58.80 E-value=35 Score=24.82 Aligned_cols=33 Identities=15% Similarity=0.303 Sum_probs=28.6
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (385)
..+.+....|..++++++.+|....+.+|+-..
T Consensus 7 s~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 7 SQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred cEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 457778888999999999999999999998654
No 86
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=58.35 E-value=27 Score=24.80 Aligned_cols=33 Identities=18% Similarity=0.415 Sum_probs=28.4
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (385)
..+.|.+.||+.+.+.+.++|...++.+-....
T Consensus 7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~~ 39 (63)
T cd00600 7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVEE 39 (63)
T ss_pred CEEEEEECCCcEEEEEEEEECCCCCEEECCEEE
Confidence 468899999999999999999998988766643
No 87
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=56.07 E-value=51 Score=31.80 Aligned_cols=49 Identities=24% Similarity=0.355 Sum_probs=33.6
Q ss_pred cccccccccc-cCCCCCcEEEEECCEEeCC--HHHHHHHHhcCCCCCEEEEEEE
Q 016647 316 GLLSTKRDAY-GRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDEVIVEVL 366 (385)
Q Consensus 316 ~v~~~~~a~~-~gl~~GDiI~~ing~~i~s--~~~l~~~l~~~~~g~~v~l~v~ 366 (385)
.+-...|++. +.++.||.|++|||..|.- .-++.+++... -+.|++.|.
T Consensus 36 QvFD~tPAa~dG~i~~GDEi~avNg~svKGktKveVAkmIQ~~--~~eV~IhyN 87 (429)
T KOG3651|consen 36 QVFDKTPAAKDGRIRCGDEIVAVNGISVKGKTKVEVAKMIQVS--LNEVKIHYN 87 (429)
T ss_pred EeccCCchhccCccccCCeeEEecceeecCccHHHHHHHHHHh--ccceEEEeh
Confidence 3445566654 5699999999999999985 44666666543 245677664
No 88
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=54.67 E-value=30 Score=25.37 Aligned_cols=33 Identities=9% Similarity=0.245 Sum_probs=29.0
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (385)
..+.|.+.+|+.+.+++.++|...++.+-....
T Consensus 11 ~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 11 KPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence 468999999999999999999999998877643
No 89
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=54.02 E-value=30 Score=25.74 Aligned_cols=32 Identities=13% Similarity=0.330 Sum_probs=28.3
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
..+.|.+.+|+.+.+++.++|...++.+=...
T Consensus 15 k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~ 46 (72)
T PRK00737 15 SPVLVRLKGGREFRGELQGYDIHMNLVLDNAE 46 (72)
T ss_pred CEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence 46899999999999999999999999877764
No 90
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=53.91 E-value=28 Score=25.45 Aligned_cols=32 Identities=13% Similarity=0.233 Sum_probs=27.9
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
..+.|.+.+|+.|.+++.++|...++.+=...
T Consensus 11 ~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 11 RPVVVKLNSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence 46899999999999999999999998776653
No 91
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=52.55 E-value=29 Score=25.51 Aligned_cols=32 Identities=13% Similarity=0.217 Sum_probs=27.7
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
..+.|.+.+|+.+.+++.++|...++.+=.+.
T Consensus 12 ~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 12 KPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 46889999999999999999999988775553
No 92
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=51.34 E-value=19 Score=40.44 Aligned_cols=48 Identities=25% Similarity=0.341 Sum_probs=37.1
Q ss_pred cccccccccccCCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEE
Q 016647 316 GLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEV 365 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v 365 (385)
.|..++++..+|+++||.|+.+||+++.... ++.+++.. .|..+.+.+
T Consensus 664 sv~egsPA~~agls~~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~t 713 (1205)
T KOG0606|consen 664 SVEEGSPAFEAGLSAGDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLRT 713 (1205)
T ss_pred eecCCCCccccCCCccceeEeccCcccchhhHHHHHHHHHh--cCCeeEEEe
Confidence 4567888999999999999999999998644 66666653 466666644
No 93
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=51.29 E-value=15 Score=25.15 Aligned_cols=21 Identities=38% Similarity=0.616 Sum_probs=17.8
Q ss_pred CCCCCceeeCCCccEEEEeec
Q 016647 276 PGNSGGPLLDSSGSLIGINTA 296 (385)
Q Consensus 276 ~G~SGGPlvn~~G~VVGI~s~ 296 (385)
.+.+.-|++|.+|+++|+++.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 356778999999999999874
No 94
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.63 E-value=28 Score=26.65 Aligned_cols=31 Identities=10% Similarity=0.271 Sum_probs=26.8
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (385)
..+.|.+.+|+.+.+++.++|...+|.+=..
T Consensus 12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence 4689999999999999999999998876443
No 95
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=50.47 E-value=25 Score=35.50 Aligned_cols=60 Identities=18% Similarity=0.222 Sum_probs=42.5
Q ss_pred cccccccccccCCCC-CcEEEEECCEEeCCHHH-HHHHHhcCCCCCEEEEEEEECC--EEEEEEEE
Q 016647 316 GLLSTKRDAYGRLIL-GDIITSVNGKKVSNGSD-LYRILDQCKVGDEVIVEVLRGD--QKEKIPVK 377 (385)
Q Consensus 316 ~v~~~~~a~~~gl~~-GDiI~~ing~~i~s~~~-l~~~l~~~~~g~~v~l~v~R~g--~~~~~~v~ 377 (385)
.+..++++.++||.+ =|-|++|||..++...| |++.|+.... .|+++++.-. ..+.++|+
T Consensus 21 kVqedSpa~~aglepffdFIvSI~g~rL~~dnd~Lk~llk~~se--kVkltv~n~kt~~~R~v~I~ 84 (462)
T KOG3834|consen 21 KVQEDSPAHKAGLEPFFDFIVSINGIRLNKDNDTLKALLKANSE--KVKLTVYNSKTQEVRIVEIV 84 (462)
T ss_pred EeecCChHHhcCcchhhhhhheeCcccccCchHHHHHHHHhccc--ceEEEEEecccceeEEEEec
Confidence 567888999999876 58999999999996554 6666665433 3899887643 23344444
No 96
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=50.23 E-value=37 Score=26.05 Aligned_cols=46 Identities=22% Similarity=0.456 Sum_probs=31.6
Q ss_pred EeeEEEEECCCCCeEEEEEcCCCCCCcceecCCC-CCCCCCCEEEE-EeCC
Q 016647 188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVS-ADLLVGQKVYA-IGNP 236 (385)
Q Consensus 188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~-~~~~~G~~V~~-iG~p 236 (385)
++++++..|...++|++.+-.-. +.+.+.-- ..++.|+.|.+ +||.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvk---reV~l~Lv~~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVK---REVNLDLVGEEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEe---EEEEeeeecCccccCCEEEEEeeEE
Confidence 57899999988889999996532 22222211 26889999877 6764
No 97
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=49.48 E-value=41 Score=34.01 Aligned_cols=135 Identities=20% Similarity=0.299 Sum_probs=62.9
Q ss_pred eEEEEEEEcCCCEEEecccccCCCC-eEEEEecCCCeEeeEEEEECCCCCeEEEEEcCCC-CCCcceecCCCCCCCCCCE
Q 016647 152 GSGSGFVWDSKGHVVTNYHVIRGAS-DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPK-DKLRPIPIGVSADLLVGQK 229 (385)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~~~~-~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~l~l~~~~~~~~G~~ 229 (385)
+.|=||-|+++ .++|+-||+.... ++ | | .+..-+.++..-++.-+++..+. .+++-+.|.. -...|.-
T Consensus 379 GsGWGfWVS~~-lfITttHViP~g~~E~---F--G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLEe--GapEGtV 448 (535)
T PF05416_consen 379 GSGWGFWVSPT-LFITTTHVIPPGAKEA---F--G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILEE--GAPEGTV 448 (535)
T ss_dssp TTEEEEESSSS-EEEEEGGGS-STTSEE---T--T--EECGGEEEEEETTEEEEEESS-SSTTS---EE-S--S--TT-E
T ss_pred CCceeeeecce-EEEEeeeecCCcchhh---h--C--CChhHeEEeeccceEEEecCCCCCCCccceeecc--CCCCceE
Confidence 46889999998 9999999997432 21 0 0 01111233444577777776543 2455555532 2334554
Q ss_pred EEE-EeCCCCCC--CceEEeEEeeeeeeeccCCCCCCcccEEEE-------ccccCCCCCCceeeCCCc---cEEEEeec
Q 016647 230 VYA-IGNPFGLD--HTLTTGVISGLRREISSAATGRPIQDVIQT-------DAAINPGNSGGPLLDSSG---SLIGINTA 296 (385)
Q Consensus 230 V~~-iG~p~g~~--~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~-------d~~i~~G~SGGPlvn~~G---~VVGI~s~ 296 (385)
+.+ |-.+.|.- ..+.-|....+.-.-.. . .....++.+ |-...||+-|.|-+-..| -|+|++.+
T Consensus 449 ~siLiKR~sGEllpLAvRMgt~AsmkIqgr~--v-~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~A 525 (535)
T PF05416_consen 449 CSILIKRPSGELLPLAVRMGTHASMKIQGRT--V-HGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAA 525 (535)
T ss_dssp EEEEEE-TTSBEEEEEEEEEEEEEEEETTEE--E-EEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEE
T ss_pred EEEEEEcCCccchhhhhhhccceeEEEccee--e-cceeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEeh
Confidence 433 44554432 12333333322111000 0 000122222 334569999999996555 48999987
Q ss_pred ccC
Q 016647 297 IYS 299 (385)
Q Consensus 297 ~~~ 299 (385)
...
T Consensus 526 Atr 528 (535)
T PF05416_consen 526 ATR 528 (535)
T ss_dssp E-S
T ss_pred hcc
Confidence 643
No 98
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=49.16 E-value=36 Score=25.81 Aligned_cols=31 Identities=19% Similarity=0.414 Sum_probs=27.1
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (385)
..+.|.+.||+.+.+.+.++|...+|.|=..
T Consensus 11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~ 41 (79)
T cd01717 11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDC 41 (79)
T ss_pred CEEEEEECCCcEEEEEEEEEcCccCEEcCCE
Confidence 4688999999999999999999999876554
No 99
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=48.82 E-value=16 Score=36.43 Aligned_cols=29 Identities=38% Similarity=0.486 Sum_probs=25.2
Q ss_pred cCCCCCcEEEEECCEEeCCHHHHHHHHhc
Q 016647 326 GRLILGDIITSVNGKKVSNGSDLYRILDQ 354 (385)
Q Consensus 326 ~gl~~GDiI~~ing~~i~s~~~l~~~l~~ 354 (385)
-||.+||+|+++||-+|.+.+|..+-++.
T Consensus 237 rGL~vgdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 237 RGLSVGDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred ccCCccceEEecCCcccCCHHHHHHHHHh
Confidence 38999999999999999999997776653
No 100
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.07 E-value=45 Score=25.21 Aligned_cols=31 Identities=13% Similarity=0.213 Sum_probs=26.9
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (385)
..+.|.+.||+.+.+...++|...+|.+=..
T Consensus 11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~ 41 (75)
T cd06168 11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSA 41 (75)
T ss_pred CeEEEEEcCCeEEEEEEEEEcCCCcEEecCc
Confidence 4689999999999999999999999876544
No 101
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.00 E-value=43 Score=25.62 Aligned_cols=31 Identities=23% Similarity=0.334 Sum_probs=26.7
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (385)
..+.|.+.||+.+.+++.++|...+|.+=..
T Consensus 13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence 4688999999999999999999998876544
No 102
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=46.91 E-value=39 Score=25.60 Aligned_cols=31 Identities=16% Similarity=0.409 Sum_probs=27.1
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (385)
..+.|.+.+|+.+.+++.++|...++.+=..
T Consensus 14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 5789999999999999999999998876544
No 103
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=46.08 E-value=11 Score=35.47 Aligned_cols=41 Identities=20% Similarity=0.493 Sum_probs=35.3
Q ss_pred CCCCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEEE
Q 016647 327 RLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVIVEVLR 367 (385)
Q Consensus 327 gl~~GDiI~~ing~~i~s~~--~l~~~l~~~~~g~~v~l~v~R 367 (385)
-++.||.|-+|||+.|--.. ++.++|.+.+.|++.+|.+..
T Consensus 167 ~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLie 209 (334)
T KOG3938|consen 167 AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLIE 209 (334)
T ss_pred heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEeec
Confidence 47899999999999998765 677899999999999997764
No 104
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=44.81 E-value=36 Score=25.83 Aligned_cols=49 Identities=18% Similarity=0.182 Sum_probs=34.8
Q ss_pred CCcEEEEECCEEeCCHHHHHHHHh-cCCCCCE----EEEEEEECCEEEEEEEEe
Q 016647 330 LGDIITSVNGKKVSNGSDLYRILD-QCKVGDE----VIVEVLRGDQKEKIPVKL 378 (385)
Q Consensus 330 ~GDiI~~ing~~i~s~~~l~~~l~-~~~~g~~----v~l~v~R~g~~~~~~v~~ 378 (385)
|-|-.+.+||++.++......... .+++|.. |+.++.|||+....+-++
T Consensus 10 PadAkl~v~G~~t~~~G~~R~F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~~~V 63 (75)
T TIGR03000 10 PADAKLKVDGKETNGTGTVRTFTTPPLEAGKEYEYTVTAEYDRDGRILTRTRTV 63 (75)
T ss_pred CCCCEEEECCeEcccCccEEEEECCCCCCCCEEEEEEEEEEecCCcEEEEEEEE
Confidence 468889999999998877555443 3566664 677888999876554443
No 105
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=44.31 E-value=53 Score=24.52 Aligned_cols=31 Identities=10% Similarity=0.202 Sum_probs=26.9
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (385)
..+.|.+.+|+.+.+++.++|...+|.+=..
T Consensus 11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~ 41 (72)
T cd01719 11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDA 41 (72)
T ss_pred CeEEEEECCCeEEEEEEEEEcccccEEeccE
Confidence 4688999999999999999999888877554
No 106
>PF04225 OapA: Opacity-associated protein A LysM-like domain; InterPro: IPR007340 This entry includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonization, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation [].; PDB: 2GU1_A.
Probab=43.43 E-value=14 Score=28.67 Aligned_cols=53 Identities=17% Similarity=0.246 Sum_probs=26.4
Q ss_pred CCCCcEEEEE---CCEEeCCHHHHHH------HHhcCCCCCEEEEEEEECCEEEEEEEEeec
Q 016647 328 LILGDIITSV---NGKKVSNGSDLYR------ILDQCKVGDEVIVEVLRGDQKEKIPVKLEP 380 (385)
Q Consensus 328 l~~GDiI~~i---ng~~i~s~~~l~~------~l~~~~~g~~v~l~v~R~g~~~~~~v~~~~ 380 (385)
++.||-+-.| .|-+..++..+.+ .|...+||+++.+.+-.+|+...+.+...+
T Consensus 7 V~~GDtLs~iF~~~gls~~dl~~v~~~~~~~k~L~~L~pGq~l~f~~d~~g~L~~L~~~~~~ 68 (85)
T PF04225_consen 7 VKSGDTLSTIFRRAGLSASDLYAVLEADGEAKPLTRLKPGQTLEFQLDEDGQLTALRYERSP 68 (85)
T ss_dssp --TT--HHHHHHHTT--HHHHHHHHHHGGGT--GGG--TT-EEEEEE-TTS-EEEEEEEEET
T ss_pred ECCCCcHHHHHHHcCCCHHHHHHHHhccCccchHhhCCCCCEEEEEECCCCCEEEEEEEcCC
Confidence 4556654444 4555444444333 556678999999999888988777766543
No 107
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=43.17 E-value=54 Score=24.69 Aligned_cols=31 Identities=16% Similarity=0.185 Sum_probs=27.0
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (385)
..+.|.+.||+.+.+.+.++|+..++.+=..
T Consensus 13 k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~ 43 (74)
T cd01728 13 KKVVVLLRDGRKLIGILRSFDQFANLVLQDT 43 (74)
T ss_pred CEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence 4688999999999999999999988877554
No 108
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=43.10 E-value=57 Score=23.43 Aligned_cols=32 Identities=19% Similarity=0.413 Sum_probs=27.5
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
..+.|.+.||+.+.+.+.++|...++-+=...
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~ 40 (67)
T smart00651 9 KRVLVELKNGREYRGTLKGFDQFMNLVLEDVE 40 (67)
T ss_pred cEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence 36889999999999999999999988776554
No 109
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=42.29 E-value=59 Score=24.06 Aligned_cols=32 Identities=9% Similarity=0.272 Sum_probs=28.8
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
..+.|.+.+|..|.+++..+|...++.+-.+.
T Consensus 11 ~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 11 HIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence 46889999999999999999999999888774
No 110
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=40.51 E-value=60 Score=24.24 Aligned_cols=31 Identities=19% Similarity=0.272 Sum_probs=27.1
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (385)
.++.|.+.||+.+.++..++|...++.+=..
T Consensus 10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~ 40 (74)
T cd01727 10 KTVSVITVDGRVIVGTLKGFDQATNLILDDS 40 (74)
T ss_pred CEEEEEECCCcEEEEEEEEEccccCEEccce
Confidence 4688999999999999999999988877665
No 111
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=40.14 E-value=55 Score=24.70 Aligned_cols=33 Identities=21% Similarity=0.459 Sum_probs=28.5
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (385)
..+.|.+.+|+.+.+++.++|...++.+--+..
T Consensus 18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence 578999999999999999999999887765543
No 112
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=39.89 E-value=26 Score=34.85 Aligned_cols=42 Identities=36% Similarity=0.649 Sum_probs=29.4
Q ss_pred cCCCCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEE--EEEECC
Q 016647 326 GRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVIV--EVLRGD 369 (385)
Q Consensus 326 ~gl~~GDiI~~ing~~i~s~--~~l~~~l~~~~~g~~v~l--~v~R~g 369 (385)
..|..||.|+++||+...+. ++...+|+ +.|+.|.+ +++|+-
T Consensus 127 ~aL~~gDaIlSVNG~dL~~AtHdeAVqaLK--raGkeV~levKy~REv 172 (506)
T KOG3551|consen 127 GALFLGDAILSVNGEDLRDATHDEAVQALK--RAGKEVLLEVKYMREV 172 (506)
T ss_pred cceeeccEEEEecchhhhhcchHHHHHHHH--hhCceeeeeeeeehhc
Confidence 45889999999999999864 34445554 46777655 556654
No 113
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=37.83 E-value=66 Score=23.13 Aligned_cols=33 Identities=21% Similarity=0.444 Sum_probs=29.1
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEcC
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (385)
..+.|.+.+|+.+.+.+..+|...++.+-....
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~ 41 (67)
T PF01423_consen 9 KRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE 41 (67)
T ss_dssp SEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred cEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence 468999999999999999999999998877754
No 114
>PF02601 Exonuc_VII_L: Exonuclease VII, large subunit; InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus. This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=35.89 E-value=43 Score=32.40 Aligned_cols=34 Identities=29% Similarity=0.524 Sum_probs=30.5
Q ss_pred EEEEEEEcCCCEEEecccccCCCCeEEEEecCCC
Q 016647 153 SGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQS 186 (385)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~ 186 (385)
.|-.++.+++|.+||+..-+...+.+++.+.||.
T Consensus 281 RGYaiv~~~~g~vI~s~~~l~~gd~i~i~l~DG~ 314 (319)
T PF02601_consen 281 RGYAIVRDKDGKVITSVKQLKPGDEIEIRLADGS 314 (319)
T ss_pred CceEEEECCCCCEECCHHHCCCCCEEEEEEcceE
Confidence 5667888888999999999999999999999995
No 115
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=35.18 E-value=93 Score=29.61 Aligned_cols=40 Identities=18% Similarity=0.184 Sum_probs=29.7
Q ss_pred ccccccccccC-CCCCcEEEEECCEEeC--CHHHHHHHHhcCC
Q 016647 317 LLSTKRDAYGR-LILGDIITSVNGKKVS--NGSDLYRILDQCK 356 (385)
Q Consensus 317 v~~~~~a~~~g-l~~GDiI~~ing~~i~--s~~~l~~~l~~~~ 356 (385)
+.+++-++..| |..+|.|+++||.+|. +.+++.++|-...
T Consensus 201 lVpGGLAeSTGLLaVnDEVlEVNGIEVaGKTLDQVTDMMvANs 243 (358)
T KOG3606|consen 201 LVPGGLAESTGLLAVNDEVLEVNGIEVAGKTLDQVTDMMVANS 243 (358)
T ss_pred ecCCccccccceeeecceeEEEcCEEeccccHHHHHHHHhhcc
Confidence 34566666666 4689999999999997 6778877765543
No 116
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=34.98 E-value=38 Score=34.92 Aligned_cols=43 Identities=28% Similarity=0.362 Sum_probs=29.1
Q ss_pred cccccCCCCCcEEEEECCEEeCCHH--H----HHHHHhcCCCCCEEEEEEEE
Q 016647 322 RDAYGRLILGDIITSVNGKKVSNGS--D----LYRILDQCKVGDEVIVEVLR 367 (385)
Q Consensus 322 ~a~~~gl~~GDiI~~ing~~i~s~~--~----l~~~l~~~~~g~~v~l~v~R 367 (385)
.+..+.|.+||.|+.||.....++. | |.+++. ++|- ++++|-.
T Consensus 290 VA~DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~--~~gP-i~ltvAk 338 (626)
T KOG3571|consen 290 VALDGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVS--RPGP-IKLTVAK 338 (626)
T ss_pred eeccCccCccceEEEeeecchhhcCchHHHHHHHHHhc--cCCC-eEEEEee
Confidence 4667789999999999998887654 3 344443 3442 5666544
No 117
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=34.59 E-value=1.2e+02 Score=22.38 Aligned_cols=43 Identities=23% Similarity=0.396 Sum_probs=30.3
Q ss_pred EeeEEEEECCCCCeEEEEEcCCCCCCcceecCCCCCCCCCCEEEEE
Q 016647 188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI 233 (385)
Q Consensus 188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~~~~~~~G~~V~~i 233 (385)
++++++..+.....|++.... ....+.+.--.++++||+|.+-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G---~~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGG---VRREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETT---EEEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCC---cEEEEEEEEeCCCCCCCEEEEe
Confidence 688999998888999998875 3345555444558999999764
No 118
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=34.11 E-value=28 Score=36.98 Aligned_cols=42 Identities=26% Similarity=0.445 Sum_probs=30.5
Q ss_pred ccccccccccCCCCCcEEEEECCEEeCCH-H-HHHHHHhcCCCCC
Q 016647 317 LLSTKRDAYGRLILGDIITSVNGKKVSNG-S-DLYRILDQCKVGD 359 (385)
Q Consensus 317 v~~~~~a~~~gl~~GDiI~~ing~~i~s~-~-~l~~~l~~~~~g~ 359 (385)
++-++.++++|++.|.-|++|||+.|--. . -+.++|.. ..|+
T Consensus 763 LlRGGIAERGGVRVGHRIIEINgQSVVA~pHekIV~lLs~-aVGE 806 (829)
T KOG3605|consen 763 LLRGGIAERGGVRVGHRIIEINGQSVVATPHEKIVQLLSN-AVGE 806 (829)
T ss_pred hhcccchhccCceeeeeEEEECCceEEeccHHHHHHHHHH-hhhh
Confidence 34677799999999999999999988643 2 35555554 3554
No 119
>PF05578 Peptidase_S31: Pestivirus NS3 polyprotein peptidase S31; InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=33.57 E-value=1e+02 Score=26.72 Aligned_cols=73 Identities=22% Similarity=0.219 Sum_probs=38.5
Q ss_pred CCCCCEEEEEeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEccccCCCCCCceeeC-CCccEEEEeecc
Q 016647 224 LLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD-SSGSLIGINTAI 297 (385)
Q Consensus 224 ~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvn-~~G~VVGI~s~~ 297 (385)
...|...+++ +|...+-+-+.|.+-...+.-..+..-.....--.+|..-..|.||-|+|. ..|++||=.-.+
T Consensus 109 cp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gtpaf~~~knlkg~s~~pifeassgr~vgr~k~g 182 (211)
T PF05578_consen 109 CPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTPAFFDLKNLKGWSGLPIFEASSGRVVGRVKVG 182 (211)
T ss_pred CCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCCcceeeccccCCCCCCceeeccCCcEEEEEEec
Confidence 4457778777 565544455555554333221100000000001123334457999999997 689999976543
No 120
>PF09122 DUF1930: Domain of unknown function (DUF1930); InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=31.31 E-value=1.9e+02 Score=21.16 Aligned_cols=45 Identities=22% Similarity=0.304 Sum_probs=27.7
Q ss_pred CcEEEEECCEEeCCHH-HHHHHHhcCCCCCEEEEEEEECCEEEEEEE
Q 016647 331 GDIITSVNGKKVSNGS-DLYRILDQCKVGDEVIVEVLRGDQKEKIPV 376 (385)
Q Consensus 331 GDiI~~ing~~i~s~~-~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v 376 (385)
.-..+.+||+.+.+.+ ++..++.-...|+..++.+ ..+....+++
T Consensus 19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~F-kS~Rv~~iEv 64 (68)
T PF09122_consen 19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFF-KSQRVAVIEV 64 (68)
T ss_dssp TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEE-TTS-EEEEE-
T ss_pred cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEE-ecCcEEEEEc
Confidence 4567889999999987 6888888778999988865 3344444443
No 121
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=30.42 E-value=39 Score=27.39 Aligned_cols=17 Identities=35% Similarity=0.675 Sum_probs=12.6
Q ss_pred ceeeCCCccEEEEeecc
Q 016647 281 GPLLDSSGSLIGINTAI 297 (385)
Q Consensus 281 GPlvn~~G~VVGI~s~~ 297 (385)
.|++|.+|++||++..+
T Consensus 94 ~PV~d~~g~viG~V~VG 110 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVG 110 (116)
T ss_dssp EEEE-TTS-EEEEEEEE
T ss_pred EeeECCCCcEEEEEEEE
Confidence 68889999999998754
No 122
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.26 E-value=1.6e+02 Score=22.00 Aligned_cols=32 Identities=13% Similarity=0.270 Sum_probs=28.5
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
..+.|.+.+|+.+.+++..+|...++.+-.+.
T Consensus 12 ~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 12 HPMLVELKNGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence 46899999999999999999999999887663
No 123
>PF14275 DUF4362: Domain of unknown function (DUF4362)
Probab=29.95 E-value=1.2e+02 Score=24.23 Aligned_cols=25 Identities=20% Similarity=0.405 Sum_probs=17.8
Q ss_pred CCCcEEEEECCEEeCCHHHHHHHHhcC
Q 016647 329 ILGDIITSVNGKKVSNGSDLYRILDQC 355 (385)
Q Consensus 329 ~~GDiI~~ing~~i~s~~~l~~~l~~~ 355 (385)
|.||||.+ .|+ |.|.+.|...+...
T Consensus 1 ~~~DVi~~-~~~-i~Nl~kl~~Fi~nv 25 (98)
T PF14275_consen 1 KNNDVINK-HGE-IENLDKLDQFIENV 25 (98)
T ss_pred CCCCEEEe-CCe-EEeHHHHHHHHHHH
Confidence 56999999 444 77777777766653
No 124
>COG4956 Integral membrane protein (PIN domain superfamily) [General function prediction only]
Probab=29.36 E-value=48 Score=32.12 Aligned_cols=38 Identities=21% Similarity=0.390 Sum_probs=32.0
Q ss_pred EEECCEEeCCHHHHHHHHh-cCCCCCEEEEEEEECCEEE
Q 016647 335 TSVNGKKVSNGSDLYRILD-QCKVGDEVIVEVLRGDQKE 372 (385)
Q Consensus 335 ~~ing~~i~s~~~l~~~l~-~~~~g~~v~l~v~R~g~~~ 372 (385)
.++.|.+|-|..|+.++++ ..-|||.+++++.++||+.
T Consensus 270 ae~qgV~vLNINDLAnAVkP~vlpGe~l~v~iiK~GkE~ 308 (356)
T COG4956 270 AELQGVQVLNINDLANAVKPVVLPGEELTVQIIKDGKEP 308 (356)
T ss_pred HhhcCCceecHHHHHHHhCCcccCCCeeEEEEeecCccc
Confidence 4567788889999999988 4679999999999999864
No 125
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=26.65 E-value=50 Score=26.23 Aligned_cols=22 Identities=32% Similarity=0.453 Sum_probs=17.3
Q ss_pred CCCCCceeeCCCccEEEEeecc
Q 016647 276 PGNSGGPLLDSSGSLIGINTAI 297 (385)
Q Consensus 276 ~G~SGGPlvn~~G~VVGI~s~~ 297 (385)
.+.+.-|++|.+|+++|+++..
T Consensus 97 ~~~~~lpVvd~~~~~vGiit~~ 118 (123)
T cd04627 97 EGISSVAVVDNQGNLIGNISVT 118 (123)
T ss_pred cCCceEEEECCCCcEEEEEeHH
Confidence 3445679999899999999753
No 126
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=25.68 E-value=1.6e+02 Score=22.47 Aligned_cols=32 Identities=13% Similarity=0.276 Sum_probs=28.4
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
..+.|.+.+|..+.+++..+|...++-+-.+.
T Consensus 12 ~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~ 43 (81)
T cd01725 12 KEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS 43 (81)
T ss_pred CEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence 46899999999999999999999999887764
No 127
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=24.55 E-value=59 Score=25.37 Aligned_cols=20 Identities=20% Similarity=0.270 Sum_probs=16.0
Q ss_pred CCCCceeeCCCccEEEEeec
Q 016647 277 GNSGGPLLDSSGSLIGINTA 296 (385)
Q Consensus 277 G~SGGPlvn~~G~VVGI~s~ 296 (385)
+.+--|++|.+|+++|+++.
T Consensus 86 ~~~~lpVvd~~~~~~Giit~ 105 (111)
T cd04603 86 EPPVVAVVDKEGKLVGTIYE 105 (111)
T ss_pred CCCeEEEEcCCCeEEEEEEh
Confidence 44456899988999999875
No 128
>PF11948 DUF3465: Protein of unknown function (DUF3465); InterPro: IPR021856 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 131 to 151 amino acids in length. This protein has a conserved HWTH sequence motif.
Probab=24.53 E-value=4.3e+02 Score=22.30 Aligned_cols=12 Identities=33% Similarity=0.246 Sum_probs=10.4
Q ss_pred CCCCCCEEEEEe
Q 016647 223 DLLVGQKVYAIG 234 (385)
Q Consensus 223 ~~~~G~~V~~iG 234 (385)
.++.||.|.+.|
T Consensus 85 ~l~~GD~V~f~G 96 (131)
T PF11948_consen 85 WLQKGDQVEFYG 96 (131)
T ss_pred CcCCCCEEEEEE
Confidence 478899999988
No 129
>PF08669 GCV_T_C: Glycine cleavage T-protein C-terminal barrel domain; InterPro: IPR013977 This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=24.36 E-value=58 Score=25.20 Aligned_cols=20 Identities=30% Similarity=0.499 Sum_probs=16.5
Q ss_pred CCCceeeCCCccEEEEeecc
Q 016647 278 NSGGPLLDSSGSLIGINTAI 297 (385)
Q Consensus 278 ~SGGPlvn~~G~VVGI~s~~ 297 (385)
..|.|+++.+|+.||.+++.
T Consensus 34 ~~g~~v~~~~g~~vG~vTS~ 53 (95)
T PF08669_consen 34 RGGEPVYDEDGKPVGRVTSG 53 (95)
T ss_dssp STTCEEEETTTEEEEEEEEE
T ss_pred CCCCEEEECCCcEEeEEEEE
Confidence 45899998799999988754
No 130
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=23.76 E-value=62 Score=25.17 Aligned_cols=20 Identities=45% Similarity=0.599 Sum_probs=16.2
Q ss_pred CCCCceeeCCCccEEEEeec
Q 016647 277 GNSGGPLLDSSGSLIGINTA 296 (385)
Q Consensus 277 G~SGGPlvn~~G~VVGI~s~ 296 (385)
+...-|++|.+|+++|+++.
T Consensus 90 ~~~~~pVvd~~~~~~Gvit~ 109 (115)
T cd04620 90 QIRHLPVLDDQGQLIGLVTA 109 (115)
T ss_pred CCceEEEEcCCCCEEEEEEh
Confidence 34467899988999999875
No 131
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=23.44 E-value=3.3e+02 Score=20.56 Aligned_cols=32 Identities=9% Similarity=0.285 Sum_probs=28.3
Q ss_pred CeEEEEecCCCeEeeEEEEECCCCCeEEEEEc
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (385)
..+.|.+.+|..|.+++..+|...++-+-.+.
T Consensus 20 ~~V~VeLKng~~~~G~L~~vD~~MNl~L~~~~ 51 (78)
T cd01733 20 KVVTVELRNETTVTGRIASVDAFMNIRLAKVT 51 (78)
T ss_pred CEEEEEECCCCEEEEEEEEEcCCceeEEEEEE
Confidence 46899999999999999999999998887764
No 132
>PF10049 DUF2283: Protein of unknown function (DUF2283); InterPro: IPR019270 Members of this family of hypothetical proteins have no known function.
Probab=23.14 E-value=56 Score=22.44 Aligned_cols=10 Identities=40% Similarity=0.973 Sum_probs=8.0
Q ss_pred CCCccEEEEe
Q 016647 285 DSSGSLIGIN 294 (385)
Q Consensus 285 n~~G~VVGI~ 294 (385)
|.+|++|||-
T Consensus 36 d~~G~ivGIE 45 (50)
T PF10049_consen 36 DEDGRIVGIE 45 (50)
T ss_pred CCCCCEEEEE
Confidence 5788999974
No 133
>KOG1379 consensus Serine/threonine protein phosphatase [Signal transduction mechanisms]
Probab=22.44 E-value=98 Score=30.14 Aligned_cols=71 Identities=20% Similarity=0.212 Sum_probs=41.6
Q ss_pred CCCCCCceeeCCCccEEEEeecccCCCCCCCCceeeecccccccc--------cccc----cccCCCCCcEEEEECCEEe
Q 016647 275 NPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTGLLS--------TKRD----AYGRLILGDIITSVNGKKV 342 (385)
Q Consensus 275 ~~G~SGGPlvn~~G~VVGI~s~~~~~~~~~~~~g~aIP~~~~v~~--------~~~a----~~~gl~~GDiI~~ing~~i 342 (385)
+-|+||=-++ .+|+||= .. ..+...|-.|+...+.+ +.|. ..-.+|.||||+.--+=-.
T Consensus 187 NLGDSGF~Vv-R~G~vv~------~S--~~Q~H~FN~PyQLs~~p~~~~~~~~d~p~~ad~~~~~v~~GDvIilATDGlf 257 (330)
T KOG1379|consen 187 NLGDSGFLVV-REGKVVF------RS--PEQQHYFNTPYQLSSPPEGYSSYISDVPDSADVTSFDVQKGDVIILATDGLF 257 (330)
T ss_pred eccCcceEEE-ECCEEEE------cC--chheeccCCceeeccCCccccccccCCccccceEEEeccCCCEEEEeccccc
Confidence 4699998888 7999872 11 23445666676653332 2221 1235899999887655455
Q ss_pred CCHHH--HHHHHhc
Q 016647 343 SNGSD--LYRILDQ 354 (385)
Q Consensus 343 ~s~~~--l~~~l~~ 354 (385)
+|+.+ +..+|..
T Consensus 258 DNl~e~~Il~il~~ 271 (330)
T KOG1379|consen 258 DNLPEKEILSILKG 271 (330)
T ss_pred ccccHHHHHHHHHH
Confidence 55543 4444443
No 134
>PRK13835 conjugal transfer protein TrbH; Provisional
Probab=21.92 E-value=73 Score=27.38 Aligned_cols=40 Identities=20% Similarity=0.223 Sum_probs=26.0
Q ss_pred HHHHHHHhCCceEEEEEeeeccCccccccccCCCeEEEEEE
Q 016647 118 TVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFV 158 (385)
Q Consensus 118 ~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfi 158 (385)
..++.+.+.|+--+|.-.... ++|....+..-+++|=+++
T Consensus 47 vsqLae~~pPa~tt~~l~q~~-d~Fg~aL~~aLr~~GYaVv 86 (145)
T PRK13835 47 VSRLAEQIGPGTTTIKLKKDT-SPFGQALEAALKGWGYAVV 86 (145)
T ss_pred HHHHHHhcCCCceEEEEeecC-cHHHHHHHHHHHhcCeEEe
Confidence 457888889988777765554 6776555544455555555
No 135
>PRK14864 putative biofilm stress and motility protein A; Provisional
Probab=21.59 E-value=1.7e+02 Score=23.65 Aligned_cols=26 Identities=8% Similarity=-0.027 Sum_probs=10.8
Q ss_pred ccccCccchhHHHHHHHhCCceEEEE
Q 016647 108 QRKLQTDELATVRLFQENTPSVVNIT 133 (385)
Q Consensus 108 ~~~~~~~~~~~~~~~~~~~~SVV~I~ 133 (385)
++...+++.+..+....-+=.+|.|.
T Consensus 32 ~~~~~A~eI~~~qa~~lq~iGtVSvs 57 (104)
T PRK14864 32 PPADHAQEIRRAQTQGLQKMGTVSAL 57 (104)
T ss_pred CccccceecCHHHhhCCceeeEEEEe
Confidence 33444455554433222222355554
No 136
>PF14438 SM-ATX: Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=21.15 E-value=2.2e+02 Score=21.17 Aligned_cols=30 Identities=20% Similarity=0.350 Sum_probs=21.3
Q ss_pred CeEEEEecCCCeEeeEEEEECC---CCCeEEEEE
Q 016647 176 SDIRVTFADQSAYDAKIVGFDQ---DKDVAVLRI 206 (385)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~---~~DlAlLkv 206 (385)
..++|++.||..|++-....++ +.|++| +.
T Consensus 13 ~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL-k~ 45 (77)
T PF14438_consen 13 QTVEVTTKNGSVYEGIFHSASPESNEFDVVL-KM 45 (77)
T ss_dssp SEEEEEETTS-EEEEEEEEE-T---T--EEE-EE
T ss_pred CEEEEEECCCCEEEEEEEeCCCcccceeEEE-Ee
Confidence 4689999999999999999887 667755 44
No 137
>cd04597 CBS_pair_DRTGG_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=20.79 E-value=84 Score=24.94 Aligned_cols=21 Identities=29% Similarity=0.347 Sum_probs=17.4
Q ss_pred CCCCCceeeCCCccEEEEeec
Q 016647 276 PGNSGGPLLDSSGSLIGINTA 296 (385)
Q Consensus 276 ~G~SGGPlvn~~G~VVGI~s~ 296 (385)
.+...-|++|.+|+++||++.
T Consensus 87 ~~~~~lpVvd~~~~l~Givt~ 107 (113)
T cd04597 87 HNIRTLPVVDDDGTPAGIITL 107 (113)
T ss_pred cCCCEEEEECCCCeEEEEEEH
Confidence 455678999999999999874
Done!