Query 015960
Match_columns 397
No_of_seqs 467 out of 3248
Neff 8.3
Searched_HMMs 46136
Date Fri Mar 29 02:30:48 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/015960.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/015960hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10898 serine endoprotease; 100.0 3.6E-54 7.9E-59 420.3 40.4 302 78-394 42-352 (353)
2 PRK10139 serine endoprotease; 100.0 1.9E-54 4.1E-59 434.5 39.1 299 81-393 40-362 (455)
3 TIGR02038 protease_degS peripl 100.0 2.7E-54 5.9E-59 421.5 39.1 302 77-393 41-350 (351)
4 PRK10942 serine endoprotease; 100.0 1.5E-51 3.3E-56 415.4 37.1 297 82-392 39-382 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 3.6E-50 7.9E-55 403.7 36.8 298 82-393 2-329 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 1.9E-41 4.1E-46 332.0 30.5 298 81-391 33-340 (347)
7 KOG1320 Serine protease [Postt 100.0 9E-29 1.9E-33 242.8 20.9 302 80-392 127-469 (473)
8 KOG1421 Predicted signaling-as 99.9 6E-23 1.3E-27 203.7 21.7 292 82-392 53-372 (955)
9 PF13365 Trypsin_2: Trypsin-li 99.7 5.1E-17 1.1E-21 133.9 10.4 109 116-254 1-120 (120)
10 PF13180 PDZ_2: PDZ domain; PD 99.6 5.5E-15 1.2E-19 114.1 7.1 82 293-389 1-82 (82)
11 PF00089 Trypsin: Trypsin; In 99.5 4.8E-13 1E-17 121.5 19.3 178 92-281 13-220 (220)
12 KOG1421 Predicted signaling-as 99.5 5.1E-13 1.1E-17 133.8 15.0 255 114-392 550-832 (955)
13 KOG1320 Serine protease [Postt 99.4 1.3E-13 2.8E-18 136.2 7.0 275 86-380 55-351 (473)
14 cd00190 Tryp_SPc Trypsin-like 99.4 1.3E-11 2.9E-16 112.8 18.4 180 91-282 12-230 (232)
15 cd00991 PDZ_archaeal_metallopr 99.3 1.2E-11 2.6E-16 94.7 9.9 68 310-388 10-77 (79)
16 cd00986 PDZ_LON_protease PDZ d 99.3 1.8E-11 3.9E-16 93.7 10.6 72 310-393 8-79 (79)
17 smart00020 Tryp_SPc Trypsin-li 99.3 7.4E-11 1.6E-15 108.0 16.6 161 113-275 25-223 (229)
18 TIGR01713 typeII_sec_gspC gene 99.3 5.5E-12 1.2E-16 117.9 7.3 101 275-389 159-259 (259)
19 cd00987 PDZ_serine_protease PD 99.2 1E-10 2.2E-15 91.5 9.8 66 310-386 24-89 (90)
20 cd00990 PDZ_glycyl_aminopeptid 99.1 1.8E-10 3.9E-15 88.1 8.4 78 293-390 1-78 (80)
21 cd00989 PDZ_metalloprotease PD 99.0 1.4E-09 3E-14 82.9 9.1 66 311-388 13-78 (79)
22 COG3591 V8-like Glu-specific e 99.0 1.2E-08 2.7E-13 93.6 15.6 158 115-285 65-250 (251)
23 cd00988 PDZ_CTP_protease PDZ d 98.9 4.9E-09 1.1E-13 81.1 8.8 67 310-388 13-82 (85)
24 PRK10779 zinc metallopeptidase 98.9 3.5E-09 7.7E-14 107.3 7.4 68 313-391 129-196 (449)
25 TIGR02037 degP_htrA_DO peripla 98.8 1.6E-08 3.4E-13 102.2 10.2 84 292-386 337-427 (428)
26 TIGR00054 RIP metalloprotease 98.7 3.9E-08 8.5E-13 98.8 9.5 69 310-390 203-271 (420)
27 cd00136 PDZ PDZ domain, also c 98.7 3.6E-08 7.8E-13 73.2 6.4 56 310-377 13-70 (70)
28 PF00863 Peptidase_C4: Peptida 98.7 1.8E-06 3.8E-11 78.9 18.2 165 87-275 13-185 (235)
29 PRK10779 zinc metallopeptidase 98.7 8E-08 1.7E-12 97.5 9.9 69 311-391 222-290 (449)
30 TIGR02860 spore_IV_B stage IV 98.5 4.1E-07 8.9E-12 89.3 10.4 69 310-390 105-181 (402)
31 KOG3627 Trypsin [Amino acid tr 98.5 3.4E-06 7.4E-11 78.9 16.3 168 115-284 39-253 (256)
32 PRK10139 serine endoprotease; 98.5 3.1E-07 6.7E-12 93.1 9.1 65 310-387 390-454 (455)
33 TIGR00225 prc C-terminal pepti 98.5 3.2E-07 6.9E-12 89.6 8.2 70 310-391 62-133 (334)
34 smart00228 PDZ Domain present 98.5 3.2E-07 7E-12 70.4 6.5 60 310-380 26-85 (85)
35 PLN00049 carboxyl-terminal pro 98.4 7.6E-07 1.6E-11 88.7 9.3 66 311-388 103-170 (389)
36 PRK10942 serine endoprotease; 98.4 8.2E-07 1.8E-11 90.5 9.1 65 310-387 408-472 (473)
37 TIGR03279 cyano_FeS_chp putati 98.4 6.7E-07 1.4E-11 88.4 6.8 61 314-389 2-63 (433)
38 cd00992 PDZ_signaling PDZ doma 98.3 1.5E-06 3.3E-11 66.4 6.8 54 310-376 26-81 (82)
39 TIGR00054 RIP metalloprotease 98.3 1.2E-06 2.5E-11 88.2 6.3 67 310-389 128-194 (420)
40 PF00595 PDZ: PDZ domain (Also 98.2 1.2E-06 2.6E-11 67.1 4.3 55 310-377 25-81 (81)
41 COG3480 SdrC Predicted secrete 98.2 8.9E-06 1.9E-10 76.2 9.1 71 310-392 130-201 (342)
42 COG0793 Prc Periplasmic protea 98.2 4.9E-06 1.1E-10 83.0 7.9 68 310-389 112-181 (406)
43 PF14685 Tricorn_PDZ: Tricorn 98.1 2.4E-05 5.1E-10 60.6 8.4 68 310-387 12-88 (88)
44 PRK09681 putative type II secr 98.1 1.1E-05 2.5E-10 75.4 7.6 63 316-389 210-275 (276)
45 PF04495 GRASP55_65: GRASP55/6 97.9 9.1E-06 2E-10 68.6 3.7 87 293-390 26-114 (138)
46 KOG3129 26S proteasome regulat 97.9 3.5E-05 7.5E-10 68.0 6.6 72 311-393 140-213 (231)
47 COG5640 Secreted trypsin-like 97.8 0.00027 5.9E-09 67.4 11.7 53 233-285 223-278 (413)
48 PF05579 Peptidase_S32: Equine 97.8 0.0005 1.1E-08 63.1 12.9 116 114-259 112-229 (297)
49 COG3975 Predicted protease wit 97.8 2.6E-05 5.6E-10 77.8 5.0 65 310-393 462-526 (558)
50 PRK11186 carboxy-terminal prot 97.7 8.6E-05 1.9E-09 78.2 7.8 67 310-388 255-332 (667)
51 COG3031 PulC Type II secretory 97.5 0.0002 4.4E-09 64.5 5.4 65 313-388 210-274 (275)
52 KOG3553 Tax interaction protei 97.4 0.00014 3.1E-09 56.4 2.7 34 310-354 59-92 (124)
53 PF10459 Peptidase_S46: Peptid 97.1 0.0016 3.5E-08 69.0 8.5 22 114-135 47-68 (698)
54 PF00548 Peptidase_C3: 3C cyst 97.0 0.012 2.7E-07 51.7 11.9 138 113-258 24-170 (172)
55 PF03761 DUF316: Domain of unk 96.9 0.026 5.6E-07 53.6 14.3 92 159-259 159-255 (282)
56 PF05580 Peptidase_S55: SpoIVB 96.9 0.02 4.3E-07 51.4 12.4 160 111-277 17-215 (218)
57 KOG3580 Tight junction protein 96.3 0.0038 8.3E-08 63.1 4.2 58 310-378 429-488 (1027)
58 PF12812 PDZ_1: PDZ-like domai 96.1 0.017 3.7E-07 43.8 6.0 46 310-366 30-75 (78)
59 PF10459 Peptidase_S46: Peptid 95.7 0.012 2.6E-07 62.5 4.7 58 228-285 623-687 (698)
60 PF00949 Peptidase_S7: Peptida 95.7 0.018 4E-07 48.0 4.7 32 230-261 89-120 (132)
61 TIGR02860 spore_IV_B stage IV 95.2 0.23 5E-06 49.3 11.5 43 232-278 354-396 (402)
62 KOG3209 WW domain-containing p 94.8 0.031 6.8E-07 57.9 4.3 55 314-380 782-838 (984)
63 KOG3532 Predicted protein kina 94.6 0.058 1.3E-06 55.7 5.7 47 310-367 398-444 (1051)
64 PF02122 Peptidase_S39: Peptid 94.5 0.15 3.3E-06 45.9 7.5 118 126-258 43-166 (203)
65 KOG3552 FERM domain protein FR 94.4 0.052 1.1E-06 57.8 4.8 55 310-378 75-131 (1298)
66 KOG3834 Golgi reassembly stack 94.3 0.092 2E-06 51.6 5.9 70 308-388 13-84 (462)
67 PF08192 Peptidase_S64: Peptid 94.2 0.24 5.1E-06 51.6 9.0 117 159-283 541-687 (695)
68 COG0750 Predicted membrane-ass 93.8 0.23 5E-06 49.1 8.0 58 314-383 133-194 (375)
69 KOG3542 cAMP-regulated guanine 93.4 0.059 1.3E-06 55.6 2.9 57 310-378 562-618 (1283)
70 KOG1892 Actin filament-binding 93.1 0.12 2.6E-06 55.3 4.8 59 310-380 960-1020(1629)
71 PF09342 DUF1986: Domain of un 92.6 1.5 3.2E-05 40.4 10.4 87 112-199 26-131 (267)
72 PF00944 Peptidase_S3: Alphavi 92.5 0.22 4.7E-06 41.3 4.5 30 232-261 100-129 (158)
73 KOG3606 Cell polarity protein 91.8 0.44 9.5E-06 44.2 6.0 59 309-379 193-253 (358)
74 KOG3209 WW domain-containing p 91.6 0.29 6.4E-06 51.0 5.2 56 311-380 924-982 (984)
75 KOG3550 Receptor targeting pro 91.4 0.41 8.8E-06 40.4 5.0 55 310-377 115-172 (207)
76 KOG3580 Tight junction protein 90.5 0.4 8.7E-06 49.0 5.0 61 309-381 39-99 (1027)
77 PF02395 Peptidase_S6: Immunog 89.6 1.7 3.7E-05 47.0 9.2 54 115-171 66-121 (769)
78 KOG3549 Syntrophins (type gamm 89.3 0.58 1.3E-05 44.9 4.7 54 312-377 82-137 (505)
79 PF02907 Peptidase_S29: Hepati 88.1 0.45 9.8E-06 39.5 2.8 40 235-275 105-144 (148)
80 PF00947 Pico_P2A: Picornaviru 87.7 1.5 3.2E-05 36.1 5.5 33 226-259 78-110 (127)
81 PF03510 Peptidase_C24: 2C end 87.1 2.8 6.1E-05 33.5 6.6 53 118-182 3-55 (105)
82 KOG0606 Microtubule-associated 87.1 1 2.2E-05 49.6 5.5 51 313-376 661-713 (1205)
83 KOG3571 Dishevelled 3 and rela 86.5 0.98 2.1E-05 45.5 4.6 60 308-378 275-338 (626)
84 KOG3651 Protein kinase C, alph 86.3 1.6 3.6E-05 41.1 5.7 54 311-377 31-87 (429)
85 KOG2921 Intramembrane metallop 82.7 2.1 4.5E-05 42.0 4.8 45 310-365 220-265 (484)
86 KOG0609 Calcium/calmodulin-dep 81.2 2.6 5.6E-05 43.0 5.1 56 311-379 147-205 (542)
87 KOG3605 Beta amyloid precursor 80.5 2.6 5.7E-05 43.8 4.9 55 315-379 678-734 (829)
88 KOG3834 Golgi reassembly stack 80.3 2.4 5.1E-05 42.1 4.4 57 314-381 113-169 (462)
89 KOG3551 Syntrophins (type beta 78.9 1.5 3.2E-05 42.8 2.4 55 312-379 112-171 (506)
90 KOG3605 Beta amyloid precursor 78.7 1.7 3.6E-05 45.2 2.9 113 236-370 678-806 (829)
91 PF05416 Peptidase_C37: Southa 75.0 9.7 0.00021 37.9 6.8 137 114-260 379-528 (535)
92 PF01732 DUF31: Putative pepti 74.8 2.3 5E-05 42.2 2.7 23 234-256 351-373 (374)
93 KOG3938 RGS-GAIP interacting p 72.6 2.7 5.8E-05 39.1 2.3 57 310-377 149-208 (334)
94 cd01720 Sm_D2 The eukaryotic S 63.0 15 0.00033 28.3 4.4 37 133-169 10-46 (87)
95 PF12381 Peptidase_C3G: Tungro 56.3 10 0.00022 34.3 2.7 45 226-274 168-216 (231)
96 cd00600 Sm_like The eukaryotic 55.7 36 0.00078 23.9 5.2 33 138-170 7-39 (63)
97 cd01735 LSm12_N LSm12 belongs 55.4 49 0.0011 23.7 5.6 34 137-170 6-39 (61)
98 cd01731 archaeal_Sm1 The archa 52.8 38 0.00083 24.5 5.0 33 138-170 11-43 (68)
99 cd01722 Sm_F The eukaryotic Sm 52.6 33 0.00072 24.9 4.6 32 138-169 12-43 (68)
100 cd01726 LSm6 The eukaryotic Sm 52.5 36 0.00077 24.6 4.7 32 138-169 11-42 (67)
101 PRK00737 small nuclear ribonuc 52.5 38 0.00083 24.9 5.0 33 138-170 15-47 (72)
102 cd01730 LSm3 The eukaryotic Sm 50.1 34 0.00073 25.9 4.4 31 138-168 12-42 (82)
103 cd01717 Sm_B The eukaryotic Sm 48.8 42 0.00092 25.1 4.8 32 138-169 11-42 (79)
104 cd06168 LSm9 The eukaryotic Sm 48.7 48 0.001 24.7 5.0 32 138-169 11-42 (75)
105 cd01732 LSm5 The eukaryotic Sm 47.8 42 0.00092 25.1 4.6 31 138-168 14-44 (76)
106 PF01455 HupF_HypC: HupF/HypC 47.7 78 0.0017 23.1 5.8 43 150-195 5-47 (68)
107 cd01729 LSm7 The eukaryotic Sm 47.3 49 0.0011 25.0 4.9 31 138-168 13-43 (81)
108 cd01719 Sm_G The eukaryotic Sm 45.8 57 0.0012 24.1 5.0 31 138-168 11-41 (72)
109 PF00571 CBS: CBS domain CBS d 45.3 21 0.00045 24.2 2.5 20 238-257 29-48 (57)
110 cd01728 LSm1 The eukaryotic Sm 44.8 59 0.0013 24.2 4.9 31 138-168 13-43 (74)
111 cd01721 Sm_D3 The eukaryotic S 41.4 72 0.0016 23.3 4.9 33 137-169 10-42 (70)
112 smart00651 Sm snRNP Sm protein 41.2 73 0.0016 22.6 4.9 33 138-170 9-41 (67)
113 PF01423 LSM: LSM domain ; In 40.5 59 0.0013 23.1 4.4 34 137-170 8-41 (67)
114 PF11874 DUF3394: Domain of un 39.6 1.1E+02 0.0025 27.0 6.7 28 310-348 122-149 (183)
115 cd01727 LSm8 The eukaryotic Sm 39.1 75 0.0016 23.4 4.8 32 138-169 10-41 (74)
116 COG1958 LSM1 Small nuclear rib 38.7 66 0.0014 24.0 4.5 33 138-170 18-50 (79)
117 COG0298 HypC Hydrogenase matur 37.2 78 0.0017 23.9 4.4 47 150-198 5-52 (82)
118 COG4956 Integral membrane prot 34.5 34 0.00074 32.8 2.7 40 345-384 269-309 (356)
119 PF14827 Cache_3: Sensory doma 33.0 40 0.00086 27.1 2.6 19 242-260 94-112 (116)
120 TIGR03000 plancto_dom_1 Planct 31.7 85 0.0019 23.5 3.9 49 342-390 11-64 (75)
121 cd01723 LSm4 The eukaryotic Sm 30.4 1.4E+02 0.0031 22.1 5.1 33 137-169 11-43 (76)
122 PF08669 GCV_T_C: Glycine clea 25.8 89 0.0019 23.9 3.4 21 239-259 34-54 (95)
123 cd04627 CBS_pair_14 The CBS do 25.8 48 0.001 26.2 1.9 21 238-258 98-118 (123)
124 PRK13922 rod shape-determining 24.2 5.9E+02 0.013 23.7 10.5 73 140-215 172-244 (276)
125 PF02743 Cache_1: Cache domain 23.9 47 0.001 24.5 1.5 17 242-258 19-35 (81)
126 cd01724 Sm_D1 The eukaryotic S 23.9 1.8E+02 0.004 22.4 4.8 34 137-170 11-44 (90)
127 PF04225 OapA: Opacity-associa 23.4 1.2E+02 0.0027 23.0 3.7 31 361-391 38-68 (85)
128 PF02601 Exonuc_VII_L: Exonucl 22.8 76 0.0016 30.5 3.0 34 115-148 281-314 (319)
129 PF14438 SM-ATX: Ataxin 2 SM d 21.9 2.3E+02 0.0049 20.8 4.9 28 138-165 13-43 (77)
130 TIGR00074 hypC_hupF hydrogenas 21.5 2.1E+02 0.0046 21.4 4.5 41 150-195 5-45 (76)
131 cd04603 CBS_pair_KefB_assoc Th 21.2 71 0.0015 24.7 2.1 18 240-257 88-105 (111)
132 cd04620 CBS_pair_7 The CBS dom 21.1 73 0.0016 24.6 2.1 20 239-258 91-110 (115)
133 COG5233 GRH1 Peripheral Golgi 21.0 52 0.0011 31.6 1.3 30 314-354 67-96 (417)
134 PF10049 DUF2283: Protein of u 20.9 67 0.0015 21.8 1.6 11 246-256 36-46 (50)
135 PF11325 DUF3127: Domain of un 20.8 2.8E+02 0.006 21.3 5.0 64 310-384 3-71 (84)
136 PF09122 DUF1930: Domain of un 20.6 3.3E+02 0.0073 19.6 5.3 45 342-387 19-64 (68)
137 PF09465 LBR_tudor: Lamin-B re 20.2 3.2E+02 0.0069 19.1 5.7 36 136-171 8-44 (55)
No 1
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=3.6e-54 Score=420.31 Aligned_cols=302 Identities=34% Similarity=0.504 Sum_probs=260.3
Q ss_pred cccchHHHHHHHcCCceEEEEeeeccccccccCCCCceeEEEEEcCCCEEEEcccccCCCCeEEEEecCCcEEEEEEEEE
Q 015960 78 TDEVETAGIFEENLPSVVHITNFGMNTFTLTMEYPQATGTGFIWDEDGHIVTNHHVIEGASSVKVTLFDKTTLDAKVVGH 157 (397)
Q Consensus 78 ~~~~~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~GSG~ii~~~G~ILT~aHvv~~~~~i~V~~~~g~~~~a~vv~~ 157 (397)
..+.++.++++++.||||.|................+.||||+|+++||||||+|||.++..+.|++.||+.++|+++++
T Consensus 42 ~~~~~~~~~~~~~~psvV~v~~~~~~~~~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~ 121 (353)
T PRK10898 42 ETPASYNQAVRRAAPAVVNVYNRSLNSTSHNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQDGRVFEALLVGS 121 (353)
T ss_pred cccchHHHHHHHhCCcEEEEEeEeccccCcccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCCCCEEEEEEEEE
Confidence 34457899999999999999886543322122233478999999999999999999999999999999999999999999
Q ss_pred cCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccccCC
Q 015960 158 DQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAINR 237 (397)
Q Consensus 158 d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~ 237 (397)
|+.+||||||++.. .+++++++++..+++||+|+++|||++...+++.|+|++..+......+ ..+++|+|+++++
T Consensus 122 d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~~~--~~~~iqtda~i~~ 197 (353)
T PRK10898 122 DSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSPTG--RQNFLQTDASINH 197 (353)
T ss_pred cCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCCcc--ccceEEeccccCC
Confidence 99999999999873 6889999988889999999999999998889999999988775432222 2578999999999
Q ss_pred CCcccceecCCccEEEEEeeeeccCC---CccCceeeEeccchhHHHHHHHhccccccccCCCchh---HHHHHHcCC--
Q 015960 238 GNSGGPLLDSSGSLIGVNTSIITRTD---AFCGMACSIPIDTVSGIVDQLVKFGKIIRPYLGIAHD---QLLEKLMGI-- 309 (397)
Q Consensus 238 G~SGGPlvn~~G~vVGI~s~~~~~~~---~~~~~~~aip~~~i~~~~~~l~~~g~~~~~~lGi~~~---~~~~~~~~~-- 309 (397)
|||||||+|.+|+||||+++.+.... ...+++|+||++.+++++++|+++|++.++|||+..+ +..+...+.
T Consensus 198 GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~~~~~~ 277 (353)
T PRK10898 198 GNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQGGGIDQ 277 (353)
T ss_pred CCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHhcCCCC
Confidence 99999999999999999998775432 2367999999999999999999999999999999532 222333333
Q ss_pred -CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 015960 310 -SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILII 388 (397)
Q Consensus 310 -~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 388 (397)
.|++|.++.++|||+++||++ ||+|++|||++|.++.++.+.+...++|+++.++|+|+|+.+++.++
T Consensus 278 ~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 346 (353)
T PRK10898 278 LQGIVVNEVSPDGPAAKAGIQV-----------NDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQLTLQVT 346 (353)
T ss_pred CCeEEEEEECCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEE
Confidence 799999999999999999999 99999999999999999999998878999999999999999999999
Q ss_pred eeeCCC
Q 015960 389 LEVEPD 394 (397)
Q Consensus 389 l~~~~~ 394 (397)
+.++|.
T Consensus 347 l~~~p~ 352 (353)
T PRK10898 347 IQEYPA 352 (353)
T ss_pred eccCCC
Confidence 987763
No 2
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=1.9e-54 Score=434.49 Aligned_cols=299 Identities=34% Similarity=0.572 Sum_probs=261.1
Q ss_pred chHHHHHHHcCCceEEEEeeecc--------ccc--cc-------cCCCCceeEEEEEcC-CCEEEEcccccCCCCeEEE
Q 015960 81 VETAGIFEENLPSVVHITNFGMN--------TFT--LT-------MEYPQATGTGFIWDE-DGHIVTNHHVIEGASSVKV 142 (397)
Q Consensus 81 ~~~~~~~~~~~~sVV~I~~~~~~--------~~~--~~-------~~~~~~~GSG~ii~~-~G~ILT~aHvv~~~~~i~V 142 (397)
.++.++++++.||||.|.+.... .|. +. .....+.||||+|++ +||||||+|||.++..+.|
T Consensus 40 ~~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V 119 (455)
T PRK10139 40 PSLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISI 119 (455)
T ss_pred ccHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEE
Confidence 36999999999999999875321 010 00 012246899999985 7999999999999999999
Q ss_pred EecCCcEEEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCC
Q 015960 143 TLFDKTTLDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATG 222 (397)
Q Consensus 143 ~~~~g~~~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~ 222 (397)
++.|++.++|++++.|+.+||||||++.+ ..+++++++++..+++||+|+++|+|++...+++.|+|+++.+......
T Consensus 120 ~~~dg~~~~a~vvg~D~~~DlAvlkv~~~-~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~~- 197 (455)
T PRK10139 120 QLNDGREFDAKLIGSDDQSDIALLQIQNP-SKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNLE- 197 (455)
T ss_pred EECCCCEEEEEEEEEcCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCCC-
Confidence 99999999999999999999999999864 3789999999999999999999999999999999999999887533221
Q ss_pred CCcccEEEEccccCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchhHHHHHHHhccccccccCCCc---h
Q 015960 223 PPIQGLIQIDAAINRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQLVKFGKIIRPYLGIA---H 299 (397)
Q Consensus 223 ~~~~~~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l~~~g~~~~~~lGi~---~ 299 (397)
.+.+++|+|+++++|||||||+|.+|+||||+++.+...+...+++||||++.+++++++|+++|++.|+|||+. +
T Consensus 198 -~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~l 276 (455)
T PRK10139 198 -GLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEM 276 (455)
T ss_pred -CcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEEC
Confidence 235789999999999999999999999999999988766667899999999999999999999999999999994 3
Q ss_pred hHHHHHHcCC---CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEE
Q 015960 300 DQLLEKLMGI---SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRI 376 (397)
Q Consensus 300 ~~~~~~~~~~---~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v 376 (397)
++..++.+++ .|++|.+|.++|||+++||++ ||+|++|||++|.++.|+.+.+...++|+++.++|
T Consensus 277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V 345 (455)
T PRK10139 277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKA-----------GDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGL 345 (455)
T ss_pred CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCC-----------CCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence 4555666665 599999999999999999999 99999999999999999999998878899999999
Q ss_pred EECCeEEEEEEEeeeCC
Q 015960 377 LRGTQLEEILIILEVEP 393 (397)
Q Consensus 377 ~R~g~~~~~~v~l~~~~ 393 (397)
+|+|+++++.+++...+
T Consensus 346 ~R~G~~~~l~v~~~~~~ 362 (455)
T PRK10139 346 LRNGKPLEVEVTLDTST 362 (455)
T ss_pred EECCEEEEEEEEECCCC
Confidence 99999999999886443
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=2.7e-54 Score=421.49 Aligned_cols=302 Identities=34% Similarity=0.550 Sum_probs=261.9
Q ss_pred CcccchHHHHHHHcCCceEEEEeeeccccccccCCCCceeEEEEEcCCCEEEEcccccCCCCeEEEEecCCcEEEEEEEE
Q 015960 77 KTDEVETAGIFEENLPSVVHITNFGMNTFTLTMEYPQATGTGFIWDEDGHIVTNHHVIEGASSVKVTLFDKTTLDAKVVG 156 (397)
Q Consensus 77 ~~~~~~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~GSG~ii~~~G~ILT~aHvv~~~~~i~V~~~~g~~~~a~vv~ 156 (397)
.+.+.++.++++++.||||.|++..............+.||||+|+++||||||+|||.+++.+.|++.||+.++|++++
T Consensus 41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~dg~~~~a~vv~ 120 (351)
T TIGR02038 41 NTVEISFNKAVRRAAPAVVNIYNRSISQNSLNQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQDGRKFEAELVG 120 (351)
T ss_pred cccchhHHHHHHhcCCcEEEEEeEeccccccccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECCCCEEEEEEEE
Confidence 34455799999999999999987643221111223456899999999999999999999999999999999999999999
Q ss_pred EcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccccC
Q 015960 157 HDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAIN 236 (397)
Q Consensus 157 ~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~ 236 (397)
.|+.+||||||++.. .+++++++++..+++||+|+++|||++...+++.|+|+.+.+...... ...+++|+|+.++
T Consensus 121 ~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~~~--~~~~~iqtda~i~ 196 (351)
T TIGR02038 121 SDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLSSV--GRQNFIQTDAAIN 196 (351)
T ss_pred ecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccCCC--CcceEEEECCccC
Confidence 999999999999874 588899988888999999999999999989999999999887543222 2257899999999
Q ss_pred CCCcccceecCCccEEEEEeeeeccC--CCccCceeeEeccchhHHHHHHHhccccccccCCCch---hHHHHHHcCC--
Q 015960 237 RGNSGGPLLDSSGSLIGVNTSIITRT--DAFCGMACSIPIDTVSGIVDQLVKFGKIIRPYLGIAH---DQLLEKLMGI-- 309 (397)
Q Consensus 237 ~G~SGGPlvn~~G~vVGI~s~~~~~~--~~~~~~~~aip~~~i~~~~~~l~~~g~~~~~~lGi~~---~~~~~~~~~~-- 309 (397)
+|||||||+|.+|+||||+++.+... +...+++|+||++.+++++++++++|++.|+|||+.. ++..++.+++
T Consensus 197 ~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~lgl~~ 276 (351)
T TIGR02038 197 AGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGLGLPD 276 (351)
T ss_pred CCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhcCCCc
Confidence 99999999999999999999876432 2346899999999999999999999999999999943 3445555565
Q ss_pred -CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 015960 310 -SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILII 388 (397)
Q Consensus 310 -~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 388 (397)
.|++|..+.++|||+++||++ ||+|++|||++|.++.|+.+.+...++|+++.++|+|+|+.+++.++
T Consensus 277 ~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 345 (351)
T TIGR02038 277 LRGIVITGVDPNGPAARAGILV-----------RDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLRQGKQLELPVT 345 (351)
T ss_pred cccceEeecCCCChHHHCCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEEEEE
Confidence 599999999999999999999 99999999999999999999998878999999999999999999999
Q ss_pred eeeCC
Q 015960 389 LEVEP 393 (397)
Q Consensus 389 l~~~~ 393 (397)
+.++|
T Consensus 346 l~~~p 350 (351)
T TIGR02038 346 IDEKP 350 (351)
T ss_pred ecCCC
Confidence 98765
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=1.5e-51 Score=415.36 Aligned_cols=297 Identities=35% Similarity=0.579 Sum_probs=258.7
Q ss_pred hHHHHHHHcCCceEEEEeeecc---------cccccc-------------------------------CCCCceeEEEEE
Q 015960 82 ETAGIFEENLPSVVHITNFGMN---------TFTLTM-------------------------------EYPQATGTGFIW 121 (397)
Q Consensus 82 ~~~~~~~~~~~sVV~I~~~~~~---------~~~~~~-------------------------------~~~~~~GSG~ii 121 (397)
++.++++++.||||+|++.... .|...+ ....+.||||+|
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 6999999999999999875421 010000 011357999999
Q ss_pred cC-CCEEEEcccccCCCCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCC
Q 015960 122 DE-DGHIVTNHHVIEGASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLG 200 (397)
Q Consensus 122 ~~-~G~ILT~aHvv~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g 200 (397)
++ +||||||+|||.+++++.|++.|++.++|++++.|+.+||||||++.+ ..+++++++++..+++||+|+++|+|++
T Consensus 119 ~~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~-~~l~~~~lg~s~~l~~G~~V~aiG~P~g 197 (473)
T PRK10942 119 DADKGYVVTNNHVVDNATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNP-KNLTAIKMADSDALRVGDYTVAIGNPYG 197 (473)
T ss_pred ECCCCEEEeChhhcCCCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEcCCCC
Confidence 86 599999999999999999999999999999999999999999999864 3689999999999999999999999999
Q ss_pred CCCceeeeEEeeccccccCCCCCCcccEEEEccccCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchhHH
Q 015960 201 RKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAINRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGI 280 (397)
Q Consensus 201 ~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~ 280 (397)
...+++.|+|++..+..... ..+.+++++|+++++|||||||+|.+|+||||+++.+...++..+++|+||++.++++
T Consensus 198 ~~~tvt~GiVs~~~r~~~~~--~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~v 275 (473)
T PRK10942 198 LGETVTSGIVSALGRSGLNV--ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNL 275 (473)
T ss_pred CCcceeEEEEEEeecccCCc--ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHHH
Confidence 99999999999988753221 1235789999999999999999999999999999988776666789999999999999
Q ss_pred HHHHHhccccccccCCCch---hHHHHHHcCC---CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecC
Q 015960 281 VDQLVKFGKIIRPYLGIAH---DQLLEKLMGI---SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVS 354 (397)
Q Consensus 281 ~~~l~~~g~~~~~~lGi~~---~~~~~~~~~~---~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~ 354 (397)
+++|+++|++.|+|||+.. ++..++.+++ .|++|..|.++|||+++||++ ||+|++|||++|.
T Consensus 276 ~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V~ 344 (473)
T PRK10942 276 TSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKA-----------GDVITSLNGKPIS 344 (473)
T ss_pred HHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCCCC-----------CCEEEEECCEECC
Confidence 9999999999999999943 4445555665 599999999999999999999 9999999999999
Q ss_pred CHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEeeeC
Q 015960 355 NANDLHNILDQCKVGDEVIVRILRGTQLEEILIILEVE 392 (397)
Q Consensus 355 ~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~ 392 (397)
++.++...+....+|+++.++|+|+|+.+++.+++...
T Consensus 345 s~~dl~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~~ 382 (473)
T PRK10942 345 SFAALRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQS 382 (473)
T ss_pred CHHHHHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCcC
Confidence 99999999988889999999999999999999887653
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=3.6e-50 Score=403.72 Aligned_cols=298 Identities=39% Similarity=0.609 Sum_probs=260.5
Q ss_pred hHHHHHHHcCCceEEEEeeecc-----------ccc--cc-----------cCCCCceeEEEEEcCCCEEEEcccccCCC
Q 015960 82 ETAGIFEENLPSVVHITNFGMN-----------TFT--LT-----------MEYPQATGTGFIWDEDGHIVTNHHVIEGA 137 (397)
Q Consensus 82 ~~~~~~~~~~~sVV~I~~~~~~-----------~~~--~~-----------~~~~~~~GSG~ii~~~G~ILT~aHvv~~~ 137 (397)
++.++++++.||||.|.+.... .+. +. .....+.||||+|+++||||||+||+.++
T Consensus 2 ~~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~ 81 (428)
T TIGR02037 2 SFAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGA 81 (428)
T ss_pred cHHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCC
Confidence 3778999999999999875420 000 00 01134689999999999999999999999
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccccc
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEP 217 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~ 217 (397)
..+.|++.|++.++|++++.|+.+||||||++.+ ..+++++++++..+++|++|+++|||++...+++.|+|+...+..
T Consensus 82 ~~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~ 160 (428)
T TIGR02037 82 DEITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSG 160 (428)
T ss_pred CeEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCc
Confidence 9999999999999999999999999999999875 378999999888899999999999999999999999999987653
Q ss_pred cCCCCCCcccEEEEccccCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchhHHHHHHHhccccccccCCC
Q 015960 218 ITATGPPIQGLIQIDAAINRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQLVKFGKIIRPYLGI 297 (397)
Q Consensus 218 ~~~~~~~~~~~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l~~~g~~~~~~lGi 297 (397)
. ....+..++++|+++++|+|||||+|.+|+||||+++.+...++..+++||||++.+++++++|+++|++.++|||+
T Consensus 161 ~--~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi 238 (428)
T TIGR02037 161 L--GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGV 238 (428)
T ss_pred c--CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCce
Confidence 2 12233568999999999999999999999999999998776655678999999999999999999999999999999
Q ss_pred ch---hHHHHHHcCC---CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCE
Q 015960 298 AH---DQLLEKLMGI---SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDE 371 (397)
Q Consensus 298 ~~---~~~~~~~~~~---~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~ 371 (397)
.. ++..++.+++ .|++|.+|.++|||+++||++ ||+|++|||++|.++.++.+.+....+|++
T Consensus 239 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~ 307 (428)
T TIGR02037 239 TIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKA-----------GDVILSVNGKPISSFADLRRAIGTLKPGKK 307 (428)
T ss_pred EeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCE
Confidence 44 3555666676 699999999999999999999 999999999999999999999988888999
Q ss_pred EEEEEEECCeEEEEEEEeeeCC
Q 015960 372 VIVRILRGTQLEEILIILEVEP 393 (397)
Q Consensus 372 v~l~v~R~g~~~~~~v~l~~~~ 393 (397)
++++|+|+|+.+++++++..++
T Consensus 308 v~l~v~R~g~~~~~~v~l~~~~ 329 (428)
T TIGR02037 308 VTLGILRKGKEKTITVTLGASP 329 (428)
T ss_pred EEEEEEECCEEEEEEEEECcCC
Confidence 9999999999999999887655
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1.9e-41 Score=332.00 Aligned_cols=298 Identities=40% Similarity=0.627 Sum_probs=259.1
Q ss_pred chHHHHHHHcCCceEEEEeeecccc----ccccC-C-CCceeEEEEEcCCCEEEEcccccCCCCeEEEEecCCcEEEEEE
Q 015960 81 VETAGIFEENLPSVVHITNFGMNTF----TLTME-Y-PQATGTGFIWDEDGHIVTNHHVIEGASSVKVTLFDKTTLDAKV 154 (397)
Q Consensus 81 ~~~~~~~~~~~~sVV~I~~~~~~~~----~~~~~-~-~~~~GSG~ii~~~G~ILT~aHvv~~~~~i~V~~~~g~~~~a~v 154 (397)
.++..+++++.|+||.+........ ..... . ..+.||||+++++|||+||.|++.++..+.+.+.||+.+++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~dg~~~~a~~ 112 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLADGREVPAKL 112 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeCCCCEEEEEE
Confidence 5789999999999999987543111 00000 0 1479999999999999999999999999999999999999999
Q ss_pred EEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccc
Q 015960 155 VGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAA 234 (397)
Q Consensus 155 v~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~ 234 (397)
++.|+..|+|+||++.... ++.+.++++..++.|++++++|+|++...+++.|+++...+. .........+++|+|++
T Consensus 113 vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~v~~~~~~~~~IqtdAa 190 (347)
T COG0265 113 VGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-GVGSAGGYVNFIQTDAA 190 (347)
T ss_pred EecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-cccCcccccchhhcccc
Confidence 9999999999999998643 888899999999999999999999999999999999999986 22211124688999999
Q ss_pred cCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchhHHHHHHHhccccccccCCCchhHHHHHH-cC---CC
Q 015960 235 INRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQLVKFGKIIRPYLGIAHDQLLEKL-MG---IS 310 (397)
Q Consensus 235 i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l~~~g~~~~~~lGi~~~~~~~~~-~~---~~ 310 (397)
+++|+||||++|.+|++|||++..+...++..+++|+||++.++.+++++++.|++.++|+|+...+..... ++ ..
T Consensus 191 in~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~~g~~~~~ 270 (347)
T COG0265 191 INPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIALGLPVAA 270 (347)
T ss_pred cCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccccCCCCCC
Confidence 999999999999999999999999887765677999999999999999999988999999999543221111 22 36
Q ss_pred CcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEee
Q 015960 311 GGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIILE 390 (397)
Q Consensus 311 g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~ 390 (397)
|++|..+.+++||+++|++. ||+|+++||+++.+..++...+....+|+++.++++|+|+++++.+++.
T Consensus 271 G~~V~~v~~~spa~~agi~~-----------Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~~~v~l~ 339 (347)
T COG0265 271 GAVVLGVLPGSPAAKAGIKA-----------GDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERELAVTLG 339 (347)
T ss_pred ceEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEEEEEEec
Confidence 89999999999999999999 9999999999999999999999988899999999999999999999998
Q ss_pred e
Q 015960 391 V 391 (397)
Q Consensus 391 ~ 391 (397)
+
T Consensus 340 ~ 340 (347)
T COG0265 340 D 340 (347)
T ss_pred C
Confidence 7
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96 E-value=9e-29 Score=242.76 Aligned_cols=302 Identities=33% Similarity=0.459 Sum_probs=242.1
Q ss_pred cchHHHHHHHcCCceEEEEeeec--ccccc-ccCCCCceeEEEEEcCCCEEEEcccccCCCC-----------eEEEEec
Q 015960 80 EVETAGIFEENLPSVVHITNFGM--NTFTL-TMEYPQATGTGFIWDEDGHIVTNHHVIEGAS-----------SVKVTLF 145 (397)
Q Consensus 80 ~~~~~~~~~~~~~sVV~I~~~~~--~~~~~-~~~~~~~~GSG~ii~~~G~ILT~aHvv~~~~-----------~i~V~~~ 145 (397)
.....++.++..+|+|.|+.... ....+ ..+-+...||||+++.+|+++||+||+.... .+.+...
T Consensus 127 ~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa 206 (473)
T KOG1320|consen 127 KAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAA 206 (473)
T ss_pred hhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEe
Confidence 34567789999999999987322 11111 1234567899999999999999999997432 3677777
Q ss_pred CC--cEEEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccCC---
Q 015960 146 DK--TTLDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITA--- 220 (397)
Q Consensus 146 ~g--~~~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~--- 220 (397)
+| ..+++.+++.|+..|+|+++++.+....++++++.+..+..|+++.++|.|++...+.+.|.++...|.....
T Consensus 207 ~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~ 286 (473)
T KOG1320|consen 207 IGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE 286 (473)
T ss_pred ecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence 66 8999999999999999999998765347899999899999999999999999999999999999988865432
Q ss_pred CCCCcccEEEEccccCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchhHHHHHHHhcc---cc------c
Q 015960 221 TGPPIQGLIQIDAAINRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQLVKFG---KI------I 291 (397)
Q Consensus 221 ~~~~~~~~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l~~~g---~~------~ 291 (397)
++....+++|+|+++++|+||||++|.+|++||+++......+-..+++|++|.+.++.++.+..+.. +. .
T Consensus 287 ~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~ 366 (473)
T KOG1320|consen 287 TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPV 366 (473)
T ss_pred cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccc
Confidence 22456788999999999999999999999999999988776555678999999999988887764332 22 2
Q ss_pred cccCCCch---------hHHHHHH----cCCCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHH
Q 015960 292 RPYLGIAH---------DQLLEKL----MGISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNAND 358 (397)
Q Consensus 292 ~~~lGi~~---------~~~~~~~----~~~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d 358 (397)
+.|+|+.. ++..+.. ....++++..|.+++++...++.+ ||+|++|||++|.+..+
T Consensus 367 ~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~-----------g~~V~~vng~~V~n~~~ 435 (473)
T KOG1320|consen 367 HQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKP-----------GDQVVKVNGKPVKNLKH 435 (473)
T ss_pred cccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccC-----------CCEEEEECCEEeechHH
Confidence 34777611 1110100 011478999999999999999999 99999999999999999
Q ss_pred HHHHHhcCCCCCEEEEEEEECCeEEEEEEEeeeC
Q 015960 359 LHNILDQCKVGDEVIVRILRGTQLEEILIILEVE 392 (397)
Q Consensus 359 ~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~ 392 (397)
++++++.+.++++|.+..+|..+..++.+..++.
T Consensus 436 l~~~i~~~~~~~~v~vl~~~~~e~~tl~Il~~~~ 469 (473)
T KOG1320|consen 436 LYELIEECSTEDKVAVLDRRSAEDATLEILPEHK 469 (473)
T ss_pred HHHHHHhcCcCceEEEEEecCccceeEEeccccc
Confidence 9999999888899999999998988888876543
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.91 E-value=6e-23 Score=203.73 Aligned_cols=292 Identities=24% Similarity=0.333 Sum_probs=228.2
Q ss_pred hHHHHHHHcCCceEEEEeeeccccccccCCCCceeEEEEEcC-CCEEEEcccccCCC-CeEEEEecCCcEEEEEEEEEcC
Q 015960 82 ETAGIFEENLPSVVHITNFGMNTFTLTMEYPQATGTGFIWDE-DGHIVTNHHVIEGA-SSVKVTLFDKTTLDAKVVGHDQ 159 (397)
Q Consensus 82 ~~~~~~~~~~~sVV~I~~~~~~~~~~~~~~~~~~GSG~ii~~-~G~ILT~aHvv~~~-~~i~V~~~~g~~~~a~vv~~d~ 159 (397)
++...+..+.+|||.|.....-.|... ......||||++++ .||+|||+|++... -.-.+.+.+-+..+.-.++.|+
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~~fdte-sag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvyrDp 131 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVRAFDTE-SAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVYRDP 131 (955)
T ss_pred hhhhhhhhhcccEEEEEehheeecccc-cccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCcccccCCc
Confidence 788899999999999988665444322 23456799999994 58999999999754 3456777787888888899999
Q ss_pred CCCeEEEEEcCCCC---CcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCc----ccEEEEc
Q 015960 160 GTDLAVLHIDAPNH---KLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPI----QGLIQID 232 (397)
Q Consensus 160 ~~DlAlL~v~~~~~---~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~----~~~i~~~ 232 (397)
.+|+.++|.+.... .+.-+++.. +..++|.+++++|+..+...++..|.++.+++......+..+ ..++|.-
T Consensus 132 VhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~Qaa 210 (955)
T KOG1421|consen 132 VHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYIQAA 210 (955)
T ss_pred hhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceeeeeh
Confidence 99999999987532 233444432 335899999999999999999999999999887644333222 2357777
Q ss_pred cccCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchhHHHHHHHhccccccccCCC---------------
Q 015960 233 AAINRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQLVKFGKIIRPYLGI--------------- 297 (397)
Q Consensus 233 ~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l~~~g~~~~~~lGi--------------- 297 (397)
+....|.||+|++|.+|..|.++..+... .+.+|++|++.+++-+..++++.-++|+.|-+
T Consensus 211 sstsggssgspVv~i~gyAVAl~agg~~s----sas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rrlGL 286 (955)
T KOG1421|consen 211 SSTSGGSSGSPVVDIPGYAVALNAGGSIS----SASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRRLGL 286 (955)
T ss_pred hcCCCCCCCCceecccceEEeeecCCccc----ccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHhcCC
Confidence 77889999999999999999998776543 34689999999999999999888887776544
Q ss_pred chh--HHHH-HHcCCCCc-EEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEE
Q 015960 298 AHD--QLLE-KLMGISGG-VIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVI 373 (397)
Q Consensus 298 ~~~--~~~~-~~~~~~g~-~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~ 373 (397)
.-+ +..+ +.....|+ +|..+.++|||++. |++ ||++++||++-+.++.++.+.|++ ..|+.+.
T Consensus 287 ~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~-Le~-----------GDillavN~t~l~df~~l~~iLDe-gvgk~l~ 353 (955)
T KOG1421|consen 287 SSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK-LEP-----------GDILLAVNSTCLNDFEALEQILDE-GVGKNLE 353 (955)
T ss_pred cHHHHHHHHhcCcccceeEEEEEeccCCchhhc-cCC-----------CcEEEEEcceehHHHHHHHHHHhh-ccCceEE
Confidence 211 1111 12233565 56788999999888 777 999999999999999999999987 4899999
Q ss_pred EEEEECCeEEEEEEEeeeC
Q 015960 374 VRILRGTQLEEILIILEVE 392 (397)
Q Consensus 374 l~v~R~g~~~~~~v~l~~~ 392 (397)
|+|+|+|++.++.+++..+
T Consensus 354 LtI~Rggqelel~vtvqdl 372 (955)
T KOG1421|consen 354 LTIQRGGQELELTVTVQDL 372 (955)
T ss_pred EEEEeCCEEEEEEEEeccc
Confidence 9999999999999887644
No 9
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.71 E-value=5.1e-17 Score=133.88 Aligned_cols=109 Identities=38% Similarity=0.532 Sum_probs=74.8
Q ss_pred eEEEEEcCCCEEEEcccccC--------CCCeEEEEecCCcEEE--EEEEEEcCC-CCeEEEEEcCCCCCcceeecCCCC
Q 015960 116 GTGFIWDEDGHIVTNHHVIE--------GASSVKVTLFDKTTLD--AKVVGHDQG-TDLAVLHIDAPNHKLRSIPVGVSA 184 (397)
Q Consensus 116 GSG~ii~~~G~ILT~aHvv~--------~~~~i~V~~~~g~~~~--a~vv~~d~~-~DlAlL~v~~~~~~~~~~~l~~s~ 184 (397)
||||+|+++|+||||+||+. ....+.+...++.... ++++..++. .|+|||+++. .
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~-------------~ 67 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP-------------W 67 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC-------------E
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec-------------c
Confidence 79999999999999999998 4567888888888888 999999999 9999999980 0
Q ss_pred CCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccccCCCCcccceecCCccEEEE
Q 015960 185 NLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAINRGNSGGPLLDSSGSLIGV 254 (397)
Q Consensus 185 ~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvn~~G~vVGI 254 (397)
...+...... ............. .....+ +++.+.+|+||||+||.+|+||||
T Consensus 68 -~~~~~~~~~~------------~~~~~~~~~~~~~---~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 68 -TGVGGGVRVP------------GSTSGVSPTSTND---NRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -EEEEEEEEEE------------EEEEEEEEEEEEE---TEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred -cceeeeeEee------------eeccccccccCcc---cceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 0000000000 0000000000000 001114 799999999999999999999997
No 10
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.57 E-value=5.5e-15 Score=114.12 Aligned_cols=82 Identities=38% Similarity=0.566 Sum_probs=69.5
Q ss_pred ccCCCchhHHHHHHcCCCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEE
Q 015960 293 PYLGIAHDQLLEKLMGISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEV 372 (397)
Q Consensus 293 ~~lGi~~~~~~~~~~~~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v 372 (397)
||||+...... ...|++|.++.++|||+++||++ ||+|++|||++|+++.++..++....+|+++
T Consensus 1 ~~lGv~~~~~~----~~~g~~V~~V~~~spA~~aGl~~-----------GD~I~~ing~~v~~~~~~~~~l~~~~~g~~v 65 (82)
T PF13180_consen 1 GGLGVTVQNLS----DTGGVVVVSVIPGSPAAKAGLQP-----------GDIILAINGKPVNSSEDLVNILSKGKPGDTV 65 (82)
T ss_dssp -E-SEEEEECS----CSSSEEEEEESTTSHHHHTTS-T-----------TEEEEEETTEESSSHHHHHHHHHCSSTTSEE
T ss_pred CEECeEEEEcc----CCCeEEEEEeCCCCcHHHCCCCC-----------CcEEEEECCEEcCCHHHHHHHHHhCCCCCEE
Confidence 46777543211 13589999999999999999999 9999999999999999999999888999999
Q ss_pred EEEEEECCeEEEEEEEe
Q 015960 373 IVRILRGTQLEEILIIL 389 (397)
Q Consensus 373 ~l~v~R~g~~~~~~v~l 389 (397)
+++|+|+|+.++++++|
T Consensus 66 ~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 66 TLTVLRDGEELTVEVTL 82 (82)
T ss_dssp EEEEEETTEEEEEEEE-
T ss_pred EEEEEECCEEEEEEEEC
Confidence 99999999999999875
No 11
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.55 E-value=4.8e-13 Score=121.49 Aligned_cols=178 Identities=19% Similarity=0.249 Sum_probs=117.4
Q ss_pred CceEEEEeeeccccccccCCCCceeEEEEEcCCCEEEEcccccCCCCeEEEEec-------CC--cEEEEEEEEEc----
Q 015960 92 PSVVHITNFGMNTFTLTMEYPQATGTGFIWDEDGHIVTNHHVIEGASSVKVTLF-------DK--TTLDAKVVGHD---- 158 (397)
Q Consensus 92 ~sVV~I~~~~~~~~~~~~~~~~~~GSG~ii~~~G~ILT~aHvv~~~~~i~V~~~-------~g--~~~~a~vv~~d---- 158 (397)
|-+|.|.... ....|+|++|+++ +|||+|||+.+...+.+.+. ++ ..+..+-+..+
T Consensus 13 p~~v~i~~~~----------~~~~C~G~li~~~-~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~ 81 (220)
T PF00089_consen 13 PWVVSIRYSN----------GRFFCTGTLISPR-WVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYD 81 (220)
T ss_dssp TTEEEEEETT----------TEEEEEEEEEETT-EEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSB
T ss_pred CeEEEEeeCC----------CCeeEeEEecccc-cccccccccccccccccccccccccccccccccccccccccccccc
Confidence 6677776421 1458999999987 99999999999666666543 22 34444433332
Q ss_pred C---CCCeEEEEEcCC---CCCcceeecCCC-CCCCCCCeEEEEecCCCCCC----ceeeeEEeeccccccC--CCCCCc
Q 015960 159 Q---GTDLAVLHIDAP---NHKLRSIPVGVS-ANLRIGQKVYAIGHPLGRKF----TCTAGIISAFGLEPIT--ATGPPI 225 (397)
Q Consensus 159 ~---~~DlAlL~v~~~---~~~~~~~~l~~s-~~~~~G~~V~~iG~p~g~~~----~~~~G~vs~~~~~~~~--~~~~~~ 225 (397)
+ ..|+|||+++.+ ...+.++.+... ..+..|+.+.++||+..... ......+....+.... ......
T Consensus 82 ~~~~~~DiAll~L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~ 161 (220)
T PF00089_consen 82 PSTYDNDIALLKLDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLT 161 (220)
T ss_dssp TTTTTTSEEEEEESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTST
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 2 579999999987 446777777652 33588999999999975322 3333333332221111 111112
Q ss_pred ccEEEEcc----ccCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchhHHH
Q 015960 226 QGLIQIDA----AINRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGIV 281 (397)
Q Consensus 226 ~~~i~~~~----~i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~ 281 (397)
...+.... ..+.|+|||||++.++.|+||++... ..+.....+++.++....+|+
T Consensus 162 ~~~~c~~~~~~~~~~~g~sG~pl~~~~~~lvGI~s~~~-~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 162 PNMICAGSSGSGDACQGDSGGPLICNNNYLVGIVSFGE-NCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp TTEEEEETTSSSBGGTTTTTSEEEETTEEEEEEEEEES-SSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccccccccccccccccceeeecceeeecC-CCCCCCcCEEEEEHHHhhccC
Confidence 34566655 78999999999988777999999883 333333368889998888775
No 12
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.49 E-value=5.1e-13 Score=133.78 Aligned_cols=255 Identities=18% Similarity=0.161 Sum_probs=170.2
Q ss_pred ceeEEEEEc-CCCEEEEcccccC-CCCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCe
Q 015960 114 ATGTGFIWD-EDGHIVTNHHVIE-GASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQK 191 (397)
Q Consensus 114 ~~GSG~ii~-~~G~ILT~aHvv~-~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~ 191 (397)
..|||.|++ ..|++++...++. ++.+.+|++.|...++|.+.+.++..++|.+|.+.. ....+.+.+ ..+..||+
T Consensus 550 ~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~n~a~~kydp~--~~~~~kl~~-~~v~~gD~ 626 (955)
T KOG1421|consen 550 YKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTENVASFKYDPA--LEVQLKLTD-TTVLRGDE 626 (955)
T ss_pred hcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCccceeEeccChh--Hhhhhccce-eeEecCCc
Confidence 469999998 4589999999997 667899999999999999999999999999999874 234444533 44789999
Q ss_pred EEEEecCCCCCCceeeeEEeec-----cccccCCCCCCcccEEEEccccCCCCcccceecCCccEEEEEeeeeccC--CC
Q 015960 192 VYAIGHPLGRKFTCTAGIISAF-----GLEPITATGPPIQGLIQIDAAINRGNSGGPLLDSSGSLIGVNTSIITRT--DA 264 (397)
Q Consensus 192 V~~iG~p~g~~~~~~~G~vs~~-----~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~--~~ 264 (397)
+...|+-...........|... .+...........+.|..++.+.-+.--|-+.|.+|+|+|++-..+... +.
T Consensus 627 ~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~k 706 (955)
T KOG1421|consen 627 CTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGK 706 (955)
T ss_pred eeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEeccccccccceEEECCCCeEEEEEeeeeccccCCc
Confidence 9999987654322222222111 1111111111224556665555444445688899999999986555432 22
Q ss_pred ccCceeeEeccchhHHHHHHHhccccccccCCCc---hhHHHHHHcCCC----------------CcEEEEecccCcccc
Q 015960 265 FCGMACSIPIDTVSGIVDQLVKFGKIIRPYLGIA---HDQLLEKLMGIS----------------GGVIFIAVEEGPAGK 325 (397)
Q Consensus 265 ~~~~~~aip~~~i~~~~~~l~~~g~~~~~~lGi~---~~~~~~~~~~~~----------------g~~V~~v~~~spa~~ 325 (397)
..-.-|.+.+..++++++.|+.++......+|+. .+...++.+|+. =.+|.++.+.-
T Consensus 707 d~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lglp~e~imk~e~es~~~~ql~~ishv~~~~---- 782 (955)
T KOG1421|consen 707 DYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGLPSEFIMKSEEESTIPRQLYVISHVRPLL---- 782 (955)
T ss_pred eeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCCCHHHHhhhhhcCCCcceEEEEEeeccCc----
Confidence 3334566778889999999998877543344441 111112222222 23444544332
Q ss_pred cCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEeeeC
Q 015960 326 AGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIILEVE 392 (397)
Q Consensus 326 aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~ 392 (397)
++.|..||+|+++||+.|+...|+.+.. .+..+|+|+|.++++++++-+.
T Consensus 783 ----------~kil~~gdiilsvngk~itr~~dl~d~~-------eid~~ilrdg~~~~ikipt~p~ 832 (955)
T KOG1421|consen 783 ----------HKILGVGDIILSVNGKMITRLSDLHDFE-------EIDAVILRDGIEMEIKIPTYPE 832 (955)
T ss_pred ----------ccccccccEEEEecCeEEeeehhhhhhh-------hhheeeeecCcEEEEEeccccc
Confidence 3334459999999999999999999733 4778999999999999987544
No 13
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.44 E-value=1.3e-13 Score=136.24 Aligned_cols=275 Identities=21% Similarity=0.212 Sum_probs=190.0
Q ss_pred HHHHcCCceEEEEeeecccc----ccccCCCCceeEEEEEcCCCEEEEcccccC---CCCeEEEEec-CCcEEEEEEEEE
Q 015960 86 IFEENLPSVVHITNFGMNTF----TLTMEYPQATGTGFIWDEDGHIVTNHHVIE---GASSVKVTLF-DKTTLDAKVVGH 157 (397)
Q Consensus 86 ~~~~~~~sVV~I~~~~~~~~----~~~~~~~~~~GSG~ii~~~G~ILT~aHvv~---~~~~i~V~~~-~g~~~~a~vv~~ 157 (397)
..+....|++.+.....+.. |.........|+||.+.-. .++|++|++. ++..+.+.-+ .-+.+.+++...
T Consensus 55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~ 133 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAV 133 (473)
T ss_pred CccccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccccccccccCCCchhhhhhHHHh
Confidence 34566678888876554222 1122244567999999754 8999999999 5566666522 336778888888
Q ss_pred cCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccccCC
Q 015960 158 DQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAINR 237 (397)
Q Consensus 158 d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~ 237 (397)
-.+.|+|++.++..+......++...+-+...+.++++| +....+|.|.|+..........+ .....+++|+++++
T Consensus 134 ~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~~-~~l~~vqi~aa~~~ 209 (473)
T KOG1320|consen 134 FEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHSS-TVLLRVQIDAAIGP 209 (473)
T ss_pred hhcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---CCcEEEEeeEEEEEEeccccCCC-cceeeEEEEEeecC
Confidence 899999999999743332232333334466778899998 67789999999887665433322 23457999999999
Q ss_pred CCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchhHHHHHHHhccccc-cccCCCchh----HHHHHHc--CC-
Q 015960 238 GNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQLVKFGKII-RPYLGIAHD----QLLEKLM--GI- 309 (397)
Q Consensus 238 G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l~~~g~~~-~~~lGi~~~----~~~~~~~--~~- 309 (397)
|+||+|.+...+++.|+........+ .+++.||.-.+.++.......+... .++++...+ ...++.+ +.
T Consensus 210 ~~s~ep~i~g~d~~~gvA~l~ik~~~---~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~ 286 (473)
T KOG1320|consen 210 GNSGEPVIVGVDKVAGVAFLKIKTPE---NILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE 286 (473)
T ss_pred CccCCCeEEccccccceEEEEEecCC---cccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence 99999999877899999998875332 5789999999988887766655532 233333111 1112222 22
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecC------CHHHHHHHHhcCCCCCEEEEEEEECC
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVS------NANDLHNILDQCKVGDEVIVRILRGT 380 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~------~~~d~~~~l~~~~~g~~v~l~v~R~g 380 (397)
.|+.+.++.+-+.|-+. ++. ||.|+++||..|. .+..+...+..+.|+|++.+.+.|.+
T Consensus 287 ~g~~i~~~~qtd~ai~~-~ns-----------g~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~ 351 (473)
T KOG1320|consen 287 TGVLISKINQTDAAINP-GNS-----------GGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG 351 (473)
T ss_pred cceeeeeecccchhhhc-ccC-----------CCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence 46788888777766555 344 9999999999884 12234456667789999999999987
No 14
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.42 E-value=1.3e-11 Score=112.79 Aligned_cols=180 Identities=17% Similarity=0.168 Sum_probs=107.1
Q ss_pred CCceEEEEeeeccccccccCCCCceeEEEEEcCCCEEEEcccccCCC--CeEEEEecC---------CcEEEEEEEEEc-
Q 015960 91 LPSVVHITNFGMNTFTLTMEYPQATGTGFIWDEDGHIVTNHHVIEGA--SSVKVTLFD---------KTTLDAKVVGHD- 158 (397)
Q Consensus 91 ~~sVV~I~~~~~~~~~~~~~~~~~~GSG~ii~~~G~ILT~aHvv~~~--~~i~V~~~~---------g~~~~a~vv~~d- 158 (397)
-|-+|.|... .....|+|++|+++ +|||+|||+.+. ..+.|.+.. ...+..+-+..+
T Consensus 12 ~Pw~v~i~~~----------~~~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp 80 (232)
T cd00190 12 FPWQVSLQYT----------GGRHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHP 80 (232)
T ss_pred CCCEEEEEcc----------CCcEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECC
Confidence 4566777542 13458999999987 999999999875 556666532 122333333333
Q ss_pred ------CCCCeEEEEEcCCC---CCcceeecCCCC-CCCCCCeEEEEecCCCCCC-----ceeeeEEeeccccc---cCC
Q 015960 159 ------QGTDLAVLHIDAPN---HKLRSIPVGVSA-NLRIGQKVYAIGHPLGRKF-----TCTAGIISAFGLEP---ITA 220 (397)
Q Consensus 159 ------~~~DlAlL~v~~~~---~~~~~~~l~~s~-~~~~G~~V~~iG~p~g~~~-----~~~~G~vs~~~~~~---~~~ 220 (397)
...|||||+++.+- ..+.++.|.... .+..|+.+.+.||+..... ......+..+.... ...
T Consensus 81 ~y~~~~~~~DiAll~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~ 160 (232)
T cd00190 81 NYNPSTYDNDIALLKLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYS 160 (232)
T ss_pred CCCCCCCcCCEEEEEECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhcc
Confidence 35899999998752 246788886543 5778999999999864322 12222222211111 100
Q ss_pred C-CCCcccEEEE-----ccccCCCCcccceecCC---ccEEEEEeeeeccCCCccCceeeEeccchhHHHH
Q 015960 221 T-GPPIQGLIQI-----DAAINRGNSGGPLLDSS---GSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVD 282 (397)
Q Consensus 221 ~-~~~~~~~i~~-----~~~i~~G~SGGPlvn~~---G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~ 282 (397)
. .......+.. +...|+|+|||||+... +.++||.+++.. ++.....+....+...++|++
T Consensus 161 ~~~~~~~~~~C~~~~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~~WI~ 230 (232)
T cd00190 161 YGGTITDNMLCAGGLEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYLDWIQ 230 (232)
T ss_pred CcccCCCceEeeCCCCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhhHHhh
Confidence 0 0000122222 34578899999999764 789999998764 221122344455555566654
No 15
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.32 E-value=1.2e-11 Score=94.67 Aligned_cols=68 Identities=28% Similarity=0.408 Sum_probs=63.3
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILII 388 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 388 (397)
.|++|.++.++|||+++||++ ||+|++|||+++.++.++.+++....+|+++.+++.|+|+.++++++
T Consensus 10 ~Gv~V~~V~~~spa~~aGL~~-----------GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 10 AGVVIVGVIVGSPAENAVLHT-----------GDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVST 77 (79)
T ss_pred CcEEEEEECCCChHHhcCCCC-----------CCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence 689999999999999999999 99999999999999999999998777799999999999998877664
No 16
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.32 E-value=1.8e-11 Score=93.68 Aligned_cols=72 Identities=31% Similarity=0.395 Sum_probs=66.5
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIIL 389 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 389 (397)
.|++|..+.++|||++ ||++ ||+|++|||+++.+++++..++....+|+.+.+++.|+|+.+++++++
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l 75 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKA-----------GDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLIL 75 (79)
T ss_pred cCEEEEEECCCCchhh-CCCC-----------CCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEE
Confidence 6899999999999987 8999 999999999999999999999987678999999999999999999999
Q ss_pred eeCC
Q 015960 390 EVEP 393 (397)
Q Consensus 390 ~~~~ 393 (397)
.++|
T Consensus 76 ~~~~ 79 (79)
T cd00986 76 KTFP 79 (79)
T ss_pred eccC
Confidence 8764
No 17
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.32 E-value=7.4e-11 Score=108.04 Aligned_cols=161 Identities=17% Similarity=0.218 Sum_probs=96.3
Q ss_pred CceeEEEEEcCCCEEEEcccccCCCC--eEEEEecCC--------cEEEEEEEEEc-------CCCCeEEEEEcCC---C
Q 015960 113 QATGTGFIWDEDGHIVTNHHVIEGAS--SVKVTLFDK--------TTLDAKVVGHD-------QGTDLAVLHIDAP---N 172 (397)
Q Consensus 113 ~~~GSG~ii~~~G~ILT~aHvv~~~~--~i~V~~~~g--------~~~~a~vv~~d-------~~~DlAlL~v~~~---~ 172 (397)
...|+|++|+++ +|||+|||+.+.. .+.|.+... ..+.+.-+..+ ...|||||+++.+ .
T Consensus 25 ~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~ 103 (229)
T smart00020 25 RHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLS 103 (229)
T ss_pred CcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCC
Confidence 458999999987 9999999998753 677776532 22333333322 4689999999876 2
Q ss_pred CCcceeecCCC-CCCCCCCeEEEEecCCCCC------CceeeeEEeeccccccCC---CC-CCcccEEEE-----ccccC
Q 015960 173 HKLRSIPVGVS-ANLRIGQKVYAIGHPLGRK------FTCTAGIISAFGLEPITA---TG-PPIQGLIQI-----DAAIN 236 (397)
Q Consensus 173 ~~~~~~~l~~s-~~~~~G~~V~~iG~p~g~~------~~~~~G~vs~~~~~~~~~---~~-~~~~~~i~~-----~~~i~ 236 (397)
..+.++.+... ..+..++.+.+.||+.... .......+..+....... .. ......+.. ....|
T Consensus 104 ~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 183 (229)
T smart00020 104 DNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDAC 183 (229)
T ss_pred CceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCccc
Confidence 34677777653 3467889999999986542 112222222222111110 00 000111211 35578
Q ss_pred CCCcccceecCCc--cEEEEEeeeeccCCCccCceeeEecc
Q 015960 237 RGNSGGPLLDSSG--SLIGVNTSIITRTDAFCGMACSIPID 275 (397)
Q Consensus 237 ~G~SGGPlvn~~G--~vVGI~s~~~~~~~~~~~~~~aip~~ 275 (397)
+|+||||++...+ .++||.+.+. .+........+..+.
T Consensus 184 ~gdsG~pl~~~~~~~~l~Gi~s~g~-~C~~~~~~~~~~~i~ 223 (229)
T smart00020 184 QGDSGGPLVCNDGRWVLVGIVSWGS-GCARPGKPGVYTRVS 223 (229)
T ss_pred CCCCCCeeEEECCCEEEEEEEEECC-CCCCCCCCCEEEEec
Confidence 9999999996543 8999999876 332222334444444
No 18
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.29 E-value=5.5e-12 Score=117.90 Aligned_cols=101 Identities=20% Similarity=0.241 Sum_probs=89.1
Q ss_pred cchhHHHHHHHhccccccccCCCchhHHHHHHcCCCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecC
Q 015960 275 DTVSGIVDQLVKFGKIIRPYLGIAHDQLLEKLMGISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVS 354 (397)
Q Consensus 275 ~~i~~~~~~l~~~g~~~~~~lGi~~~~~~~~~~~~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~ 354 (397)
..++++++++++++++.+.|+|+...... -...|++|..+.+++||+++||++ ||+|++|||+++.
T Consensus 159 ~~~~~v~~~l~~~g~~~~~~lgi~p~~~~---g~~~G~~v~~v~~~s~a~~aGLr~-----------GDvIv~ING~~i~ 224 (259)
T TIGR01713 159 VVSRRIIEELTKDPQKMFDYIRLSPVMKN---DKLEGYRLNPGKDPSLFYKSGLQD-----------GDIAVALNGLDLR 224 (259)
T ss_pred hhHHHHHHHHHHCHHhhhheEeEEEEEeC---CceeEEEEEecCCCCHHHHcCCCC-----------CCEEEEECCEEcC
Confidence 46788999999999999999999653211 123799999999999999999999 9999999999999
Q ss_pred CHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 015960 355 NANDLHNILDQCKVGDEVIVRILRGTQLEEILIIL 389 (397)
Q Consensus 355 ~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 389 (397)
++.++.+++.+.+++++++++|.|+|+.+++.+.+
T Consensus 225 ~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 225 DPEQAFQALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred CHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 99999999998889999999999999998888764
No 19
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.21 E-value=1e-10 Score=91.50 Aligned_cols=66 Identities=38% Similarity=0.573 Sum_probs=60.6
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEIL 386 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~ 386 (397)
.|++|.++.+++||+++||++ ||+|++|||+++.++.++.+++.....++.+.+++.|+|+..++.
T Consensus 24 ~g~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~i~~~~~~~~~l~~~~~~~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 24 KGVLVASVDPGSPAAKAGLKP-----------GDVILAVNGKPVKSVADLRRALAELKPGDKVTLTVLRGGKELTVT 89 (90)
T ss_pred CEEEEEEECCCCHHHHcCCCc-----------CCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEEEECCEEEEee
Confidence 589999999999999999999 999999999999999999999987667899999999999876654
No 20
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.15 E-value=1.8e-10 Score=88.15 Aligned_cols=78 Identities=36% Similarity=0.569 Sum_probs=63.7
Q ss_pred ccCCCchhHHHHHHcCCCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEE
Q 015960 293 PYLGIAHDQLLEKLMGISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEV 372 (397)
Q Consensus 293 ~~lGi~~~~~~~~~~~~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v 372 (397)
||+|+.... ...|++|.++.++|||+++||++ ||+|++|||+++.++.++ +...++++.+
T Consensus 1 ~~~G~~~~~------~~~~~~V~~V~~~s~a~~aGl~~-----------GD~I~~Ing~~v~~~~~~---l~~~~~~~~v 60 (80)
T cd00990 1 PYLGLTLDK------EEGLGKVTFVRDDSPADKAGLVA-----------GDELVAVNGWRVDALQDR---LKEYQAGDPV 60 (80)
T ss_pred CcccEEEEc------cCCcEEEEEECCCChHHHhCCCC-----------CCEEEEECCEEhHHHHHH---HHhcCCCCEE
Confidence 567775532 12568999999999999999999 999999999999886654 4444578899
Q ss_pred EEEEEECCeEEEEEEEee
Q 015960 373 IVRILRGTQLEEILIILE 390 (397)
Q Consensus 373 ~l~v~R~g~~~~~~v~l~ 390 (397)
.+++.|+|+..++.+++.
T Consensus 61 ~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 61 ELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred EEEEEECCEEEEEEEEec
Confidence 999999999888888764
No 21
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.04 E-value=1.4e-09 Score=82.89 Aligned_cols=66 Identities=29% Similarity=0.427 Sum_probs=59.2
Q ss_pred CcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 015960 311 GGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILII 388 (397)
Q Consensus 311 g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 388 (397)
.++|..+.++|||+++||++ ||+|++|||+++.++.++...+... .++.+.+++.|+++..++.++
T Consensus 13 ~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~l~~~-~~~~~~l~v~r~~~~~~~~l~ 78 (79)
T cd00989 13 EPVIGEVVPGSPAAKAGLKA-----------GDRILAINGQKIKSWEDLVDAVQEN-PGKPLTLTVERNGETITLTLT 78 (79)
T ss_pred CcEEEeECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHHC-CCceEEEEEEECCEEEEEEec
Confidence 47899999999999999999 9999999999999999999998774 478999999999987776653
No 22
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=99.01 E-value=1.2e-08 Score=93.61 Aligned_cols=158 Identities=14% Similarity=0.174 Sum_probs=94.3
Q ss_pred eeEEEEEcCCCEEEEcccccCCCCe----EEEEe----cCCc-EEEEEE--EEEc-C---CCCeEEEEEcCC--------
Q 015960 115 TGTGFIWDEDGHIVTNHHVIEGASS----VKVTL----FDKT-TLDAKV--VGHD-Q---GTDLAVLHIDAP-------- 171 (397)
Q Consensus 115 ~GSG~ii~~~G~ILT~aHvv~~~~~----i~V~~----~~g~-~~~a~v--v~~d-~---~~DlAlL~v~~~-------- 171 (397)
.+++|+|+++ .+|||+||+..... +.+.. .++. .+..+. ..+. . ..|.+...+...
T Consensus 65 ~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~ 143 (251)
T COG3591 65 CTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINI 143 (251)
T ss_pred eeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCc
Confidence 4566999988 99999999964431 11111 1111 111111 1111 2 456666655431
Q ss_pred CCCcceeecCCCCCCCCCCeEEEEecCCCCCCce----eeeEEeeccccccCCCCCCcccEEEEccccCCCCcccceecC
Q 015960 172 NHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTC----TAGIISAFGLEPITATGPPIQGLIQIDAAINRGNSGGPLLDS 247 (397)
Q Consensus 172 ~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~----~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvn~ 247 (397)
...............+.++.+.++|||.+..... ..+.+.... ...+++++.+++|+||+|+++.
T Consensus 144 ~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~-----------~~~l~y~~dT~pG~SGSpv~~~ 212 (251)
T COG3591 144 GDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIK-----------GNKLFYDADTLPGSSGSPVLIS 212 (251)
T ss_pred cccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEe-----------cceEEEEecccCCCCCCceEec
Confidence 1122223344445678899999999997654222 223332211 2368889999999999999999
Q ss_pred CccEEEEEeeeeccCCCccCceeeEec-cchhHHHHHHH
Q 015960 248 SGSLIGVNTSIITRTDAFCGMACSIPI-DTVSGIVDQLV 285 (397)
Q Consensus 248 ~G~vVGI~s~~~~~~~~~~~~~~aip~-~~i~~~~~~l~ 285 (397)
+.+|+|+++.+....+. ...++++-+ ..+++++++++
T Consensus 213 ~~~vigv~~~g~~~~~~-~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 213 KDEVIGVHYNGPGANGG-SLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred CceEEEEEecCCCcccc-cccCcceEecHHHHHHHHHhh
Confidence 88999999988765443 234444433 45666766654
No 23
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.94 E-value=4.9e-09 Score=81.08 Aligned_cols=67 Identities=36% Similarity=0.587 Sum_probs=60.0
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEEEC-CeEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRILRG-TQLEEIL 386 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~R~-g~~~~~~ 386 (397)
.+++|..+.+++||+++||++ ||+|++|||+++.++ .++..++.. .+|+.+.+++.|+ ++..++.
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~vng~~i~~~~~~~~~~~l~~-~~~~~i~l~v~r~~~~~~~~~ 80 (85)
T cd00988 13 GGLVITSVLPGSPAAKAGIKA-----------GDIIVAIDGEPVDGLSLEDVVKLLRG-KAGTKVRLTLKRGDGEPREVT 80 (85)
T ss_pred CeEEEEEecCCCCHHHcCCCC-----------CCEEEEECCEEcCCCCHHHHHHHhcC-CCCCEEEEEEEcCCCCEEEEE
Confidence 678999999999999999999 999999999999999 899988866 4688999999998 8877776
Q ss_pred EE
Q 015960 387 II 388 (397)
Q Consensus 387 v~ 388 (397)
++
T Consensus 81 ~~ 82 (85)
T cd00988 81 LT 82 (85)
T ss_pred EE
Confidence 64
No 24
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.87 E-value=3.5e-09 Score=107.31 Aligned_cols=68 Identities=15% Similarity=0.097 Sum_probs=63.6
Q ss_pred EEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEeee
Q 015960 313 VIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIILEV 391 (397)
Q Consensus 313 ~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~ 391 (397)
+|.+|.++|||++|||++ ||+|++|||++|.+++|+...+....+|++++++|.|+|+++++++++..
T Consensus 129 lV~~V~~~SpA~kAGLk~-----------GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~~ 196 (449)
T PRK10779 129 VVGEIAPNSIAAQAQIAP-----------GTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTLDL 196 (449)
T ss_pred cccccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEecc
Confidence 689999999999999999 99999999999999999999998888899999999999999888888753
No 25
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.83 E-value=1.6e-08 Score=102.19 Aligned_cols=84 Identities=35% Similarity=0.566 Sum_probs=70.2
Q ss_pred cccCCCchh---HHHHHHcCC----CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHh
Q 015960 292 RPYLGIAHD---QLLEKLMGI----SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILD 364 (397)
Q Consensus 292 ~~~lGi~~~---~~~~~~~~~----~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~ 364 (397)
+.++|+... ....+.+++ .|++|.+|.++|||+++||++ ||+|++|||++|.++.|+.+++.
T Consensus 337 ~~~lGi~~~~l~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~-----------GDvI~~Ing~~V~s~~d~~~~l~ 405 (428)
T TIGR02037 337 NPFLGLTVANLSPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQP-----------GDVILSVNQQPVSSVAELRKVLD 405 (428)
T ss_pred ccccceEEecCCHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHH
Confidence 356787432 333343343 589999999999999999999 99999999999999999999998
Q ss_pred cCCCCCEEEEEEEECCeEEEEE
Q 015960 365 QCKVGDEVIVRILRGTQLEEIL 386 (397)
Q Consensus 365 ~~~~g~~v~l~v~R~g~~~~~~ 386 (397)
..++|+++.++|+|+|+...+.
T Consensus 406 ~~~~g~~v~l~v~R~g~~~~~~ 427 (428)
T TIGR02037 406 RAKKGGRVALLILRGGATIFVT 427 (428)
T ss_pred hcCCCCEEEEEEEECCEEEEEE
Confidence 8778999999999999977654
No 26
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.73 E-value=3.9e-08 Score=98.81 Aligned_cols=69 Identities=26% Similarity=0.412 Sum_probs=63.5
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIIL 389 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 389 (397)
.+++|.+|.++|||+++||++ ||+|++|||++|.+++|+.+.+.. .+++++.+++.|+|+..++++++
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~-----------GD~Iv~Vng~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~ 270 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKE-----------GDYIQSINGEKLRSWTDFVSAVKE-NPGKSMDIKVERNGETLSISLTP 270 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHh-CCCCceEEEEEECCEEEEEEEEE
Confidence 378999999999999999999 999999999999999999999977 57888999999999998888877
Q ss_pred e
Q 015960 390 E 390 (397)
Q Consensus 390 ~ 390 (397)
+
T Consensus 271 ~ 271 (420)
T TIGR00054 271 E 271 (420)
T ss_pred c
Confidence 4
No 27
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.71 E-value=3.6e-08 Score=73.19 Aligned_cols=56 Identities=39% Similarity=0.660 Sum_probs=50.8
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRIL 377 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~ 377 (397)
.+++|..+.+++||+++||++ ||+|++|||+++.++ .++.+.+... +|++++|+++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~~-~g~~v~l~v~ 70 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGLQA-----------GDVILAVNGTDVKNLTLEDVAELLKKE-VGEKVTLTVR 70 (70)
T ss_pred CCEEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhhC-CCCeEEEEEC
Confidence 489999999999999999999 999999999999999 9999999875 4888888763
No 28
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.70 E-value=1.8e-06 Score=78.89 Aligned_cols=165 Identities=14% Similarity=0.217 Sum_probs=89.1
Q ss_pred HHHcCCceEEEEeeeccccccccCCCCceeEEEEEcCCCEEEEcccccC-CCCeEEEEecCCcEEEEE-----EEEEcCC
Q 015960 87 FEENLPSVVHITNFGMNTFTLTMEYPQATGTGFIWDEDGHIVTNHHVIE-GASSVKVTLFDKTTLDAK-----VVGHDQG 160 (397)
Q Consensus 87 ~~~~~~sVV~I~~~~~~~~~~~~~~~~~~GSG~ii~~~G~ILT~aHvv~-~~~~i~V~~~~g~~~~a~-----vv~~d~~ 160 (397)
+..+...|++|.+.+. .....--|+.+. .+|+|++|..+ +...+.|...-|.. ... -+..-+.
T Consensus 13 yn~Ia~~ic~l~n~s~--------~~~~~l~gigyG--~~iItn~HLf~~nng~L~i~s~hG~f-~v~nt~~lkv~~i~~ 81 (235)
T PF00863_consen 13 YNPIASNICRLTNESD--------GGTRSLYGIGYG--SYIITNAHLFKRNNGELTIKSQHGEF-TVPNTTQLKVHPIEG 81 (235)
T ss_dssp -HHHHTTEEEEEEEET--------TEEEEEEEEEET--TEEEEEGGGGSSTTCEEEEEETTEEE-EECEGGGSEEEE-TC
T ss_pred cchhhheEEEEEEEeC--------CCeEEEEEEeEC--CEEEEChhhhccCCCeEEEEeCceEE-EcCCccccceEEeCC
Confidence 4566678889986432 112233566665 38999999995 45678888776642 211 1233468
Q ss_pred CCeEEEEEcCCCCCcceeecCC-CCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccccCCCC
Q 015960 161 TDLAVLHIDAPNHKLRSIPVGV-SANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAINRGN 239 (397)
Q Consensus 161 ~DlAlL~v~~~~~~~~~~~l~~-s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~G~ 239 (397)
.||.++|++. ++||.+-.. -..++.++.|.++|.-+..... .-.||......... ...+..+-.....|+
T Consensus 82 ~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~--~s~vSesS~i~p~~----~~~fWkHwIsTk~G~ 152 (235)
T PF00863_consen 82 RDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKSI--SSTVSESSWIYPEE----NSHFWKHWISTKDGD 152 (235)
T ss_dssp SSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCCC--EEEEEEEEEEEEET----TTTEEEE-C---TT-
T ss_pred ccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCCe--eEEECCceEEeecC----CCCeeEEEecCCCCc
Confidence 9999999987 455544322 2458999999999976543321 22233322211111 145677777778999
Q ss_pred cccceecC-CccEEEEEeeeeccCCCccCceeeEecc
Q 015960 240 SGGPLLDS-SGSLIGVNTSIITRTDAFCGMACSIPID 275 (397)
Q Consensus 240 SGGPlvn~-~G~vVGI~s~~~~~~~~~~~~~~aip~~ 275 (397)
-|.|+++. +|.+|||++...... ..+|+.|+.
T Consensus 153 CG~PlVs~~Dg~IVGiHsl~~~~~----~~N~F~~f~ 185 (235)
T PF00863_consen 153 CGLPLVSTKDGKIVGIHSLTSNTS----SRNYFTPFP 185 (235)
T ss_dssp TT-EEEETTT--EEEEEEEEETTT----SSEEEEE--
T ss_pred cCCcEEEcCCCcEEEEEcCccCCC----CeEEEEcCC
Confidence 99999986 999999999875433 257777774
No 29
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.68 E-value=8e-08 Score=97.47 Aligned_cols=69 Identities=20% Similarity=0.301 Sum_probs=63.5
Q ss_pred CcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEee
Q 015960 311 GGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIILE 390 (397)
Q Consensus 311 g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~ 390 (397)
+.+|.+|.++|||+++||++ ||+|++|||++|.+++|+.+.+.. .+|+.+.+++.|+|+..++++++.
T Consensus 222 ~~vV~~V~~~SpA~~AGL~~-----------GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 289 (449)
T PRK10779 222 EPVLAEVQPNSAASKAGLQA-----------GDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEIERQGSPLSLTLTPD 289 (449)
T ss_pred CcEEEeeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEEEECCEEEEEEEEee
Confidence 57899999999999999999 999999999999999999999877 578899999999999988888875
Q ss_pred e
Q 015960 391 V 391 (397)
Q Consensus 391 ~ 391 (397)
.
T Consensus 290 ~ 290 (449)
T PRK10779 290 S 290 (449)
T ss_pred e
Confidence 3
No 30
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.55 E-value=4.1e-07 Score=89.28 Aligned_cols=69 Identities=29% Similarity=0.495 Sum_probs=59.6
Q ss_pred CCcEEEEec--------ccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCe
Q 015960 310 SGGVIFIAV--------EEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQ 381 (397)
Q Consensus 310 ~g~~V~~v~--------~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~ 381 (397)
.|++|.... .+|||+++||++ ||+|++|||++|.+++|+.+++... .++.+.++|.|+|+
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~-----------GDiIvsING~~V~s~~DL~~iL~~~-~g~~V~LtV~R~Ge 172 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQI-----------GDRILKINGEKIKNMDDLANLINKA-GGEKLTLTIERGGK 172 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHhC-CCCeEEEEEEECCE
Confidence 688876553 258999999999 9999999999999999999999875 48899999999999
Q ss_pred EEEEEEEee
Q 015960 382 LEEILIILE 390 (397)
Q Consensus 382 ~~~~~v~l~ 390 (397)
..++.++..
T Consensus 173 ~~tv~V~Pv 181 (402)
T TIGR02860 173 IIETVIKPV 181 (402)
T ss_pred EEEEEEEEe
Confidence 888888743
No 31
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.55 E-value=3.4e-06 Score=78.87 Aligned_cols=168 Identities=17% Similarity=0.150 Sum_probs=93.2
Q ss_pred eeEEEEEcCCCEEEEcccccCCCC--eEEEEecC---------C---cEE-EEEEEEEcC-------C-CCeEEEEEcCC
Q 015960 115 TGTGFIWDEDGHIVTNHHVIEGAS--SVKVTLFD---------K---TTL-DAKVVGHDQ-------G-TDLAVLHIDAP 171 (397)
Q Consensus 115 ~GSG~ii~~~G~ILT~aHvv~~~~--~i~V~~~~---------g---~~~-~a~vv~~d~-------~-~DlAlL~v~~~ 171 (397)
.|.|.+|+++ ||||+|||+.+.. ...|.+.. + ... -.+++ .++ . +|||||+++.+
T Consensus 39 ~Cggsli~~~-~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~ 116 (256)
T KOG3627|consen 39 LCGGSLISPR-WVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEP 116 (256)
T ss_pred eeeeEEeeCC-EEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCC
Confidence 6788788766 9999999999865 66666531 1 111 11222 221 3 89999999975
Q ss_pred ---CCCcceeecCCCCC---CCCCCeEEEEecCCCCC------CceeeeEEeeccccccCCCCC---Cc-ccEEEEc---
Q 015960 172 ---NHKLRSIPVGVSAN---LRIGQKVYAIGHPLGRK------FTCTAGIISAFGLEPITATGP---PI-QGLIQID--- 232 (397)
Q Consensus 172 ---~~~~~~~~l~~s~~---~~~G~~V~~iG~p~g~~------~~~~~G~vs~~~~~~~~~~~~---~~-~~~i~~~--- 232 (397)
...+.++.|..... ...+..+++.||+.... .......+.-+.......... .. ...+...
T Consensus 117 v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~ 196 (256)
T KOG3627|consen 117 VTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPE 196 (256)
T ss_pred cccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccC
Confidence 34567777753332 34448888899875321 122222222222111111000 00 1123332
Q ss_pred --cccCCCCcccceecCC---ccEEEEEeeeeccCCCccCceeeEeccchhHHHHHH
Q 015960 233 --AAINRGNSGGPLLDSS---GSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQL 284 (397)
Q Consensus 233 --~~i~~G~SGGPlvn~~---G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l 284 (397)
...|.|+|||||+-.+ ..++||++++.........-+....+....+++++.
T Consensus 197 ~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~ 253 (256)
T KOG3627|consen 197 GGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKEN 253 (256)
T ss_pred CCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHH
Confidence 3368999999999654 699999999876332221123344455555555543
No 32
>PRK10139 serine endoprotease; Provisional
Probab=98.52 E-value=3.1e-07 Score=93.11 Aligned_cols=65 Identities=28% Similarity=0.432 Sum_probs=59.3
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILI 387 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v 387 (397)
.|++|..+.++|||+++||++ ||+|++|||++|.+|+++.+++... + +++.++|+|+|+...+.+
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~-~-~~v~l~v~R~g~~~~~~~ 454 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQK-----------DDVIIGVNRDRVNSIAEMRKVLAAK-P-AIIALQIVRGNESIYLLL 454 (455)
T ss_pred CceEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhC-C-CeEEEEEEECCEEEEEEe
Confidence 589999999999999999999 9999999999999999999999763 3 689999999999877665
No 33
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.50 E-value=3.2e-07 Score=89.64 Aligned_cols=70 Identities=31% Similarity=0.477 Sum_probs=59.7
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRILRGTQLEEILI 387 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v 387 (397)
.+++|..|.++|||+++||++ ||+|++|||++|.++ .++...+.. .+|+++.+++.|+|+..++++
T Consensus 62 ~~~~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g~~v~l~v~R~g~~~~~~v 129 (334)
T TIGR00225 62 GEIVIVSPFEGSPAEKAGIKP-----------GDKIIKINGKSVAGMSLDDAVALIRG-KKGTKVSLEILRAGKSKPLTF 129 (334)
T ss_pred CEEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHhccC-CCCCEEEEEEEeCCCCceEEE
Confidence 468899999999999999999 999999999999986 466666654 578999999999988777777
Q ss_pred Eeee
Q 015960 388 ILEV 391 (397)
Q Consensus 388 ~l~~ 391 (397)
++..
T Consensus 130 ~l~~ 133 (334)
T TIGR00225 130 TLKR 133 (334)
T ss_pred EEEE
Confidence 6654
No 34
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.49 E-value=3.2e-07 Score=70.39 Aligned_cols=60 Identities=38% Similarity=0.548 Sum_probs=50.1
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECC
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGT 380 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g 380 (397)
.|++|..+.+++||+++||++ ||+|++|||+++.++.+..........++.+.+++.|++
T Consensus 26 ~~~~i~~v~~~s~a~~~gl~~-----------GD~I~~In~~~v~~~~~~~~~~~~~~~~~~~~l~i~r~~ 85 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGLKV-----------GDVILEVNGTSVEGLTHLEAVDLLKKAGGKVTLTVLRGG 85 (85)
T ss_pred CCEEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHHhCCCeEEEEEEeCC
Confidence 589999999999999999999 999999999999987665554433334668999999864
No 35
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.44 E-value=7.6e-07 Score=88.67 Aligned_cols=66 Identities=27% Similarity=0.497 Sum_probs=57.3
Q ss_pred CcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 015960 311 GGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRILRGTQLEEILII 388 (397)
Q Consensus 311 g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 388 (397)
|++|..+.++|||+++||++ ||+|++|||++|.++ .++...+.. ..|+++.++|.|+|+..+++++
T Consensus 103 g~~V~~V~~~SPA~~aGl~~-----------GD~Iv~InG~~v~~~~~~~~~~~l~g-~~g~~v~ltv~r~g~~~~~~l~ 170 (389)
T PLN00049 103 GLVVVAPAPGGPAARAGIRP-----------GDVILAIDGTSTEGLSLYEAADRLQG-PEGSSVELTLRRGPETRLVTLT 170 (389)
T ss_pred cEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhc-CCCCEEEEEEEECCEEEEEEEE
Confidence 78999999999999999999 999999999999864 677777754 5789999999999987776654
No 36
>PRK10942 serine endoprotease; Provisional
Probab=98.42 E-value=8.2e-07 Score=90.47 Aligned_cols=65 Identities=32% Similarity=0.462 Sum_probs=59.0
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILI 387 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v 387 (397)
.|++|.++.++|||+++||++ ||+|++|||++|.+++++.+++.. .+ +.+.++|+|+|+.+.+.+
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~-----------GDvIv~VNg~~V~s~~dl~~~l~~-~~-~~v~l~V~R~g~~~~v~~ 472 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKK-----------GDVIIGANQQPVKNIAELRKILDS-KP-SVLALNIQRGDSSIYLLM 472 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CC-CeEEEEEEECCEEEEEEe
Confidence 479999999999999999999 999999999999999999999976 33 689999999998877654
No 37
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.37 E-value=6.7e-07 Score=88.39 Aligned_cols=61 Identities=20% Similarity=0.272 Sum_probs=52.7
Q ss_pred EEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEE-ECCeEEEEEEEe
Q 015960 314 IFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRIL-RGTQLEEILIIL 389 (397)
Q Consensus 314 V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~-R~g~~~~~~v~l 389 (397)
|..|.|+|||+++||++ ||+|++|||++|.+|.|+...+. ++.+.++|. |+|+..++++..
T Consensus 2 I~~V~pgSpAe~AGLe~-----------GD~IlsING~~V~Dw~D~~~~l~----~e~l~L~V~~rdGe~~~l~Ie~ 63 (433)
T TIGR03279 2 ISAVLPGSIAEELGFEP-----------GDALVSINGVAPRDLIDYQFLCA----DEELELEVLDANGESHQIEIEK 63 (433)
T ss_pred cCCcCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHhc----CCcEEEEEEcCCCeEEEEEEec
Confidence 66789999999999999 99999999999999999988874 356899997 788777776654
No 38
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.34 E-value=1.5e-06 Score=66.37 Aligned_cols=54 Identities=33% Similarity=0.503 Sum_probs=47.0
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecC--CHHHHHHHHhcCCCCCEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVS--NANDLHNILDQCKVGDEVIVRI 376 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~--~~~d~~~~l~~~~~g~~v~l~v 376 (397)
.|++|.++.++|||+++||++ ||+|++|||+++. ++.++.+++.... ..+.+++
T Consensus 26 ~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~~~l~~~~--~~v~l~v 81 (82)
T cd00992 26 GGIFVSRVEPGGPAERGGLRV-----------GDRILEVNGVSVEGLTHEEAVELLKNSG--DEVTLTV 81 (82)
T ss_pred CCeEEEEECCCChHHhCCCCC-----------CCEEEEECCEEcCccCHHHHHHHHHhCC--CeEEEEE
Confidence 589999999999999999999 9999999999999 8999999887632 2666654
No 39
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.28 E-value=1.2e-06 Score=88.23 Aligned_cols=67 Identities=25% Similarity=0.327 Sum_probs=59.0
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIIL 389 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 389 (397)
.|.+|.++.++|||+++||++ ||+|+++||+++.++.|+.+.+.... +++.+++.|+++..++.+++
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~-----------GDvI~~vng~~v~~~~dl~~~ia~~~--~~v~~~I~r~g~~~~l~v~l 194 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEP-----------GDEILSVNGNKIPGFKDVRQQIADIA--GEPMVEILAERENWTFEVMK 194 (420)
T ss_pred CCceeeccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhc--ccceEEEEEecCceEecccc
Confidence 577899999999999999999 99999999999999999999988755 67899999988877655443
No 40
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.24 E-value=1.2e-06 Score=67.08 Aligned_cols=55 Identities=27% Similarity=0.498 Sum_probs=46.2
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRIL 377 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~ 377 (397)
.+++|.++.++|||+++||+. ||.|++|||+++.++ .++..++... +.+++|+|+
T Consensus 25 ~~~~V~~v~~~~~a~~~gl~~-----------GD~Il~INg~~v~~~~~~~~~~~l~~~--~~~v~L~V~ 81 (81)
T PF00595_consen 25 KGVFVSSVVPGSPAERAGLKV-----------GDRILEINGQSVRGMSHDEVVQLLKSA--SNPVTLTVQ 81 (81)
T ss_dssp EEEEEEEECTTSHHHHHTSST-----------TEEEEEETTEESTTSBHHHHHHHHHHS--TSEEEEEEE
T ss_pred CCEEEEEEeCCChHHhcccch-----------hhhhheeCCEeCCCCCHHHHHHHHHCC--CCcEEEEEC
Confidence 589999999999999999999 999999999999976 5566666653 347888774
No 41
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=98.17 E-value=8.9e-06 Score=76.18 Aligned_cols=71 Identities=28% Similarity=0.430 Sum_probs=64.0
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEE-CCeEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILR-GTQLEEILII 388 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~ 388 (397)
.|+++..+..++|+... |+. ||.|++|||+++.+.+++...+...++|++|++++.| +++...++++
T Consensus 130 ~gvyv~~v~~~~~~~gk-l~~-----------gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~t 197 (342)
T COG3480 130 AGVYVLSVIDNSPFKGK-LEA-----------GDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTIT 197 (342)
T ss_pred eeEEEEEccCCcchhce-ecc-----------CCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEE
Confidence 79999999999998644 555 9999999999999999999999999999999999997 8888888888
Q ss_pred eeeC
Q 015960 389 LEVE 392 (397)
Q Consensus 389 l~~~ 392 (397)
+.+.
T Consensus 198 l~~~ 201 (342)
T COG3480 198 LIKN 201 (342)
T ss_pred EEee
Confidence 8776
No 42
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.17 E-value=4.9e-06 Score=83.01 Aligned_cols=68 Identities=26% Similarity=0.518 Sum_probs=56.0
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRILRGTQLEEILI 387 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v 387 (397)
.++.|.++.+++||+++||++ ||+|++|||+++... ++....+.. ++|+.|+|++.|.+..+.+.+
T Consensus 112 ~~~~V~s~~~~~PA~kagi~~-----------GD~I~~IdG~~~~~~~~~~av~~irG-~~Gt~V~L~i~r~~~~k~~~v 179 (406)
T COG0793 112 GGVKVVSPIDGSPAAKAGIKP-----------GDVIIKIDGKSVGGVSLDEAVKLIRG-KPGTKVTLTILRAGGGKPFTV 179 (406)
T ss_pred CCcEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCCCHHHHHHHhCC-CCCCeEEEEEEEcCCCceeEE
Confidence 678899999999999999999 999999999999976 456666665 789999999999754444444
Q ss_pred Ee
Q 015960 388 IL 389 (397)
Q Consensus 388 ~l 389 (397)
++
T Consensus 180 ~l 181 (406)
T COG0793 180 TL 181 (406)
T ss_pred EE
Confidence 43
No 43
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.08 E-value=2.4e-05 Score=60.65 Aligned_cols=68 Identities=24% Similarity=0.388 Sum_probs=46.6
Q ss_pred CCcEEEEeccc--------CcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECC-
Q 015960 310 SGGVIFIAVEE--------GPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGT- 380 (397)
Q Consensus 310 ~g~~V~~v~~~--------spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g- 380 (397)
.+..|.++.++ ||-.+.|+. +++||+|++|||+++....++..+|.. +.|+.|.|+|.+.+
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv~---------v~~GD~I~aInG~~v~~~~~~~~lL~~-~agk~V~Ltv~~~~~ 81 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLAQPGVD---------VREGDYILAINGQPVTADANPYRLLEG-KAGKQVLLTVNRKPG 81 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGGGGS-------------TT-EEEEETTEE-BTTB-HHHHHHT-TTTSEEEEEEE-STT
T ss_pred CEEEEEEEeCCCCCCccccCCccCCCCC---------CCCCCEEEEECCEECCCCCCHHHHhcc-cCCCEEEEEEecCCC
Confidence 56778888876 666666664 234999999999999999999999987 67999999999965
Q ss_pred eEEEEEE
Q 015960 381 QLEEILI 387 (397)
Q Consensus 381 ~~~~~~v 387 (397)
+.+++.|
T Consensus 82 ~~R~v~V 88 (88)
T PF14685_consen 82 GARTVVV 88 (88)
T ss_dssp -EEEEEE
T ss_pred CceEEEC
Confidence 5555543
No 44
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=98.06 E-value=1.1e-05 Score=75.38 Aligned_cols=63 Identities=22% Similarity=0.336 Sum_probs=55.1
Q ss_pred EecccC---cccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 015960 316 IAVEEG---PAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIIL 389 (397)
Q Consensus 316 ~v~~~s---pa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 389 (397)
.+.|+. ..+++||++ ||++++|||.++++.++..+++.+.+..+.++|+|+|+|+.+++.+.+
T Consensus 210 rl~Pgkd~~lF~~~GLq~-----------GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i~l 275 (276)
T PRK09681 210 AVKPGADRSLFDASGFKE-----------GDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIAL 275 (276)
T ss_pred EECCCCcHHHHHHcCCCC-----------CCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEEEc
Confidence 344553 357899999 999999999999999999999988888899999999999999988865
No 45
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.91 E-value=9.1e-06 Score=68.63 Aligned_cols=87 Identities=20% Similarity=0.256 Sum_probs=57.1
Q ss_pred ccCCCchhHHHHHHcCCCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEE
Q 015960 293 PYLGIAHDQLLEKLMGISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEV 372 (397)
Q Consensus 293 ~~lGi~~~~~~~~~~~~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v 372 (397)
+.||++............+.-|.+|.|+|||++|||++. .|.|+.+++..+.+.++|.+.+.. ..+..+
T Consensus 26 g~LG~sv~~~~~~~~~~~~~~Vl~V~p~SPA~~AGL~p~----------~DyIig~~~~~l~~~~~l~~~v~~-~~~~~l 94 (138)
T PF04495_consen 26 GLLGISVRFESFEGAEEEGWHVLRVAPNSPAAKAGLEPF----------FDYIIGIDGGLLDDEDDLFELVEA-NENKPL 94 (138)
T ss_dssp SSS-EEEEEEE-TTGCCCEEEEEEE-TTSHHHHTT--TT----------TEEEEEETTCE--STCHHHHHHHH-TTTS-E
T ss_pred CCCcEEEEEecccccccceEEEeEecCCCHHHHCCcccc----------ccEEEEccceecCCHHHHHHHHHH-cCCCcE
Confidence 567774432111111236788999999999999999982 599999999999999999999987 467899
Q ss_pred EEEEEEC--CeEEEEEEEee
Q 015960 373 IVRILRG--TQLEEILIILE 390 (397)
Q Consensus 373 ~l~v~R~--g~~~~~~v~l~ 390 (397)
.|.|+.. ...+++.++..
T Consensus 95 ~L~Vyns~~~~vR~V~i~P~ 114 (138)
T PF04495_consen 95 QLYVYNSKTDSVREVTITPS 114 (138)
T ss_dssp EEEEEETTTTCEEEEEE---
T ss_pred EEEEEECCCCeEEEEEEEcC
Confidence 9999974 34455655543
No 46
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.87 E-value=3.5e-05 Score=67.96 Aligned_cols=72 Identities=26% Similarity=0.204 Sum_probs=60.6
Q ss_pred CcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHH--HhcCCCCCEEEEEEEECCeEEEEEEE
Q 015960 311 GGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNI--LDQCKVGDEVIVRILRGTQLEEILII 388 (397)
Q Consensus 311 g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~--l~~~~~g~~v~l~v~R~g~~~~~~v~ 388 (397)
-++|.+|.|+|||+++||+. ||.|+.+....-.++..+..+ +.+...++.+.++|.|.|+...+.++
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~-----------gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~lt 208 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCV-----------GDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLT 208 (231)
T ss_pred eEEEeecCCCChhhhhCccc-----------CceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeC
Confidence 67899999999999999999 999999988777776655543 33446788999999999999999998
Q ss_pred eeeCC
Q 015960 389 LEVEP 393 (397)
Q Consensus 389 l~~~~ 393 (397)
...|.
T Consensus 209 P~~W~ 213 (231)
T KOG3129|consen 209 PKKWQ 213 (231)
T ss_pred ccccc
Confidence 88775
No 47
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.79 E-value=0.00027 Score=67.39 Aligned_cols=53 Identities=15% Similarity=0.172 Sum_probs=35.2
Q ss_pred cccCCCCcccceecC--Ccc-EEEEEeeeeccCCCccCceeeEeccchhHHHHHHH
Q 015960 233 AAINRGNSGGPLLDS--SGS-LIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQLV 285 (397)
Q Consensus 233 ~~i~~G~SGGPlvn~--~G~-vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l~ 285 (397)
...|.|+||||++-. +|. -+||++|+...++...-.+..--++....|++...
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~ 278 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMT 278 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHh
Confidence 356899999999943 465 48999999876543222233334566677777643
No 48
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.78 E-value=0.0005 Score=63.08 Aligned_cols=116 Identities=22% Similarity=0.308 Sum_probs=62.4
Q ss_pred ceeEEEEEcCCC--EEEEcccccCCCCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCe
Q 015960 114 ATGTGFIWDEDG--HIVTNHHVIEGASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQK 191 (397)
Q Consensus 114 ~~GSG~ii~~~G--~ILT~aHvv~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~ 191 (397)
..|||-++..+| .|+|+.||+. ...-.|.. .+.... .-++..-|+|.-.++.-....|.+++.. ...|.-
T Consensus 112 s~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~-~g~~~~---~tF~~~GDfA~~~~~~~~G~~P~~k~a~---~~~GrA 183 (297)
T PF05579_consen 112 SVGSGGVFTIGGNTVVVTATHVLG-GNTARVSG-VGTRRM---LTFKKNGDFAEADITNWPGAAPKYKFAQ---NYTGRA 183 (297)
T ss_dssp SEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEE-TTEEEE---EEEEEETTEEEEEETTS-S---B--B-T---T-SEEE
T ss_pred cccccceEEECCeEEEEEEEEEcC-CCeEEEEe-cceEEE---EEEeccCcEEEEECCCCCCCCCceeecC---Ccccce
Confidence 456666665544 5999999998 45555554 333322 2345678999999944323566665541 112211
Q ss_pred EEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccccCCCCcccceecCCccEEEEEeeee
Q 015960 192 VYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAINRGNSGGPLLDSSGSLIGVNTSII 259 (397)
Q Consensus 192 V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~ 259 (397)
-.. -...+..|.|..-. .+ +-..+||||+|++..+|.+|||++..-
T Consensus 184 yW~------t~tGvE~G~ig~~~-------------~~---~fT~~GDSGSPVVt~dg~liGVHTGSn 229 (297)
T PF05579_consen 184 YWL------TSTGVEPGFIGGGG-------------AV---CFTGPGDSGSPVVTEDGDLIGVHTGSN 229 (297)
T ss_dssp EEE------ETTEEEEEEEETTE-------------EE---ESS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred EEE------cccCcccceecCce-------------EE---EEcCCCCCCCccCcCCCCEEEEEecCC
Confidence 111 12233445442211 11 234679999999999999999999764
No 49
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.78 E-value=2.6e-05 Score=77.78 Aligned_cols=65 Identities=40% Similarity=0.582 Sum_probs=56.5
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILIIL 389 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 389 (397)
.+.+|..|.++|||++|||.+ ||.|++|||. ...+...++++++++++.|.++.+++.+++
T Consensus 462 g~~~i~~V~~~gPA~~AGl~~-----------Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~~~~L~e~~v~~ 522 (558)
T COG3975 462 GHEKITFVFPGGPAYKAGLSP-----------GDKIVAINGI--------SDQLDRYKVNDKIQVHVFREGRLREFLVKL 522 (558)
T ss_pred CeeEEEecCCCChhHhccCCC-----------ccEEEEEcCc--------cccccccccccceEEEEccCCceEEeeccc
Confidence 457899999999999999999 9999999999 445566789999999999999999998876
Q ss_pred eeCC
Q 015960 390 EVEP 393 (397)
Q Consensus 390 ~~~~ 393 (397)
...+
T Consensus 523 ~~~~ 526 (558)
T COG3975 523 GGDP 526 (558)
T ss_pred CCCc
Confidence 5443
No 50
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.71 E-value=8.6e-05 Score=78.17 Aligned_cols=67 Identities=28% Similarity=0.405 Sum_probs=52.1
Q ss_pred CCcEEEEecccCccccc-CccccccCCCCcccCCcEEEEEC--CEecCC-----HHHHHHHHhcCCCCCEEEEEEEEC--
Q 015960 310 SGGVIFIAVEEGPAGKA-GLRSTKFGANGKFILGDIIKAVN--GEDVSN-----ANDLHNILDQCKVGDEVIVRILRG-- 379 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~a-Gl~~~~~~~~~~l~~GDiI~~vn--g~~v~~-----~~d~~~~l~~~~~g~~v~l~v~R~-- 379 (397)
.+++|.++.|+|||+++ ||++ ||+|++|| |+++.+ .+++...+.. ++|.+|.|+|.|+
T Consensus 255 ~~~~V~~vipGsPA~ka~gLk~-----------GD~IlaVn~~g~~~~dv~g~~~~~vv~lirG-~~Gt~V~LtV~r~~~ 322 (667)
T PRK11186 255 DYTVINSLVAGGPAAKSKKLSV-----------GDKIVGVGQDGKPIVDVIGWRLDDVVALIKG-PKGSKVRLEILPAGK 322 (667)
T ss_pred CeEEEEEccCCChHHHhCCCCC-----------CCEEEEECCCCCcccccccCCHHHHHHHhcC-CCCCEEEEEEEeCCC
Confidence 45889999999999998 9999 99999999 554433 3477777766 6799999999994
Q ss_pred -CeEEEEEEE
Q 015960 380 -TQLEEILII 388 (397)
Q Consensus 380 -g~~~~~~v~ 388 (397)
++.++++++
T Consensus 323 ~~~~~~vtl~ 332 (667)
T PRK11186 323 GTKTRIVTLT 332 (667)
T ss_pred CCceEEEEEE
Confidence 344555443
No 51
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.45 E-value=0.0002 Score=64.49 Aligned_cols=65 Identities=20% Similarity=0.288 Sum_probs=55.5
Q ss_pred EEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 015960 313 VIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQLEEILII 388 (397)
Q Consensus 313 ~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 388 (397)
.+.-..+.+..++.|||. ||+.+++|+..+++.+++..++.+...-+.+.++|+|+|+.+.+.+.
T Consensus 210 r~~pgkd~slF~~sglq~-----------GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 210 RFEPGKDGSLFYKSGLQR-----------GDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred EecCCCCcchhhhhcCCC-----------cceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 333334456778899999 99999999999999999999998877778899999999999998875
No 52
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.35 E-value=0.00014 Score=56.39 Aligned_cols=34 Identities=41% Similarity=0.429 Sum_probs=32.0
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecC
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVS 354 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~ 354 (397)
.|++|++|.++|||+.|||+. +|.|+.+||.+.+
T Consensus 59 ~GiYvT~V~eGsPA~~AGLri-----------hDKIlQvNG~DfT 92 (124)
T KOG3553|consen 59 KGIYVTRVSEGSPAEIAGLRI-----------HDKILQVNGWDFT 92 (124)
T ss_pred ccEEEEEeccCChhhhhccee-----------cceEEEecCceeE
Confidence 799999999999999999999 9999999997654
No 53
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=97.11 E-value=0.0016 Score=69.03 Aligned_cols=22 Identities=32% Similarity=0.403 Sum_probs=19.8
Q ss_pred ceeEEEEEcCCCEEEEcccccC
Q 015960 114 ATGTGFIWDEDGHIVTNHHVIE 135 (397)
Q Consensus 114 ~~GSG~ii~~~G~ILT~aHvv~ 135 (397)
+.|||.+|+++|.||||.||..
T Consensus 47 gGCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 47 GGCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred CceeEEEEcCCceEEecchhhh
Confidence 3599999999999999999963
No 54
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=97.01 E-value=0.012 Score=51.74 Aligned_cols=138 Identities=19% Similarity=0.288 Sum_probs=77.7
Q ss_pred CceeEEEEEcCCCEEEEcccccCCCCeEEEEecCCcEEEE--EEEEEcC---CCCeEEEEEcCCCCCcceeecCCCCCC-
Q 015960 113 QATGTGFIWDEDGHIVTNHHVIEGASSVKVTLFDKTTLDA--KVVGHDQ---GTDLAVLHIDAPNHKLRSIPVGVSANL- 186 (397)
Q Consensus 113 ~~~GSG~ii~~~G~ILT~aHvv~~~~~i~V~~~~g~~~~a--~vv~~d~---~~DlAlL~v~~~~~~~~~~~l~~s~~~- 186 (397)
...++++.|-++ ++|...| -.... .+.+ +|..++. .+...+. ..||++++++... +++-+.--..+..
T Consensus 24 ~~t~l~~gi~~~-~~lvp~H-~~~~~--~i~i-~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~-kfrDIrk~~~~~~~ 97 (172)
T PF00548_consen 24 EFTMLALGIYDR-YFLVPTH-EEPED--TIYI-DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNP-KFRDIRKFFPESIP 97 (172)
T ss_dssp EEEEEEEEEEBT-EEEEEGG-GGGCS--EEEE-TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS--B--GGGGSBSSGG
T ss_pred eEEEecceEeee-EEEEECc-CCCcE--EEEE-CCEEEEeeeeEEEecCCCcceeEEEEEccCCc-ccCchhhhhccccc
Confidence 446788888766 9999999 22223 3333 3554433 2223343 4699999997743 3332221111222
Q ss_pred CCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccccCCCCcccceecC---CccEEEEEeee
Q 015960 187 RIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAINRGNSGGPLLDS---SGSLIGVNTSI 258 (397)
Q Consensus 187 ~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvn~---~G~vVGI~s~~ 258 (397)
...+...++=.+.........+.++..+.- ...+......+.++++..+|+.||||+.. .++++||+.++
T Consensus 98 ~~~~~~l~v~~~~~~~~~~~v~~v~~~~~i--~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 98 EYPECVLLVNSTKFPRMIVEVGFVTNFGFI--NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp TEEEEEEEEESSSSTCEEEEEEEEEEEEEE--EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred cCCCcEEEEECCCCccEEEEEEEEeecCcc--ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 334444444222221223344444443332 22233446788999999999999999952 57899999876
No 55
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.93 E-value=0.026 Score=53.64 Aligned_cols=92 Identities=20% Similarity=0.208 Sum_probs=56.3
Q ss_pred CCCCeEEEEEcCC-CCCcceeecCCCC-CCCCCCeEEEEecCCCCCCceeeeEEeeccccccCCCCCCcccEEEEccccC
Q 015960 159 QGTDLAVLHIDAP-NHKLRSIPVGVSA-NLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAIN 236 (397)
Q Consensus 159 ~~~DlAlL~v~~~-~~~~~~~~l~~s~-~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~ 236 (397)
...+++||.++.+ .....++.|+++. ....++.+.+.|+... .......+.-..... ....+..+...+
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~~~~~~~~i~~~~~-------~~~~~~~~~~~~ 229 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GKLKHRKLKITNCTK-------CAYSICTKQYSC 229 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--CeEEEEEEEEEEeec-------cceeEecccccC
Confidence 4579999999886 3366777776543 3678999998888221 112222221111100 123455556678
Q ss_pred CCCcccceec-CCc--cEEEEEeeee
Q 015960 237 RGNSGGPLLD-SSG--SLIGVNTSII 259 (397)
Q Consensus 237 ~G~SGGPlvn-~~G--~vVGI~s~~~ 259 (397)
.|++|||++. .+| .||||.+...
T Consensus 230 ~~d~Gg~lv~~~~gr~tlIGv~~~~~ 255 (282)
T PF03761_consen 230 KGDRGGPLVKNINGRWTLIGVGASGN 255 (282)
T ss_pred CCCccCeEEEEECCCEEEEEEEccCC
Confidence 9999999983 244 5899987553
No 56
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.92 E-value=0.02 Score=51.43 Aligned_cols=160 Identities=18% Similarity=0.227 Sum_probs=86.2
Q ss_pred CCCceeEEEEEcCC-CEEEEcccccCCCCe-EEEEecCCcEEEEEEEEEcC----------------CCCeEEEEEcCC-
Q 015960 111 YPQATGTGFIWDED-GHIVTNHHVIEGASS-VKVTLFDKTTLDAKVVGHDQ----------------GTDLAVLHIDAP- 171 (397)
Q Consensus 111 ~~~~~GSG~ii~~~-G~ILT~aHvv~~~~~-i~V~~~~g~~~~a~vv~~d~----------------~~DlAlL~v~~~- 171 (397)
...+.||=.+++++ +..--=.|.|.+.+. ..+.+.+|+.+++++..+.+ ..-+.-+.-+..
T Consensus 17 ~~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~Nt~~ 96 (218)
T PF05580_consen 17 STAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEKNTQF 96 (218)
T ss_pred CCcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEecccc
Confidence 45678898999864 555555788876543 45666788888888776543 111222222211
Q ss_pred -------------CCCcceeecCCCCCCCCCCeEEEEecCCCCC-CceeeeEEeeccccccCCCCCCc------ccEEEE
Q 015960 172 -------------NHKLRSIPVGVSANLRIGQKVYAIGHPLGRK-FTCTAGIISAFGLEPITATGPPI------QGLIQI 231 (397)
Q Consensus 172 -------------~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~-~~~~~G~vs~~~~~~~~~~~~~~------~~~i~~ 231 (397)
....++++++...++++|.--+..=. .|.. .....-++ .+.+.... .+..+ ..++..
T Consensus 97 GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~-~G~~ie~f~ieI~-~v~~~~~~-~~k~~vi~vtd~~Ll~~ 173 (218)
T PF05580_consen 97 GIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVI-DGTKIEEFDIEIE-KVLPQSSP-SGKGMVIKVTDPRLLEK 173 (218)
T ss_pred ceeEEeccccccccccCceeEEEEHHHceEccEEEEEEE-cCCeEEEeEEEEE-EEccCCCC-CCCcEEEEECCcchhhh
Confidence 11335555655566777754321111 1111 11111111 11111110 00000 122333
Q ss_pred ccccCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccch
Q 015960 232 DAAINRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTV 277 (397)
Q Consensus 232 ~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i 277 (397)
...+..||||+|++ .+|++||=++..+.+. ...||.++++..
T Consensus 174 TGGIvqGMSGSPI~-qdGKLiGAVthvf~~d---p~~Gygi~ie~M 215 (218)
T PF05580_consen 174 TGGIVQGMSGSPII-QDGKLIGAVTHVFVND---PTKGYGIFIEWM 215 (218)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEecC---CCceeeecHHHH
Confidence 44577899999999 7999999998887542 347888887654
No 57
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=96.32 E-value=0.0038 Score=63.09 Aligned_cols=58 Identities=29% Similarity=0.317 Sum_probs=49.4
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRILR 378 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~R 378 (397)
-|++|..|.+++||++.||+. ||.|+.||..+..+. ++....|....+|+.++|.-.+
T Consensus 429 VGIFVaGvqegspA~~eGlqE-----------GDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ~ 488 (1027)
T KOG3580|consen 429 VGIFVAGVQEGSPAEQEGLQE-----------GDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQS 488 (1027)
T ss_pred eeEEEeecccCCchhhccccc-----------cceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhhh
Confidence 489999999999999999999 999999999998874 5555666677899998886543
No 58
>PF12812 PDZ_1: PDZ-like domain
Probab=96.15 E-value=0.017 Score=43.76 Aligned_cols=46 Identities=26% Similarity=0.412 Sum_probs=39.5
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcC
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQC 366 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~ 366 (397)
-|+++.....++++..-|+.. |-+|++|||+++.+.++|.+++++.
T Consensus 30 ~~gv~v~~~~g~~~~~~~i~~-----------g~iI~~Vn~kpt~~Ld~f~~vvk~i 75 (78)
T PF12812_consen 30 VGGVYVAVSGGSLAFAGGISK-----------GFIITSVNGKPTPDLDDFIKVVKKI 75 (78)
T ss_pred CCEEEEEecCCChhhhCCCCC-----------CeEEEeECCcCCcCHHHHHHHHHhC
Confidence 456666777888888777998 9999999999999999999999764
No 59
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.69 E-value=0.012 Score=62.51 Aligned_cols=58 Identities=22% Similarity=0.265 Sum_probs=37.4
Q ss_pred EEEEccccCCCCcccceecCCccEEEEEeeeeccC-----CCccCc--eeeEeccchhHHHHHHH
Q 015960 228 LIQIDAAINRGNSGGPLLDSSGSLIGVNTSIITRT-----DAFCGM--ACSIPIDTVSGIVDQLV 285 (397)
Q Consensus 228 ~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~-----~~~~~~--~~aip~~~i~~~~~~l~ 285 (397)
.+.++..+..||||+|++|.+|+|||+++-+--.+ .-.... +..|-+..+..+|+++-
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv~ 687 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKVY 687 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHHh
Confidence 35667788999999999999999999996432111 001223 33444455666666543
No 60
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.66 E-value=0.018 Score=48.00 Aligned_cols=32 Identities=25% Similarity=0.607 Sum_probs=22.5
Q ss_pred EEccccCCCCcccceecCCccEEEEEeeeecc
Q 015960 230 QIDAAINRGNSGGPLLDSSGSLIGVNTSIITR 261 (397)
Q Consensus 230 ~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~ 261 (397)
-.+....+|.||+|+||.+|++|||.......
T Consensus 89 ~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 89 AIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred eeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 33445678999999999999999998877654
No 61
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=95.22 E-value=0.23 Score=49.32 Aligned_cols=43 Identities=23% Similarity=0.434 Sum_probs=33.0
Q ss_pred ccccCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEeccchh
Q 015960 232 DAAINRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPIDTVS 278 (397)
Q Consensus 232 ~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~ 278 (397)
...+..||||+|++ .+|++||=++..+-+.. ..||+|-+++..
T Consensus 354 tgGivqGMSGSPi~-q~gkliGAvtHVfvndp---t~GYGi~ie~Ml 396 (402)
T TIGR02860 354 TGGIVQGMSGSPII-QNGKVIGAVTHVFVNDP---TSGYGVYIEWML 396 (402)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEecCC---CcceeehHHHHH
Confidence 34677899999999 89999999888776532 367887776553
No 62
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=94.81 E-value=0.031 Score=57.86 Aligned_cols=55 Identities=33% Similarity=0.511 Sum_probs=43.1
Q ss_pred EEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEEECC
Q 015960 314 IFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRILRGT 380 (397)
Q Consensus 314 V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~R~g 380 (397)
|..+.++|||++.| +|+.||.|++|||+.|.+. .|+..+++. .|-+|+|+|.-..
T Consensus 782 iGrIieGSPAdRCg----------kLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGlsVtLtIip~e 838 (984)
T KOG3209|consen 782 IGRIIEGSPADRCG----------KLKVGDRILAVNGQSILNLSHADIVSLIKD--AGLSVTLTIIPPE 838 (984)
T ss_pred ccccccCChhHhhc----------cccccceEEEecCeeeeccCchhHHHHHHh--cCceEEEEEcChh
Confidence 66788899998875 3344999999999999865 577777764 6888999987643
No 63
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=94.65 E-value=0.058 Score=55.71 Aligned_cols=47 Identities=17% Similarity=0.304 Sum_probs=42.4
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCC
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCK 367 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~ 367 (397)
.-+.|..|.+++||.++.+.+ ||++++|||.||++..+..+.+....
T Consensus 398 ~~v~v~tv~~ns~a~k~~~~~-----------gdvlvai~~~pi~s~~q~~~~~~s~~ 444 (1051)
T KOG3532|consen 398 RAVKVCTVEDNSLADKAAFKP-----------GDVLVAINNVPIRSERQATRFLQSTT 444 (1051)
T ss_pred eEEEEEEecCCChhhHhcCCC-----------cceEEEecCccchhHHHHHHHHHhcc
Confidence 346788999999999999999 99999999999999999999997743
No 64
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.51 E-value=0.15 Score=45.88 Aligned_cols=118 Identities=16% Similarity=0.185 Sum_probs=48.0
Q ss_pred EEEEcccccCCCCeEEEEecCCcEEE---EEEEEEcCCCCeEEEEEcCCC---CCcceeecCCCCCCCCCCeEEEEecCC
Q 015960 126 HIVTNHHVIEGASSVKVTLFDKTTLD---AKVVGHDQGTDLAVLHIDAPN---HKLRSIPVGVSANLRIGQKVYAIGHPL 199 (397)
Q Consensus 126 ~ILT~aHvv~~~~~i~V~~~~g~~~~---a~vv~~d~~~DlAlL~v~~~~---~~~~~~~l~~s~~~~~G~~V~~iG~p~ 199 (397)
.++|++||..+...+. .+.+|+.++ .+.+..+...|++||+....- ...+.+.+.....+ .-|--
T Consensus 43 ~L~ta~Hv~~~~~~~~-~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~-------~~g~~- 113 (203)
T PF02122_consen 43 ALLTARHVWSRPSKVT-SLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQL-------AKGPV- 113 (203)
T ss_dssp EEEE-HHHHTSSS----EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SE-------EEEES-
T ss_pred ceecccccCCCcccee-EcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhh-------CCCCe-
Confidence 6999999999855543 334555544 344556789999999998320 11222222111111 00100
Q ss_pred CCCCceeeeEEeeccccccCCCCCCcccEEEEccccCCCCcccceecCCccEEEEEeee
Q 015960 200 GRKFTCTAGIISAFGLEPITATGPPIQGLIQIDAAINRGNSGGPLLDSSGSLIGVNTSI 258 (397)
Q Consensus 200 g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~ 258 (397)
..+....+............. ..+...-+...+|.||.|+++.+ ++||++...
T Consensus 114 -~~y~~~~~~~~~~sa~i~g~~----~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~ 166 (203)
T PF02122_consen 114 -SFYGFSSGEWPCSSAKIPGTE----GKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS 166 (203)
T ss_dssp -STTSEEEEEEEEEE-S----S----TTEEEE-----TT-TT-EEE-SS--EEEEEEEE
T ss_pred -eeeeecCCCceeccCcccccc----CcCCceEcCCCCCCCCCCeEECC-CceEeecCc
Confidence 011112211111111111111 23556667778999999999877 999999875
No 65
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=94.40 E-value=0.052 Score=57.85 Aligned_cols=55 Identities=29% Similarity=0.501 Sum_probs=43.9
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCC--HHHHHHHHhcCCCCCEEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSN--ANDLHNILDQCKVGDEVIVRILR 378 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~--~~d~~~~l~~~~~g~~v~l~v~R 378 (397)
.-++|..|.+|+|+ +|+|.+||.|++|||++|.+ |+.+.++++.++ +.|.|+|.+
T Consensus 75 rPviVr~VT~GGps------------~GKL~PGDQIl~vN~Epv~daprervIdlvRace--~sv~ltV~q 131 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPS------------IGKLQPGDQILAVNGEPVKDAPRERVIDLVRACE--SSVNLTVCQ 131 (1298)
T ss_pred CceEEEEecCCCCc------------cccccCCCeEEEecCcccccccHHHHHHHHHHHh--hhcceEEec
Confidence 45789999999987 34555599999999999985 677888887643 668888876
No 66
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=94.26 E-value=0.092 Score=51.63 Aligned_cols=70 Identities=26% Similarity=0.394 Sum_probs=52.0
Q ss_pred CCCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEEC--CeEEEE
Q 015960 308 GISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRG--TQLEEI 385 (397)
Q Consensus 308 ~~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~--g~~~~~ 385 (397)
+..|.-|.+|.++|+|.++||.+. -|.|++|||..+...+|..+.+.+... ++|+++++-. ...+.+
T Consensus 13 gteg~hvlkVqedSpa~~aglepf----------fdFIvSI~g~rL~~dnd~Lk~llk~~s-ekVkltv~n~kt~~~R~v 81 (462)
T KOG3834|consen 13 GTEGYHVLKVQEDSPAHKAGLEPF----------FDFIVSINGIRLNKDNDTLKALLKANS-EKVKLTVYNSKTQEVRIV 81 (462)
T ss_pred CceeEEEEEeecCChHHhcCcchh----------hhhhheeCcccccCchHHHHHHHHhcc-cceEEEEEecccceeEEE
Confidence 346788999999999999999985 899999999999977766665544332 4499998753 233444
Q ss_pred EEE
Q 015960 386 LII 388 (397)
Q Consensus 386 ~v~ 388 (397)
.|+
T Consensus 82 ~I~ 84 (462)
T KOG3834|consen 82 EIV 84 (462)
T ss_pred Eec
Confidence 443
No 67
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=94.23 E-value=0.24 Score=51.58 Aligned_cols=117 Identities=20% Similarity=0.298 Sum_probs=66.2
Q ss_pred CCCCeEEEEEcCCC-------CCc------ceeecCC------CCCCCCCCeEEEEecCCCCCCceeeeEEeeccccccC
Q 015960 159 QGTDLAVLHIDAPN-------HKL------RSIPVGV------SANLRIGQKVYAIGHPLGRKFTCTAGIISAFGLEPIT 219 (397)
Q Consensus 159 ~~~DlAlL~v~~~~-------~~~------~~~~l~~------s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~~~~~ 219 (397)
.-.|+|||+++... +.+ |.+.+.+ ...+..|..|+=+|.--+ .|.|.+.++.-.. .
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy-w 615 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY-W 615 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE-e
Confidence 34699999998531 111 2222211 123567889988876543 4566665543211 1
Q ss_pred CCCC-CcccEEEEc----cccCCCCcccceecCCcc------EEEEEeeeeccCCCccCceeeEeccchhHHHHH
Q 015960 220 ATGP-PIQGLIQID----AAINRGNSGGPLLDSSGS------LIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQ 283 (397)
Q Consensus 220 ~~~~-~~~~~i~~~----~~i~~G~SGGPlvn~~G~------vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~ 283 (397)
.++. ...+++... .-...||||+=+++.-+. |+||..+... ....+|.+.|+..|..=+++
T Consensus 616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydg---e~kqfglftPi~~il~rl~~ 687 (695)
T PF08192_consen 616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDG---EQKQFGLFTPINEILDRLEE 687 (695)
T ss_pred cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCC---ccceeeccCcHHHHHHHHHH
Confidence 1121 112333333 334579999999986444 9999887533 24458888887766554444
No 68
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=93.77 E-value=0.23 Score=49.12 Aligned_cols=58 Identities=31% Similarity=0.480 Sum_probs=48.8
Q ss_pred EEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCE---EEEEEEE-CCeEE
Q 015960 314 IFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDE---VIVRILR-GTQLE 383 (397)
Q Consensus 314 V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~---v~l~v~R-~g~~~ 383 (397)
+..+..++++..+|++. ||.|+++|++++.++++....+... .+.. +.+.+.| ++...
T Consensus 133 ~~~v~~~s~a~~a~l~~-----------Gd~iv~~~~~~i~~~~~~~~~~~~~-~~~~~~~~~i~~~~~~~~~~ 194 (375)
T COG0750 133 VGEVAPKSAAALAGLRP-----------GDRIVAVDGEKVASWDDVRRLLVAA-AGDVFNLLTILVIRLDGEAH 194 (375)
T ss_pred eeecCCCCHHHHcCCCC-----------CCEEEeECCEEccCHHHHHHHHHhc-cCCcccceEEEEEeccceee
Confidence 33688899999999999 9999999999999999998887653 4544 7889999 76663
No 69
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=93.39 E-value=0.059 Score=55.61 Aligned_cols=57 Identities=19% Similarity=0.285 Sum_probs=43.9
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILR 378 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R 378 (397)
.|++|.+|.|++.|++.|++. ||.|++|||+...+. .+.++..-.+.+..+++++.-
T Consensus 562 fgifV~~V~pgskAa~~GlKR-----------gDqilEVNgQnfeni-s~~KA~eiLrnnthLtltvKt 618 (1283)
T KOG3542|consen 562 FGIFVAEVFPGSKAAREGLKR-----------GDQILEVNGQNFENI-SAKKAEEILRNNTHLTLTVKT 618 (1283)
T ss_pred ceeEEeeecCCchHHHhhhhh-----------hhhhhhccccchhhh-hHHHHHHHhcCCceEEEEEec
Confidence 479999999999999999999 999999999988765 334443333445566666653
No 70
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=93.13 E-value=0.12 Score=55.34 Aligned_cols=59 Identities=20% Similarity=0.370 Sum_probs=44.3
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHH--HHHHHHhcCCCCCEEEEEEEECC
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNAN--DLHNILDQCKVGDEVIVRILRGT 380 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~--d~~~~l~~~~~g~~v~l~v~R~g 380 (397)
-|++|..|++|++|+ ++|+|..||.+++|||+..-... +..+++. +.|..|.++|...|
T Consensus 960 lGIYvKsVV~GgaAd----------~DGRL~aGDQLLsVdG~SLiGisQErAA~lmt--rtg~vV~leVaKqg 1020 (1629)
T KOG1892|consen 960 LGIYVKSVVEGGAAD----------HDGRLEAGDQLLSVDGHSLIGISQERAARLMT--RTGNVVHLEVAKQG 1020 (1629)
T ss_pred cceEEEEeccCCccc----------cccccccCceeeeecCcccccccHHHHHHHHh--ccCCeEEEehhhhh
Confidence 488999999999985 45566679999999999877543 3444443 56888999887543
No 71
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=92.59 E-value=1.5 Score=40.40 Aligned_cols=87 Identities=16% Similarity=0.214 Sum_probs=58.6
Q ss_pred CCceeEEEEEcCCCEEEEcccccCCC----CeEEEEecCCcEEE------EEEEEEc-----CCCCeEEEEEcCCC---C
Q 015960 112 PQATGTGFIWDEDGHIVTNHHVIEGA----SSVKVTLFDKTTLD------AKVVGHD-----QGTDLAVLHIDAPN---H 173 (397)
Q Consensus 112 ~~~~GSG~ii~~~G~ILT~aHvv~~~----~~i~V~~~~g~~~~------a~vv~~d-----~~~DlAlL~v~~~~---~ 173 (397)
+...|||++||++ |+|++..|+.+- ..+.+.+..++.+. -++..+| ++.+++||.++.+. .
T Consensus 26 G~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~~~~fTr 104 (267)
T PF09342_consen 26 GRYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQPANFTR 104 (267)
T ss_pred CeEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecCccccee
Confidence 4568999999987 999999999863 34666666655432 1233333 68899999999873 2
Q ss_pred CcceeecCC-CCCCCCCCeEEEEecCC
Q 015960 174 KLRSIPVGV-SANLRIGQKVYAIGHPL 199 (397)
Q Consensus 174 ~~~~~~l~~-s~~~~~G~~V~~iG~p~ 199 (397)
.+.|.-+.+ +......+.++++|.-.
T Consensus 105 ~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 105 YVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eecccccccccCCCCCCCceEEEEccc
Confidence 344444433 23455566899998764
No 72
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=92.50 E-value=0.22 Score=41.25 Aligned_cols=30 Identities=23% Similarity=0.448 Sum_probs=24.6
Q ss_pred ccccCCCCcccceecCCccEEEEEeeeecc
Q 015960 232 DAAINRGNSGGPLLDSSGSLIGVNTSIITR 261 (397)
Q Consensus 232 ~~~i~~G~SGGPlvn~~G~vVGI~s~~~~~ 261 (397)
...-.+|+||-|++|..|+||||+-.+.+.
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~ne 129 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGGANE 129 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEEEEE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecCCCC
Confidence 344579999999999999999999877543
No 73
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=91.75 E-value=0.44 Score=44.15 Aligned_cols=59 Identities=22% Similarity=0.354 Sum_probs=45.5
Q ss_pred CCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecC--CHHHHHHHHhcCCCCCEEEEEEEEC
Q 015960 309 ISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVS--NANDLHNILDQCKVGDEVIVRILRG 379 (397)
Q Consensus 309 ~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~--~~~d~~~~l~~~~~g~~v~l~v~R~ 379 (397)
..|++|...+|++.|+..||-. ..|.|++|||.+|. +.+++.+++-.. ...+.++|.-.
T Consensus 193 vpGIFISRlVpGGLAeSTGLLa----------VnDEVlEVNGIEVaGKTLDQVTDMMvAN--shNLIiTVkPA 253 (358)
T KOG3606|consen 193 VPGIFISRLVPGGLAESTGLLA----------VNDEVLEVNGIEVAGKTLDQVTDMMVAN--SHNLIITVKPA 253 (358)
T ss_pred cCceEEEeecCCccccccceee----------ecceeEEEcCEEeccccHHHHHHHHhhc--ccceEEEeccc
Confidence 4899999999999999999865 39999999999987 567777766432 23466666543
No 74
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=91.58 E-value=0.29 Score=51.01 Aligned_cols=56 Identities=27% Similarity=0.487 Sum_probs=41.1
Q ss_pred CcEEEEecccCcccccC-ccccccCCCCcccCCcEEEEECCEecCCHH--HHHHHHhcCCCCCEEEEEEEECC
Q 015960 311 GGVIFIAVEEGPAGKAG-LRSTKFGANGKFILGDIIKAVNGEDVSNAN--DLHNILDQCKVGDEVIVRILRGT 380 (397)
Q Consensus 311 g~~V~~v~~~spa~~aG-l~~~~~~~~~~l~~GDiI~~vng~~v~~~~--d~~~~l~~~~~g~~v~l~v~R~g 380 (397)
+++|..+.+++||.+-| ++. ||.|++|||+....+. +..++++ .|....+.++|.|
T Consensus 924 ~LfVLRlAeDGPA~rdGrm~V-----------GDqi~eINGesTkgmtH~rAIelIk---~gg~~vll~Lr~g 982 (984)
T KOG3209|consen 924 DLFVLRLAEDGPAIRDGRMRV-----------GDQITEINGESTKGMTHDRAIELIK---QGGRRVLLLLRRG 982 (984)
T ss_pred ceEEEEeccCCCccccCceee-----------cceEEEecCcccCCCcHHHHHHHHH---hCCeEEEEEeccC
Confidence 67899999999998765 455 9999999999998764 3444443 3444556666654
No 75
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=91.39 E-value=0.41 Score=40.39 Aligned_cols=55 Identities=29% Similarity=0.381 Sum_probs=39.1
Q ss_pred CCcEEEEecccCccccc-CccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKA-GLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRIL 377 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~a-Gl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~ 377 (397)
+-++|..+.|++.|++- ||+. ||.+++|||..|..- +-..++|+. ..| .|++.|+
T Consensus 115 spiyisriipggvadrhgglkr-----------gdqllsvngvsvege~hekavellka-a~g-svklvvr 172 (207)
T KOG3550|consen 115 SPIYISRIIPGGVADRHGGLKR-----------GDQLLSVNGVSVEGEHHEKAVELLKA-AVG-SVKLVVR 172 (207)
T ss_pred CceEEEeecCCccccccCcccc-----------cceeEeecceeecchhhHHHHHHHHH-hcC-cEEEEEe
Confidence 56899999999998874 5777 999999999988753 333444544 233 4666553
No 76
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=90.54 E-value=0.4 Score=49.04 Aligned_cols=61 Identities=23% Similarity=0.254 Sum_probs=44.9
Q ss_pred CCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCe
Q 015960 309 ISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQ 381 (397)
Q Consensus 309 ~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~ 381 (397)
...++|.+|.|++||+-. |+. ||.|+.|||....+......+-.-.+.|+...++|.|..+
T Consensus 39 etSiViSDVlpGGPAeG~-LQe-----------nDrvvMVNGvsMenv~haFAvQqLrksgK~A~ItvkRprk 99 (1027)
T KOG3580|consen 39 ETSIVISDVLPGGPAEGL-LQE-----------NDRVVMVNGVSMENVLHAFAVQQLRKSGKVAAITVKRPRK 99 (1027)
T ss_pred ceeEEEeeccCCCCcccc-ccc-----------CCeEEEEcCcchhhhHHHHHHHHHHhhccceeEEecccce
Confidence 356899999999999755 666 9999999999988765433221122467788899988644
No 77
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=89.57 E-value=1.7 Score=47.00 Aligned_cols=54 Identities=20% Similarity=0.237 Sum_probs=34.3
Q ss_pred eeEEEEEcCCCEEEEcccccCCCCeEEEEecCCcEEEEEEEEEc--CCCCeEEEEEcCC
Q 015960 115 TGTGFIWDEDGHIVTNHHVIEGASSVKVTLFDKTTLDAKVVGHD--QGTDLAVLHIDAP 171 (397)
Q Consensus 115 ~GSG~ii~~~G~ILT~aHvv~~~~~i~V~~~~g~~~~a~vv~~d--~~~DlAlL~v~~~ 171 (397)
.|...+|+++ ||+|.+|...+...+..--.+...| +++.+. +..|+.+-|++.-
T Consensus 66 ~G~aTLigpq-YiVSV~HN~~gy~~v~FG~~g~~~Y--~iV~RNn~~~~Df~~pRLnK~ 121 (769)
T PF02395_consen 66 KGVATLIGPQ-YIVSVKHNGKGYNSVSFGNEGQNTY--KIVDRNNYPSGDFHMPRLNKF 121 (769)
T ss_dssp TSS-EEEETT-EEEBETTG-TSCCEECESCSSTCEE--EEEEEEBETTSTEBEEEESS-
T ss_pred CceEEEecCC-eEEEEEccCCCcCceeecccCCceE--EEEEccCCCCcccceeecCce
Confidence 3889999987 9999999985544433322234444 444443 3469999999863
No 78
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=89.32 E-value=0.58 Score=44.86 Aligned_cols=54 Identities=33% Similarity=0.394 Sum_probs=43.9
Q ss_pred cEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCC--HHHHHHHHhcCCCCCEEEEEEE
Q 015960 312 GVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSN--ANDLHNILDQCKVGDEVIVRIL 377 (397)
Q Consensus 312 ~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~--~~d~~~~l~~~~~g~~v~l~v~ 377 (397)
++|..+.++..|+..|+-- .||-|+.|||.-|+. -+++..+|+. .||.|+++|.
T Consensus 82 vviSkI~kdQaAd~tG~LF----------vGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV~ 137 (505)
T KOG3549|consen 82 VVISKIYKDQAADITGQLF----------VGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTVK 137 (505)
T ss_pred EEeehhhhhhhhhhcCceE----------eeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEeH
Confidence 6788888888888887642 399999999999885 5788888875 6899999885
No 79
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=88.09 E-value=0.45 Score=39.45 Aligned_cols=40 Identities=25% Similarity=0.475 Sum_probs=26.1
Q ss_pred cCCCCcccceecCCccEEEEEeeeeccCCCccCceeeEecc
Q 015960 235 INRGNSGGPLLDSSGSLIGVNTSIITRTDAFCGMACSIPID 275 (397)
Q Consensus 235 i~~G~SGGPlvn~~G~vVGI~s~~~~~~~~~~~~~~aip~~ 275 (397)
...|.||||++-.+|.+|||..+.....+....+-|. |.+
T Consensus 105 ~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~-P~e 144 (148)
T PF02907_consen 105 DLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI-PVE 144 (148)
T ss_dssp HHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-EHH
T ss_pred EEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-eee
Confidence 3579999999999999999987765543333334443 553
No 80
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=87.67 E-value=1.5 Score=36.14 Aligned_cols=33 Identities=24% Similarity=0.285 Sum_probs=24.4
Q ss_pred ccEEEEccccCCCCcccceecCCccEEEEEeeee
Q 015960 226 QGLIQIDAAINRGNSGGPLLDSSGSLIGVNTSII 259 (397)
Q Consensus 226 ~~~i~~~~~i~~G~SGGPlvn~~G~vVGI~s~~~ 259 (397)
.+.+....+..||+.||+|+-.- -|+||++++-
T Consensus 78 ~~~l~g~Gp~~PGdCGg~L~C~H-GViGi~Tagg 110 (127)
T PF00947_consen 78 YNLLIGEGPAEPGDCGGILRCKH-GVIGIVTAGG 110 (127)
T ss_dssp ECEEEEE-SSSTT-TCSEEEETT-CEEEEEEEEE
T ss_pred cCceeecccCCCCCCCceeEeCC-CeEEEEEeCC
Confidence 34555667889999999999555 5999999873
No 81
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=87.10 E-value=2.8 Score=33.48 Aligned_cols=53 Identities=25% Similarity=0.348 Sum_probs=34.7
Q ss_pred EEEEcCCCEEEEcccccCCCCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcCCCCCcceeecCC
Q 015960 118 GFIWDEDGHIVTNHHVIEGASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGV 182 (397)
Q Consensus 118 G~ii~~~G~ILT~aHvv~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~ 182 (397)
++-|. +|..+|+.||.+..+.+ +|..+ +++. ...|+|+++.+.. .++..++++
T Consensus 3 avHIG-nG~~vt~tHva~~~~~v-----~g~~f--~~~~--~~ge~~~v~~~~~--~~p~~~ig~ 55 (105)
T PF03510_consen 3 AVHIG-NGRYVTVTHVAKSSDSV-----DGQPF--KIVK--TDGELCWVQSPLV--HLPAAQIGT 55 (105)
T ss_pred eEEeC-CCEEEEEEEEeccCceE-----cCcCc--EEEE--eccCEEEEECCCC--CCCeeEecc
Confidence 55565 68999999999876543 23322 2333 4569999998874 356666653
No 82
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=87.07 E-value=1 Score=49.62 Aligned_cols=51 Identities=35% Similarity=0.520 Sum_probs=40.7
Q ss_pred EEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEE
Q 015960 313 VIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRI 376 (397)
Q Consensus 313 ~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v 376 (397)
.|..|.+++||..+|++. ||.|+.+||+++... .++.+.+.. .|.++.+.+
T Consensus 661 ~v~sv~egsPA~~agls~-----------~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~t 713 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSA-----------GDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLRT 713 (1205)
T ss_pred eeeeecCCCCccccCCCc-----------cceeEeccCcccchhhHHHHHHHHHh--cCCeeEEEe
Confidence 578899999999999999 999999999999864 566666653 455565544
No 83
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=86.53 E-value=0.98 Score=45.47 Aligned_cols=60 Identities=20% Similarity=0.342 Sum_probs=39.8
Q ss_pred CCCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhc--CCCCCEEEEEEEE
Q 015960 308 GISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQ--CKVGDEVIVRILR 378 (397)
Q Consensus 308 ~~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~--~~~g~~v~l~v~R 378 (397)
+..|++|.++.+++..+. +|.+++||.|+.||.....++ .|..+.|.+ .++| .++++|-.
T Consensus 275 gDggIYVgsImkgGAVA~----------DGRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~~g-Pi~ltvAk 338 (626)
T KOG3571|consen 275 GDGGIYVGSIMKGGAVAL----------DGRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSRPG-PIKLTVAK 338 (626)
T ss_pred CCCceEEeeeccCceeec----------cCccCccceEEEeeecchhhcCchHHHHHHHHHhccCC-CeEEEEee
Confidence 447899999999886433 455566999999999887764 333333332 1344 36676654
No 84
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=86.34 E-value=1.6 Score=41.08 Aligned_cols=54 Identities=30% Similarity=0.431 Sum_probs=39.8
Q ss_pred CcEEEEecccCcccccC-ccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEE
Q 015960 311 GGVIFIAVEEGPAGKAG-LRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRIL 377 (397)
Q Consensus 311 g~~V~~v~~~spa~~aG-l~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~ 377 (397)
=++|..|..++||++-| ++. ||.|++|||..|..- .++.+.+... . +.|.+.+.
T Consensus 31 ClYiVQvFD~tPAa~dG~i~~-----------GDEi~avNg~svKGktKveVAkmIQ~~-~-~eV~IhyN 87 (429)
T KOG3651|consen 31 CLYIVQVFDKTPAAKDGRIRC-----------GDEIVAVNGISVKGKTKVEVAKMIQVS-L-NEVKIHYN 87 (429)
T ss_pred eEEEEEeccCCchhccCcccc-----------CCeeEEecceeecCccHHHHHHHHHHh-c-cceEEEeh
Confidence 36899999999998876 566 999999999999853 4555666542 2 34677664
No 85
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=82.66 E-value=2.1 Score=42.01 Aligned_cols=45 Identities=24% Similarity=0.336 Sum_probs=38.6
Q ss_pred CCcEEEEecccCccc-ccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhc
Q 015960 310 SGGVIFIAVEEGPAG-KAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQ 365 (397)
Q Consensus 310 ~g~~V~~v~~~spa~-~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~ 365 (397)
.|+.|+++...||+. .-||.+ ||+|+++||-+|++.+|..+-++.
T Consensus 220 ~gV~Vtev~~~Spl~gprGL~v-----------gdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 220 EGVTVTEVPSVSPLFGPRGLSV-----------GDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred ceEEEEeccccCCCcCcccCCc-----------cceEEecCCcccCCHHHHHHHHHh
Confidence 689999999999854 248888 999999999999999998877753
No 86
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=81.16 E-value=2.6 Score=42.96 Aligned_cols=56 Identities=30% Similarity=0.513 Sum_probs=46.4
Q ss_pred CcEEEEecccCcccccCc-cccccCCCCcccCCcEEEEECCEecCC--HHHHHHHHhcCCCCCEEEEEEEEC
Q 015960 311 GGVIFIAVEEGPAGKAGL-RSTKFGANGKFILGDIIKAVNGEDVSN--ANDLHNILDQCKVGDEVIVRILRG 379 (397)
Q Consensus 311 g~~V~~v~~~spa~~aGl-~~~~~~~~~~l~~GDiI~~vng~~v~~--~~d~~~~l~~~~~g~~v~l~v~R~ 379 (397)
.++|..+..|+.+.+.|+ +. ||.|.+|||..+.+ ..++.+++.+.. | .+++++.-.
T Consensus 147 ~~~vARI~~GG~~~r~glL~~-----------GD~i~EvNGi~v~~~~~~e~q~~l~~~~-G-~itfkiiP~ 205 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGLLHV-----------GDEILEVNGISVANKSPEELQELLRNSR-G-SITFKIIPS 205 (542)
T ss_pred ccEEeeeccCCcchhccceee-----------ccchheecCeecccCCHHHHHHHHHhCC-C-cEEEEEccc
Confidence 578999999999999886 45 99999999999985 588999998866 5 477877643
No 87
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=80.51 E-value=2.6 Score=43.83 Aligned_cols=55 Identities=25% Similarity=0.416 Sum_probs=36.7
Q ss_pred EEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEEEEC
Q 015960 315 FIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRILRG 379 (397)
Q Consensus 315 ~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v~R~ 379 (397)
.....++||++. |+|+.||.|++|||...... ..-+.+++..+.-..|+++|.+=
T Consensus 678 Anmm~~GpAars----------gkLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~c 734 (829)
T KOG3605|consen 678 ANMMHGGPAARS----------GKLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVSC 734 (829)
T ss_pred HhcccCChhhhc----------CCccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEecC
Confidence 334455666655 45667999999999877642 44455666656556778877653
No 88
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=80.31 E-value=2.4 Score=42.06 Aligned_cols=57 Identities=19% Similarity=0.324 Sum_probs=45.3
Q ss_pred EEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCCHHHHHHHHhcCCCCCEEEEEEEECCe
Q 015960 314 IFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSNANDLHNILDQCKVGDEVIVRILRGTQ 381 (397)
Q Consensus 314 V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~ 381 (397)
|-+|.++|||+.|||++. +|-|+-+-.......+|+...+..+ .++.+++-|+.-..
T Consensus 113 vl~V~p~SPaalAgl~~~----------~DYivG~~~~~~~~~eDl~~lIesh-e~kpLklyVYN~D~ 169 (462)
T KOG3834|consen 113 VLSVEPNSPAALAGLRPY----------TDYIVGIWDAVMHEEEDLFTLIESH-EGKPLKLYVYNHDT 169 (462)
T ss_pred eeecCCCCHHHhcccccc----------cceEecchhhhccchHHHHHHHHhc-cCCCcceeEeecCC
Confidence 778899999999999962 8999999555566778888888774 57888998886433
No 89
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=78.89 E-value=1.5 Score=42.83 Aligned_cols=55 Identities=33% Similarity=0.495 Sum_probs=38.7
Q ss_pred cEEEEecccCcccccC-ccccccCCCCcccCCcEEEEECCEecCCH--HHHHHHHhcCCCCCEEEEEE--EEC
Q 015960 312 GVIFIAVEEGPAGKAG-LRSTKFGANGKFILGDIIKAVNGEDVSNA--NDLHNILDQCKVGDEVIVRI--LRG 379 (397)
Q Consensus 312 ~~V~~v~~~spa~~aG-l~~~~~~~~~~l~~GDiI~~vng~~v~~~--~d~~~~l~~~~~g~~v~l~v--~R~ 379 (397)
++|.++.++-.|++.+ |.. ||.|++|||.+..+. ++..++|+ +.|+.|.++| .|+
T Consensus 112 IlISKIFkGlAADQt~aL~~-----------gDaIlSVNG~dL~~AtHdeAVqaLK--raGkeV~levKy~RE 171 (506)
T KOG3551|consen 112 ILISKIFKGLAADQTGALFL-----------GDAILSVNGEDLRDATHDEAVQALK--RAGKEVLLEVKYMRE 171 (506)
T ss_pred eehhHhccccccccccceee-----------ccEEEEecchhhhhcchHHHHHHHH--hhCceeeeeeeeehh
Confidence 5677777777777765 445 999999999988754 45556665 4677666655 454
No 90
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=78.71 E-value=1.7 Score=45.21 Aligned_cols=113 Identities=22% Similarity=0.418 Sum_probs=67.0
Q ss_pred CCCCccccee-----cCCccEEEEEeeeeccCCCccCceeeEeccchhHHHHHHHhcccccc------ccCCCch-hHHH
Q 015960 236 NRGNSGGPLL-----DSSGSLIGVNTSIITRTDAFCGMACSIPIDTVSGIVDQLVKFGKIIR------PYLGIAH-DQLL 303 (397)
Q Consensus 236 ~~G~SGGPlv-----n~~G~vVGI~s~~~~~~~~~~~~~~aip~~~i~~~~~~l~~~g~~~~------~~lGi~~-~~~~ 303 (397)
-.-++|||.- |...+++.||-..+ ..+|.+....+++.+++.-.|+. |..-+.. .+..
T Consensus 678 Anmm~~GpAarsgkLnIGDQiiaING~SL----------VGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd~ 747 (829)
T KOG3605|consen 678 ANMMHGGPAARSGKLNIGDQIMSINGTSL----------VGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPDL 747 (829)
T ss_pred HhcccCChhhhcCCccccceeEeecCcee----------ccccHHHHHHHHhcccccceEEEEEecCCCceEEEeecccc
Confidence 3457778874 33335565653322 23888888888888776554321 1111100 1111
Q ss_pred HHHcC--CCCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCEecCC--HHHHHHHHhcCCCCC
Q 015960 304 EKLMG--ISGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVSN--ANDLHNILDQCKVGD 370 (397)
Q Consensus 304 ~~~~~--~~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~~--~~d~~~~l~~~~~g~ 370 (397)
+-.+| +...+|=+...++.|++-|+|. |-.|++|||+.|.- -+-+.++|.. ..|+
T Consensus 748 kyQLGFSVQNGiICSLlRGGIAERGGVRV-----------GHRIIEINgQSVVA~pHekIV~lLs~-aVGE 806 (829)
T KOG3605|consen 748 RYQLGFSVQNGIICSLLRGGIAERGGVRV-----------GHRIIEINGQSVVATPHEKIVQLLSN-AVGE 806 (829)
T ss_pred hhhccceeeCcEeehhhcccchhccCcee-----------eeeEEEECCceEEeccHHHHHHHHHH-hhhh
Confidence 11122 2345677888999999999999 99999999997763 2344455543 3443
No 91
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=74.95 E-value=9.7 Score=37.86 Aligned_cols=137 Identities=17% Similarity=0.260 Sum_probs=63.4
Q ss_pred ceeEEEEEcCCCEEEEcccccCCC-CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcCC-CCCcceeecCCCCCCCCCCe
Q 015960 114 ATGTGFIWDEDGHIVTNHHVIEGA-SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDAP-NHKLRSIPVGVSANLRIGQK 191 (397)
Q Consensus 114 ~~GSG~ii~~~G~ILT~aHvv~~~-~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~~-~~~~~~~~l~~s~~~~~G~~ 191 (397)
+.|-||-|++. ..+|+-||+... .++. |.. ..-+.++..-+++-+++..+ ..+++-+-|.. -...|..
T Consensus 379 GsGWGfWVS~~-lfITttHViP~g~~E~F-----Gv~--i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLEe--GapEGtV 448 (535)
T PF05416_consen 379 GSGWGFWVSPT-LFITTTHVIPPGAKEAF-----GVP--ISQIQVHKSGEFCRFRFPKPIRPDVTGMILEE--GAPEGTV 448 (535)
T ss_dssp TTEEEEESSSS-EEEEEGGGS-STTSEET-----TEE--CGGEEEEEETTEEEEEESS-SSTTS---EE-S--S--TT-E
T ss_pred CCceeeeecce-EEEEeeeecCCcchhhh-----CCC--hhHeEEeeccceEEEecCCCCCCCccceeecc--CCCCceE
Confidence 47899999998 999999999743 2211 111 11123344566777777664 23455555532 2334444
Q ss_pred EE-EEecCCCC--CCceeeeEEeeccccccCCCCCC-----cccEEEEccccCCCCcccceecCCcc---EEEEEeeeec
Q 015960 192 VY-AIGHPLGR--KFTCTAGIISAFGLEPITATGPP-----IQGLIQIDAAINRGNSGGPLLDSSGS---LIGVNTSIIT 260 (397)
Q Consensus 192 V~-~iG~p~g~--~~~~~~G~vs~~~~~~~~~~~~~-----~~~~i~~~~~i~~G~SGGPlvn~~G~---vVGI~s~~~~ 260 (397)
+. +|-.|.|. ...+..|......-......+.. -.+.--.|-...||+-|.|-+-..|+ |+|++++...
T Consensus 449 ~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~AAtr 528 (535)
T PF05416_consen 449 CSILIKRPSGELLPLAVRMGTHASMKIQGRTVHGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHAAATR 528 (535)
T ss_dssp EEEEEE-TTSBEEEEEEEEEEEEEEEETTEEEEEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEEEE-S
T ss_pred EEEEEEcCCccchhhhhhhccceeEEEcceeecceeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEehhcc
Confidence 32 23444442 22333343322211000000000 00111124556799999999976665 8899988654
No 92
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=74.83 E-value=2.3 Score=42.22 Aligned_cols=23 Identities=22% Similarity=0.553 Sum_probs=20.6
Q ss_pred ccCCCCcccceecCCccEEEEEe
Q 015960 234 AINRGNSGGPLLDSSGSLIGVNT 256 (397)
Q Consensus 234 ~i~~G~SGGPlvn~~G~vVGI~s 256 (397)
....|.||+.++|.+|++|||..
T Consensus 351 ~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 351 SLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred CCCCCCCcCeEECCCCCEEEEeC
Confidence 55689999999999999999974
No 93
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=72.62 E-value=2.7 Score=39.05 Aligned_cols=57 Identities=21% Similarity=0.362 Sum_probs=43.8
Q ss_pred CCcEEEEecccCccccc-CccccccCCCCcccCCcEEEEECCEecCCHH--HHHHHHhcCCCCCEEEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKA-GLRSTKFGANGKFILGDIIKAVNGEDVSNAN--DLHNILDQCKVGDEVIVRIL 377 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~a-Gl~~~~~~~~~~l~~GDiI~~vng~~v~~~~--d~~~~l~~~~~g~~v~l~v~ 377 (397)
.-++|..+.++|.-++. -++. ||.|-+|||+.|..+. ++.+.|+..+.|++.++.+.
T Consensus 149 GyAFIKrIkegsvidri~~i~V-----------Gd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLi 208 (334)
T KOG3938|consen 149 GYAFIKRIKEGSVIDRIEAICV-----------GDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLI 208 (334)
T ss_pred ceeeeEeecCCchhhhhhheeH-----------HhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEee
Confidence 34677778888765543 3566 9999999999998764 56788888888998888765
No 94
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=62.97 E-value=15 Score=28.32 Aligned_cols=37 Identities=11% Similarity=0.255 Sum_probs=31.3
Q ss_pred ccCCCCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEc
Q 015960 133 VIEGASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHID 169 (397)
Q Consensus 133 vv~~~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~ 169 (397)
++.....+.|.+.+++.+.+++.++|.+.++.|=.+.
T Consensus 10 ~~~~~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 10 AVKNNTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred HHcCCCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 4445678999999999999999999999999876553
No 95
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=56.27 E-value=10 Score=34.32 Aligned_cols=45 Identities=13% Similarity=0.313 Sum_probs=34.0
Q ss_pred ccEEEEccccCCCCcccceecCC----ccEEEEEeeeeccCCCccCceeeEec
Q 015960 226 QGLIQIDAAINRGNSGGPLLDSS----GSLIGVNTSIITRTDAFCGMACSIPI 274 (397)
Q Consensus 226 ~~~i~~~~~i~~G~SGGPlvn~~----G~vVGI~s~~~~~~~~~~~~~~aip~ 274 (397)
...+++..+...|+-|||++-.+ -+++||+.++.. ..+.+||=++
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~----~~~~gYAe~i 216 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSA----NHAMGYAESI 216 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccc----cccceehhhh
Confidence 34577788889999999998432 479999998754 2457888666
No 96
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=55.74 E-value=36 Score=23.87 Aligned_cols=33 Identities=24% Similarity=0.386 Sum_probs=28.5
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcC
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDA 170 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~ 170 (397)
..+.|.+.||+.+.+.+.++|+..++.|-....
T Consensus 7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~~ 39 (63)
T cd00600 7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVEE 39 (63)
T ss_pred CEEEEEECCCcEEEEEEEEECCCCCEEECCEEE
Confidence 478899999999999999999999988776643
No 97
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=55.37 E-value=49 Score=23.72 Aligned_cols=34 Identities=21% Similarity=0.260 Sum_probs=29.4
Q ss_pred CCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcC
Q 015960 137 ASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDA 170 (397)
Q Consensus 137 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~ 170 (397)
...+.++...|+.++++|+.+|....+.+|+.+.
T Consensus 6 Gs~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 6 GSQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred ccEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 3457788888999999999999999999999655
No 98
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=52.82 E-value=38 Score=24.50 Aligned_cols=33 Identities=18% Similarity=0.214 Sum_probs=29.4
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcC
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDA 170 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~ 170 (397)
..+.|.+.+|+.+.+++.++|+..++.|-....
T Consensus 11 ~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 11 KPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence 578899999999999999999999999887753
No 99
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=52.57 E-value=33 Score=24.91 Aligned_cols=32 Identities=19% Similarity=0.147 Sum_probs=28.1
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEc
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHID 169 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~ 169 (397)
..+.|.+.+|+.+.+++.++|...++.+=.+.
T Consensus 12 ~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 12 KPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 57889999999999999999999998886653
No 100
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=52.53 E-value=36 Score=24.64 Aligned_cols=32 Identities=16% Similarity=0.113 Sum_probs=28.2
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEc
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHID 169 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~ 169 (397)
..+.|.+.+|+.+.+++.++|+..++.|=...
T Consensus 11 ~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 11 RPVVVKLNSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence 57889999999999999999999999886653
No 101
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=52.51 E-value=38 Score=24.91 Aligned_cols=33 Identities=21% Similarity=0.230 Sum_probs=29.0
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcC
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDA 170 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~ 170 (397)
..+.|.+.+|+.+.+++.++|+..++.|-....
T Consensus 15 k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 47 (72)
T PRK00737 15 SPVLVRLKGGREFRGELQGYDIHMNLVLDNAEE 47 (72)
T ss_pred CEEEEEECCCCEEEEEEEEEcccceeEEeeEEE
Confidence 468899999999999999999999998877643
No 102
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.08 E-value=34 Score=25.94 Aligned_cols=31 Identities=19% Similarity=0.224 Sum_probs=27.2
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEE
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHI 168 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v 168 (397)
..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus 12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence 5788999999999999999999999886544
No 103
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.80 E-value=42 Score=25.14 Aligned_cols=32 Identities=19% Similarity=0.300 Sum_probs=27.8
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEc
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHID 169 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~ 169 (397)
..+.|.+.||+.+.+.+.++|+..+|.|=...
T Consensus 11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence 57889999999999999999999998875553
No 104
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.71 E-value=48 Score=24.72 Aligned_cols=32 Identities=13% Similarity=0.228 Sum_probs=28.0
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEc
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHID 169 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~ 169 (397)
..+.|.+.||+.+.+.+.++|+..+|.|=...
T Consensus 11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred CeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence 57899999999999999999999998876553
No 105
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.78 E-value=42 Score=25.09 Aligned_cols=31 Identities=10% Similarity=0.262 Sum_probs=27.5
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEE
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHI 168 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v 168 (397)
..+.|.+.+|+.+.+++.++|...++.|=..
T Consensus 14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 6788999999999999999999999887554
No 106
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=47.72 E-value=78 Score=23.12 Aligned_cols=43 Identities=19% Similarity=0.401 Sum_probs=30.3
Q ss_pred EEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEE
Q 015960 150 LDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAI 195 (397)
Q Consensus 150 ~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~i 195 (397)
++++++..+.....|++.... ....+.+..-.++++||+|++-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G---~~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGG---VRREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETT---EEEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCC---cEEEEEEEEeCCCCCCCEEEEe
Confidence 578999998889999998875 3344444444558999999874
No 107
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.28 E-value=49 Score=25.02 Aligned_cols=31 Identities=16% Similarity=0.194 Sum_probs=27.3
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEE
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHI 168 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v 168 (397)
..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus 13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence 5788999999999999999999998887554
No 108
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.79 E-value=57 Score=24.05 Aligned_cols=31 Identities=13% Similarity=0.164 Sum_probs=27.3
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEE
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHI 168 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v 168 (397)
..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus 11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~ 41 (72)
T cd01719 11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDA 41 (72)
T ss_pred CeEEEEECCCeEEEEEEEEEcccccEEeccE
Confidence 5788999999999999999999999887555
No 109
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=45.29 E-value=21 Score=24.19 Aligned_cols=20 Identities=35% Similarity=0.654 Sum_probs=17.1
Q ss_pred CCcccceecCCccEEEEEee
Q 015960 238 GNSGGPLLDSSGSLIGVNTS 257 (397)
Q Consensus 238 G~SGGPlvn~~G~vVGI~s~ 257 (397)
+-+.-|++|.+|+++|+++.
T Consensus 29 ~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 29 GISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp TSSEEEEESTTSBEEEEEEH
T ss_pred CCcEEEEEecCCEEEEEEEH
Confidence 45678999999999999875
No 110
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=44.85 E-value=59 Score=24.18 Aligned_cols=31 Identities=26% Similarity=0.175 Sum_probs=27.4
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEE
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHI 168 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v 168 (397)
..+.|.+.+|+.+.+.+.++|+..++.|=..
T Consensus 13 k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~ 43 (74)
T cd01728 13 KKVVVLLRDGRKLIGILRSFDQFANLVLQDT 43 (74)
T ss_pred CEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence 5788999999999999999999998887554
No 111
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.38 E-value=72 Score=23.31 Aligned_cols=33 Identities=12% Similarity=0.190 Sum_probs=29.2
Q ss_pred CCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEc
Q 015960 137 ASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHID 169 (397)
Q Consensus 137 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~ 169 (397)
...+.|.+.+|..+.+++..+|...++.+-.+.
T Consensus 10 g~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 10 GHIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CCEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence 357889999999999999999999999988774
No 112
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=41.21 E-value=73 Score=22.61 Aligned_cols=33 Identities=21% Similarity=0.264 Sum_probs=28.2
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcC
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDA 170 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~ 170 (397)
..+.|.+.||+.+.+.+.++|+..++-|=....
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~e 41 (67)
T smart00651 9 KRVLVELKNGREYRGTLKGFDQFMNLVLEDVEE 41 (67)
T ss_pred cEEEEEECCCcEEEEEEEEECccccEEEccEEE
Confidence 468899999999999999999999988766543
No 113
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=40.52 E-value=59 Score=23.14 Aligned_cols=34 Identities=24% Similarity=0.303 Sum_probs=29.9
Q ss_pred CCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcC
Q 015960 137 ASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDA 170 (397)
Q Consensus 137 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~ 170 (397)
...+.|.+.+|+.+.+.+.++|+..++.|-....
T Consensus 8 g~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~ 41 (67)
T PF01423_consen 8 GKRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE 41 (67)
T ss_dssp TSEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred CcEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence 3578999999999999999999999998887754
No 114
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=39.57 E-value=1.1e+02 Score=27.03 Aligned_cols=28 Identities=25% Similarity=0.118 Sum_probs=25.2
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAV 348 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~v 348 (397)
+.+.|..+..+|||+++|+.- |+.|+++
T Consensus 122 ~~~~Vd~v~fgS~A~~~g~d~-----------d~~I~~v 149 (183)
T PF11874_consen 122 GKVIVDEVEFGSPAEKAGIDF-----------DWEITEV 149 (183)
T ss_pred CEEEEEecCCCCHHHHcCCCC-----------CcEEEEE
Confidence 457899999999999999998 8989887
No 115
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=39.05 E-value=75 Score=23.44 Aligned_cols=32 Identities=25% Similarity=0.312 Sum_probs=27.8
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEc
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHID 169 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~ 169 (397)
..+.|.+.||+.+.+++.++|+..++.|=...
T Consensus 10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 10 KTVSVITVDGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence 56888999999999999999999988876653
No 116
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=38.70 E-value=66 Score=24.02 Aligned_cols=33 Identities=24% Similarity=0.318 Sum_probs=28.6
Q ss_pred CeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcC
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDA 170 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~ 170 (397)
..+.|.+.+|+.+.+++.++|...++.|--+..
T Consensus 18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence 678999999999999999999999888765543
No 117
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=37.18 E-value=78 Score=23.94 Aligned_cols=47 Identities=15% Similarity=0.400 Sum_probs=31.2
Q ss_pred EEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEE-EecC
Q 015960 150 LDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYA-IGHP 198 (397)
Q Consensus 150 ~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~-iG~p 198 (397)
+|++++..+...++|++.+-.-...+.. .|- ..+++.||+|++ +||.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvkreV~l-~Lv-~~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVKREVNL-DLV-GEEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEeEEEEe-eee-cCccccCCEEEEEeeEE
Confidence 5778889988888999998763212221 221 126899999877 5654
No 118
>COG4956 Integral membrane protein (PIN domain superfamily) [General function prediction only]
Probab=34.54 E-value=34 Score=32.78 Aligned_cols=40 Identities=28% Similarity=0.431 Sum_probs=33.3
Q ss_pred EEEECCEecCCHHHHHHHHh-cCCCCCEEEEEEEECCeEEE
Q 015960 345 IKAVNGEDVSNANDLHNILD-QCKVGDEVIVRILRGTQLEE 384 (397)
Q Consensus 345 I~~vng~~v~~~~d~~~~l~-~~~~g~~v~l~v~R~g~~~~ 384 (397)
+-++.|.+|-+.+|+.++++ ...|||++++++.++|++..
T Consensus 269 Vae~qgV~vLNINDLAnAVkP~vlpGe~l~v~iiK~GkE~~ 309 (356)
T COG4956 269 VAELQGVQVLNINDLANAVKPVVLPGEELTVQIIKDGKEPG 309 (356)
T ss_pred HHhhcCCceecHHHHHHHhCCcccCCCeeEEEEeecCcccC
Confidence 44567788889999999997 45799999999999998764
No 119
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=33.03 E-value=40 Score=27.12 Aligned_cols=19 Identities=37% Similarity=0.670 Sum_probs=14.1
Q ss_pred cceecCCccEEEEEeeeec
Q 015960 242 GPLLDSSGSLIGVNTSIIT 260 (397)
Q Consensus 242 GPlvn~~G~vVGI~s~~~~ 260 (397)
.|++|.+|+++|+++.++.
T Consensus 94 ~PV~d~~g~viG~V~VG~~ 112 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVGVS 112 (116)
T ss_dssp EEEE-TTS-EEEEEEEEEE
T ss_pred EeeECCCCcEEEEEEEEEE
Confidence 5899999999999987653
No 120
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=31.66 E-value=85 Score=23.47 Aligned_cols=49 Identities=12% Similarity=0.167 Sum_probs=33.5
Q ss_pred CcEEEEECCEecCCHHHHHHHHh-cCCCCC----EEEEEEEECCeEEEEEEEee
Q 015960 342 GDIIKAVNGEDVSNANDLHNILD-QCKVGD----EVIVRILRGTQLEEILIILE 390 (397)
Q Consensus 342 GDiI~~vng~~v~~~~d~~~~l~-~~~~g~----~v~l~v~R~g~~~~~~v~l~ 390 (397)
-|-.+.+||++..+....+.... ...+|. ++..++.|||+..+.+-++.
T Consensus 11 adAkl~v~G~~t~~~G~~R~F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~~~V~ 64 (75)
T TIGR03000 11 ADAKLKVDGKETNGTGTVRTFTTPPLEAGKEYEYTVTAEYDRDGRILTRTRTVV 64 (75)
T ss_pred CCCEEEECCeEcccCccEEEEECCCCCCCCEEEEEEEEEEecCCcEEEEEEEEE
Confidence 58889999999988765554432 345665 46667789998776654443
No 121
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.41 E-value=1.4e+02 Score=22.08 Aligned_cols=33 Identities=15% Similarity=0.192 Sum_probs=29.0
Q ss_pred CCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEc
Q 015960 137 ASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHID 169 (397)
Q Consensus 137 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~ 169 (397)
...+.|.+.+|+.+.+++..+|...++.+-.+.
T Consensus 11 g~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 11 NHPMLVELKNGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CCEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence 357889999999999999999999999987763
No 122
>PF08669 GCV_T_C: Glycine cleavage T-protein C-terminal barrel domain; InterPro: IPR013977 This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=25.81 E-value=89 Score=23.93 Aligned_cols=21 Identities=33% Similarity=0.487 Sum_probs=16.8
Q ss_pred CcccceecCCccEEEEEeeee
Q 015960 239 NSGGPLLDSSGSLIGVNTSII 259 (397)
Q Consensus 239 ~SGGPlvn~~G~vVGI~s~~~ 259 (397)
..|.|+++.+|+.||.++...
T Consensus 34 ~~g~~v~~~~g~~vG~vTS~~ 54 (95)
T PF08669_consen 34 RGGEPVYDEDGKPVGRVTSGA 54 (95)
T ss_dssp STTCEEEETTTEEEEEEEEEE
T ss_pred CCCCEEEECCCcEEeEEEEEe
Confidence 357899987999999987653
No 123
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=25.76 E-value=48 Score=26.23 Aligned_cols=21 Identities=33% Similarity=0.474 Sum_probs=16.6
Q ss_pred CCcccceecCCccEEEEEeee
Q 015960 238 GNSGGPLLDSSGSLIGVNTSI 258 (397)
Q Consensus 238 G~SGGPlvn~~G~vVGI~s~~ 258 (397)
+.+.=|++|.+|+++|+++..
T Consensus 98 ~~~~lpVvd~~~~~vGiit~~ 118 (123)
T cd04627 98 GISSVAVVDNQGNLIGNISVT 118 (123)
T ss_pred CCceEEEECCCCcEEEEEeHH
Confidence 344569999999999998754
No 124
>PRK13922 rod shape-determining protein MreC; Provisional
Probab=24.22 E-value=5.9e+02 Score=23.75 Aligned_cols=73 Identities=14% Similarity=0.071 Sum_probs=41.3
Q ss_pred EEEEecCCcEEEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEEecCCCCCCceeeeEEeeccc
Q 015960 140 VKVTLFDKTTLDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAIGHPLGRKFTCTAGIISAFGL 215 (397)
Q Consensus 140 i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~vs~~~~ 215 (397)
+...+.....+.+.+. ...+-++++=......+..--+....+++.||.|+.-|...-....+.-|.|..+..
T Consensus 172 V~li~d~~~~v~v~i~---~~~~~gi~~G~g~~~~l~l~~i~~~~~i~~GD~VvTSGl~g~fP~Gi~VG~V~~v~~ 244 (276)
T PRK13922 172 VLLLTDPNSRVPVQVG---RNGIRGILSGNGSGDNLKLEFIPRSADIKVGDLVVTSGLGGIFPAGLPVGKVTSVER 244 (276)
T ss_pred EEEEEcCCCceEEEEE---cCCceEEEEecCCCCceEEEecCCCCCCCCCCEEEECCCCCcCCCCCEEEEEEEEEe
Confidence 3333333344555542 223346666553211123233334466999999999987655566778888887744
No 125
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=23.93 E-value=47 Score=24.48 Aligned_cols=17 Identities=35% Similarity=0.538 Sum_probs=13.5
Q ss_pred cceecCCccEEEEEeee
Q 015960 242 GPLLDSSGSLIGVNTSI 258 (397)
Q Consensus 242 GPlvn~~G~vVGI~s~~ 258 (397)
-|+.+.+|+++|+....
T Consensus 19 ~pi~~~~g~~~Gvv~~d 35 (81)
T PF02743_consen 19 VPIYDDDGKIIGVVGID 35 (81)
T ss_dssp EEEEETTTEEEEEEEEE
T ss_pred EEEECCCCCEEEEEEEE
Confidence 48888899999997543
No 126
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=23.89 E-value=1.8e+02 Score=22.42 Aligned_cols=34 Identities=15% Similarity=0.294 Sum_probs=29.5
Q ss_pred CCeEEEEecCCcEEEEEEEEEcCCCCeEEEEEcC
Q 015960 137 ASSVKVTLFDKTTLDAKVVGHDQGTDLAVLHIDA 170 (397)
Q Consensus 137 ~~~i~V~~~~g~~~~a~vv~~d~~~DlAlL~v~~ 170 (397)
...+.|.+.+|..|.+++..+|...++.+-.+..
T Consensus 11 g~~V~VeLKng~~~~G~L~~vD~~MNl~L~~a~~ 44 (90)
T cd01724 11 NETVTIELKNGTIVHGTITGVDPSMNTHLKNVKL 44 (90)
T ss_pred CCEEEEEECCCCEEEEEEEEEcCceeEEEEEEEE
Confidence 3578899999999999999999999998887643
No 127
>PF04225 OapA: Opacity-associated protein A LysM-like domain; InterPro: IPR007340 This entry includes the Haemophilus influenzae opacity-associated protein. This protein is required for efficient nasopharyngeal mucosal colonization, and its expression is associated with a distinctive transparent colony phenotype. OapA is thought to be a secreted protein, and its expression exhibits high-frequency phase variation [].; PDB: 2GU1_A.
Probab=23.39 E-value=1.2e+02 Score=23.05 Aligned_cols=31 Identities=16% Similarity=0.109 Sum_probs=18.8
Q ss_pred HHHhcCCCCCEEEEEEEECCeEEEEEEEeee
Q 015960 361 NILDQCKVGDEVIVRILRGTQLEEILIILEV 391 (397)
Q Consensus 361 ~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~ 391 (397)
+.|...+||+++.+.+-.+|+...+.+....
T Consensus 38 k~L~~L~pGq~l~f~~d~~g~L~~L~~~~~~ 68 (85)
T PF04225_consen 38 KPLTRLKPGQTLEFQLDEDGQLTALRYERSP 68 (85)
T ss_dssp --GGG--TT-EEEEEE-TTS-EEEEEEEEET
T ss_pred chHhhCCCCCEEEEEECCCCCEEEEEEEcCC
Confidence 3556778999999999888988877776543
No 128
>PF02601 Exonuc_VII_L: Exonuclease VII, large subunit; InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus. This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=22.79 E-value=76 Score=30.54 Aligned_cols=34 Identities=21% Similarity=0.487 Sum_probs=30.7
Q ss_pred eeEEEEEcCCCEEEEcccccCCCCeEEEEecCCc
Q 015960 115 TGTGFIWDEDGHIVTNHHVIEGASSVKVTLFDKT 148 (397)
Q Consensus 115 ~GSG~ii~~~G~ILT~aHvv~~~~~i~V~~~~g~ 148 (397)
.|-+++.+++|.++|+..-+...+.+.+.+.||.
T Consensus 281 RGYaiv~~~~g~vI~s~~~l~~gd~i~i~l~DG~ 314 (319)
T PF02601_consen 281 RGYAIVRDKDGKVITSVKQLKPGDEIEIRLADGS 314 (319)
T ss_pred CceEEEECCCCCEECCHHHCCCCCEEEEEEcceE
Confidence 5677888889999999999999999999999985
No 129
>PF14438 SM-ATX: Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=21.89 E-value=2.3e+02 Score=20.84 Aligned_cols=28 Identities=14% Similarity=0.177 Sum_probs=20.6
Q ss_pred CeEEEEecCCcEEEEEEEEEcC---CCCeEE
Q 015960 138 SSVKVTLFDKTTLDAKVVGHDQ---GTDLAV 165 (397)
Q Consensus 138 ~~i~V~~~~g~~~~a~vv~~d~---~~DlAl 165 (397)
..+.|++.||..|++-+...++ +.|++|
T Consensus 13 ~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL 43 (77)
T PF14438_consen 13 QTVEVTTKNGSVYEGIFHSASPESNEFDVVL 43 (77)
T ss_dssp SEEEEEETTS-EEEEEEEEE-T---T--EEE
T ss_pred CEEEEEECCCCEEEEEEEeCCCcccceeEEE
Confidence 5788999999999999999988 666665
No 130
>TIGR00074 hypC_hupF hydrogenase assembly chaperone HypC/HupF. An additional proposed function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. PubMed:12441107.
Probab=21.51 E-value=2.1e+02 Score=21.40 Aligned_cols=41 Identities=17% Similarity=0.379 Sum_probs=26.0
Q ss_pred EEEEEEEEcCCCCeEEEEEcCCCCCcceeecCCCCCCCCCCeEEEE
Q 015960 150 LDAKVVGHDQGTDLAVLHIDAPNHKLRSIPVGVSANLRIGQKVYAI 195 (397)
Q Consensus 150 ~~a~vv~~d~~~DlAlL~v~~~~~~~~~~~l~~s~~~~~G~~V~~i 195 (397)
++++++..+. +.|++.+... ...+.+.--.++++||+|++-
T Consensus 5 iP~~V~~i~~--~~A~v~~~G~---~~~v~l~lv~~~~vGD~VLVH 45 (76)
T TIGR00074 5 IPGQVVEIDE--NIALVEFCGI---KRDVSLDLVGEVKVGDYVLVH 45 (76)
T ss_pred cceEEEEEcC--CEEEEEcCCe---EEEEEEEeeCCCCCCCEEEEe
Confidence 4677877765 4688877652 223333333468999998873
No 131
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=21.24 E-value=71 Score=24.73 Aligned_cols=18 Identities=22% Similarity=0.346 Sum_probs=14.8
Q ss_pred cccceecCCccEEEEEee
Q 015960 240 SGGPLLDSSGSLIGVNTS 257 (397)
Q Consensus 240 SGGPlvn~~G~vVGI~s~ 257 (397)
+--|++|.+|+++|+++.
T Consensus 88 ~~lpVvd~~~~~~Giit~ 105 (111)
T cd04603 88 PVVAVVDKEGKLVGTIYE 105 (111)
T ss_pred CeEEEEcCCCeEEEEEEh
Confidence 345899988999999875
No 132
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=21.15 E-value=73 Score=24.63 Aligned_cols=20 Identities=40% Similarity=0.532 Sum_probs=16.0
Q ss_pred CcccceecCCccEEEEEeee
Q 015960 239 NSGGPLLDSSGSLIGVNTSI 258 (397)
Q Consensus 239 ~SGGPlvn~~G~vVGI~s~~ 258 (397)
...-|++|.+|+++|+++..
T Consensus 91 ~~~~pVvd~~~~~~Gvit~~ 110 (115)
T cd04620 91 IRHLPVLDDQGQLIGLVTAE 110 (115)
T ss_pred CceEEEEcCCCCEEEEEEhH
Confidence 34569999899999998753
No 133
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=21.01 E-value=52 Score=31.65 Aligned_cols=30 Identities=33% Similarity=0.534 Sum_probs=26.2
Q ss_pred EEEecccCcccccCccccccCCCCcccCCcEEEEECCEecC
Q 015960 314 IFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGEDVS 354 (397)
Q Consensus 314 V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~~v~ 354 (397)
+..|.+.+||+++|... ||-|+-+|+-++.
T Consensus 67 ~lrv~~~~~~e~~~~~~-----------~dyilg~n~Dp~~ 96 (417)
T COG5233 67 VLRVNPESPAEKAGMVV-----------GDYILGINEDPLR 96 (417)
T ss_pred heeccccChhHhhcccc-----------ceeEEeecCCcHH
Confidence 55788999999999999 9999999987654
No 134
>PF10049 DUF2283: Protein of unknown function (DUF2283); InterPro: IPR019270 Members of this family of hypothetical proteins have no known function.
Probab=20.87 E-value=67 Score=21.76 Aligned_cols=11 Identities=27% Similarity=0.839 Sum_probs=8.3
Q ss_pred cCCccEEEEEe
Q 015960 246 DSSGSLIGVNT 256 (397)
Q Consensus 246 n~~G~vVGI~s 256 (397)
|.+|++|||--
T Consensus 36 d~~G~ivGIEI 46 (50)
T PF10049_consen 36 DEDGRIVGIEI 46 (50)
T ss_pred CCCCCEEEEEE
Confidence 57888998854
No 135
>PF11325 DUF3127: Domain of unknown function (DUF3127); InterPro: IPR021474 This bacterial family of proteins has no known function.
Probab=20.81 E-value=2.8e+02 Score=21.26 Aligned_cols=64 Identities=22% Similarity=0.259 Sum_probs=40.5
Q ss_pred CCcEEEEecccCcccccCccccccCCCCcccCCcEEEEECCE-----ecCCHHHHHHHHhcCCCCCEEEEEEEECCeEEE
Q 015960 310 SGGVIFIAVEEGPAGKAGLRSTKFGANGKFILGDIIKAVNGE-----DVSNANDLHNILDQCKVGDEVIVRILRGTQLEE 384 (397)
Q Consensus 310 ~g~~V~~v~~~spa~~aGl~~~~~~~~~~l~~GDiI~~vng~-----~v~~~~d~~~~l~~~~~g~~v~l~v~R~g~~~~ 384 (397)
.|-+|..+.+-+--.+.|.+. -|.|++-+++ .+.-+.|-.+.|...++|+.|++.+.=++.+-.
T Consensus 3 ~Gkii~~l~~~~g~s~~Gw~K-----------re~Vlet~~qYP~~i~f~~~~dk~~~l~~~~~Gd~V~Vsf~i~~RE~~ 71 (84)
T PF11325_consen 3 TGKIIKVLPEQQGVSKNGWKK-----------REFVLETEEQYPQKICFEFWGDKIDLLDNFQVGDEVKVSFNIEGREWN 71 (84)
T ss_pred ccEEEEEecCcccCcCCCcEE-----------EEEEEeCCCcCCceEEEEEEcchhhhhccCCCCCEEEEEEEeeccEec
Confidence 455555554444444466777 7888887663 223345666667777899999988865555443
No 136
>PF09122 DUF1930: Domain of unknown function (DUF1930); InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=20.63 E-value=3.3e+02 Score=19.58 Aligned_cols=45 Identities=22% Similarity=0.322 Sum_probs=27.2
Q ss_pred CcEEEEECCEecCCH-HHHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 015960 342 GDIIKAVNGEDVSNA-NDLHNILDQCKVGDEVIVRILRGTQLEEILI 387 (397)
Q Consensus 342 GDiI~~vng~~v~~~-~d~~~~l~~~~~g~~v~l~v~R~g~~~~~~v 387 (397)
.-.-+.+||..+.+. .++..++.....|++.++.+. .++...+++
T Consensus 19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~Fk-S~Rv~~iEv 64 (68)
T PF09122_consen 19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFFK-SQRVAVIEV 64 (68)
T ss_dssp TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEET-TS-EEEEE-
T ss_pred cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEEe-cCcEEEEEc
Confidence 456678899999986 568888888889998877543 333333433
No 137
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=20.16 E-value=3.2e+02 Score=19.15 Aligned_cols=36 Identities=17% Similarity=0.105 Sum_probs=28.0
Q ss_pred CCCeEEEEecCCc-EEEEEEEEEcCCCCeEEEEEcCC
Q 015960 136 GASSVKVTLFDKT-TLDAKVVGHDQGTDLAVLHIDAP 171 (397)
Q Consensus 136 ~~~~i~V~~~~g~-~~~a~vv~~d~~~DlAlL~v~~~ 171 (397)
....+.++.++.. .|++++..+|...+++-++.+..
T Consensus 8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~DG 44 (55)
T PF09465_consen 8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYEDG 44 (55)
T ss_dssp SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETTS
T ss_pred CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcCC
Confidence 3456788888655 56999999999999999999874
Done!