Query 014786
Match_columns 418
No_of_seqs 361 out of 3361
Neff 7.5
Searched_HMMs 46136
Date Fri Mar 29 08:30:22 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/014786.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/014786hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 3E-51 6.5E-56 421.5 36.3 286 117-417 41-346 (455)
2 PRK10898 serine endoprotease; 100.0 5.9E-50 1.3E-54 400.3 36.8 287 112-417 41-335 (353)
3 TIGR02038 protease_degS peripl 100.0 7.5E-50 1.6E-54 399.7 36.8 287 112-417 41-334 (351)
4 PRK10942 serine endoprotease; 100.0 3.3E-49 7.2E-54 408.1 34.9 286 117-417 39-367 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 1.3E-47 2.9E-52 393.8 32.8 285 118-417 3-313 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 1.9E-38 4.2E-43 317.6 28.5 288 116-417 33-326 (347)
7 KOG1320 Serine protease [Postt 99.9 6.7E-27 1.4E-31 236.1 19.6 291 115-416 127-453 (473)
8 KOG1421 Predicted signaling-as 99.8 1.9E-20 4E-25 191.4 16.5 279 117-417 53-357 (955)
9 PF13365 Trypsin_2: Trypsin-li 99.7 2.2E-16 4.8E-21 132.6 12.5 109 154-293 1-120 (120)
10 PF00089 Trypsin: Trypsin; In 99.5 4E-12 8.7E-17 117.3 19.2 168 151-320 24-220 (220)
11 KOG1421 Predicted signaling-as 99.5 1.1E-12 2.4E-17 135.0 15.5 288 122-416 524-827 (955)
12 KOG1320 Serine protease [Postt 99.4 2.6E-12 5.7E-17 130.6 12.7 252 122-394 56-319 (473)
13 PF13180 PDZ_2: PDZ domain; PD 99.3 5.8E-12 1.3E-16 99.7 8.6 70 332-417 1-70 (82)
14 cd00190 Tryp_SPc Trypsin-like 99.3 9E-11 1.9E-15 109.0 17.1 169 151-321 24-230 (232)
15 smart00020 Tryp_SPc Trypsin-li 99.2 3.5E-10 7.7E-15 105.3 15.9 147 151-298 25-208 (229)
16 COG3591 V8-like Glu-specific e 99.0 1.7E-08 3.6E-13 95.5 16.4 158 153-324 65-250 (251)
17 cd00987 PDZ_serine_protease PD 99.0 3E-09 6.6E-14 85.0 8.7 75 332-417 1-80 (90)
18 cd00991 PDZ_archaeal_metallopr 98.9 4.8E-09 1E-13 82.4 7.9 57 350-417 10-66 (79)
19 cd00136 PDZ PDZ domain, also c 98.9 4.7E-09 1E-13 80.0 7.6 68 332-417 1-70 (70)
20 TIGR01713 typeII_sec_gspC gene 98.8 1E-08 2.2E-13 98.6 8.2 89 314-417 159-247 (259)
21 cd00990 PDZ_glycyl_aminopeptid 98.8 1.2E-08 2.7E-13 79.8 7.2 65 332-417 1-65 (80)
22 cd00989 PDZ_metalloprotease PD 98.7 5.7E-08 1.2E-12 75.7 7.9 66 333-417 2-67 (79)
23 cd00986 PDZ_LON_protease PDZ d 98.7 7.2E-08 1.6E-12 75.5 7.7 56 350-417 8-63 (79)
24 cd00988 PDZ_CTP_protease PDZ d 98.6 1.1E-07 2.4E-12 75.2 7.4 66 333-417 3-70 (85)
25 TIGR02037 degP_htrA_DO peripla 98.5 2.6E-07 5.7E-12 95.4 9.0 76 331-417 337-418 (428)
26 PF00863 Peptidase_C4: Peptida 98.5 9.5E-06 2E-10 76.3 17.5 164 122-313 13-184 (235)
27 cd00992 PDZ_signaling PDZ doma 98.5 6.7E-07 1.5E-11 70.0 8.2 69 331-416 11-81 (82)
28 PF00595 PDZ: PDZ domain (Also 98.4 3.4E-07 7.5E-12 71.9 5.2 71 330-416 8-80 (81)
29 smart00228 PDZ Domain present 98.3 1.4E-06 3E-11 68.4 7.0 71 332-417 12-82 (85)
30 PRK10779 zinc metallopeptidase 98.2 1.4E-06 3.1E-11 90.5 6.3 54 353-417 129-182 (449)
31 KOG3627 Trypsin [Amino acid tr 98.2 9.6E-05 2.1E-09 70.4 18.0 145 153-299 39-229 (256)
32 TIGR00054 RIP metalloprotease 98.0 1.3E-05 2.7E-10 82.7 7.4 55 350-416 203-257 (420)
33 TIGR00225 prc C-terminal pepti 98.0 1.9E-05 4.1E-10 78.9 7.7 67 332-417 51-119 (334)
34 PRK10779 zinc metallopeptidase 97.9 2E-05 4.4E-10 81.9 7.6 54 351-416 222-275 (449)
35 PF14685 Tricorn_PDZ: Tricorn 97.9 3.9E-05 8.5E-10 61.4 7.2 68 333-417 2-77 (88)
36 PRK10139 serine endoprotease; 97.9 2.5E-05 5.3E-10 81.3 7.4 55 350-417 390-444 (455)
37 PLN00049 carboxyl-terminal pro 97.8 6.2E-05 1.4E-09 76.8 8.3 56 350-417 102-159 (389)
38 PF05579 Peptidase_S32: Equine 97.8 0.00034 7.4E-09 66.2 12.2 117 151-298 111-229 (297)
39 PRK10942 serine endoprotease; 97.8 5.7E-05 1.2E-09 79.0 7.5 55 350-417 408-462 (473)
40 TIGR00054 RIP metalloprotease 97.8 4.3E-05 9.3E-10 78.8 6.3 54 350-416 128-181 (420)
41 COG0793 Prc Periplasmic protea 97.7 0.00013 2.9E-09 74.6 8.5 71 329-417 97-169 (406)
42 PF04495 GRASP55_65: GRASP55/6 97.6 0.0001 2.3E-09 64.1 6.0 74 332-417 26-99 (138)
43 TIGR02860 spore_IV_B stage IV 97.6 0.00012 2.6E-09 74.1 7.1 55 350-416 105-167 (402)
44 COG3480 SdrC Predicted secrete 97.4 0.0003 6.4E-09 68.2 6.8 55 350-416 130-184 (342)
45 PRK11186 carboxy-terminal prot 97.3 0.00061 1.3E-08 73.7 7.8 68 331-417 243-318 (667)
46 COG5640 Secreted trypsin-like 97.2 0.015 3.4E-07 57.4 15.5 54 272-325 223-279 (413)
47 PF12812 PDZ_1: PDZ-like domai 97.2 0.0013 2.8E-08 51.5 6.3 65 332-407 9-76 (78)
48 COG3975 Predicted protease wit 97.1 0.00065 1.4E-08 70.0 4.8 49 350-417 462-510 (558)
49 KOG3553 Tax interaction protei 97.0 0.00066 1.4E-08 54.7 3.0 32 350-392 59-90 (124)
50 PF05580 Peptidase_S55: SpoIVB 96.7 0.022 4.7E-07 52.8 11.1 164 145-314 13-213 (218)
51 PF00548 Peptidase_C3: 3C cyst 96.6 0.049 1.1E-06 49.2 12.8 136 151-297 24-170 (172)
52 PF03761 DUF316: Domain of unk 96.6 0.16 3.4E-06 49.3 17.2 91 197-297 159-254 (282)
53 KOG3129 26S proteasome regulat 96.0 0.012 2.6E-07 53.9 5.4 56 351-417 140-197 (231)
54 PF08192 Peptidase_S64: Peptid 96.0 0.042 9.1E-07 58.4 10.0 117 198-323 542-688 (695)
55 PRK09681 putative type II secr 95.9 0.013 2.7E-07 56.7 5.3 50 356-416 210-262 (276)
56 PF10459 Peptidase_S46: Peptid 95.8 0.053 1.1E-06 59.2 10.0 22 152-173 47-68 (698)
57 KOG3209 WW domain-containing p 95.6 0.014 3.1E-07 62.0 4.8 52 354-417 782-835 (984)
58 PF00949 Peptidase_S7: Peptida 95.4 0.016 3.5E-07 49.9 3.6 32 268-299 88-119 (132)
59 TIGR02860 spore_IV_B stage IV 95.3 0.16 3.5E-06 51.8 11.1 38 272-313 355-392 (402)
60 KOG3580 Tight junction protein 95.3 0.024 5.2E-07 59.2 5.1 55 350-415 429-485 (1027)
61 PF10459 Peptidase_S46: Peptid 95.3 0.014 3E-07 63.6 3.6 31 265-295 621-651 (698)
62 KOG3532 Predicted protein kina 95.0 0.046 9.9E-07 58.0 6.1 62 332-411 386-447 (1051)
63 PF02122 Peptidase_S39: Peptid 94.4 0.13 2.7E-06 47.8 6.8 134 164-314 43-182 (203)
64 KOG1892 Actin filament-binding 93.4 0.1 2.2E-06 57.3 5.0 57 350-416 960-1016(1629)
65 KOG3550 Receptor targeting pro 93.0 0.17 3.7E-06 44.1 4.8 54 350-416 115-171 (207)
66 COG3031 PulC Type II secretory 92.6 0.22 4.7E-06 46.8 5.2 48 359-417 216-263 (275)
67 PF00947 Pico_P2A: Picornaviru 92.5 1 2.2E-05 38.3 8.7 33 264-297 77-109 (127)
68 PF09342 DUF1986: Domain of un 92.4 1.7 3.7E-05 41.2 10.9 88 149-237 25-131 (267)
69 KOG3606 Cell polarity protein 92.3 0.44 9.5E-06 45.5 6.9 66 332-407 171-243 (358)
70 KOG3834 Golgi reassembly stack 91.8 0.18 3.9E-06 51.0 4.0 56 350-417 15-71 (462)
71 PF00944 Peptidase_S3: Alphavi 91.5 0.26 5.7E-06 42.2 4.1 28 271-298 100-127 (158)
72 KOG3209 WW domain-containing p 91.1 0.47 1E-05 50.9 6.3 55 352-416 373-429 (984)
73 KOG3549 Syntrophins (type gamm 90.9 0.36 7.9E-06 47.6 4.9 54 351-416 81-136 (505)
74 KOG0609 Calcium/calmodulin-dep 90.6 0.42 9.2E-06 49.8 5.3 67 332-416 134-202 (542)
75 KOG2921 Intramembrane metallop 88.8 0.64 1.4E-05 46.7 4.8 45 350-405 220-265 (484)
76 KOG3605 Beta amyloid precursor 88.6 0.6 1.3E-05 49.6 4.7 101 274-396 677-791 (829)
77 KOG3552 FERM domain protein FR 87.9 0.61 1.3E-05 51.5 4.4 64 332-417 65-130 (1298)
78 KOG3542 cAMP-regulated guanine 87.5 0.46 9.9E-06 50.7 3.1 55 350-416 562-616 (1283)
79 KOG3834 Golgi reassembly stack 87.0 0.69 1.5E-05 47.0 3.9 53 354-417 113-165 (462)
80 KOG3580 Tight junction protein 86.9 0.85 1.8E-05 48.1 4.6 55 350-416 40-94 (1027)
81 COG0750 Predicted membrane-ass 84.6 2.3 4.9E-05 42.9 6.5 44 356-411 135-178 (375)
82 KOG3551 Syntrophins (type beta 84.6 0.86 1.9E-05 45.7 3.2 70 331-416 95-166 (506)
83 PF03510 Peptidase_C24: 2C end 84.2 4.3 9.3E-05 33.5 6.6 53 156-220 3-55 (105)
84 PF02395 Peptidase_S6: Immunog 83.9 3.7 8.1E-05 45.6 8.1 54 153-209 66-121 (769)
85 KOG3605 Beta amyloid precursor 83.5 1.3 2.8E-05 47.2 4.2 58 350-417 673-732 (829)
86 PF02907 Peptidase_S29: Hepati 82.7 0.62 1.3E-05 40.0 1.2 115 155-299 15-130 (148)
87 KOG3571 Dishevelled 3 and rela 82.3 2.1 4.6E-05 44.4 5.0 37 350-396 277-313 (626)
88 KOG0606 Microtubule-associated 74.7 4 8.6E-05 46.4 4.6 50 353-415 661-712 (1205)
89 PF01732 DUF31: Putative pepti 73.2 2.5 5.4E-05 42.9 2.6 23 273-295 351-373 (374)
90 KOG3651 Protein kinase C, alph 69.4 10 0.00022 37.1 5.5 47 350-406 30-78 (429)
91 PF05416 Peptidase_C37: Southa 62.1 26 0.00056 36.0 7.0 135 151-298 378-527 (535)
92 PF11874 DUF3394: Domain of un 53.6 56 0.0012 29.8 7.1 37 335-389 114-150 (183)
93 cd00600 Sm_like The eukaryotic 52.4 41 0.00088 24.3 5.2 33 176-208 7-39 (63)
94 cd01735 LSm12_N LSm12 belongs 50.6 57 0.0012 24.2 5.5 33 176-208 7-39 (61)
95 cd01720 Sm_D2 The eukaryotic S 50.1 31 0.00067 27.5 4.4 37 171-207 10-46 (87)
96 KOG3938 RGS-GAIP interacting p 50.0 16 0.00035 35.2 3.1 56 352-417 151-208 (334)
97 cd06168 LSm9 The eukaryotic Sm 46.0 52 0.0011 25.4 4.9 32 176-207 11-42 (75)
98 cd01731 archaeal_Sm1 The archa 46.0 53 0.0011 24.5 4.9 33 176-208 11-43 (68)
99 PRK00737 small nuclear ribonuc 46.0 52 0.0011 25.0 4.9 33 176-208 15-47 (72)
100 cd01726 LSm6 The eukaryotic Sm 46.0 48 0.001 24.7 4.7 32 176-207 11-42 (67)
101 cd01722 Sm_F The eukaryotic Sm 45.6 44 0.00095 25.0 4.4 32 176-207 12-43 (68)
102 PF00571 CBS: CBS domain CBS d 41.7 20 0.00044 25.0 2.0 21 276-296 28-48 (57)
103 cd01717 Sm_B The eukaryotic Sm 41.7 59 0.0013 25.1 4.8 32 176-207 11-42 (79)
104 cd01730 LSm3 The eukaryotic Sm 41.0 52 0.0011 25.7 4.4 31 176-206 12-42 (82)
105 cd01729 LSm7 The eukaryotic Sm 40.9 66 0.0014 25.1 4.9 31 176-206 13-43 (81)
106 COG5233 GRH1 Peripheral Golgi 39.7 17 0.00037 35.8 1.6 30 354-394 67-96 (417)
107 cd01732 LSm5 The eukaryotic Sm 38.7 64 0.0014 24.9 4.5 31 176-206 14-44 (76)
108 cd01728 LSm1 The eukaryotic Sm 38.5 77 0.0017 24.3 4.9 31 176-206 13-43 (74)
109 PF12381 Peptidase_C3G: Tungro 38.4 23 0.0005 33.1 2.2 55 265-323 168-228 (231)
110 PF09122 DUF1930: Domain of un 38.3 78 0.0017 23.6 4.5 35 382-416 19-54 (68)
111 cd01719 Sm_G The eukaryotic Sm 38.0 82 0.0018 24.0 4.9 32 176-207 11-42 (72)
112 smart00651 Sm snRNP Sm protein 36.9 87 0.0019 22.9 4.9 32 176-207 9-40 (67)
113 cd01721 Sm_D3 The eukaryotic S 35.6 92 0.002 23.5 4.9 32 176-207 11-42 (70)
114 PF01423 LSM: LSM domain ; In 35.5 62 0.0014 23.7 3.9 33 176-208 9-41 (67)
115 cd01727 LSm8 The eukaryotic Sm 34.2 93 0.002 23.7 4.7 32 176-207 10-41 (74)
116 COG0298 HypC Hydrogenase matur 32.6 67 0.0014 25.2 3.6 47 188-236 5-52 (82)
117 COG1958 LSM1 Small nuclear rib 31.6 93 0.002 23.9 4.4 33 176-208 18-50 (79)
118 PF02601 Exonuc_VII_L: Exonucl 30.3 60 0.0013 32.0 3.9 35 152-186 280-314 (319)
119 PF14827 Cache_3: Sensory doma 28.6 49 0.0011 27.3 2.5 18 281-298 94-111 (116)
120 PF05578 Peptidase_S31: Pestiv 26.5 1.5E+02 0.0032 26.2 5.1 131 152-297 51-182 (211)
121 COG0061 nadF NAD kinase [Coenz 26.2 27 0.00058 34.0 0.6 32 2-33 179-210 (281)
122 PF09465 LBR_tudor: Lamin-B re 25.7 2.7E+02 0.0058 20.3 5.6 35 174-208 8-43 (55)
123 cd01723 LSm4 The eukaryotic Sm 25.6 1.8E+02 0.0039 22.2 5.1 32 176-207 12-43 (76)
124 PF01455 HupF_HypC: HupF/HypC 24.1 2.1E+02 0.0045 21.6 5.0 43 188-233 5-47 (68)
125 PF02743 Cache_1: Cache domain 23.3 58 0.0013 24.7 1.9 30 281-323 19-48 (81)
126 PF01732 DUF31: Putative pepti 23.2 51 0.0011 33.4 2.0 23 151-173 35-67 (374)
127 PF14438 SM-ATX: Ataxin 2 SM d 23.2 1.9E+02 0.0041 21.9 4.8 28 176-203 13-43 (77)
128 cd04627 CBS_pair_14 The CBS do 22.4 61 0.0013 26.2 2.0 21 277-297 98-118 (123)
129 cd01725 LSm2 The eukaryotic Sm 20.7 2.4E+02 0.0052 21.9 4.9 32 176-207 12-43 (81)
130 cd04603 CBS_pair_KefB_assoc Th 20.5 74 0.0016 25.3 2.1 20 277-296 86-105 (111)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=3e-51 Score=421.49 Aligned_cols=286 Identities=39% Similarity=0.627 Sum_probs=249.3
Q ss_pred hHHHHHHHcCCceEEEEeeecccC------c----ccc---cc-ccCCCeEEEEEEEcC-CcEEEecccccCCCCeEEEE
Q 014786 117 ATVRLFQENTPSVVNITNLAARQD------A----FTL---DV-LEVPQGSGSGFVWDS-KGHVVTNYHVIRGASDIRVT 181 (418)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~------~----~~~---~~-~~~~~~~GSGfiI~~-~G~ILT~aHvv~~~~~i~V~ 181 (418)
.+.++++++.||||.|.+...... . |.. +. ....++.||||||++ +||||||+|||++++.+.|+
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~ 120 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ 120 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence 588999999999999987653221 1 110 00 112357899999985 79999999999999999999
Q ss_pred eCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCC
Q 014786 182 FADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG 261 (418)
Q Consensus 182 ~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~ 261 (418)
+.|++.++|++++.|+++||||||++.+ ..+++++++++..+++||+|+++|||++...+++.|+|++..+.....
T Consensus 121 ~~dg~~~~a~vvg~D~~~DlAvlkv~~~-~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~--- 196 (455)
T PRK10139 121 LNDGREFDAKLIGSDDQSDIALLQIQNP-SKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNL--- 196 (455)
T ss_pred ECCCCEEEEEEEEEcCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCC---
Confidence 9999999999999999999999999854 368999999999999999999999999999999999999887752211
Q ss_pred CCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccccccCeeecc-
Q 014786 262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP- 340 (418)
Q Consensus 262 ~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~- 340 (418)
..+.+++++|+++++|+|||||+|.+|+||||+++.+.++++..+++|+||++.+++++++|+++|++.|+|||+.+++
T Consensus 197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~l 276 (455)
T PRK10139 197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEM 276 (455)
T ss_pred CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEEC
Confidence 2245789999999999999999999999999999988776677899999999999999999999999999999999886
Q ss_pred -chhhhhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 341 -DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 341 -~~~~~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
.+.++.+++ .|++|.+|.++|||+++||++ ||+|++|||++|.++.|+.+.+...++|++++++|
T Consensus 277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V 345 (455)
T PRK10139 277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKA-----------GDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGL 345 (455)
T ss_pred CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCC-----------CCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence 344666776 699999999999999999999 99999999999999999999998878899998887
Q ss_pred E
Q 014786 417 F 417 (418)
Q Consensus 417 ~ 417 (418)
+
T Consensus 346 ~ 346 (455)
T PRK10139 346 L 346 (455)
T ss_pred E
Confidence 4
No 2
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=5.9e-50 Score=400.28 Aligned_cols=287 Identities=34% Similarity=0.527 Sum_probs=245.5
Q ss_pred CccchhHHHHHHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEE
Q 014786 112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK 191 (418)
Q Consensus 112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~ 191 (418)
...+....++++++.||||.|......... .......+.||||+|+++||||||+|||++++.+.|++.||+.++|+
T Consensus 41 ~~~~~~~~~~~~~~~psvV~v~~~~~~~~~---~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~ 117 (353)
T PRK10898 41 DETPASYNQAVRRAAPAVVNVYNRSLNSTS---HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQDGRVFEAL 117 (353)
T ss_pred ccccchHHHHHHHhCCcEEEEEeEeccccC---cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCCCCEEEEE
Confidence 334457889999999999999886532211 11112357899999999999999999999999999999999999999
Q ss_pred EEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEc
Q 014786 192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD 271 (418)
Q Consensus 192 vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~ 271 (418)
++++|+.+||||||++.. .+++++++++..+++||+|+++|||++...+++.|+|++..+.... .....+++++|
T Consensus 118 vv~~d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~---~~~~~~~iqtd 192 (353)
T PRK10898 118 LVGSDSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLS---PTGRQNFLQTD 192 (353)
T ss_pred EEEEcCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccC---CccccceEEec
Confidence 999999999999999873 5888999988889999999999999998889999999987765322 12234789999
Q ss_pred cccCCCCCCCceeCCCceEEEEEeeeeCCCC---CCCCccceeecccchhhhhhhhhcccccccccCeeeccc--hhhhh
Q 014786 272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSG---ASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD--QSVEQ 346 (418)
Q Consensus 272 ~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~---~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~~--~~~~~ 346 (418)
+++++|+|||||+|.+||||||+++.+...+ ...+++|+||++.+++++++|+++|++.++|||+.+++. ..++.
T Consensus 193 a~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~ 272 (353)
T PRK10898 193 ASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQG 272 (353)
T ss_pred cccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHh
Confidence 9999999999999999999999998775432 236899999999999999999999999999999998753 22334
Q ss_pred hCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 347 LGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 347 ~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
+++ .|++|.+|.+++||+++||++ ||+|++|||++|.+..|+.+.+...++|++++++|+
T Consensus 273 ~~~~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~ 335 (353)
T PRK10898 273 GGIDQLQGIVVNEVSPDGPAAKAGIQV-----------NDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVM 335 (353)
T ss_pred cCCCCCCeEEEEEECCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEE
Confidence 443 799999999999999999999 999999999999999999999988889999998874
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=7.5e-50 Score=399.67 Aligned_cols=287 Identities=36% Similarity=0.604 Sum_probs=248.3
Q ss_pred CccchhHHHHHHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEE
Q 014786 112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK 191 (418)
Q Consensus 112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~ 191 (418)
...+.+..++++++.||||.|.+.....+.+ ......+.||||+|+++||||||+|||.+++.+.|.+.||+.++|+
T Consensus 41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~~---~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~dg~~~~a~ 117 (351)
T TIGR02038 41 NTVEISFNKAVRRAAPAVVNIYNRSISQNSL---NQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQDGRKFEAE 117 (351)
T ss_pred cccchhHHHHHHhcCCcEEEEEeEecccccc---ccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECCCCEEEEE
Confidence 4455678899999999999999865433211 1123357899999999999999999999999999999999999999
Q ss_pred EEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEc
Q 014786 192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD 271 (418)
Q Consensus 192 vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~ 271 (418)
++++|+.+||||||++.. .+++++++++..+++||+|+++|||++...+++.|+|+...+.... .....+++++|
T Consensus 118 vv~~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~---~~~~~~~iqtd 192 (351)
T TIGR02038 118 LVGSDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS---SVGRQNFIQTD 192 (351)
T ss_pred EEEecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC---CCCcceEEEEC
Confidence 999999999999999874 4888999888889999999999999999899999999988764321 12235789999
Q ss_pred cccCCCCCCCceeCCCceEEEEEeeeeCCC--CCCCCccceeecccchhhhhhhhhcccccccccCeeecc--chhhhhh
Q 014786 272 AAINPGNSGGPLLDSSGSLIGINTAIYSPS--GASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP--DQSVEQL 347 (418)
Q Consensus 272 ~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~--~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~--~~~~~~~ 347 (418)
+.+++|+|||||+|.+|+||||+++.+... +...+++|+||++.+++++++++++|++.|+|||+.+++ ...++.+
T Consensus 193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l 272 (351)
T TIGR02038 193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL 272 (351)
T ss_pred CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence 999999999999999999999999876432 234689999999999999999999999999999999886 3345567
Q ss_pred Cc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 348 GV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 348 ~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
|+ .|++|.+|.+++||+++||++ ||+|++|||++|.+++|+.+.+...++|++++++|+
T Consensus 273 gl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~ 334 (351)
T TIGR02038 273 GLPDLRGIVITGVDPNGPAARAGILV-----------RDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVL 334 (351)
T ss_pred CCCccccceEeecCCCChHHHCCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEE
Confidence 76 599999999999999999999 999999999999999999999998889999998874
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=3.3e-49 Score=408.11 Aligned_cols=286 Identities=38% Similarity=0.600 Sum_probs=248.9
Q ss_pred hHHHHHHHcCCceEEEEeeecccC-----------ccccc--------------------------cccCCCeEEEEEEE
Q 014786 117 ATVRLFQENTPSVVNITNLAARQD-----------AFTLD--------------------------VLEVPQGSGSGFVW 159 (418)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~-----------~~~~~--------------------------~~~~~~~~GSGfiI 159 (418)
++.++++++.||||.|.+...... +|... .....++.||||||
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 588999999999999988653211 01100 00112468999999
Q ss_pred cC-CcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCC
Q 014786 160 DS-KGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFG 238 (418)
Q Consensus 160 ~~-~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g 238 (418)
++ +||||||+|||.++++++|.+.|++.|+|++++.|+.+||||||++.. ..+++++++++..+++||+|+++|||++
T Consensus 119 ~~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~-~~l~~~~lg~s~~l~~G~~V~aiG~P~g 197 (473)
T PRK10942 119 DADKGYVVTNNHVVDNATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNP-KNLTAIKMADSDALRVGDYTVAIGNPYG 197 (473)
T ss_pred ECCCCEEEeChhhcCCCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEcCCCC
Confidence 96 599999999999999999999999999999999999999999999753 3689999999999999999999999999
Q ss_pred CCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchh
Q 014786 239 LDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNG 318 (418)
Q Consensus 239 ~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~ 318 (418)
...+++.|+|+++.+.... ...+.+++++|+++++|+|||||+|.+|+||||+++.+.++++..+++|+||++.+++
T Consensus 198 ~~~tvt~GiVs~~~r~~~~---~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~ 274 (473)
T PRK10942 198 LGETVTSGIVSALGRSGLN---VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKN 274 (473)
T ss_pred CCcceeEEEEEEeecccCC---cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHH
Confidence 9999999999988764211 1234578999999999999999999999999999998887776789999999999999
Q ss_pred hhhhhhhcccccccccCeeecc--chhhhhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEe
Q 014786 319 IVDQLVKFGKVTRPILGIKFAP--DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV 393 (418)
Q Consensus 319 ~l~~l~~~g~v~~~~lGv~~~~--~~~~~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v 393 (418)
++++|+++|++.|+|||+.+++ ...++.+++ .|++|.+|.+++||+++||+. ||+|++|||++|
T Consensus 275 v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V 343 (473)
T PRK10942 275 LTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKA-----------GDVITSLNGKPI 343 (473)
T ss_pred HHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCCCC-----------CCEEEEECCEEC
Confidence 9999999999999999999886 344667776 599999999999999999999 999999999999
Q ss_pred CCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 394 SNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 394 ~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.+++|+.+.+...++|++++++|+
T Consensus 344 ~s~~dl~~~l~~~~~g~~v~l~v~ 367 (473)
T PRK10942 344 SSFAALRAQVGTMPVGSKLTLGLL 367 (473)
T ss_pred CCHHHHHHHHHhcCCCCEEEEEEE
Confidence 999999999998889999998874
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=1.3e-47 Score=393.81 Aligned_cols=285 Identities=45% Similarity=0.682 Sum_probs=249.5
Q ss_pred HHHHHHHcCCceEEEEeeecccC-------------ccccc--------cccCCCeEEEEEEEcCCcEEEecccccCCCC
Q 014786 118 TVRLFQENTPSVVNITNLAARQD-------------AFTLD--------VLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS 176 (418)
Q Consensus 118 ~~~~~~~~~~SVV~I~~~~~~~~-------------~~~~~--------~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~ 176 (418)
+.++++++.||||.|.+...... +|... ......+.||||+|+++||||||+||+.++.
T Consensus 3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~ 82 (428)
T TIGR02037 3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD 82 (428)
T ss_pred HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence 56889999999999988652211 11100 0112457899999999999999999999999
Q ss_pred eEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeec
Q 014786 177 DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREIS 256 (418)
Q Consensus 177 ~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~ 256 (418)
.+.|++.|++.++|++++.|+.+||||||++.. ..+++++++++..+++|++|+++|||++...+++.|+|+...+...
T Consensus 83 ~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~ 161 (428)
T TIGR02037 83 EITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGL 161 (428)
T ss_pred eEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCcc
Confidence 999999999999999999999999999999875 3689999998888999999999999999999999999998776521
Q ss_pred ccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccccccCe
Q 014786 257 SAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGI 336 (418)
Q Consensus 257 ~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv 336 (418)
....+..++++|+++++|+|||||+|.+|+||||+++.+...++..+++|+||++.+++++++++++|++.++|||+
T Consensus 162 ---~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi 238 (428)
T TIGR02037 162 ---GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGV 238 (428)
T ss_pred ---CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCce
Confidence 12334578999999999999999999999999999998877666778999999999999999999999999999999
Q ss_pred eecc--chhhhhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786 337 KFAP--DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (418)
Q Consensus 337 ~~~~--~~~~~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~ 411 (418)
.+++ ...++.+|+ .|++|.+|.+++||+++||++ ||+|++|||++|.++.|+.+.+...++|++
T Consensus 239 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~ 307 (428)
T TIGR02037 239 TIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKA-----------GDVILSVNGKPISSFADLRRAIGTLKPGKK 307 (428)
T ss_pred EeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCE
Confidence 9986 345677887 699999999999999999999 999999999999999999999998889999
Q ss_pred EEEEEE
Q 014786 412 VSCFTF 417 (418)
Q Consensus 412 v~l~v~ 417 (418)
++++|+
T Consensus 308 v~l~v~ 313 (428)
T TIGR02037 308 VTLGIL 313 (428)
T ss_pred EEEEEE
Confidence 999874
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1.9e-38 Score=317.61 Aligned_cols=288 Identities=45% Similarity=0.673 Sum_probs=248.3
Q ss_pred hhHHHHHHHcCCceEEEEeeecccC-ccccccc-c-CCCeEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEE
Q 014786 116 LATVRLFQENTPSVVNITNLAARQD-AFTLDVL-E-VPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKI 192 (418)
Q Consensus 116 ~~~~~~~~~~~~SVV~I~~~~~~~~-~~~~~~~-~-~~~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~v 192 (418)
..+..+++++.|+||.|........ .|..... . ...+.||||+++++|||+|+.||+.++..+.+.+.||+.+++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~dg~~~~a~~ 112 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLADGREVPAKL 112 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeCCCCEEEEEE
Confidence 5778899999999999988654332 1110000 0 01589999999989999999999999999999999999999999
Q ss_pred EEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEcc
Q 014786 193 VGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDA 272 (418)
Q Consensus 193 v~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~ 272 (418)
++.|+..|+|++|++.... ++.+.++++..++.|++++++|+|++...+++.|+++...+. . ......+.++||+|+
T Consensus 113 vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~-v~~~~~~~~~IqtdA 189 (347)
T COG0265 113 VGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-G-VGSAGGYVNFIQTDA 189 (347)
T ss_pred EecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-c-ccCcccccchhhccc
Confidence 9999999999999998543 888899999999999999999999999999999999999886 1 111122568899999
Q ss_pred ccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccccccCeeeccchhhhhhCc---
Q 014786 273 AINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLGV--- 349 (418)
Q Consensus 273 ~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~~~~~~~--- 349 (418)
++++|+||||++|.+|++|||+++.+...++..+++|+||++.++.+++++.++|++.++|+|+.+.+......+|+
T Consensus 190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~~g~~~~ 269 (347)
T COG0265 190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIALGLPVA 269 (347)
T ss_pred ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccccCCCCC
Confidence 99999999999999999999999998876656779999999999999999999889999999999876332222443
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.|++|.++.+++||+++|++. ||+|+++||+++.+..++.+.+...++|+++.++++
T Consensus 270 ~G~~V~~v~~~spa~~agi~~-----------Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~ 326 (347)
T COG0265 270 AGAVVLGVLPGSPAAKAGIKA-----------GDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLL 326 (347)
T ss_pred CceEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEE
Confidence 799999999999999999999 999999999999999999999999999999999875
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.95 E-value=6.7e-27 Score=236.05 Aligned_cols=291 Identities=37% Similarity=0.543 Sum_probs=232.0
Q ss_pred chhHHHHHHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCCcEEEecccccCCCC-----------eEEEEeC
Q 014786 115 ELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS-----------DIRVTFA 183 (418)
Q Consensus 115 ~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~~~-----------~i~V~~~ 183 (418)
......+.++-.+++|.|+...-..........+.+...||||||+.+|.++||+||+.... .+.|...
T Consensus 127 ~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa 206 (473)
T KOG1320|consen 127 KAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAA 206 (473)
T ss_pred hhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEe
Confidence 34456788999999999998544333322334456788999999999999999999997543 2677776
Q ss_pred CC--cEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCC-
Q 014786 184 DQ--SAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAAT- 260 (418)
Q Consensus 184 dg--~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~- 260 (418)
+| ..+++.+.+.|+..|+|+++++.+..-.++++++.+..+..|+++..+|.|++..++.+.|++++..|.......
T Consensus 207 ~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~ 286 (473)
T KOG1320|consen 207 IGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE 286 (473)
T ss_pred ecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence 66 889999999999999999999765433788888888899999999999999999999999999988886544332
Q ss_pred -CCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccc---------c
Q 014786 261 -GRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKV---------T 330 (418)
Q Consensus 261 -~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v---------~ 330 (418)
+....+.+++|+++++|+||+|++|.+|++||+++......+-..+++|++|.+.+..++.+..+.... .
T Consensus 287 ~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~ 366 (473)
T KOG1320|consen 287 TGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPV 366 (473)
T ss_pred cceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCcccc
Confidence 245678899999999999999999999999999988765444456899999999999988887443322 2
Q ss_pred ccccCeeecc-------chhhhhh----C-ccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHH
Q 014786 331 RPILGIKFAP-------DQSVEQL----G-VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSD 398 (418)
Q Consensus 331 ~~~lGv~~~~-------~~~~~~~----~-~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~d 398 (418)
+.|+|....- ....+.+ + .+|++|.+|.+++++...+++. ||+|++|||++|.+..+
T Consensus 367 ~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~-----------g~~V~~vng~~V~n~~~ 435 (473)
T KOG1320|consen 367 HQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKP-----------GDQVVKVNGKPVKNLKH 435 (473)
T ss_pred cccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccC-----------CCEEEEECCEEeechHH
Confidence 4577754321 0011111 1 2699999999999999999999 99999999999999999
Q ss_pred HHHHHhcCCCCCEEEEEE
Q 014786 399 LYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 399 l~~~l~~~~~g~~v~l~v 416 (418)
|.++++.+.++++|.+..
T Consensus 436 l~~~i~~~~~~~~v~vl~ 453 (473)
T KOG1320|consen 436 LYELIEECSTEDKVAVLD 453 (473)
T ss_pred HHHHHHhcCcCceEEEEE
Confidence 999999888888877754
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.85 E-value=1.9e-20 Score=191.40 Aligned_cols=279 Identities=27% Similarity=0.348 Sum_probs=216.6
Q ss_pred hHHHHHHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCC-cEEEecccccCC-CCeEEEEeCCCcEEEEEEEE
Q 014786 117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIRG-ASDIRVTFADQSAYDAKIVG 194 (418)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~-~~~i~V~~~dg~~~~a~vv~ 194 (418)
.....+..+-+|||.|....... |..+ .-..+.+|||++++. ||+|||+||+.. .-...+.+.+..+.+.-.+.
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~~--fdte--sag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy 128 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVRA--FDTE--SAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY 128 (955)
T ss_pred hhhhhhhhhcccEEEEEehheee--cccc--cccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence 66678889999999998764321 2111 124678999999976 899999999974 44567778888788888899
Q ss_pred EcCCCCeEEEEecCCC---CCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCC---CCcccEE
Q 014786 195 FDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG---RPIQDVI 268 (418)
Q Consensus 195 ~d~~~DlAlLkv~~~~---~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i 268 (418)
.|+-+|+.+++.+... ..+..+.++. +-.++|.++.++|+..+.-.+...|.++.+.+....+... .-....+
T Consensus 129 rDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~ 207 (955)
T KOG1421|consen 129 RDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI 207 (955)
T ss_pred CCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceee
Confidence 9999999999998643 2344444542 3467899999999988877788899999888877665421 1123456
Q ss_pred EEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccccccCeeecc--chhhhh
Q 014786 269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP--DQSVEQ 346 (418)
Q Consensus 269 ~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~--~~~~~~ 346 (418)
|..+....|.||+|++|.+|..|.++..+... .+-+|++|++.+.+.+.-++++.-+.|+.|.++|.+ .+..++
T Consensus 208 QaasstsggssgspVv~i~gyAVAl~agg~~s----sas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rr 283 (955)
T KOG1421|consen 208 QAASSTSGGSSGSPVVDIPGYAVALNAGGSIS----SASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRR 283 (955)
T ss_pred eehhcCCCCCCCCceecccceEEeeecCCccc----ccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHh
Confidence 77778889999999999999999999876442 356799999999999999988888999999999986 344566
Q ss_pred hCc---------------cCcE-EEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCC
Q 014786 347 LGV---------------SGVL-VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGD 410 (418)
Q Consensus 347 ~~~---------------~G~~-V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~ 410 (418)
+|+ .|++ |..+.+++|+++. |+.||++++||+.-++++.++.+.|++. .|+
T Consensus 284 lGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~------------Le~GDillavN~t~l~df~~l~~iLDeg-vgk 350 (955)
T KOG1421|consen 284 LGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK------------LEPGDILLAVNSTCLNDFEALEQILDEG-VGK 350 (955)
T ss_pred cCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc------------cCCCcEEEEEcceehHHHHHHHHHHhhc-cCc
Confidence 665 3444 4555667766554 4559999999999999999999999976 899
Q ss_pred EEEEEEE
Q 014786 411 EVSCFTF 417 (418)
Q Consensus 411 ~v~l~v~ 417 (418)
.+.|+|+
T Consensus 351 ~l~LtI~ 357 (955)
T KOG1421|consen 351 NLELTIQ 357 (955)
T ss_pred eEEEEEE
Confidence 9888875
No 9
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.70 E-value=2.2e-16 Score=132.63 Aligned_cols=109 Identities=38% Similarity=0.599 Sum_probs=74.8
Q ss_pred EEEEEEcCCcEEEecccccC--------CCCeEEEEeCCCcEEE--EEEEEEcCC-CCeEEEEecCCCCCCcccccCCCC
Q 014786 154 GSGFVWDSKGHVVTNYHVIR--------GASDIRVTFADQSAYD--AKIVGFDQD-KDVAVLRIDAPKDKLRPIPIGVSA 222 (418)
Q Consensus 154 GSGfiI~~~G~ILT~aHvv~--------~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~~~l~~~~ 222 (418)
||||+|+++|+||||+||+. ....+.+...+++.+. ++++..|+. .|+|||+++. .
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~-------------~ 67 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP-------------W 67 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC-------------E
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec-------------c
Confidence 89999999899999999998 4567888888888888 999999999 9999999980 0
Q ss_pred CCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEE
Q 014786 223 DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGI 293 (418)
Q Consensus 223 ~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI 293 (418)
...+...... ............ ......+ +++.+.+|+||||+||.+|+||||
T Consensus 68 -~~~~~~~~~~------------~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 68 -TGVGGGVRVP------------GSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp -EEEEEEEEEE------------EEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred -cceeeeeEee------------eeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 0000000000 000000000000 0001114 899999999999999999999997
No 10
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.47 E-value=4e-12 Score=117.33 Aligned_cols=168 Identities=24% Similarity=0.369 Sum_probs=109.6
Q ss_pred CeEEEEEEEcCCcEEEecccccCCCCeEEEEeC-------CC--cEEEEEEEEE----cC---CCCeEEEEecCC---CC
Q 014786 151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFA-------DQ--SAYDAKIVGF----DQ---DKDVAVLRIDAP---KD 211 (418)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~-------dg--~~~~a~vv~~----d~---~~DlAlLkv~~~---~~ 211 (418)
...|+|++|+++ +|||++||+.+...+.+.+. ++ ..+..+-+.. +. .+|+|||+++.+ ..
T Consensus 24 ~~~C~G~li~~~-~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~ 102 (220)
T PF00089_consen 24 RFFCTGTLISPR-WVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD 102 (220)
T ss_dssp EEEEEEEEEETT-EEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred CeeEeEEecccc-ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 457999999987 99999999999656666543 22 2344443333 22 469999999987 24
Q ss_pred CCcccccCCC-CCCCCCCEEEEEeCCCCCCC----ceeEeEEeeeeeeeccc-CCCCCcccEEEEcc----ccCCCCCCC
Q 014786 212 KLRPIPIGVS-ADLLVGQKVYAIGNPFGLDH----TLTTGVISGLRREISSA-ATGRPIQDVIQTDA----AINPGNSGG 281 (418)
Q Consensus 212 ~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~~----~~~~G~Vs~~~~~~~~~-~~~~~~~~~i~~~~----~i~~G~SGG 281 (418)
...++.+... ..+..|+.+.++||+..... ......+.-+....... .........+.... ..+.|+|||
T Consensus 103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~ 182 (220)
T PF00089_consen 103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGG 182 (220)
T ss_dssp SBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTS
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 5667777652 34688999999999875322 33434443333321111 11112345566655 788999999
Q ss_pred ceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhh
Q 014786 282 PLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIV 320 (418)
Q Consensus 282 Pl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l 320 (418)
|+++.++.|+||++.. ..++.....+++.++....+++
T Consensus 183 pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 183 PLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGGHHHH
T ss_pred ccccceeeecceeeec-CCCCCCCcCEEEEEHHHhhccC
Confidence 9998877899999987 3333333357778887776654
No 11
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.46 E-value=1.1e-12 Score=135.04 Aligned_cols=288 Identities=18% Similarity=0.207 Sum_probs=193.8
Q ss_pred HHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCC-cEEEecccccC-CCCeEEEEeCCCcEEEEEEEEEcCCC
Q 014786 122 FQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVGFDQDK 199 (418)
Q Consensus 122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~-~~~~i~V~~~dg~~~~a~vv~~d~~~ 199 (418)
.+.+..+.|.+.... ++..+........|||.|++.. |++++.+.++. +..+.+|...|...+.|.+.+.++..
T Consensus 524 ~~~i~~~~~~v~~~~----~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~ 599 (955)
T KOG1421|consen 524 SADISNCLVDVEPMM----PVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTE 599 (955)
T ss_pred hhHHhhhhhhheece----eeccccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCcc
Confidence 345566666665532 1222222223457999999955 89999999996 56788999999999999999999999
Q ss_pred CeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeee----ecccCCCCCcccEEEEccccC
Q 014786 200 DVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRRE----ISSAATGRPIQDVIQTDAAIN 275 (418)
Q Consensus 200 DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~----~~~~~~~~~~~~~i~~~~~i~ 275 (418)
++|.+|.+... ...+.+.+ ..+..||++...|+......-.....|..+... ...........+.|.+++.+.
T Consensus 600 n~a~~kydp~~--~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nls 676 (955)
T KOG1421|consen 600 NVASFKYDPAL--EVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNLS 676 (955)
T ss_pred ceeEeccChhH--hhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEecccc
Confidence 99999998743 34555543 458889999999987654432222222222111 001111123356777777776
Q ss_pred CCCCCCceeCCCceEEEEEeeeeCC--CCCCCCccceeecccchhhhhhhhhcccccccccCeeeccchh--hhhhCccC
Q 014786 276 PGNSGGPLLDSSGSLIGINTAIYSP--SGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQS--VEQLGVSG 351 (418)
Q Consensus 276 ~G~SGGPl~n~~G~VVGI~s~~~~~--~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~--~~~~~~~G 351 (418)
.++--|-+.|.+|+|+|++-..+.. ++.+-..-|.+.+.+++..++.|+..++..-..+|++|....+ ++.+|++-
T Consensus 677 T~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lglp~ 756 (955)
T KOG1421|consen 677 TSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGLPS 756 (955)
T ss_pred ccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCCCH
Confidence 6666778899999999997644432 2233345677888999999999998888877788888876433 44566655
Q ss_pred cEEEecCCCChhhhcCcccccc--ccCCCccCCcEEEEECCEEeCCHHHHHHHHh----cCCCCCEEEEEE
Q 014786 352 VLVLDAPPNGPAGKAGLLSTKR--DAYGRLILGDIITSVNGKKVSNGSDLYRILD----QCKVGDEVSCFT 416 (418)
Q Consensus 352 ~~V~~v~~~~pa~~aGl~~~~~--~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~----~~~~g~~v~l~v 416 (418)
.++.+...++.-.++-+..++. ...+-|..||||+++||+.|+...||.+... -.+.|..++++|
T Consensus 757 e~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~~eid~~ilrdg~~~~iki 827 (955)
T KOG1421|consen 757 EFIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDFEEIDAVILRDGIEMEIKI 827 (955)
T ss_pred HHHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhhhhhheeeeecCcEEEEEe
Confidence 5555555555544443333222 2234567799999999999999999987543 126788777765
No 12
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.39 E-value=2.6e-12 Score=130.59 Aligned_cols=252 Identities=18% Similarity=0.196 Sum_probs=179.2
Q ss_pred HHHcCCceEEEEeeecccCcccccccc-CCCeEEEEEEEcCCcEEEecccccC---CCCeEEEEe-CCCcEEEEEEEEEc
Q 014786 122 FQENTPSVVNITNLAARQDAFTLDVLE-VPQGSGSGFVWDSKGHVVTNYHVIR---GASDIRVTF-ADQSAYDAKIVGFD 196 (418)
Q Consensus 122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~-~~~~~GSGfiI~~~G~ILT~aHvv~---~~~~i~V~~-~dg~~~~a~vv~~d 196 (418)
.+....|++.+............|... .....|+||.+... .++|++|++. +...+.+.- ..-+.|.+++...-
T Consensus 56 ~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~ 134 (473)
T KOG1320|consen 56 VDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF 134 (473)
T ss_pred ccccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence 445567888888776666544444433 33567999999754 8999999999 555555552 23356888888888
Q ss_pred CCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCC
Q 014786 197 QDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINP 276 (418)
Q Consensus 197 ~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~ 276 (418)
.++|+|++.++........-++...+-+...+.++++| |....++.|.|...... .+..+......+++++++++
T Consensus 135 ~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~--~y~~~~~~l~~vqi~aa~~~ 209 (473)
T KOG1320|consen 135 EECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVG---GDGIIVTNGHVVRVEPR--IYAHSSTVLLRVQIDAAIGP 209 (473)
T ss_pred hcccceEEEEeeccccCCCcccccCCCcccCccEEEEc---CCcEEEEeeEEEEEEec--cccCCCcceeeEEEEEeecC
Confidence 99999999998743211111233233456678899998 66779999999876543 23334445568999999999
Q ss_pred CCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccc-cccccCeeeccch---hhhhhCc---
Q 014786 277 GNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKV-TRPILGIKFAPDQ---SVEQLGV--- 349 (418)
Q Consensus 277 G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v-~~~~lGv~~~~~~---~~~~~~~--- 349 (418)
|+||+|.+...+++.|+........ +.+++.||.-.+.++.......+.. .+++++...+... ..+.+.+
T Consensus 210 ~~s~ep~i~g~d~~~gvA~l~ik~~---~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~ 286 (473)
T KOG1320|consen 210 GNSGEPVIVGVDKVAGVAFLKIKTP---ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLE 286 (473)
T ss_pred CccCCCeEEccccccceEEEEEecC---CcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCcc
Confidence 9999999988789999998876422 2689999999999998877666654 4677776665422 2333333
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeC
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS 394 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~ 394 (418)
.|+.+.++.+-+.|- ..++.||+|+++||..|.
T Consensus 287 ~g~~i~~~~qtd~ai------------~~~nsg~~ll~~DG~~Ig 319 (473)
T KOG1320|consen 287 TGVLISKINQTDAAI------------NPGNSGGPLLNLDGEVIG 319 (473)
T ss_pred cceeeeeecccchhh------------hcccCCCcEEEecCcEee
Confidence 578888888766553 345559999999999983
No 13
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.33 E-value=5.8e-12 Score=99.73 Aligned_cols=70 Identities=40% Similarity=0.598 Sum_probs=61.5
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~ 411 (418)
||||+.+..... ..|++|.+|.+++||+++||++ ||+|++|||++|++..|+.+.+...++|++
T Consensus 1 ~~lGv~~~~~~~-----~~g~~V~~V~~~spA~~aGl~~-----------GD~I~~ing~~v~~~~~~~~~l~~~~~g~~ 64 (82)
T PF13180_consen 1 GGLGVTVQNLSD-----TGGVVVVSVIPGSPAAKAGLQP-----------GDIILAINGKPVNSSEDLVNILSKGKPGDT 64 (82)
T ss_dssp -E-SEEEEECSC-----SSSEEEEEESTTSHHHHTTS-T-----------TEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred CEECeEEEEccC-----CCeEEEEEeCCCCcHHHCCCCC-----------CcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence 689999877542 2699999999999999999999 999999999999999999999999999999
Q ss_pred EEEEEE
Q 014786 412 VSCFTF 417 (418)
Q Consensus 412 v~l~v~ 417 (418)
++|+|+
T Consensus 65 v~l~v~ 70 (82)
T PF13180_consen 65 VTLTVL 70 (82)
T ss_dssp EEEEEE
T ss_pred EEEEEE
Confidence 999885
No 14
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.31 E-value=9e-11 Score=109.04 Aligned_cols=169 Identities=21% Similarity=0.260 Sum_probs=98.6
Q ss_pred CeEEEEEEEcCCcEEEecccccCCC--CeEEEEeCC---------CcEEEEEEEEEc-------CCCCeEEEEecCCC--
Q 014786 151 QGSGSGFVWDSKGHVVTNYHVIRGA--SDIRVTFAD---------QSAYDAKIVGFD-------QDKDVAVLRIDAPK-- 210 (418)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~--~~i~V~~~d---------g~~~~a~vv~~d-------~~~DlAlLkv~~~~-- 210 (418)
...|+|++|+++ +|||+|||+.+. ..+.|.+.. ...+..+-+..+ ...|||||+++.+.
T Consensus 24 ~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~ 102 (232)
T cd00190 24 RHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTL 102 (232)
T ss_pred cEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccC
Confidence 468999999986 999999999875 456666532 122334444443 35799999998753
Q ss_pred -CCCcccccCCCC-CCCCCCEEEEEeCCCCCCC-----ceeEeEEeeeeeeecccCCC---CCcccEEEE-----ccccC
Q 014786 211 -DKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDH-----TLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN 275 (418)
Q Consensus 211 -~~~~~~~l~~~~-~~~~G~~V~~vG~p~g~~~-----~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----~~~i~ 275 (418)
..+.++.+.... .+..|+.+.++||...... ......+.-+....+..... ......+.. +...+
T Consensus 103 ~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 182 (232)
T cd00190 103 SDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDAC 182 (232)
T ss_pred CCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccc
Confidence 235677775543 5778999999998754322 12222222111111110000 011122222 34577
Q ss_pred CCCCCCceeCCC---ceEEEEEeeeeCCCCCCCCccceeecccchhhhh
Q 014786 276 PGNSGGPLLDSS---GSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVD 321 (418)
Q Consensus 276 ~G~SGGPl~n~~---G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~ 321 (418)
+|+||||++... +.++||.++... ++.....+.+..+...+++++
T Consensus 183 ~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~~WI~ 230 (232)
T cd00190 183 QGDSGGPLVCNDNGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYLDWIQ 230 (232)
T ss_pred cCCCCCcEEEEeCCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhhHHhh
Confidence 899999999764 789999998653 221123333444444444443
No 15
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.22 E-value=3.5e-10 Score=105.33 Aligned_cols=147 Identities=23% Similarity=0.329 Sum_probs=89.8
Q ss_pred CeEEEEEEEcCCcEEEecccccCCCC--eEEEEeCCC--------cEEEEEEEEEc-------CCCCeEEEEecCCC---
Q 014786 151 QGSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ--------SAYDAKIVGFD-------QDKDVAVLRIDAPK--- 210 (418)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~dg--------~~~~a~vv~~d-------~~~DlAlLkv~~~~--- 210 (418)
...|+|++|+++ +|||+|||+.+.. .+.|.+... ..+.+.-+..+ ...|+|||+++.+.
T Consensus 25 ~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~ 103 (229)
T smart00020 25 RHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLS 103 (229)
T ss_pred CcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCC
Confidence 457999999976 9999999998753 677776432 22334433322 45799999998752
Q ss_pred CCCcccccCCC-CCCCCCCEEEEEeCCCCCC------CceeEeEEeeeeeeecccCCC---CCcccEEEE-----ccccC
Q 014786 211 DKLRPIPIGVS-ADLLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN 275 (418)
Q Consensus 211 ~~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----~~~i~ 275 (418)
..+.++.+... ..+..++.+.+.||+.... .......+.-+.......... ......+.. ....+
T Consensus 104 ~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 183 (229)
T smart00020 104 DNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDAC 183 (229)
T ss_pred CceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCccc
Confidence 24566666543 3467789999999876542 112222222211111110000 001112211 35578
Q ss_pred CCCCCCceeCCCc--eEEEEEeeee
Q 014786 276 PGNSGGPLLDSSG--SLIGINTAIY 298 (418)
Q Consensus 276 ~G~SGGPl~n~~G--~VVGI~s~~~ 298 (418)
+|+||||++...+ .++||++...
T Consensus 184 ~gdsG~pl~~~~~~~~l~Gi~s~g~ 208 (229)
T smart00020 184 QGDSGGPLVCNDGRWVLVGIVSWGS 208 (229)
T ss_pred CCCCCCeeEEECCCEEEEEEEEECC
Confidence 8999999996543 8999999865
No 16
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=99.00 E-value=1.7e-08 Score=95.54 Aligned_cols=158 Identities=18% Similarity=0.228 Sum_probs=93.3
Q ss_pred EEEEEEEcCCcEEEecccccCCCC----eEEEEe----CCCc-EE--EEEEEEEc-C---CCCeEEEEecCCCC------
Q 014786 153 SGSGFVWDSKGHVVTNYHVIRGAS----DIRVTF----ADQS-AY--DAKIVGFD-Q---DKDVAVLRIDAPKD------ 211 (418)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHvv~~~~----~i~V~~----~dg~-~~--~a~vv~~d-~---~~DlAlLkv~~~~~------ 211 (418)
.+++|+|+++ .+||++||+.... ++.+.. .++. .+ ........ . +.|.+...+.....
T Consensus 65 ~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~ 143 (251)
T COG3591 65 CTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINI 143 (251)
T ss_pred eeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCc
Confidence 4466999987 9999999996443 222211 1111 11 11122112 2 34666666643211
Q ss_pred --CCcccccCCCCCCCCCCEEEEEeCCCCCCCc----eeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeC
Q 014786 212 --KLRPIPIGVSADLLVGQKVYAIGNPFGLDHT----LTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD 285 (418)
Q Consensus 212 --~~~~~~l~~~~~~~~G~~V~~vG~p~g~~~~----~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n 285 (418)
-.....+......+.++.+.++|||.+.... ...+.|... ....+.+++.+.+|+||+|+++
T Consensus 144 ~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~------------~~~~l~y~~dT~pG~SGSpv~~ 211 (251)
T COG3591 144 GDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSI------------KGNKLFYDADTLPGSSGSPVLI 211 (251)
T ss_pred cccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEE------------ecceEEEEecccCCCCCCceEe
Confidence 1122223334457789999999999775532 223333211 1246889999999999999999
Q ss_pred CCceEEEEEeeeeCCCCCCCCcccee-ecccchhhhhhhh
Q 014786 286 SSGSLIGINTAIYSPSGASSGVGFSI-PVDTVNGIVDQLV 324 (418)
Q Consensus 286 ~~G~VVGI~s~~~~~~~~~~~~~~aI-p~~~i~~~l~~l~ 324 (418)
.+.+|+|+++.+....++ ...++++ -...+++++++++
T Consensus 212 ~~~~vigv~~~g~~~~~~-~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 212 SKDEVIGVHYNGPGANGG-SLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred cCceEEEEEecCCCcccc-cccCcceEecHHHHHHHHHhh
Confidence 988999999977553332 2344333 3456666666654
No 17
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.96 E-value=3e-09 Score=84.98 Aligned_cols=75 Identities=44% Similarity=0.659 Sum_probs=63.3
Q ss_pred cccCeeeccchhh--hhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcC
Q 014786 332 PILGIKFAPDQSV--EQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (418)
Q Consensus 332 ~~lGv~~~~~~~~--~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~ 406 (418)
+|+|+.+++.... ..+++ .|++|.+|.+++||+++||+. ||+|++|||++|.++.++.+.+...
T Consensus 1 ~~~G~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~i~~~~~~~~~l~~~ 69 (90)
T cd00987 1 PWLGVTVQDLTPDLAEELGLKDTKGVLVASVDPGSPAAKAGLKP-----------GDVILAVNGKPVKSVADLRRALAEL 69 (90)
T ss_pred CccceEEeECCHHHHHHcCCCCCCEEEEEEECCCCHHHHcCCCc-----------CCEEEEECCEECCCHHHHHHHHHhc
Confidence 5889999874322 22333 599999999999999999999 9999999999999999999999887
Q ss_pred CCCCEEEEEEE
Q 014786 407 KVGDEVSCFTF 417 (418)
Q Consensus 407 ~~g~~v~l~v~ 417 (418)
..|+.+.+++.
T Consensus 70 ~~~~~i~l~v~ 80 (90)
T cd00987 70 KPGDKVTLTVL 80 (90)
T ss_pred CCCCEEEEEEE
Confidence 77899888764
No 18
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.91 E-value=4.8e-09 Score=82.43 Aligned_cols=57 Identities=28% Similarity=0.485 Sum_probs=53.3
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.|++|.+|.+++||+++||++ ||+|++|||++|.+++|+.+.+...++|+++.++++
T Consensus 10 ~Gv~V~~V~~~spa~~aGL~~-----------GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~ 66 (79)
T cd00991 10 AGVVIVGVIVGSPAENAVLHT-----------GDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVL 66 (79)
T ss_pred CcEEEEEECCCChHHhcCCCC-----------CCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEE
Confidence 699999999999999999999 999999999999999999999988778999888764
No 19
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.91 E-value=4.7e-09 Score=79.96 Aligned_cols=68 Identities=35% Similarity=0.541 Sum_probs=59.0
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g 409 (418)
+++|+.+..... .|++|.++.+++||+++||++ ||+|++|||+++.++ +++.+++.... |
T Consensus 1 ~~~G~~~~~~~~------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~~~-g 62 (70)
T cd00136 1 GGLGFSIRGGTE------GGVVVLSVEPGSPAERAGLQA-----------GDVILAVNGTDVKNLTLEDVAELLKKEV-G 62 (70)
T ss_pred CCccEEEecCCC------CCEEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhhCC-C
Confidence 357777765431 489999999999999999999 999999999999999 99999998765 9
Q ss_pred CEEEEEEE
Q 014786 410 DEVSCFTF 417 (418)
Q Consensus 410 ~~v~l~v~ 417 (418)
++++|+++
T Consensus 63 ~~v~l~v~ 70 (70)
T cd00136 63 EKVTLTVR 70 (70)
T ss_pred CeEEEEEC
Confidence 99999874
No 20
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.82 E-value=1e-08 Score=98.57 Aligned_cols=89 Identities=15% Similarity=0.143 Sum_probs=78.6
Q ss_pred ccchhhhhhhhhcccccccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEe
Q 014786 314 DTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV 393 (418)
Q Consensus 314 ~~i~~~l~~l~~~g~v~~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v 393 (418)
..++++++++++++++-+.|+|+...... -...|++|..+.+++|++++||+. ||+|++|||+++
T Consensus 159 ~~~~~v~~~l~~~g~~~~~~lgi~p~~~~----g~~~G~~v~~v~~~s~a~~aGLr~-----------GDvIv~ING~~i 223 (259)
T TIGR01713 159 VVSRRIIEELTKDPQKMFDYIRLSPVMKN----DKLEGYRLNPGKDPSLFYKSGLQD-----------GDIAVALNGLDL 223 (259)
T ss_pred hhHHHHHHHHHHCHHhhhheEeEEEEEeC----CceeEEEEEecCCCCHHHHcCCCC-----------CCEEEEECCEEc
Confidence 46778999999999999999999975422 123799999999999999999999 999999999999
Q ss_pred CCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 394 SNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 394 ~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
++++++.+++.+.+++++++++|.
T Consensus 224 ~~~~~~~~~l~~~~~~~~v~l~V~ 247 (259)
T TIGR01713 224 RDPEQAFQALQMLREETNLTLTVE 247 (259)
T ss_pred CCHHHHHHHHHhcCCCCeEEEEEE
Confidence 999999999999889999988764
No 21
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.82 E-value=1.2e-08 Score=79.80 Aligned_cols=65 Identities=34% Similarity=0.497 Sum_probs=53.0
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~ 411 (418)
+|+|+.+.... .|+.|.+|.+++||+++||++ ||+|++|||+++.++.++ +...+.|+.
T Consensus 1 ~~~G~~~~~~~-------~~~~V~~V~~~s~a~~aGl~~-----------GD~I~~Ing~~v~~~~~~---l~~~~~~~~ 59 (80)
T cd00990 1 PYLGLTLDKEE-------GLGKVTFVRDDSPADKAGLVA-----------GDELVAVNGWRVDALQDR---LKEYQAGDP 59 (80)
T ss_pred CcccEEEEccC-------CcEEEEEECCCChHHHhCCCC-----------CCEEEEECCEEhHHHHHH---HHhcCCCCE
Confidence 58898886543 579999999999999999999 999999999999985554 444456778
Q ss_pred EEEEEE
Q 014786 412 VSCFTF 417 (418)
Q Consensus 412 v~l~v~ 417 (418)
+.++++
T Consensus 60 v~l~v~ 65 (80)
T cd00990 60 VELTVF 65 (80)
T ss_pred EEEEEE
Confidence 777653
No 22
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.71 E-value=5.7e-08 Score=75.72 Aligned_cols=66 Identities=29% Similarity=0.406 Sum_probs=54.6
Q ss_pred ccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEE
Q 014786 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEV 412 (418)
Q Consensus 333 ~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v 412 (418)
|+|+.+.... ..+.|.++.+++||+++||++ ||+|++|||+++.+++|+...+... .++.+
T Consensus 2 ~~~~~~g~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~l~~~-~~~~~ 62 (79)
T cd00989 2 ILGFVPGGPP-------IEPVIGEVVPGSPAAKAGLKA-----------GDRILAINGQKIKSWEDLVDAVQEN-PGKPL 62 (79)
T ss_pred eeeEeccCCc-------cCcEEEeECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHHC-CCceE
Confidence 5666655432 347899999999999999999 9999999999999999999999865 47777
Q ss_pred EEEEE
Q 014786 413 SCFTF 417 (418)
Q Consensus 413 ~l~v~ 417 (418)
.+++.
T Consensus 63 ~l~v~ 67 (79)
T cd00989 63 TLTVE 67 (79)
T ss_pred EEEEE
Confidence 77663
No 23
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.68 E-value=7.2e-08 Score=75.49 Aligned_cols=56 Identities=30% Similarity=0.374 Sum_probs=50.9
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.|++|.+|.+++||+. ||++ ||+|++|||++|.+++++..++...++|+.+.++++
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~ 63 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKA-----------GDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVK 63 (79)
T ss_pred cCEEEEEECCCCchhh-CCCC-----------CCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEE
Confidence 6899999999999986 8998 999999999999999999999987778998888764
No 24
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.63 E-value=1.1e-07 Score=75.25 Aligned_cols=66 Identities=30% Similarity=0.552 Sum_probs=56.0
Q ss_pred ccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCC
Q 014786 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGD 410 (418)
Q Consensus 333 ~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~ 410 (418)
-||+.+.... .++.|..+.+++||+++||++ ||+|++|||+++.++ .++.+++.. ..|+
T Consensus 3 ~lG~~~~~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~vng~~i~~~~~~~~~~~l~~-~~~~ 63 (85)
T cd00988 3 GIGLELKYDD-------GGLVITSVLPGSPAAKAGIKA-----------GDIIVAIDGEPVDGLSLEDVVKLLRG-KAGT 63 (85)
T ss_pred EEEEEEEEcC-------CeEEEEEecCCCCHHHcCCCC-----------CCEEEEECCEEcCCCCHHHHHHHhcC-CCCC
Confidence 3666665432 688999999999999999999 999999999999999 899988876 4688
Q ss_pred EEEEEEE
Q 014786 411 EVSCFTF 417 (418)
Q Consensus 411 ~v~l~v~ 417 (418)
.+.+++.
T Consensus 64 ~i~l~v~ 70 (85)
T cd00988 64 KVRLTLK 70 (85)
T ss_pred EEEEEEE
Confidence 8888774
No 25
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.53 E-value=2.6e-07 Score=95.41 Aligned_cols=76 Identities=36% Similarity=0.572 Sum_probs=65.6
Q ss_pred ccccCeeeccch--hhhhhCc----cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHh
Q 014786 331 RPILGIKFAPDQ--SVEQLGV----SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD 404 (418)
Q Consensus 331 ~~~lGv~~~~~~--~~~~~~~----~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~ 404 (418)
+.|+|+.+.+.. ..+.+++ .|++|.+|.+++||+++||++ ||+|++|||++|.+.+|+.+++.
T Consensus 337 ~~~lGi~~~~l~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~-----------GDvI~~Ing~~V~s~~d~~~~l~ 405 (428)
T TIGR02037 337 NPFLGLTVANLSPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQP-----------GDVILSVNQQPVSSVAELRKVLD 405 (428)
T ss_pred ccccceEEecCCHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHH
Confidence 468999887632 3444554 599999999999999999999 99999999999999999999999
Q ss_pred cCCCCCEEEEEEE
Q 014786 405 QCKVGDEVSCFTF 417 (418)
Q Consensus 405 ~~~~g~~v~l~v~ 417 (418)
..+.|+++.++|+
T Consensus 406 ~~~~g~~v~l~v~ 418 (428)
T TIGR02037 406 RAKKGGRVALLIL 418 (428)
T ss_pred hcCCCCEEEEEEE
Confidence 8888999998874
No 26
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.49 E-value=9.5e-06 Score=76.32 Aligned_cols=164 Identities=16% Similarity=0.291 Sum_probs=87.5
Q ss_pred HHHcCCceEEEEeeecccCccccccccCCCeEEEEEEEcCCcEEEecccccCC-CCeEEEEeCCCcEEEEE-----EEEE
Q 014786 122 FQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRG-ASDIRVTFADQSAYDAK-----IVGF 195 (418)
Q Consensus 122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHvv~~-~~~i~V~~~dg~~~~a~-----vv~~ 195 (418)
+..+...|++|....... ...=-|+... .||+|++|.++. ...++|...-|. |... -+..
T Consensus 13 yn~Ia~~ic~l~n~s~~~-----------~~~l~gigyG--~~iItn~HLf~~nng~L~i~s~hG~-f~v~nt~~lkv~~ 78 (235)
T PF00863_consen 13 YNPIASNICRLTNESDGG-----------TRSLYGIGYG--SYIITNAHLFKRNNGELTIKSQHGE-FTVPNTTQLKVHP 78 (235)
T ss_dssp -HHHHTTEEEEEEEETTE-----------EEEEEEEEET--TEEEEEGGGGSSTTCEEEEEETTEE-EEECEGGGSEEEE
T ss_pred cchhhheEEEEEEEeCCC-----------eEEEEEEeEC--CEEEEChhhhccCCCeEEEEeCceE-EEcCCccccceEE
Confidence 345566788888643221 1233477775 399999999964 456777776653 2221 2233
Q ss_pred cCCCCeEEEEecCCCCCCcccccC-CCCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEcccc
Q 014786 196 DQDKDVAVLRIDAPKDKLRPIPIG-VSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI 274 (418)
Q Consensus 196 d~~~DlAlLkv~~~~~~~~~~~l~-~~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i 274 (418)
-+..||.++|+.. ++||.+-. .-..++.+|.|.++|.-+.... ....|+........ ....+.......
T Consensus 79 i~~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~--~~s~vSesS~i~p~-----~~~~fWkHwIsT 148 (235)
T PF00863_consen 79 IEGRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKS--ISSTVSESSWIYPE-----ENSHFWKHWIST 148 (235)
T ss_dssp -TCSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCC--CEEEEEEEEEEEEE-----TTTTEEEE-C--
T ss_pred eCCccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCC--eeEEECCceEEeec-----CCCCeeEEEecC
Confidence 4688999999976 35555421 2356889999999997544322 22233332222211 123566777778
Q ss_pred CCCCCCCceeCC-CceEEEEEeeeeCCCCCCCCccceeec
Q 014786 275 NPGNSGGPLLDS-SGSLIGINTAIYSPSGASSGVGFSIPV 313 (418)
Q Consensus 275 ~~G~SGGPl~n~-~G~VVGI~s~~~~~~~~~~~~~~aIp~ 313 (418)
..|+=|.|+++. +|++|||++..... ...+|+.|+
T Consensus 149 k~G~CG~PlVs~~Dg~IVGiHsl~~~~----~~~N~F~~f 184 (235)
T PF00863_consen 149 KDGDCGLPLVSTKDGKIVGIHSLTSNT----SSRNYFTPF 184 (235)
T ss_dssp -TT-TT-EEEETTT--EEEEEEEEETT----TSSEEEEE-
T ss_pred CCCccCCcEEEcCCCcEEEEEcCccCC----CCeEEEEcC
Confidence 899999999984 99999999977543 345677665
No 27
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.48 E-value=6.7e-07 Score=69.96 Aligned_cols=69 Identities=30% Similarity=0.453 Sum_probs=56.1
Q ss_pred ccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeC--CHHHHHHHHhcCCC
Q 014786 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKV 408 (418)
Q Consensus 331 ~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~--~~~dl~~~l~~~~~ 408 (418)
...+|+.+...... ..|++|.++.+++||+++||++ ||+|++|||+++. +.+++.+.+....
T Consensus 11 ~~~~G~~~~~~~~~----~~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~~~l~~~~- 74 (82)
T cd00992 11 GGGLGFSLRGGKDS----GGGIFVSRVEPGGPAERGGLRV-----------GDRILEVNGVSVEGLTHEEAVELLKNSG- 74 (82)
T ss_pred CCCcCEEEeCcccC----CCCeEEEEECCCChHHhCCCCC-----------CCEEEEECCEEcCccCHHHHHHHHHhCC-
Confidence 35678887653211 3689999999999999999999 9999999999999 8999999998643
Q ss_pred CCEEEEEE
Q 014786 409 GDEVSCFT 416 (418)
Q Consensus 409 g~~v~l~v 416 (418)
..+.+++
T Consensus 75 -~~v~l~v 81 (82)
T cd00992 75 -DEVTLTV 81 (82)
T ss_pred -CeEEEEE
Confidence 2676665
No 28
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.43 E-value=3.4e-07 Score=71.94 Aligned_cols=71 Identities=27% Similarity=0.433 Sum_probs=56.0
Q ss_pred cccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCC
Q 014786 330 TRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCK 407 (418)
Q Consensus 330 ~~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~ 407 (418)
....+|+.+....... ..|++|.++.+++||+++||++ ||.|++|||+++.++ .++.+++...
T Consensus 8 ~~~~lG~~l~~~~~~~---~~~~~V~~v~~~~~a~~~gl~~-----------GD~Il~INg~~v~~~~~~~~~~~l~~~- 72 (81)
T PF00595_consen 8 GNGPLGFTLRGGSDND---EKGVFVSSVVPGSPAERAGLKV-----------GDRILEINGQSVRGMSHDEVVQLLKSA- 72 (81)
T ss_dssp TTSBSSEEEEEESTSS---SEEEEEEEECTTSHHHHHTSST-----------TEEEEEETTEESTTSBHHHHHHHHHHS-
T ss_pred CCCCcCEEEEecCCCC---cCCEEEEEEeCCChHHhcccch-----------hhhhheeCCEeCCCCCHHHHHHHHHCC-
Confidence 3567888887643211 2589999999999999999999 999999999999977 4666667664
Q ss_pred CCCEEEEEE
Q 014786 408 VGDEVSCFT 416 (418)
Q Consensus 408 ~g~~v~l~v 416 (418)
+.+|+|+|
T Consensus 73 -~~~v~L~V 80 (81)
T PF00595_consen 73 -SNPVTLTV 80 (81)
T ss_dssp -TSEEEEEE
T ss_pred -CCcEEEEE
Confidence 34888876
No 29
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.35 E-value=1.4e-06 Score=68.39 Aligned_cols=71 Identities=32% Similarity=0.416 Sum_probs=54.9
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~ 411 (418)
..+|+.+....... .|++|..+.+++||+++||++ ||+|++|||+++.+..+..........++.
T Consensus 12 ~~~G~~~~~~~~~~----~~~~i~~v~~~s~a~~~gl~~-----------GD~I~~In~~~v~~~~~~~~~~~~~~~~~~ 76 (85)
T smart00228 12 GGLGFSLVGGKDEG----GGVVVSSVVPGSPAAKAGLKV-----------GDVILEVNGTSVEGLTHLEAVDLLKKAGGK 76 (85)
T ss_pred CcccEEEECCCCCC----CCEEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHHhCCCe
Confidence 67788876532111 689999999999999999999 999999999999987766555443334668
Q ss_pred EEEEEE
Q 014786 412 VSCFTF 417 (418)
Q Consensus 412 v~l~v~ 417 (418)
+.+++.
T Consensus 77 ~~l~i~ 82 (85)
T smart00228 77 VTLTVL 82 (85)
T ss_pred EEEEEE
Confidence 887764
No 30
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.25 E-value=1.4e-06 Score=90.51 Aligned_cols=54 Identities=17% Similarity=0.169 Sum_probs=50.8
Q ss_pred EEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 353 ~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
+|.+|.++|||++|||++ ||+|++|||++|++++|+...+....+|++++++|+
T Consensus 129 lV~~V~~~SpA~kAGLk~-----------GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~ 182 (449)
T PRK10779 129 VVGEIAPNSIAAQAQIAP-----------GTELKAVDGIETPDWDAVRLALVSKIGDESTTITVA 182 (449)
T ss_pred cccccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEE
Confidence 789999999999999999 999999999999999999999988888998888874
No 31
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.22 E-value=9.6e-05 Score=70.43 Aligned_cols=145 Identities=25% Similarity=0.301 Sum_probs=81.8
Q ss_pred EEEEEEEcCCcEEEecccccCCCC--eEEEEeCC---------C---cEEEE-EEEEEc-------CC-CCeEEEEecCC
Q 014786 153 SGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFAD---------Q---SAYDA-KIVGFD-------QD-KDVAVLRIDAP 209 (418)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHvv~~~~--~i~V~~~d---------g---~~~~a-~vv~~d-------~~-~DlAlLkv~~~ 209 (418)
.+.|.+|+++ ||+|++||+.+.. .+.|.+.. + ..... +++ .+ .. +|||||+++.+
T Consensus 39 ~Cggsli~~~-~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~ 116 (256)
T KOG3627|consen 39 LCGGSLISPR-WVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEP 116 (256)
T ss_pred eeeeEEeeCC-EEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCC
Confidence 6777788665 9999999999875 66666531 1 11111 222 22 13 79999999874
Q ss_pred C---CCCcccccCCCCC---CCCCCEEEEEeCCCCCC------CceeEeEEeeeeeeecccCCCC---CcccEEEEc---
Q 014786 210 K---DKLRPIPIGVSAD---LLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQTD--- 271 (418)
Q Consensus 210 ~---~~~~~~~l~~~~~---~~~G~~V~~vG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~~---~~~~~i~~~--- 271 (418)
. ..+.++.+..... ...+...++.||+.... .......+.-+........... .....+...
T Consensus 117 v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~ 196 (256)
T KOG3627|consen 117 VTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPE 196 (256)
T ss_pred cccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccC
Confidence 3 3456666642322 34458888899754211 1122222222221111111110 011223332
Q ss_pred --cccCCCCCCCceeCCC---ceEEEEEeeeeC
Q 014786 272 --AAINPGNSGGPLLDSS---GSLIGINTAIYS 299 (418)
Q Consensus 272 --~~i~~G~SGGPl~n~~---G~VVGI~s~~~~ 299 (418)
...|.|+|||||+-.+ ..++||++++..
T Consensus 197 ~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~ 229 (256)
T KOG3627|consen 197 GGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSG 229 (256)
T ss_pred CCCccccCCCCCeEEEeeCCcEEEEEEEEecCC
Confidence 2368899999999764 699999999865
No 32
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.00 E-value=1.3e-05 Score=82.71 Aligned_cols=55 Identities=33% Similarity=0.514 Sum_probs=49.6
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
.|+.|.+|.+++||+++||++ ||+|++|||++|.+++|+.+.+.. .+|+++.+++
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~-----------GD~Iv~Vng~~V~s~~dl~~~l~~-~~~~~v~l~v 257 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKE-----------GDYIQSINGEKLRSWTDFVSAVKE-NPGKSMDIKV 257 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHh-CCCCceEEEE
Confidence 478999999999999999999 999999999999999999999986 4677777765
No 33
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=97.96 E-value=1.9e-05 Score=78.94 Aligned_cols=67 Identities=30% Similarity=0.455 Sum_probs=54.0
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g 409 (418)
..+|+.+.... .+++|.+|.+++||+++||++ ||+|++|||++|.++ .++...+.. +.|
T Consensus 51 ~~lG~~~~~~~-------~~~~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g 111 (334)
T TIGR00225 51 EGIGIQVGMDD-------GEIVIVSPFEGSPAEKAGIKP-----------GDKIIKINGKSVAGMSLDDAVALIRG-KKG 111 (334)
T ss_pred EEEEEEEEEEC-------CEEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHhccC-CCC
Confidence 45777665432 578999999999999999999 999999999999986 567666654 568
Q ss_pred CEEEEEEE
Q 014786 410 DEVSCFTF 417 (418)
Q Consensus 410 ~~v~l~v~ 417 (418)
+++.+++.
T Consensus 112 ~~v~l~v~ 119 (334)
T TIGR00225 112 TKVSLEIL 119 (334)
T ss_pred CEEEEEEE
Confidence 88888764
No 34
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=97.93 E-value=2e-05 Score=81.89 Aligned_cols=54 Identities=24% Similarity=0.419 Sum_probs=49.1
Q ss_pred CcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 351 G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
+++|.+|.++|||+++||++ ||+|++|||++|++++|+.+.+.. .+|+.+.+++
T Consensus 222 ~~vV~~V~~~SpA~~AGL~~-----------GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v 275 (449)
T PRK10779 222 EPVLAEVQPNSAASKAGLQA-----------GDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEI 275 (449)
T ss_pred CcEEEeeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEE
Confidence 57899999999999999999 999999999999999999999976 4677777765
No 35
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=97.92 E-value=3.9e-05 Score=61.38 Aligned_cols=68 Identities=29% Similarity=0.516 Sum_probs=45.1
Q ss_pred ccCeeeccchhhhhhCccCcEEEecCCC--------ChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHh
Q 014786 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPN--------GPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD 404 (418)
Q Consensus 333 ~lGv~~~~~~~~~~~~~~G~~V~~v~~~--------~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~ 404 (418)
.||..+.... .+..|.++.++ ||-.+.|+. +++||+|++|||+++....+++.+|.
T Consensus 2 ~LGAd~~~~~-------~~y~I~~I~~gd~~~~~~~sPL~~pGv~---------v~~GD~I~aInG~~v~~~~~~~~lL~ 65 (88)
T PF14685_consen 2 LLGADFSYDN-------GGYRIARIYPGDPWNPNARSPLAQPGVD---------VREGDYILAINGQPVTADANPYRLLE 65 (88)
T ss_dssp B-SEEEEEET-------TEEEEEEE-BS-TTSSS-B-GGGGGS-------------TT-EEEEETTEE-BTTB-HHHHHH
T ss_pred ccceEEEEcC-------CEEEEEEEeCCCCCCccccCCccCCCCC---------CCCCCEEEEECCEECCCCCCHHHHhc
Confidence 5666665432 45668888775 666666665 35699999999999999999999998
Q ss_pred cCCCCCEEEEEEE
Q 014786 405 QCKVGDEVSCFTF 417 (418)
Q Consensus 405 ~~~~g~~v~l~v~ 417 (418)
. +.|+.|.|+|.
T Consensus 66 ~-~agk~V~Ltv~ 77 (88)
T PF14685_consen 66 G-KAGKQVLLTVN 77 (88)
T ss_dssp T-TTTSEEEEEEE
T ss_pred c-cCCCEEEEEEe
Confidence 5 57999999874
No 36
>PRK10139 serine endoprotease; Provisional
Probab=97.90 E-value=2.5e-05 Score=81.29 Aligned_cols=55 Identities=22% Similarity=0.361 Sum_probs=49.2
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.|++|.+|.+++||+++||++ ||+|++|||++|.+++|+.+++.+. + +++.++|+
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~-~-~~v~l~v~ 444 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQK-----------DDVIIGVNRDRVNSIAEMRKVLAAK-P-AIIALQIV 444 (455)
T ss_pred CceEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhC-C-CeEEEEEE
Confidence 589999999999999999999 9999999999999999999999864 2 67777653
No 37
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=97.80 E-value=6.2e-05 Score=76.82 Aligned_cols=56 Identities=27% Similarity=0.471 Sum_probs=47.7
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.|++|..|.+++||+++||+. ||+|++|||++|.+. .++...+. .+.|+.|.++|.
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~-----------GD~Iv~InG~~v~~~~~~~~~~~l~-g~~g~~v~ltv~ 159 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRP-----------GDVILAIDGTSTEGLSLYEAADRLQ-GPEGSSVELTLR 159 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHh-cCCCCEEEEEEE
Confidence 489999999999999999999 999999999999864 67777775 457888888763
No 38
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.79 E-value=0.00034 Score=66.16 Aligned_cols=117 Identities=26% Similarity=0.383 Sum_probs=63.0
Q ss_pred CeEEEEEEEcCCc--EEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCC
Q 014786 151 QGSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQ 228 (418)
Q Consensus 151 ~~~GSGfiI~~~G--~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~ 228 (418)
.+.|||=+...+| .|+|+.||+. .+...|... +.... ..++..-|+|.-.+++-...+|.++++.. ..|.
T Consensus 111 ss~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~~-g~~~~---~tF~~~GDfA~~~~~~~~G~~P~~k~a~~---~~Gr 182 (297)
T PF05579_consen 111 SSVGSGGVFTIGGNTVVVTATHVLG-GNTARVSGV-GTRRM---LTFKKNGDFAEADITNWPGAAPKYKFAQN---YTGR 182 (297)
T ss_dssp SSEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEET-TEEEE---EEEEEETTEEEEEETTS-S---B--B-TT----SEE
T ss_pred ecccccceEEECCeEEEEEEEEEcC-CCeEEEEec-ceEEE---EEEeccCcEEEEECCCCCCCCCceeecCC---cccc
Confidence 4456655555444 6999999998 455555443 33322 24455679999999554445677776522 1232
Q ss_pred EEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeee
Q 014786 229 KVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIY 298 (418)
Q Consensus 229 ~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~ 298 (418)
--+.- ..-+..|.|... ..+ +-..+|+||+|++..+|.+|||++..-
T Consensus 183 AyW~t------~tGvE~G~ig~~--------------~~~---~fT~~GDSGSPVVt~dg~liGVHTGSn 229 (297)
T PF05579_consen 183 AYWLT------STGVEPGFIGGG--------------GAV---CFTGPGDSGSPVVTEDGDLIGVHTGSN 229 (297)
T ss_dssp EEEEE------TTEEEEEEEETT--------------EEE---ESS-GGCTT-EEEETTC-EEEEEEEEE
T ss_pred eEEEc------ccCcccceecCc--------------eEE---EEcCCCCCCCccCcCCCCEEEEEecCC
Confidence 22211 112344444311 112 234579999999999999999999864
No 39
>PRK10942 serine endoprotease; Provisional
Probab=97.76 E-value=5.7e-05 Score=78.95 Aligned_cols=55 Identities=33% Similarity=0.458 Sum_probs=49.2
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.|++|.+|.+++||+++||++ ||+|++|||++|.+++|+.+++... ++.+.++|.
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~-----------GDvIv~VNg~~V~s~~dl~~~l~~~--~~~v~l~V~ 462 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKK-----------GDVIIGANQQPVKNIAELRKILDSK--PSVLALNIQ 462 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhC--CCeEEEEEE
Confidence 589999999999999999999 9999999999999999999999873 367777653
No 40
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.75 E-value=4.3e-05 Score=78.81 Aligned_cols=54 Identities=28% Similarity=0.260 Sum_probs=47.7
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
.|++|.+|.++|||++|||++ ||+|+++||+++.++.|+.+.+.... +++.+++
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~-----------GDvI~~vng~~v~~~~dl~~~ia~~~--~~v~~~I 181 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEP-----------GDEILSVNGNKIPGFKDVRQQIADIA--GEPMVEI 181 (420)
T ss_pred CCceeeccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhc--ccceEEE
Confidence 588999999999999999999 99999999999999999999887655 4555544
No 41
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.67 E-value=0.00013 Score=74.62 Aligned_cols=71 Identities=28% Similarity=0.438 Sum_probs=59.0
Q ss_pred ccccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcC
Q 014786 329 VTRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQC 406 (418)
Q Consensus 329 v~~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~ 406 (418)
.....+|++++.... .++.|.++.+++||+++||++ ||+|++|||+++.... +..+.+. .
T Consensus 97 ~~~~GiG~~i~~~~~------~~~~V~s~~~~~PA~kagi~~-----------GD~I~~IdG~~~~~~~~~~av~~ir-G 158 (406)
T COG0793 97 GEFGGIGIELQMEDI------GGVKVVSPIDGSPAAKAGIKP-----------GDVIIKIDGKSVGGVSLDEAVKLIR-G 158 (406)
T ss_pred ccccceeEEEEEecC------CCcEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCCCHHHHHHHhC-C
Confidence 366888888876432 678999999999999999999 9999999999999884 4666665 4
Q ss_pred CCCCEEEEEEE
Q 014786 407 KVGDEVSCFTF 417 (418)
Q Consensus 407 ~~g~~v~l~v~ 417 (418)
++|..|+|++.
T Consensus 159 ~~Gt~V~L~i~ 169 (406)
T COG0793 159 KPGTKVTLTIL 169 (406)
T ss_pred CCCCeEEEEEE
Confidence 68999999874
No 42
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.64 E-value=0.0001 Score=64.07 Aligned_cols=74 Identities=26% Similarity=0.438 Sum_probs=53.1
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~ 411 (418)
+.||+.++-.... ...-.+.-|.+|.++|||++|||++ ..|.|+.+|+....+.++|.+.++.+ .++.
T Consensus 26 g~LG~sv~~~~~~-~~~~~~~~Vl~V~p~SPA~~AGL~p----------~~DyIig~~~~~l~~~~~l~~~v~~~-~~~~ 93 (138)
T PF04495_consen 26 GLLGISVRFESFE-GAEEEGWHVLRVAPNSPAAKAGLEP----------FFDYIIGIDGGLLDDEDDLFELVEAN-ENKP 93 (138)
T ss_dssp SSS-EEEEEEE-T-TGCCCEEEEEEE-TTSHHHHTT--T----------TTEEEEEETTCE--STCHHHHHHHHT-TTS-
T ss_pred CCCcEEEEEeccc-ccccceEEEeEecCCCHHHHCCccc----------cccEEEEccceecCCHHHHHHHHHHc-CCCc
Confidence 7788887643321 1112678899999999999999997 26999999999999999999999875 5889
Q ss_pred EEEEEE
Q 014786 412 VSCFTF 417 (418)
Q Consensus 412 v~l~v~ 417 (418)
+.+.||
T Consensus 94 l~L~Vy 99 (138)
T PF04495_consen 94 LQLYVY 99 (138)
T ss_dssp EEEEEE
T ss_pred EEEEEE
Confidence 999886
No 43
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.62 E-value=0.00012 Score=74.13 Aligned_cols=55 Identities=31% Similarity=0.558 Sum_probs=46.4
Q ss_pred cCcEEEecC--------CCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 350 SGVLVLDAP--------PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 350 ~G~~V~~v~--------~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
+||+|.... .++||+++||++ ||+|++|||++|++++|+.+++...+ |+.+.++|
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~-----------GDiIvsING~~V~s~~DL~~iL~~~~-g~~V~LtV 167 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQI-----------GDRILKINGEKIKNMDDLANLINKAG-GEKLTLTI 167 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHhCC-CCeEEEEE
Confidence 577775542 368999999999 99999999999999999999998764 78887776
No 44
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=97.44 E-value=0.0003 Score=68.17 Aligned_cols=55 Identities=35% Similarity=0.506 Sum_probs=50.3
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
.||||..+..++|+. +.|+.||-|++|||+++.+.+|+.+.+...++||+|++++
T Consensus 130 ~gvyv~~v~~~~~~~------------gkl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~ 184 (342)
T COG3480 130 AGVYVLSVIDNSPFK------------GKLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDY 184 (342)
T ss_pred eeEEEEEccCCcchh------------ceeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEE
Confidence 799999999999874 4566699999999999999999999999999999999975
No 45
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.29 E-value=0.00061 Score=73.66 Aligned_cols=68 Identities=24% Similarity=0.313 Sum_probs=52.6
Q ss_pred ccccCeeeccchhhhhhCccCcEEEecCCCChhhhc-CccccccccCCCccCCcEEEEEC--CEEeCC-----HHHHHHH
Q 014786 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKA-GLLSTKRDAYGRLILGDIITSVN--GKKVSN-----GSDLYRI 402 (418)
Q Consensus 331 ~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~a-Gl~~~~~~~~~~l~~GDiIl~in--g~~v~~-----~~dl~~~ 402 (418)
..-+|+.++... .++.|.+|.+|+||+++ ||++ ||+|++|| |+++.+ .+++.+.
T Consensus 243 ~~GIGa~l~~~~-------~~~~V~~vipGsPA~ka~gLk~-----------GD~IlaVn~~g~~~~dv~g~~~~~vv~l 304 (667)
T PRK11186 243 LEGIGAVLQMDD-------DYTVINSLVAGGPAAKSKKLSV-----------GDKIVGVGQDGKPIVDVIGWRLDDVVAL 304 (667)
T ss_pred eeEEEEEEEEeC-------CeEEEEEccCCChHHHhCCCCC-----------CCEEEEECCCCCcccccccCCHHHHHHH
Confidence 345677776533 46899999999999998 9999 99999999 555443 3477777
Q ss_pred HhcCCCCCEEEEEEE
Q 014786 403 LDQCKVGDEVSCFTF 417 (418)
Q Consensus 403 l~~~~~g~~v~l~v~ 417 (418)
|. .+.|.+|.|+|.
T Consensus 305 ir-G~~Gt~V~LtV~ 318 (667)
T PRK11186 305 IK-GPKGSKVRLEIL 318 (667)
T ss_pred hc-CCCCCEEEEEEE
Confidence 75 467999999874
No 46
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.19 E-value=0.015 Score=57.39 Aligned_cols=54 Identities=19% Similarity=0.281 Sum_probs=36.1
Q ss_pred cccCCCCCCCceeCC--Cce-EEEEEeeeeCCCCCCCCccceeecccchhhhhhhhh
Q 014786 272 AAINPGNSGGPLLDS--SGS-LIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVK 325 (418)
Q Consensus 272 ~~i~~G~SGGPl~n~--~G~-VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~ 325 (418)
...|.|+||||+|-. +|+ -+||++|+...++...--+..--++....+++...+
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~ 279 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTN 279 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhc
Confidence 356889999999953 454 799999987766544433434445666666666443
No 47
>PF12812 PDZ_1: PDZ-like domain
Probab=97.15 E-value=0.0013 Score=51.48 Aligned_cols=65 Identities=28% Similarity=0.381 Sum_probs=51.8
Q ss_pred cccCeeecc--chhhhhhCc-cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCC
Q 014786 332 PILGIKFAP--DQSVEQLGV-SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCK 407 (418)
Q Consensus 332 ~~lGv~~~~--~~~~~~~~~-~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~ 407 (418)
-|.|..+.+ .+.++.+++ -|+++.....++++..-|+.. |-+|++|||+++.+.+++.+.+++.+
T Consensus 9 ~~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i~~-----------g~iI~~Vn~kpt~~Ld~f~~vvk~ip 76 (78)
T PF12812_consen 9 EVCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGISK-----------GFIITSVNGKPTPDLDDFIKVVKKIP 76 (78)
T ss_pred EEcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCCCC-----------CeEEEeECCcCCcCHHHHHHHHHhCC
Confidence 477888876 455777776 345555667888877666888 99999999999999999999998764
No 48
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.06 E-value=0.00065 Score=69.98 Aligned_cols=49 Identities=39% Similarity=0.532 Sum_probs=42.6
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.+..|..|.++|||++|||.+ ||.|++|||. .+.+...++++.|++.++
T Consensus 462 g~~~i~~V~~~gPA~~AGl~~-----------Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~ 510 (558)
T COG3975 462 GHEKITFVFPGGPAYKAGLSP-----------GDKIVAINGI--------SDQLDRYKVNDKIQVHVF 510 (558)
T ss_pred CeeEEEecCCCChhHhccCCC-----------ccEEEEEcCc--------cccccccccccceEEEEc
Confidence 467999999999999999999 9999999999 456667788888888764
No 49
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.96 E-value=0.00066 Score=54.70 Aligned_cols=32 Identities=38% Similarity=0.453 Sum_probs=30.4
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKK 392 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~ 392 (418)
.|+||++|.++|||+.|||+. +|.|+.+||-.
T Consensus 59 ~GiYvT~V~eGsPA~~AGLri-----------hDKIlQvNG~D 90 (124)
T KOG3553|consen 59 KGIYVTRVSEGSPAEIAGLRI-----------HDKILQVNGWD 90 (124)
T ss_pred ccEEEEEeccCChhhhhccee-----------cceEEEecCce
Confidence 799999999999999999999 99999999954
No 50
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.68 E-value=0.022 Score=52.78 Aligned_cols=164 Identities=18% Similarity=0.283 Sum_probs=85.9
Q ss_pred ccccCCCeEEEEEEEcCC-cEEEecccccCCCCe-EEEEeCCCcEEEEEEEEEcCC----------------CCeEEEEe
Q 014786 145 DVLEVPQGSGSGFVWDSK-GHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQD----------------KDVAVLRI 206 (418)
Q Consensus 145 ~~~~~~~~~GSGfiI~~~-G~ILT~aHvv~~~~~-i~V~~~dg~~~~a~vv~~d~~----------------~DlAlLkv 206 (418)
|......+.||=.+++++ +..--=.|.+.+.+. ..+.+.+|+.|++++....+. .-+.-+.-
T Consensus 13 wVRD~~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~ 92 (218)
T PF05580_consen 13 WVRDSTAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEK 92 (218)
T ss_pred EEEeCCcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEe
Confidence 444455788998999874 555555888887664 456667888888877655321 11222221
Q ss_pred cC----------C----CCCCcccccCCCCCCCCCCEEEEEeCCCCCC-CceeEeEEeeeeeeecccCCC----CCcccE
Q 014786 207 DA----------P----KDKLRPIPIGVSADLLVGQKVYAIGNPFGLD-HTLTTGVISGLRREISSAATG----RPIQDV 267 (418)
Q Consensus 207 ~~----------~----~~~~~~~~l~~~~~~~~G~~V~~vG~p~g~~-~~~~~G~Vs~~~~~~~~~~~~----~~~~~~ 267 (418)
.. . ....++++++...++++|..-+..-. .|.. ..+..-++ .+.+.......+ ....+.
T Consensus 93 Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~-~G~~ie~f~ieI~-~v~~~~~~~~k~~vi~vtd~~L 170 (218)
T PF05580_consen 93 NTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVI-DGTKIEEFDIEIE-KVLPQSSPSGKGMVIKVTDPRL 170 (218)
T ss_pred ccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEE-cCCeEEEeEEEEE-EEccCCCCCCCcEEEEECCcch
Confidence 11 1 01234555555556777754321111 1111 11111111 111110000000 000112
Q ss_pred EEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecc
Q 014786 268 IQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVD 314 (418)
Q Consensus 268 i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~ 314 (418)
+....-+..||||+|++ .+|++||=++..+.+ +...||.++++
T Consensus 171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~---dp~~Gygi~ie 213 (218)
T PF05580_consen 171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVN---DPTKGYGIFIE 213 (218)
T ss_pred hhhhCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCceeeecHH
Confidence 22233466899999999 799999999977753 35678888764
No 51
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.60 E-value=0.049 Score=49.24 Aligned_cols=136 Identities=19% Similarity=0.285 Sum_probs=78.7
Q ss_pred CeEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEE--EEEEcC---CCCeEEEEecCCCCCCcccc--cCCCCC
Q 014786 151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK--IVGFDQ---DKDVAVLRIDAPKDKLRPIP--IGVSAD 223 (418)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~--vv~~d~---~~DlAlLkv~~~~~~~~~~~--l~~~~~ 223 (418)
...++++.|..+ ++|...| -.....+.+ +|+.++.. +...+. ..|+++++++... +++-+. +.+. .
T Consensus 24 ~~t~l~~gi~~~-~~lvp~H-~~~~~~i~i---~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~-kfrDIrk~~~~~-~ 96 (172)
T PF00548_consen 24 EFTMLALGIYDR-YFLVPTH-EEPEDTIYI---DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNP-KFRDIRKFFPES-I 96 (172)
T ss_dssp EEEEEEEEEEBT-EEEEEGG-GGGCSEEEE---TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS--B--GGGGSBSS-G
T ss_pred eEEEecceEeee-EEEEECc-CCCcEEEEE---CCEEEEeeeeEEEecCCCcceeEEEEEccCCc-ccCchhhhhccc-c
Confidence 457888888765 9999999 223333433 45555432 223443 4599999997643 343332 1111 1
Q ss_pred CCCCCEEEEEeCCCCCCC-ceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCC---CceEEEEEeee
Q 014786 224 LLVGQKVYAIGNPFGLDH-TLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDS---SGSLIGINTAI 297 (418)
Q Consensus 224 ~~~G~~V~~vG~p~g~~~-~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~---~G~VVGI~s~~ 297 (418)
....+...++=.. .... ....+.+...... . ..+......+.++++..+|+-||||+.. .++++||+.++
T Consensus 97 ~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~--~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 97 PEYPECVLLVNST-KFPRMIVEVGFVTNFGFI-N--LSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp GTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-E--ETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred ccCCCcEEEEECC-CCccEEEEEEEEeecCcc-c--cCCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 2334444444332 2332 3344445443332 1 1234456788899999999999999952 67999999985
No 52
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.56 E-value=0.16 Score=49.27 Aligned_cols=91 Identities=19% Similarity=0.216 Sum_probs=54.9
Q ss_pred CCCCeEEEEecCC-CCCCcccccCCC-CCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEcccc
Q 014786 197 QDKDVAVLRIDAP-KDKLRPIPIGVS-ADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI 274 (418)
Q Consensus 197 ~~~DlAlLkv~~~-~~~~~~~~l~~~-~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i 274 (418)
...+++||.++.+ .....++-|+++ .....++.+.+.|+.. ........+.-.... .....+......
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~--~~~~~~~~~~i~~~~--------~~~~~~~~~~~~ 228 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNS--TGKLKHRKLKITNCT--------KCAYSICTKQYS 228 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCC--CCeEEEEEEEEEEee--------ccceeEeccccc
Confidence 3569999999886 234566666654 3467899999999821 112222222111110 012345556667
Q ss_pred CCCCCCCceeC---CCceEEEEEeee
Q 014786 275 NPGNSGGPLLD---SSGSLIGINTAI 297 (418)
Q Consensus 275 ~~G~SGGPl~n---~~G~VVGI~s~~ 297 (418)
+.|++|||++. .+..||||.+..
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~ 254 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASG 254 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccC
Confidence 89999999984 344599997644
No 53
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=96.01 E-value=0.012 Score=53.88 Aligned_cols=56 Identities=29% Similarity=0.248 Sum_probs=44.4
Q ss_pred CcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHH--HhcCCCCCEEEEEEE
Q 014786 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRI--LDQCKVGDEVSCFTF 417 (418)
Q Consensus 351 G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~--l~~~~~g~~v~l~v~ 417 (418)
=+.|.+|.++|||+++||+. ||.|++++...--++..|..+ +.+...++.+.++|+
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~-----------gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~ 197 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCV-----------GDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVI 197 (231)
T ss_pred eEEEeecCCCChhhhhCccc-----------CceEEEecccccccchhHHHHHHHHHhccCcceeEEEe
Confidence 46799999999999999999 999999999888887766543 334456666666653
No 54
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=95.97 E-value=0.042 Score=58.45 Aligned_cols=117 Identities=20% Similarity=0.399 Sum_probs=72.4
Q ss_pred CCCeEEEEecCCC-------CC------CcccccCC------CCCCCCCCEEEEEeCCCCCCCceeEeEEeeeeeeeccc
Q 014786 198 DKDVAVLRIDAPK-------DK------LRPIPIGV------SADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSA 258 (418)
Q Consensus 198 ~~DlAlLkv~~~~-------~~------~~~~~l~~------~~~~~~G~~V~~vG~p~g~~~~~~~G~Vs~~~~~~~~~ 258 (418)
-.|+|||+++... ++ -|.+.+.+ ...+..|..|+-+|...| .+.|.+.++.-.. +
T Consensus 542 LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvy--w 615 (695)
T PF08192_consen 542 LSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVY--W 615 (695)
T ss_pred ccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEE--e
Confidence 3599999998532 11 12222221 123567999999997655 5677777664332 2
Q ss_pred CCCCCc-ccEEEEc----cccCCCCCCCceeCCCc------eEEEEEeeeeCCCCCCCCccceeecccchhhhhhh
Q 014786 259 ATGRPI-QDVIQTD----AAINPGNSGGPLLDSSG------SLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQL 323 (418)
Q Consensus 259 ~~~~~~-~~~i~~~----~~i~~G~SGGPl~n~~G------~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l 323 (418)
..+... .+++... .-...|+||+=|++.-+ .|+||.++.. +....+|.+.|+..|.+=+++.
T Consensus 616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsyd---ge~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYD---GEQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecC---CccceeeccCcHHHHHHHHHHh
Confidence 222222 3344433 23457999999998533 3999988752 4456799998888776666553
No 55
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=95.89 E-value=0.013 Score=56.74 Aligned_cols=50 Identities=16% Similarity=0.288 Sum_probs=42.7
Q ss_pred ecCCCCh---hhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 356 DAPPNGP---AGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 356 ~v~~~~p---a~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
++.|+.. -+++||+. |||+++|||.++++.++..+++++.+..++++|+|
T Consensus 210 rl~Pgkd~~lF~~~GLq~-----------GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltV 262 (276)
T PRK09681 210 AVKPGADRSLFDASGFKE-----------GDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTV 262 (276)
T ss_pred EECCCCcHHHHHHcCCCC-----------CCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEE
Confidence 3456543 56789999 99999999999999999999999888888888876
No 56
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.75 E-value=0.053 Score=59.20 Aligned_cols=22 Identities=36% Similarity=0.359 Sum_probs=20.1
Q ss_pred eEEEEEEEcCCcEEEecccccC
Q 014786 152 GSGSGFVWDSKGHVVTNYHVIR 173 (418)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~ 173 (418)
+-|||-+|+++|.|+||.||+.
T Consensus 47 gGCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 47 GGCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred CceeEEEEcCCceEEecchhhh
Confidence 3599999999999999999975
No 57
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.64 E-value=0.014 Score=61.97 Aligned_cols=52 Identities=29% Similarity=0.450 Sum_probs=43.6
Q ss_pred EEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEE
Q 014786 354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 354 V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~~~g~~v~l~v~ 417 (418)
|-++.++|||+|. |+|++||-|++|||+.|.+.+ |+..+++. .|-+|+|+|+
T Consensus 782 iGrIieGSPAdRC----------gkLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGlsVtLtIi 835 (984)
T KOG3209|consen 782 IGRIIEGSPADRC----------GKLKVGDRILAVNGQSILNLSHADIVSLIKD--AGLSVTLTII 835 (984)
T ss_pred ccccccCChhHhh----------ccccccceEEEecCeeeeccCchhHHHHHHh--cCceEEEEEc
Confidence 7788899999886 445569999999999999985 77888875 6889999985
No 58
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.41 E-value=0.016 Score=49.86 Aligned_cols=32 Identities=22% Similarity=0.456 Sum_probs=22.9
Q ss_pred EEEccccCCCCCCCceeCCCceEEEEEeeeeC
Q 014786 268 IQTDAAINPGNSGGPLLDSSGSLIGINTAIYS 299 (418)
Q Consensus 268 i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~ 299 (418)
...+..+.+|.||+|+||.+|++|||......
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~ 119 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVE 119 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEEE
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEcccee
Confidence 34455577999999999999999999887654
No 59
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=95.34 E-value=0.16 Score=51.79 Aligned_cols=38 Identities=26% Similarity=0.602 Sum_probs=29.7
Q ss_pred cccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeec
Q 014786 272 AAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPV 313 (418)
Q Consensus 272 ~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~ 313 (418)
..+..||||+|++ .+|++||=++..+-+ +...||+|-+
T Consensus 355 gGivqGMSGSPi~-q~gkliGAvtHVfvn---dpt~GYGi~i 392 (402)
T TIGR02860 355 GGIVQGMSGSPII-QNGKVIGAVTHVFVN---DPTSGYGVYI 392 (402)
T ss_pred CCEEecccCCCEE-ECCEEEEEEEEEEec---CCCcceeehH
Confidence 3466799999999 899999998877664 3456788744
No 60
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=95.31 E-value=0.024 Score=59.18 Aligned_cols=55 Identities=29% Similarity=0.398 Sum_probs=46.7
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCF 415 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~ 415 (418)
-|++|..|.+++||++.||+. ||.||.||.++..+. ++...+|...++|+.|++.
T Consensus 429 VGIFVaGvqegspA~~eGlqE-----------GDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtil 485 (1027)
T KOG3580|consen 429 VGIFVAGVQEGSPAEQEGLQE-----------GDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTIL 485 (1027)
T ss_pred eeEEEeecccCCchhhccccc-----------cceeEEeccccchhhhHHHHHHHHhcCCCCcEEeeh
Confidence 589999999999999999999 999999999998876 3444455567899998873
No 61
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.30 E-value=0.014 Score=63.64 Aligned_cols=31 Identities=32% Similarity=0.593 Sum_probs=27.7
Q ss_pred ccEEEEccccCCCCCCCceeCCCceEEEEEe
Q 014786 265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINT 295 (418)
Q Consensus 265 ~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s 295 (418)
.-.+.++..+..||||+|++|.+|+|||++.
T Consensus 621 pv~FlstnDitGGNSGSPvlN~~GeLVGl~F 651 (698)
T PF10459_consen 621 PVNFLSTNDITGGNSGSPVLNAKGELVGLAF 651 (698)
T ss_pred eeEEEeccCcCCCCCCCccCCCCceEEEEee
Confidence 3457788899999999999999999999987
No 62
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=94.99 E-value=0.046 Score=57.98 Aligned_cols=62 Identities=24% Similarity=0.380 Sum_probs=50.2
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~ 411 (418)
-.+|+.|.... -+-|-|..|.+++||.|+.+++ ||++++|||+||++..+..+.++... |+.
T Consensus 386 ~~ig~vf~~~~------~~~v~v~tv~~ns~a~k~~~~~-----------gdvlvai~~~pi~s~~q~~~~~~s~~-~~~ 447 (1051)
T KOG3532|consen 386 SPIGLVFDKNT------NRAVKVCTVEDNSLADKAAFKP-----------GDVLVAINNVPIRSERQATRFLQSTT-GDL 447 (1051)
T ss_pred CceeEEEecCC------ceEEEEEEecCCChhhHhcCCC-----------cceEEEecCccchhHHHHHHHHHhcc-cce
Confidence 45666664321 1557799999999999999999 99999999999999999999998763 443
No 63
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=94.36 E-value=0.13 Score=47.78 Aligned_cols=134 Identities=15% Similarity=0.178 Sum_probs=48.0
Q ss_pred EEEecccccCCCCeEEEEeCCCcEEE---EEEEEEcCCCCeEEEEecCCC---CCCcccccCCCCCCCCCCEEEEEeCCC
Q 014786 164 HVVTNYHVIRGASDIRVTFADQSAYD---AKIVGFDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPF 237 (418)
Q Consensus 164 ~ILT~aHvv~~~~~i~V~~~dg~~~~---a~vv~~d~~~DlAlLkv~~~~---~~~~~~~l~~~~~~~~G~~V~~vG~p~ 237 (418)
.++|+.||..+...+.. ..+|+.++ -+.+..+...|++||++.... ...+.+.+.....+..| .+..+
T Consensus 43 ~L~ta~Hv~~~~~~~~~-~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~~~~g----~~~~y- 116 (203)
T PF02122_consen 43 ALLTARHVWSRPSKVTS-LKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQLAKG----PVSFY- 116 (203)
T ss_dssp EEEE-HHHHTSSS---E-EETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----SEEEE----ESSTT-
T ss_pred ceecccccCCCccceeE-cCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhhhCCC----Ceeee-
Confidence 69999999998665543 34454443 234456788899999997421 12233333211111100 11111
Q ss_pred CCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCCCceEEEEEeeeeCCCCCCCCccceeecc
Q 014786 238 GLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVD 314 (418)
Q Consensus 238 g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~ 314 (418)
....+........+. +. ...+...-+...+|.||.|+++.+ ++||+++.. ......+.+++..|+.
T Consensus 117 ----~~~~~~~~~~sa~i~----g~-~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~-~~~~~~~n~n~~spip 182 (203)
T PF02122_consen 117 ----GFSSGEWPCSSAKIP----GT-EGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS-PSGSNRENNNRMSPIP 182 (203)
T ss_dssp ----SEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE-----------------
T ss_pred ----eecCCCceeccCccc----cc-cCcCCceEcCCCCCCCCCCeEECC-CceEeecCc-cccccccccccccccc
Confidence 111111111111111 11 123566667888999999999888 999999975 2223344566555443
No 64
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=93.44 E-value=0.1 Score=57.31 Aligned_cols=57 Identities=28% Similarity=0.283 Sum_probs=46.0
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
-|+||.+|.+|++| |+.|+|+.||-+|+|||+......+-..+-...+.|..|.+.|
T Consensus 960 lGIYvKsVV~GgaA----------d~DGRL~aGDQLLsVdG~SLiGisQErAA~lmtrtg~vV~leV 1016 (1629)
T KOG1892|consen 960 LGIYVKSVVEGGAA----------DHDGRLEAGDQLLSVDGHSLIGISQERAARLMTRTGNVVHLEV 1016 (1629)
T ss_pred cceEEEEeccCCcc----------ccccccccCceeeeecCcccccccHHHHHHHHhccCCeEEEeh
Confidence 48999999999998 4567888899999999999988766554444455688998876
No 65
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=93.03 E-value=0.17 Score=44.06 Aligned_cols=54 Identities=28% Similarity=0.380 Sum_probs=38.6
Q ss_pred cCcEEEecCCCChhhh-cCccccccccCCCccCCcEEEEECCEEeCCHHH--HHHHHhcCCCCCEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGK-AGLLSTKRDAYGRLILGDIITSVNGKKVSNGSD--LYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~-aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~d--l~~~l~~~~~g~~v~l~v 416 (418)
.-+||+++.|++-|++ -||+. ||.++++||..|..-.. -.++|+.. -..|++.|
T Consensus 115 spiyisriipggvadrhgglkr-----------gdqllsvngvsvege~hekavellkaa--~gsvklvv 171 (207)
T KOG3550|consen 115 SPIYISRIIPGGVADRHGGLKR-----------GDQLLSVNGVSVEGEHHEKAVELLKAA--VGSVKLVV 171 (207)
T ss_pred CceEEEeecCCccccccCcccc-----------cceeEeecceeecchhhHHHHHHHHHh--cCcEEEEE
Confidence 4589999999999887 47777 99999999999976432 23344432 33566554
No 66
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=92.59 E-value=0.22 Score=46.83 Aligned_cols=48 Identities=21% Similarity=0.337 Sum_probs=41.7
Q ss_pred CCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 359 PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 359 ~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
.++--++.|||. |||.+++|+..+++.+++..+++..+.-+.++++|.
T Consensus 216 d~slF~~sglq~-----------GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~ 263 (275)
T COG3031 216 DGSLFYKSGLQR-----------GDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVI 263 (275)
T ss_pred CcchhhhhcCCC-----------cceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEE
Confidence 445577889998 999999999999999999999998877778888774
No 67
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=92.51 E-value=1 Score=38.30 Aligned_cols=33 Identities=30% Similarity=0.449 Sum_probs=24.9
Q ss_pred cccEEEEccccCCCCCCCceeCCCceEEEEEeee
Q 014786 264 IQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI 297 (418)
Q Consensus 264 ~~~~i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~ 297 (418)
..+++....+..||+-||+|+-.. -|+||++++
T Consensus 77 Q~~~l~g~Gp~~PGdCGg~L~C~H-GViGi~Tag 109 (127)
T PF00947_consen 77 QYNLLIGEGPAEPGDCGGILRCKH-GVIGIVTAG 109 (127)
T ss_dssp EECEEEEE-SSSTT-TCSEEEETT-CEEEEEEEE
T ss_pred ecCceeecccCCCCCCCceeEeCC-CeEEEEEeC
Confidence 345666677899999999999655 599999987
No 68
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=92.44 E-value=1.7 Score=41.22 Aligned_cols=88 Identities=17% Similarity=0.244 Sum_probs=60.0
Q ss_pred CCCeEEEEEEEcCCcEEEecccccCCC----CeEEEEeCCCcEEE------EEEEEEc-----CCCCeEEEEecCCC---
Q 014786 149 VPQGSGSGFVWDSKGHVVTNYHVIRGA----SDIRVTFADQSAYD------AKIVGFD-----QDKDVAVLRIDAPK--- 210 (418)
Q Consensus 149 ~~~~~GSGfiI~~~G~ILT~aHvv~~~----~~i~V~~~dg~~~~------a~vv~~d-----~~~DlAlLkv~~~~--- 210 (418)
++...++|++||++ |||++-.|+.+- ..+.+.++.++.+. -++..+| ++.++++|.++.+.
T Consensus 25 dG~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~~~~fT 103 (267)
T PF09342_consen 25 DGRYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQPANFT 103 (267)
T ss_pred cCeEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecCcccce
Confidence 45678999999987 999999999873 34667777776543 1233333 68899999998764
Q ss_pred CCCcccccCC-CCCCCCCCEEEEEeCCC
Q 014786 211 DKLRPIPIGV-SADLLVGQKVYAIGNPF 237 (418)
Q Consensus 211 ~~~~~~~l~~-~~~~~~G~~V~~vG~p~ 237 (418)
..+.|.-+.+ ..+....+..+++|.-.
T Consensus 104 r~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 104 RYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeecccccccccCCCCCCCceEEEEccc
Confidence 2234444433 23455566899999643
No 69
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=92.29 E-value=0.44 Score=45.52 Aligned_cols=66 Identities=26% Similarity=0.403 Sum_probs=49.3
Q ss_pred cccCeeeccchhhh--hhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCC--HHHHHHHHh
Q 014786 332 PILGIKFAPDQSVE--QLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILD 404 (418)
Q Consensus 332 ~~lGv~~~~~~~~~--~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~--~~dl~~~l~ 404 (418)
..||+.+.+....+ ..|+ +|+.|.+..+|+-|+..||-. +.|.|++|||.+|.. .+++-++|-
T Consensus 171 kPLGFYIRDG~SVRVtp~GlekvpGIFISRlVpGGLAeSTGLLa----------VnDEVlEVNGIEVaGKTLDQVTDMMv 240 (358)
T KOG3606|consen 171 KPLGFYIRDGTSVRVTPHGLEKVPGIFISRLVPGGLAESTGLLA----------VNDEVLEVNGIEVAGKTLDQVTDMMV 240 (358)
T ss_pred CCceEEEecCceEEeccccccccCceEEEeecCCccccccceee----------ecceeEEEcCEEeccccHHHHHHHHh
Confidence 46777665532221 2233 899999999999999999975 499999999999964 578888776
Q ss_pred cCC
Q 014786 405 QCK 407 (418)
Q Consensus 405 ~~~ 407 (418)
.+.
T Consensus 241 ANs 243 (358)
T KOG3606|consen 241 ANS 243 (358)
T ss_pred hcc
Confidence 543
No 70
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=91.83 E-value=0.18 Score=51.04 Aligned_cols=56 Identities=30% Similarity=0.478 Sum_probs=45.6
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHH-HhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRI-LDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~-l~~~~~g~~v~l~v~ 417 (418)
.|.-|.+|..++|+.+|||.. .-|-|++|||.+++...|..+. +++.- +.|+++||
T Consensus 15 eg~hvlkVqedSpa~~aglep----------ffdFIvSI~g~rL~~dnd~Lk~llk~~s--ekVkltv~ 71 (462)
T KOG3834|consen 15 EGYHVLKVQEDSPAHKAGLEP----------FFDFIVSINGIRLNKDNDTLKALLKANS--EKVKLTVY 71 (462)
T ss_pred eeEEEEEeecCChHHhcCcch----------hhhhhheeCcccccCchHHHHHHHHhcc--cceEEEEE
Confidence 677899999999999999997 3799999999999988765555 44443 33999886
No 71
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=91.55 E-value=0.26 Score=42.20 Aligned_cols=28 Identities=32% Similarity=0.565 Sum_probs=23.5
Q ss_pred ccccCCCCCCCceeCCCceEEEEEeeee
Q 014786 271 DAAINPGNSGGPLLDSSGSLIGINTAIY 298 (418)
Q Consensus 271 ~~~i~~G~SGGPl~n~~G~VVGI~s~~~ 298 (418)
...-.+|+||-|++|..|+||||+-.+.
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~ 127 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGGA 127 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEEE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecCC
Confidence 4456799999999999999999998764
No 72
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=91.11 E-value=0.47 Score=50.91 Aligned_cols=55 Identities=27% Similarity=0.413 Sum_probs=44.9
Q ss_pred cEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEE
Q 014786 352 VLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 352 ~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~v 416 (418)
+-|..|.+++||++-|- |..||+|+.|||.-+-.- .|.-+.++..+.|+.|+|++
T Consensus 373 LqVKsvl~DGPAa~dGk----------le~GDviV~INg~cvlGhTHAqaV~~fqaiPvg~~V~L~l 429 (984)
T KOG3209|consen 373 LQVKSVLKDGPAAQDGK----------LETGDVIVHINGECVLGHTHAQAVKRFQAIPVGQSVDLVL 429 (984)
T ss_pred eeeeecccCCchhhcCc----------cccCcEEEEECCceeccccHHHHHHHhhccccCCeeeEEE
Confidence 35778899999987654 445999999999999765 57778888889999999976
No 73
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=90.92 E-value=0.36 Score=47.59 Aligned_cols=54 Identities=31% Similarity=0.427 Sum_probs=43.2
Q ss_pred CcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEE
Q 014786 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 351 G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~v 416 (418)
-++|+++.++-.|+..|+- -+||-|+.|||.-|+.- +|+..+|.. .||+|.++|
T Consensus 81 PvviSkI~kdQaAd~tG~L----------FvGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV 136 (505)
T KOG3549|consen 81 PVVISKIYKDQAADITGQL----------FVGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTV 136 (505)
T ss_pred cEEeehhhhhhhhhhcCce----------EeeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEe
Confidence 3578888888777766654 34999999999999876 577788865 699999987
No 74
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=90.55 E-value=0.42 Score=49.80 Aligned_cols=67 Identities=28% Similarity=0.427 Sum_probs=50.4
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g 409 (418)
-.+|+.+..... .-++|.+++.|+-+++.|+-. +||.|++|||..|.+. .++.++|...+
T Consensus 134 eplG~Tik~~e~------~~~~vARI~~GG~~~r~glL~----------~GD~i~EvNGi~v~~~~~~e~q~~l~~~~-- 195 (542)
T KOG0609|consen 134 EPLGATIRVEED------TKVVVARIMHGGMADRQGLLH----------VGDEILEVNGISVANKSPEELQELLRNSR-- 195 (542)
T ss_pred CccceEEEeccC------CccEEeeeccCCcchhcccee----------eccchheecCeecccCCHHHHHHHHHhCC--
Confidence 356666654321 247999999999999998743 4999999999999875 68999998765
Q ss_pred CEEEEEE
Q 014786 410 DEVSCFT 416 (418)
Q Consensus 410 ~~v~l~v 416 (418)
..+++.|
T Consensus 196 G~itfki 202 (542)
T KOG0609|consen 196 GSITFKI 202 (542)
T ss_pred CcEEEEE
Confidence 3555544
No 75
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=88.78 E-value=0.64 Score=46.75 Aligned_cols=45 Identities=38% Similarity=0.499 Sum_probs=38.9
Q ss_pred cCcEEEecCCCChhhh-cCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhc
Q 014786 350 SGVLVLDAPPNGPAGK-AGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQ 405 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~-aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~ 405 (418)
.|+.|.+|...||..- .||.+ ||+|+++||.+|++.+|-.+.++.
T Consensus 220 ~gV~Vtev~~~Spl~gprGL~v-----------gdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 220 EGVTVTEVPSVSPLFGPRGLSV-----------GDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred ceEEEEeccccCCCcCcccCCc-----------cceEEecCCcccCCHHHHHHHHHh
Confidence 8999999999999533 47887 999999999999999998877763
No 76
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=88.57 E-value=0.6 Score=49.64 Aligned_cols=101 Identities=24% Similarity=0.376 Sum_probs=69.1
Q ss_pred cCCCCCCCcee-----CCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhhhhcccccc------cccCeeeccch
Q 014786 274 INPGNSGGPLL-----DSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTR------PILGIKFAPDQ 342 (418)
Q Consensus 274 i~~G~SGGPl~-----n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l~~~g~v~~------~~lGv~~~~~~ 342 (418)
+..-|+|||.- |...+++.|+-... ..+|.+..+..++.+++.-.|+. |..-+.+.-.+
T Consensus 677 iAnmm~~GpAarsgkLnIGDQiiaING~SL----------VGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd 746 (829)
T KOG3605|consen 677 IANMMHGGPAARSGKLNIGDQIMSINGTSL----------VGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPD 746 (829)
T ss_pred HHhcccCChhhhcCCccccceeEeecCcee----------ccccHHHHHHHHhcccccceEEEEEecCCCceEEEeeccc
Confidence 34557888874 44456777764332 24889999999988876554432 22223333333
Q ss_pred hhhhhCc---cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH
Q 014786 343 SVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG 396 (418)
Q Consensus 343 ~~~~~~~---~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~ 396 (418)
....+|. .| +|-....|+-|+|-|+++ |-.|++|||+.|.--
T Consensus 747 ~kyQLGFSVQNG-iICSLlRGGIAERGGVRV-----------GHRIIEINgQSVVA~ 791 (829)
T KOG3605|consen 747 LRYQLGFSVQNG-IICSLLRGGIAERGGVRV-----------GHRIIEINGQSVVAT 791 (829)
T ss_pred chhhccceeeCc-EeehhhcccchhccCcee-----------eeeEEEECCceEEec
Confidence 3445664 56 566889999999999999 999999999988643
No 77
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=87.95 E-value=0.61 Score=51.48 Aligned_cols=64 Identities=28% Similarity=0.476 Sum_probs=46.4
Q ss_pred cccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCC
Q 014786 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (418)
Q Consensus 332 ~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g 409 (418)
+.||+-|... .-|+|..|.+|+|+ .|+|+.||.|+.|||++|... +-+.+++..+ .
T Consensus 65 ~~lGFgfvag--------rPviVr~VT~GGps------------~GKL~PGDQIl~vN~Epv~daprervIdlvRac--e 122 (1298)
T KOG3552|consen 65 ASLGFGFVAG--------RPVIVRFVTEGGPS------------IGKLQPGDQILAVNGEPVKDAPRERVIDLVRAC--E 122 (1298)
T ss_pred ccccceeecC--------CceEEEEecCCCCc------------cccccCCCeEEEecCcccccccHHHHHHHHHHH--h
Confidence 5556555432 45789999999996 477888999999999999875 4666777665 3
Q ss_pred CEEEEEEE
Q 014786 410 DEVSCFTF 417 (418)
Q Consensus 410 ~~v~l~v~ 417 (418)
+.|.++|.
T Consensus 123 ~sv~ltV~ 130 (1298)
T KOG3552|consen 123 SSVNLTVC 130 (1298)
T ss_pred hhcceEEe
Confidence 45666653
No 78
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=87.47 E-value=0.46 Score=50.69 Aligned_cols=55 Identities=25% Similarity=0.358 Sum_probs=40.6
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
.|++|.+|.+++.|++.||+. ||.|++|||+.-.+... .++..-.+....+.++|
T Consensus 562 fgifV~~V~pgskAa~~GlKR-----------gDqilEVNgQnfenis~-~KA~eiLrnnthLtltv 616 (1283)
T KOG3542|consen 562 FGIFVAEVFPGSKAAREGLKR-----------GDQILEVNGQNFENISA-KKAEEILRNNTHLTLTV 616 (1283)
T ss_pred ceeEEeeecCCchHHHhhhhh-----------hhhhhhccccchhhhhH-HHHHHHhcCCceEEEEE
Confidence 689999999999999999999 99999999998776543 23332223344444443
No 79
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=86.97 E-value=0.69 Score=47.01 Aligned_cols=53 Identities=26% Similarity=0.513 Sum_probs=45.2
Q ss_pred EEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEEE
Q 014786 354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 354 V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v~ 417 (418)
|.+|.+++||++|||+. .+|.|+-+-+.--...+||...|..+ .++.+++-||
T Consensus 113 vl~V~p~SPaalAgl~~----------~~DYivG~~~~~~~~~eDl~~lIesh-e~kpLklyVY 165 (462)
T KOG3834|consen 113 VLSVEPNSPAALAGLRP----------YTDYIVGIWDAVMHEEEDLFTLIESH-EGKPLKLYVY 165 (462)
T ss_pred eeecCCCCHHHhccccc----------ccceEecchhhhccchHHHHHHHHhc-cCCCcceeEe
Confidence 78899999999999995 38999999666677789999999876 4888888776
No 80
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=86.86 E-value=0.85 Score=48.11 Aligned_cols=55 Identities=25% Similarity=0.378 Sum_probs=39.4
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~v~l~v 416 (418)
.-++|.+|.+|+||+ |.||.||-|+.|||....+...-..+-+-.+-|+..+++|
T Consensus 40 tSiViSDVlpGGPAe------------G~LQenDrvvMVNGvsMenv~haFAvQqLrksgK~A~Itv 94 (1027)
T KOG3580|consen 40 TSIVISDVLPGGPAE------------GLLQENDRVVMVNGVSMENVLHAFAVQQLRKSGKVAAITV 94 (1027)
T ss_pred eeEEEeeccCCCCcc------------cccccCCeEEEEcCcchhhhHHHHHHHHHHhhccceeEEe
Confidence 457899999999985 6677799999999999887765443322223455555554
No 81
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=84.63 E-value=2.3 Score=42.90 Aligned_cols=44 Identities=39% Similarity=0.605 Sum_probs=38.7
Q ss_pred ecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHHHHHHHHhcCCCCCE
Q 014786 356 DAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (418)
Q Consensus 356 ~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~dl~~~l~~~~~g~~ 411 (418)
.+..++++..+|++. ||.|+++|++++.+++++.+.+... .|..
T Consensus 135 ~v~~~s~a~~a~l~~-----------Gd~iv~~~~~~i~~~~~~~~~~~~~-~~~~ 178 (375)
T COG0750 135 EVAPKSAAALAGLRP-----------GDRIVAVDGEKVASWDDVRRLLVAA-AGDV 178 (375)
T ss_pred ecCCCCHHHHcCCCC-----------CCEEEeECCEEccCHHHHHHHHHhc-cCCc
Confidence 688899999999999 9999999999999999999888754 3444
No 82
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=84.55 E-value=0.86 Score=45.71 Aligned_cols=70 Identities=27% Similarity=0.362 Sum_probs=46.8
Q ss_pred ccccCeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcCCC
Q 014786 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKV 408 (418)
Q Consensus 331 ~~~lGv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~~~ 408 (418)
-+-||+.+.-....+ --++|+++-++-.|++.+ .|-.||.|++|||....+.. +-.++|+ +.
T Consensus 95 ~gGLGISIKGGreNk----MPIlISKIFkGlAADQt~----------aL~~gDaIlSVNG~dL~~AtHdeAVqaLK--ra 158 (506)
T KOG3551|consen 95 AGGLGISIKGGRENK----MPILISKIFKGLAADQTG----------ALFLGDAILSVNGEDLRDATHDEAVQALK--RA 158 (506)
T ss_pred CCcceEEeecCcccC----CceehhHhcccccccccc----------ceeeccEEEEecchhhhhcchHHHHHHHH--hh
Confidence 466777665422111 346788888887776653 34459999999999988764 3444554 57
Q ss_pred CCEEEEEE
Q 014786 409 GDEVSCFT 416 (418)
Q Consensus 409 g~~v~l~v 416 (418)
|++|.+.|
T Consensus 159 GkeV~lev 166 (506)
T KOG3551|consen 159 GKEVLLEV 166 (506)
T ss_pred Cceeeeee
Confidence 99988866
No 83
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=84.22 E-value=4.3 Score=33.51 Aligned_cols=53 Identities=23% Similarity=0.339 Sum_probs=35.1
Q ss_pred EEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCC
Q 014786 156 GFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGV 220 (418)
Q Consensus 156 GfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~ 220 (418)
++=|. +|..+|+.||.+.++.+. |..+ +++. ...|+|+++.+.. .++.+++++
T Consensus 3 avHIG-nG~~vt~tHva~~~~~v~-----g~~f--~~~~--~~ge~~~v~~~~~--~~p~~~ig~ 55 (105)
T PF03510_consen 3 AVHIG-NGRYVTVTHVAKSSDSVD-----GQPF--KIVK--TDGELCWVQSPLV--HLPAAQIGT 55 (105)
T ss_pred eEEeC-CCEEEEEEEEeccCceEc-----CcCc--EEEE--eccCEEEEECCCC--CCCeeEecc
Confidence 55565 589999999999877652 3222 2222 3559999999763 356666653
No 84
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=83.89 E-value=3.7 Score=45.58 Aligned_cols=54 Identities=19% Similarity=0.256 Sum_probs=34.2
Q ss_pred EEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEc--CCCCeEEEEecCC
Q 014786 153 SGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFD--QDKDVAVLRIDAP 209 (418)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d--~~~DlAlLkv~~~ 209 (418)
.|...+|+++ ||+|.+|+..+...+..-..++..|. ++... +..|+.+-|++.-
T Consensus 66 ~G~aTLigpq-YiVSV~HN~~gy~~v~FG~~g~~~Y~--iV~RNn~~~~Df~~pRLnK~ 121 (769)
T PF02395_consen 66 KGVATLIGPQ-YIVSVKHNGKGYNSVSFGNEGQNTYK--IVDRNNYPSGDFHMPRLNKF 121 (769)
T ss_dssp TSS-EEEETT-EEEBETTG-TSCCEECESCSSTCEEE--EEEEEBETTSTEBEEEESS-
T ss_pred CceEEEecCC-eEEEEEccCCCcCceeecccCCceEE--EEEccCCCCcccceeecCce
Confidence 4779999986 99999999866554433222344553 33333 4469999999763
No 85
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=83.54 E-value=1.3 Score=47.21 Aligned_cols=58 Identities=28% Similarity=0.383 Sum_probs=46.1
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH--HHHHHHHhcCCCCCEEEEEEE
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~--~dl~~~l~~~~~g~~v~l~v~ 417 (418)
+-|+|...+.++||+|.| +|-.||-|++|||...... +..+.+++..|.-..|+++|.
T Consensus 673 PTVViAnmm~~GpAarsg----------kLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV 732 (829)
T KOG3605|consen 673 PTVVIANMMHGGPAARSG----------KLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIV 732 (829)
T ss_pred hHHHHHhcccCChhhhcC----------CccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEe
Confidence 556777888899998774 4556999999999988775 567788888877778888774
No 86
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=82.66 E-value=0.62 Score=39.99 Aligned_cols=115 Identities=21% Similarity=0.254 Sum_probs=54.7
Q ss_pred EEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEEe
Q 014786 155 SGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIG 234 (418)
Q Consensus 155 SGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~vG 234 (418)
-|+.|+ |-.-|.+|--... . +--+.| +..-.+.+.+.|+..-....-...+.+..-+. +.+|++-
T Consensus 15 mgt~vn--GV~wT~~HGagsr-t--lAgp~G---pv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg~-------~dlylVt 79 (148)
T PF02907_consen 15 MGTCVN--GVMWTVYHGAGSR-T--LAGPKG---PVNQMYTSVDDDLVGWPAPPGARSLTPCTCGS-------SDLYLVT 79 (148)
T ss_dssp EEEEET--TEEEEEHHHHTTS-E--EEBTTS---EB-ESEEETTTTEEEEE-STTB--BBB-SSSS-------SEEEEE-
T ss_pred ehhEEc--cEEEEEEecCCcc-c--ccCCCC---cceEeEEcCCCCCcccccccccccCCccccCC-------ccEEEEe
Confidence 477775 7888888864321 1 111111 12233566777888777654333344433321 3566664
Q ss_pred CCCCCCCceeEeEEeeeeeeecccCCCCCcccE-EEEccccCCCCCCCceeCCCceEEEEEeeeeC
Q 014786 235 NPFGLDHTLTTGVISGLRREISSAATGRPIQDV-IQTDAAINPGNSGGPLLDSSGSLIGINTAIYS 299 (418)
Q Consensus 235 ~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~-i~~~~~i~~G~SGGPl~n~~G~VVGI~s~~~~ 299 (418)
+-. .+-.+ ++. ++..... .-.......|.||||++..+|.+|||..+...
T Consensus 80 r~~----~v~p~-----rr~------gd~~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~ 130 (148)
T PF02907_consen 80 RDA----DVIPV-----RRR------GDSRASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVC 130 (148)
T ss_dssp TTS-----EEEE-----EEE------STTEEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEE
T ss_pred ccC----cEeee-----EEc------CCCceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEE
Confidence 321 11111 111 0100011 11112234799999999999999999876544
No 87
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=82.28 E-value=2.1 Score=44.40 Aligned_cols=37 Identities=30% Similarity=0.370 Sum_probs=29.6
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCH
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG 396 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~ 396 (418)
.|+||.++++++.-+ +.|++..||+||.||.....++
T Consensus 277 ggIYVgsImkgGAVA----------~DGRIe~GDMiLQVNevsFENm 313 (626)
T KOG3571|consen 277 GGIYVGSIMKGGAVA----------LDGRIEPGDMILQVNEVSFENM 313 (626)
T ss_pred CceEEeeeccCceee----------ccCccCccceEEEeeecchhhc
Confidence 799999999998643 3466666999999999776665
No 88
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=74.67 E-value=4 Score=46.41 Aligned_cols=50 Identities=34% Similarity=0.444 Sum_probs=40.4
Q ss_pred EEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEE
Q 014786 353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVSCF 415 (418)
Q Consensus 353 ~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~~~g~~v~l~ 415 (418)
.|-.|.+++||..+|+++ ||.|+.+||++|.... ++.+.|.+ -|..+.+.
T Consensus 661 ~v~sv~egsPA~~agls~-----------~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~ 712 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSA-----------GDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLR 712 (1205)
T ss_pred eeeeecCCCCccccCCCc-----------cceeEeccCcccchhhHHHHHHHHHh--cCCeeEEE
Confidence 477889999999999999 9999999999998874 66666654 35565554
No 89
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=73.19 E-value=2.5 Score=42.92 Aligned_cols=23 Identities=26% Similarity=0.557 Sum_probs=20.7
Q ss_pred ccCCCCCCCceeCCCceEEEEEe
Q 014786 273 AINPGNSGGPLLDSSGSLIGINT 295 (418)
Q Consensus 273 ~i~~G~SGGPl~n~~G~VVGI~s 295 (418)
.+..|.||+.|+|.+|++|||..
T Consensus 351 ~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 351 SLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred CCCCCCCcCeEECCCCCEEEEeC
Confidence 55689999999999999999975
No 90
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=69.39 E-value=10 Score=37.11 Aligned_cols=47 Identities=26% Similarity=0.458 Sum_probs=36.7
Q ss_pred cCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcC
Q 014786 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQC 406 (418)
Q Consensus 350 ~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~ 406 (418)
+-+||..|-.++||.+-| .++.||-|++|||..|.... ++.++++..
T Consensus 30 PClYiVQvFD~tPAa~dG----------~i~~GDEi~avNg~svKGktKveVAkmIQ~~ 78 (429)
T KOG3651|consen 30 PCLYIVQVFDKTPAAKDG----------RIRCGDEIVAVNGISVKGKTKVEVAKMIQVS 78 (429)
T ss_pred CeEEEEEeccCCchhccC----------ccccCCeeEEecceeecCccHHHHHHHHHHh
Confidence 457899999999998754 33449999999999998764 566777654
No 91
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=62.13 E-value=26 Score=36.02 Aligned_cols=135 Identities=20% Similarity=0.296 Sum_probs=63.5
Q ss_pred CeEEEEEEEcCCcEEEecccccCCCC-eEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCC-CCCcccccCCCCCCCCCC
Q 014786 151 QGSGSGFVWDSKGHVVTNYHVIRGAS-DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPK-DKLRPIPIGVSADLLVGQ 228 (418)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHvv~~~~-~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~~~l~~~~~~~~G~ 228 (418)
-+.|-||-|+++ ..+|+-||+.... ++. | .+..-+.++..-+++-+++..+. .+++.+-|. +-...|.
T Consensus 378 fGsGWGfWVS~~-lfITttHViP~g~~E~F-----G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLE--eGapEGt 447 (535)
T PF05416_consen 378 FGSGWGFWVSPT-LFITTTHVIPPGAKEAF-----G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILE--EGAPEGT 447 (535)
T ss_dssp ETTEEEEESSSS-EEEEEGGGS-STTSEET-----T--EECGGEEEEEETTEEEEEESS-SSTTS---EE---SS--TT-
T ss_pred cCCceeeeecce-EEEEeeeecCCcchhhh-----C--CChhHeEEeeccceEEEecCCCCCCCccceeec--cCCCCce
Confidence 367999999987 9999999997532 211 0 01111233344577777776643 234444442 1233454
Q ss_pred EEEE-EeCCCCCC--CceeEeEEeeeeeeecccCCCCCcccEEEE-------ccccCCCCCCCceeCCCce---EEEEEe
Q 014786 229 KVYA-IGNPFGLD--HTLTTGVISGLRREISSAATGRPIQDVIQT-------DAAINPGNSGGPLLDSSGS---LIGINT 295 (418)
Q Consensus 229 ~V~~-vG~p~g~~--~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~-------~~~i~~G~SGGPl~n~~G~---VVGI~s 295 (418)
-+.+ |-.+.|.- ..+.-|......-.-.. ..+ ...++.+ |-...||+-|.|-+-..|+ |+|+++
T Consensus 448 V~siLiKR~sGEllpLAvRMgt~AsmkIqgr~-v~G--Q~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~ 524 (535)
T PF05416_consen 448 VCSILIKRPSGELLPLAVRMGTHASMKIQGRT-VHG--QMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHA 524 (535)
T ss_dssp EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEE-EEE--EEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEE
T ss_pred EEEEEEEcCCccchhhhhhhccceeEEEccee-ecc--eeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEe
Confidence 4333 33444421 23344433322211000 000 1122222 3345689999999976654 999998
Q ss_pred eee
Q 014786 296 AIY 298 (418)
Q Consensus 296 ~~~ 298 (418)
+..
T Consensus 525 AAt 527 (535)
T PF05416_consen 525 AAT 527 (535)
T ss_dssp EE-
T ss_pred hhc
Confidence 764
No 92
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=53.59 E-value=56 Score=29.83 Aligned_cols=37 Identities=30% Similarity=0.333 Sum_probs=29.5
Q ss_pred CeeeccchhhhhhCccCcEEEecCCCChhhhcCccccccccCCCccCCcEEEEEC
Q 014786 335 GIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVN 389 (418)
Q Consensus 335 Gv~~~~~~~~~~~~~~G~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~in 389 (418)
|+.+.++. ..++|..|.-+|||+++|+.- |+.|+++-
T Consensus 114 GL~l~~e~-------~~~~Vd~v~fgS~A~~~g~d~-----------d~~I~~v~ 150 (183)
T PF11874_consen 114 GLTLMEEG-------GKVIVDEVEFGSPAEKAGIDF-----------DWEITEVE 150 (183)
T ss_pred CCEEEeeC-------CEEEEEecCCCCHHHHcCCCC-----------CcEEEEEE
Confidence 66655533 557899999999999999998 88887763
No 93
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=52.40 E-value=41 Score=24.30 Aligned_cols=33 Identities=18% Similarity=0.415 Sum_probs=28.3
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (418)
..+.|.+.||+.+.+.+..+|...++.+-....
T Consensus 7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~~ 39 (63)
T cd00600 7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVEE 39 (63)
T ss_pred CEEEEEECCCcEEEEEEEEECCCCCEEECCEEE
Confidence 468899999999999999999998888776643
No 94
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=50.57 E-value=57 Score=24.20 Aligned_cols=33 Identities=15% Similarity=0.303 Sum_probs=28.5
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (418)
..+.+.+-.|..++++|+++|....+.+|+-.+
T Consensus 7 s~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 7 SQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred cEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 456777788999999999999999999998654
No 95
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.13 E-value=31 Score=27.49 Aligned_cols=37 Identities=5% Similarity=0.297 Sum_probs=30.9
Q ss_pred ccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 171 VIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 171 vv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
++.....+.|.+.+++.+.+++.++|...++.|=...
T Consensus 10 ~~~~~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 10 AVKNNTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred HHcCCCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 3445578899999999999999999999998876554
No 96
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=50.03 E-value=16 Score=35.15 Aligned_cols=56 Identities=14% Similarity=0.303 Sum_probs=42.5
Q ss_pred cEEEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeCCHH--HHHHHHhcCCCCCEEEEEEE
Q 014786 352 VLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDEVSCFTF 417 (418)
Q Consensus 352 ~~V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~~~~--dl~~~l~~~~~g~~v~l~v~ 417 (418)
+.|..+.++|--++.- -..+||.|-+|||+.|.... ++.++|+..+.|++.++.++
T Consensus 151 AFIKrIkegsvidri~----------~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLi 208 (334)
T KOG3938|consen 151 AFIKRIKEGSVIDRIE----------AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLI 208 (334)
T ss_pred eeeEeecCCchhhhhh----------heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEee
Confidence 4566666666655432 22349999999999999886 66789999999999988754
No 97
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.99 E-value=52 Score=25.38 Aligned_cols=32 Identities=13% Similarity=0.215 Sum_probs=27.6
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.||+.+.+.+.++|...+|.+=...
T Consensus 11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred CeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence 46889999999999999999999998765553
No 98
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=45.98 E-value=53 Score=24.53 Aligned_cols=33 Identities=9% Similarity=0.245 Sum_probs=29.1
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (418)
..+.|.+.+|+.+.+++.++|...++.+-....
T Consensus 11 ~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 11 KPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence 568899999999999999999999998877653
No 99
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=45.98 E-value=52 Score=25.00 Aligned_cols=33 Identities=12% Similarity=0.320 Sum_probs=28.6
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (418)
..+.|.+.||+.|.+++.++|...++-+=....
T Consensus 15 k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 47 (72)
T PRK00737 15 SPVLVRLKGGREFRGELQGYDIHMNLVLDNAEE 47 (72)
T ss_pred CEEEEEECCCCEEEEEEEEEcccceeEEeeEEE
Confidence 468899999999999999999999988877643
No 100
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=45.97 E-value=48 Score=24.72 Aligned_cols=32 Identities=13% Similarity=0.233 Sum_probs=27.8
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.+|+.|.+++.++|...++-+=...
T Consensus 11 ~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 11 RPVVVKLNSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence 46889999999999999999999888876654
No 101
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=45.65 E-value=44 Score=25.05 Aligned_cols=32 Identities=13% Similarity=0.217 Sum_probs=27.7
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.||+.|.+++.++|...++.+=.+.
T Consensus 12 ~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 12 KPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 46889999999999999999999888876554
No 102
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=41.69 E-value=20 Score=24.97 Aligned_cols=21 Identities=38% Similarity=0.616 Sum_probs=17.7
Q ss_pred CCCCCCceeCCCceEEEEEee
Q 014786 276 PGNSGGPLLDSSGSLIGINTA 296 (418)
Q Consensus 276 ~G~SGGPl~n~~G~VVGI~s~ 296 (418)
.+.+.-|++|.+|+++|+.+.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 356778999999999999874
No 103
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.68 E-value=59 Score=25.11 Aligned_cols=32 Identities=19% Similarity=0.429 Sum_probs=27.5
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.||+.+.+.+.++|...++.|=...
T Consensus 11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence 46889999999999999999999988875553
No 104
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=41.03 E-value=52 Score=25.67 Aligned_cols=31 Identities=10% Similarity=0.271 Sum_probs=26.8
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEe
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (418)
..+.|.+.+|+.+.+++.++|...+|.|=..
T Consensus 12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence 5688999999999999999999998876444
No 105
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=40.91 E-value=66 Score=25.12 Aligned_cols=31 Identities=23% Similarity=0.334 Sum_probs=26.8
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEe
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (418)
..+.|.+.||+.+.+++.++|...+|.+=..
T Consensus 13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence 4688999999999999999999988876544
No 106
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=39.66 E-value=17 Score=35.83 Aligned_cols=30 Identities=40% Similarity=0.684 Sum_probs=26.9
Q ss_pred EEecCCCChhhhcCccccccccCCCccCCcEEEEECCEEeC
Q 014786 354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS 394 (418)
Q Consensus 354 V~~v~~~~pa~~aGl~~~~~~~~~~l~~GDiIl~ing~~v~ 394 (418)
+.+|.+.+|++++|.-. ||.|+-+|+-++.
T Consensus 67 ~lrv~~~~~~e~~~~~~-----------~dyilg~n~Dp~~ 96 (417)
T COG5233 67 VLRVNPESPAEKAGMVV-----------GDYILGINEDPLR 96 (417)
T ss_pred heeccccChhHhhcccc-----------ceeEEeecCCcHH
Confidence 67889999999999998 9999999987764
No 107
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=38.74 E-value=64 Score=24.90 Aligned_cols=31 Identities=16% Similarity=0.409 Sum_probs=27.1
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEe
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (418)
..+.|.+.+|+.+.+++.++|...++.+=..
T Consensus 14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 5788999999999999999999998886554
No 108
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=38.53 E-value=77 Score=24.33 Aligned_cols=31 Identities=16% Similarity=0.185 Sum_probs=27.0
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEe
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (418)
..+.|.+.||+.+.+.+.++|+..++.+=..
T Consensus 13 k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~ 43 (74)
T cd01728 13 KKVVVLLRDGRKLIGILRSFDQFANLVLQDT 43 (74)
T ss_pred CEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence 5688999999999999999999988877554
No 109
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=38.36 E-value=23 Score=33.06 Aligned_cols=55 Identities=15% Similarity=0.412 Sum_probs=38.0
Q ss_pred ccEEEEccccCCCCCCCceeCC----CceEEEEEeeeeCCCCCCCCccceeec--ccchhhhhhh
Q 014786 265 QDVIQTDAAINPGNSGGPLLDS----SGSLIGINTAIYSPSGASSGVGFSIPV--DTVNGIVDQL 323 (418)
Q Consensus 265 ~~~i~~~~~i~~G~SGGPl~n~----~G~VVGI~s~~~~~~~~~~~~~~aIp~--~~i~~~l~~l 323 (418)
...+++..+...|+=|||++-. .-+++||+.++.. ..+.+||-++ +.+++.++.|
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~----~~~~gYAe~itQEDL~~A~~~l 228 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSA----NHAMGYAESITQEDLMRAINKL 228 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccc----cccceehhhhhHHHHHHHHHhh
Confidence 4566778888999999999843 3479999998753 3467788655 4444444443
No 110
>PF09122 DUF1930: Domain of unknown function (DUF1930); InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=38.35 E-value=78 Score=23.58 Aligned_cols=35 Identities=23% Similarity=0.374 Sum_probs=24.2
Q ss_pred CcEEEEECCEEeCCHH-HHHHHHhcCCCCCEEEEEE
Q 014786 382 GDIITSVNGKKVSNGS-DLYRILDQCKVGDEVSCFT 416 (418)
Q Consensus 382 GDiIl~ing~~v~~~~-dl~~~l~~~~~g~~v~l~v 416 (418)
.-.-+.+||..+.+++ +|..++.....|+..++.+
T Consensus 19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~F 54 (68)
T PF09122_consen 19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFF 54 (68)
T ss_dssp TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEE
T ss_pred cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEE
Confidence 5677889999999996 7888888778899887753
No 111
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=37.97 E-value=82 Score=23.96 Aligned_cols=32 Identities=9% Similarity=0.165 Sum_probs=27.2
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.+|+.+.+++.++|...+|.+=...
T Consensus 11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~ 42 (72)
T cd01719 11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV 42 (72)
T ss_pred CeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence 46788999999999999999998888775553
No 112
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=36.94 E-value=87 Score=22.89 Aligned_cols=32 Identities=19% Similarity=0.413 Sum_probs=27.5
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.||+.+.+.+..+|...++-+=...
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~ 40 (67)
T smart00651 9 KRVLVELKNGREYRGTLKGFDQFMNLVLEDVE 40 (67)
T ss_pred cEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence 46889999999999999999999888876554
No 113
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=35.63 E-value=92 Score=23.47 Aligned_cols=32 Identities=9% Similarity=0.272 Sum_probs=28.7
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.+|..|.+++..+|...++.+-.+.
T Consensus 11 ~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 11 HIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence 46889999999999999999999999888774
No 114
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=35.46 E-value=62 Score=23.74 Aligned_cols=33 Identities=21% Similarity=0.444 Sum_probs=29.1
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (418)
..+.|.+.||+.+.+.+..+|...++-+-....
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~ 41 (67)
T PF01423_consen 9 KRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE 41 (67)
T ss_dssp SEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred cEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence 568899999999999999999998888877754
No 115
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=34.19 E-value=93 Score=23.67 Aligned_cols=32 Identities=19% Similarity=0.267 Sum_probs=27.4
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.||+.+.+++.++|...++.+=...
T Consensus 10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 10 KTVSVITVDGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence 46788999999999999999999888776653
No 116
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=32.60 E-value=67 Score=25.19 Aligned_cols=47 Identities=19% Similarity=0.376 Sum_probs=31.0
Q ss_pred EEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEE-EeCC
Q 014786 188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYA-IGNP 236 (418)
Q Consensus 188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~-vG~p 236 (418)
++++++..|...++|++.+-.-.... .+.+-. .+++.|+.|.+ +||.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvkreV-~l~Lv~-~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVKREV-NLDLVG-EEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEeEEE-Eeeeec-CccccCCEEEEEeeEE
Confidence 46788889988789999986532111 122211 26889999876 6764
No 117
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=31.62 E-value=93 Score=23.92 Aligned_cols=33 Identities=21% Similarity=0.459 Sum_probs=28.6
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecC
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (418)
..+.|.+.+|+.|.+++.++|...++.+--+..
T Consensus 18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence 678999999999999999999999888776544
No 118
>PF02601 Exonuc_VII_L: Exonuclease VII, large subunit; InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus. This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=30.30 E-value=60 Score=31.96 Aligned_cols=35 Identities=29% Similarity=0.499 Sum_probs=31.2
Q ss_pred eEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCc
Q 014786 152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQS 186 (418)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~ 186 (418)
..|-.++.+++|.++|+..-+...+.+++.+.||.
T Consensus 280 ~RGYaiv~~~~g~vI~s~~~l~~gd~i~i~l~DG~ 314 (319)
T PF02601_consen 280 KRGYAIVRDKDGKVITSVKQLKPGDEIEIRLADGS 314 (319)
T ss_pred hCceEEEECCCCCEECCHHHCCCCCEEEEEEcceE
Confidence 45778888888999999999999999999999995
No 119
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=28.56 E-value=49 Score=27.30 Aligned_cols=18 Identities=33% Similarity=0.617 Sum_probs=13.2
Q ss_pred CceeCCCceEEEEEeeee
Q 014786 281 GPLLDSSGSLIGINTAIY 298 (418)
Q Consensus 281 GPl~n~~G~VVGI~s~~~ 298 (418)
.|++|.+|++||++..++
T Consensus 94 ~PV~d~~g~viG~V~VG~ 111 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVGV 111 (116)
T ss_dssp EEEE-TTS-EEEEEEEEE
T ss_pred EeeECCCCcEEEEEEEEE
Confidence 578889999999998653
No 120
>PF05578 Peptidase_S31: Pestivirus NS3 polyprotein peptidase S31; InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=26.50 E-value=1.5e+02 Score=26.25 Aligned_cols=131 Identities=18% Similarity=0.250 Sum_probs=63.0
Q ss_pred eEEEEEEEcCCcEEEecccccCCCCeEEEEeCCCcEEEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEE
Q 014786 152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVY 231 (418)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHvv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~ 231 (418)
+.-+|+-+..+|-|-.--||-.+.+.+ |.-.=|+. +++..+... +... ..+- +....-...|...|
T Consensus 51 gletgwaythqggissvdhvt~gkd~l-vcdsmgrt---rvvcqsnnk------~tde-~eyg---vktdsgcp~garcy 116 (211)
T PF05578_consen 51 GLETGWAYTHQGGISSVDHVTAGKDLL-VCDSMGRT---RVVCQSNNK------MTDE-TEYG---VKTDSGCPDGARCY 116 (211)
T ss_pred cccccceeeccCCcccceeeecCCceE-EecCCCce---EEEEccCCc------ccch-hhcc---cccCCCCCCCcEEE
Confidence 345677777677777777776664432 22222221 222222110 0000 0010 11112245578888
Q ss_pred EEeCCCCCCCceeEeEEeeeeeeecccCCCCCcccEEEEccccCCCCCCCceeCC-CceEEEEEeee
Q 014786 232 AIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDS-SGSLIGINTAI 297 (418)
Q Consensus 232 ~vG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~~~~i~~G~SGGPl~n~-~G~VVGI~s~~ 297 (418)
++ +|...+.+-+.|.+-.+...-.+...-.....--.+|..-..|.||=|+|.. .|++||=.-.+
T Consensus 117 v~-npea~nisgtkga~vhlqk~ggef~cvta~gtpaf~~~knlkg~s~~pifeassgr~vgr~k~g 182 (211)
T PF05578_consen 117 VL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGTPAFFDLKNLKGWSGLPIFEASSGRVVGRVKVG 182 (211)
T ss_pred Ee-CCcccccccCcceEEEEeccCCceEEEeccCCcceeeccccCCCCCCceeeccCCcEEEEEEec
Confidence 87 6655555555665544333211000000000001223334569999999985 89999987654
No 121
>COG0061 nadF NAD kinase [Coenzyme metabolism]
Probab=26.20 E-value=27 Score=34.04 Aligned_cols=32 Identities=28% Similarity=0.525 Sum_probs=29.4
Q ss_pred cccccccceeeeecCCCCccCCCccccccCCc
Q 014786 2 AYSLISSSTFLLSRSPNTTLAPLNKHNFPLRP 33 (418)
Q Consensus 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33 (418)
||||+---|.+.-+.+.+.+.|+++|++++||
T Consensus 179 AY~lSAGGPIv~P~l~ai~ltpi~p~~l~~Rp 210 (281)
T COG0061 179 AYNLSAGGPILHPGLDAIQLTPICPHSLSFRP 210 (281)
T ss_pred HHhhhcCCCccCCCCCeEEEeecCCCcccCCC
Confidence 79999999999999999999999999998764
No 122
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=25.68 E-value=2.7e+02 Score=20.27 Aligned_cols=35 Identities=17% Similarity=0.335 Sum_probs=27.4
Q ss_pred CCCeEEEEeCCCcE-EEEEEEEEcCCCCeEEEEecC
Q 014786 174 GASDIRVTFADQSA-YDAKIVGFDQDKDVAVLRIDA 208 (418)
Q Consensus 174 ~~~~i~V~~~dg~~-~~a~vv~~d~~~DlAlLkv~~ 208 (418)
..+.+.++.++... |++++..+|...++.-++.+.
T Consensus 8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~D 43 (55)
T PF09465_consen 8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYED 43 (55)
T ss_dssp SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETT
T ss_pred CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcC
Confidence 44568888888776 599999999999999999976
No 123
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=25.62 E-value=1.8e+02 Score=22.21 Aligned_cols=32 Identities=13% Similarity=0.270 Sum_probs=28.4
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.+|+.+.+++..+|...++.+-.+.
T Consensus 12 ~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 12 HPMLVELKNGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence 56889999999999999999999999887664
No 124
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=24.10 E-value=2.1e+02 Score=21.60 Aligned_cols=43 Identities=23% Similarity=0.395 Sum_probs=29.3
Q ss_pred EEEEEEEEcCCCCeEEEEecCCCCCCcccccCCCCCCCCCCEEEEE
Q 014786 188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI 233 (418)
Q Consensus 188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~~~l~~~~~~~~G~~V~~v 233 (418)
++++++..+.....|++..... ...+.+.--.++++||+|.+-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G~---~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGGV---RREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETTE---EEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCCc---EEEEEEEEeCCCCCCCEEEEe
Confidence 5788888888889999988752 334444334458999998874
No 125
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=23.32 E-value=58 Score=24.68 Aligned_cols=30 Identities=27% Similarity=0.626 Sum_probs=21.6
Q ss_pred CceeCCCceEEEEEeeeeCCCCCCCCccceeecccchhhhhhh
Q 014786 281 GPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQL 323 (418)
Q Consensus 281 GPl~n~~G~VVGI~s~~~~~~~~~~~~~~aIp~~~i~~~l~~l 323 (418)
-|+++.+|+++|+... .+..+.+.++++++
T Consensus 19 ~pi~~~~g~~~Gvv~~-------------di~l~~l~~~i~~~ 48 (81)
T PF02743_consen 19 VPIYDDDGKIIGVVGI-------------DISLDQLSEIISNI 48 (81)
T ss_dssp EEEEETTTEEEEEEEE-------------EEEHHHHHHHHTTS
T ss_pred EEEECCCCCEEEEEEE-------------EeccceeeeEEEee
Confidence 4678789999999754 36666777766664
No 126
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=23.21 E-value=51 Score=33.37 Aligned_cols=23 Identities=35% Similarity=0.525 Sum_probs=18.7
Q ss_pred CeEEEEEEEcC----Cc------EEEecccccC
Q 014786 151 QGSGSGFVWDS----KG------HVVTNYHVIR 173 (418)
Q Consensus 151 ~~~GSGfiI~~----~G------~ILT~aHvv~ 173 (418)
...|||.|+|- ++ |+.||.||+.
T Consensus 35 ~~~GT~WIlDy~~~~~~~~p~k~y~ATNlHVa~ 67 (374)
T PF01732_consen 35 SVSGTGWILDYKKPEDNKYPTKWYFATNLHVAS 67 (374)
T ss_pred cCcceEEEEEEeccCCCCCCeEEEEEechhhhc
Confidence 46899999982 22 6999999998
No 127
>PF14438 SM-ATX: Ataxin 2 SM domain; PDB: 1M5Q_1.
Probab=23.19 E-value=1.9e+02 Score=21.94 Aligned_cols=28 Identities=21% Similarity=0.313 Sum_probs=20.5
Q ss_pred CeEEEEeCCCcEEEEEEEEEcC---CCCeEE
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQ---DKDVAV 203 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~---~~DlAl 203 (418)
..++|++.||..|++-+...++ +.|+.|
T Consensus 13 ~~V~V~~~~G~~yeGif~s~s~~~~~~~vvL 43 (77)
T PF14438_consen 13 QTVEVTTKNGSVYEGIFHSASPESNEFDVVL 43 (77)
T ss_dssp SEEEEEETTS-EEEEEEEEE-T---T--EEE
T ss_pred CEEEEEECCCCEEEEEEEeCCCcccceeEEE
Confidence 5689999999999999999988 556655
No 128
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=22.43 E-value=61 Score=26.21 Aligned_cols=21 Identities=33% Similarity=0.482 Sum_probs=16.8
Q ss_pred CCCCCceeCCCceEEEEEeee
Q 014786 277 GNSGGPLLDSSGSLIGINTAI 297 (418)
Q Consensus 277 G~SGGPl~n~~G~VVGI~s~~ 297 (418)
+.+.=|++|.+|+++|+++..
T Consensus 98 ~~~~lpVvd~~~~~vGiit~~ 118 (123)
T cd04627 98 GISSVAVVDNQGNLIGNISVT 118 (123)
T ss_pred CCceEEEECCCCcEEEEEeHH
Confidence 445578999999999998853
No 129
>cd01725 LSm2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm2 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=20.67 E-value=2.4e+02 Score=21.90 Aligned_cols=32 Identities=13% Similarity=0.276 Sum_probs=28.3
Q ss_pred CeEEEEeCCCcEEEEEEEEEcCCCCeEEEEec
Q 014786 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (418)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (418)
..+.|.+.+|..+.+++..+|...++-+-.++
T Consensus 12 ~~V~VeLKng~~~~G~L~~vD~~MNi~L~n~~ 43 (81)
T cd01725 12 KEVTVELKNDLSIRGTLHSVDQYLNIKLTNIS 43 (81)
T ss_pred CEEEEEECCCcEEEEEEEEECCCcccEEEEEE
Confidence 46889999999999999999999998887764
No 130
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=20.53 E-value=74 Score=25.26 Aligned_cols=20 Identities=20% Similarity=0.270 Sum_probs=15.8
Q ss_pred CCCCCceeCCCceEEEEEee
Q 014786 277 GNSGGPLLDSSGSLIGINTA 296 (418)
Q Consensus 277 G~SGGPl~n~~G~VVGI~s~ 296 (418)
+.+--|++|.+|+++|+++.
T Consensus 86 ~~~~lpVvd~~~~~~Giit~ 105 (111)
T cd04603 86 EPPVVAVVDKEGKLVGTIYE 105 (111)
T ss_pred CCCeEEEEcCCCeEEEEEEh
Confidence 44446899988999999874
Done!