Query 019504
Match_columns 340
No_of_seqs 313 out of 2547
Neff 8.9
Searched_HMMs 46136
Date Fri Mar 29 10:00:04 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/019504.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/019504hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 6.1E-52 1.3E-56 397.6 32.6 289 22-340 41-349 (455)
2 TIGR02038 protease_degS peripl 100.0 3.9E-51 8.5E-56 381.5 32.7 291 16-340 40-337 (351)
3 PRK10898 serine endoprotease; 100.0 2.8E-50 6E-55 375.5 31.9 287 20-340 44-338 (353)
4 PRK10942 serine endoprotease; 100.0 2.9E-49 6.2E-54 380.9 30.9 289 22-340 39-370 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 1.1E-48 2.3E-53 375.7 30.9 288 23-340 3-316 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 1.1E-38 2.5E-43 298.7 26.5 290 21-339 33-328 (347)
7 KOG1320 Serine protease [Postt 100.0 5.7E-29 1.2E-33 233.0 19.0 290 17-317 124-436 (473)
8 KOG1421 Predicted signaling-as 99.9 6.3E-24 1.4E-28 201.4 18.0 287 18-340 49-360 (955)
9 KOG1421 Predicted signaling-as 99.8 1E-17 2.2E-22 159.6 14.3 274 27-320 524-809 (955)
10 PF13365 Trypsin_2: Trypsin-li 99.7 2.6E-17 5.7E-22 130.1 11.2 117 59-210 1-120 (120)
11 KOG1320 Serine protease [Postt 99.5 3.7E-13 8E-18 127.1 11.4 256 26-312 55-319 (473)
12 PF00089 Trypsin: Trypsin; In 99.4 1.3E-11 2.8E-16 107.5 16.1 167 58-237 26-220 (220)
13 cd00190 Tryp_SPc Trypsin-like 99.3 1.3E-10 2.8E-15 101.9 15.0 172 57-239 25-231 (232)
14 PF13180 PDZ_2: PDZ domain; PD 99.2 1.2E-11 2.7E-16 91.1 3.1 72 249-340 1-73 (82)
15 smart00020 Tryp_SPc Trypsin-li 99.2 1.2E-09 2.5E-14 96.0 15.2 164 57-231 26-223 (229)
16 COG3591 V8-like Glu-specific e 98.9 6.7E-08 1.4E-12 84.8 14.3 195 27-241 33-250 (251)
17 cd00991 PDZ_archaeal_metallopr 98.8 3.7E-09 8.1E-14 77.2 3.8 60 267-340 9-69 (79)
18 PF00863 Peptidase_C4: Peptida 98.8 7.8E-08 1.7E-12 83.7 12.0 167 27-230 13-184 (235)
19 TIGR01713 typeII_sec_gspC gene 98.8 5.5E-09 1.2E-13 93.4 4.6 92 230-340 158-250 (259)
20 cd00987 PDZ_serine_protease PD 98.8 1.2E-08 2.6E-13 76.2 5.3 77 250-340 2-83 (90)
21 cd00989 PDZ_metalloprotease PD 98.7 1.1E-08 2.5E-13 74.4 3.8 58 268-339 12-69 (79)
22 cd00990 PDZ_glycyl_aminopeptid 98.7 1.6E-08 3.5E-13 73.8 4.4 67 249-339 1-67 (80)
23 cd00988 PDZ_CTP_protease PDZ d 98.7 2.3E-08 5E-13 73.9 5.0 69 249-339 2-72 (85)
24 cd00136 PDZ PDZ domain, also c 98.6 2.6E-08 5.7E-13 70.7 3.5 55 268-336 13-69 (70)
25 cd00986 PDZ_LON_protease PDZ d 98.6 5E-08 1.1E-12 71.1 3.2 58 268-340 8-66 (79)
26 TIGR02037 degP_htrA_DO peripla 98.4 2.8E-07 6E-12 89.1 5.2 77 250-340 339-421 (428)
27 smart00228 PDZ Domain present 98.3 8E-07 1.7E-11 65.3 3.7 60 268-340 26-85 (85)
28 TIGR00225 prc C-terminal pepti 98.2 2.5E-06 5.4E-11 79.7 6.7 70 248-339 50-121 (334)
29 cd00992 PDZ_signaling PDZ doma 98.2 1.1E-06 2.4E-11 64.2 2.6 38 268-316 26-65 (82)
30 PLN00049 carboxyl-terminal pro 98.2 3.6E-06 7.9E-11 80.1 6.4 79 247-339 83-161 (389)
31 PRK10779 zinc metallopeptidase 98.1 8.4E-07 1.8E-11 86.1 2.0 57 270-340 128-185 (449)
32 TIGR00054 RIP metalloprotease 98.1 1.7E-06 3.6E-11 83.2 3.4 59 268-340 203-261 (420)
33 KOG3627 Trypsin [Amino acid tr 98.1 0.00023 5.1E-09 63.6 16.1 124 117-240 106-253 (256)
34 PF00595 PDZ: PDZ domain (Also 98.0 5.9E-06 1.3E-10 60.4 4.4 38 268-316 25-62 (81)
35 PRK10139 serine endoprotease; 98.0 2.7E-06 5.8E-11 82.5 3.1 58 268-340 390-447 (455)
36 PRK10779 zinc metallopeptidase 98.0 3E-06 6.5E-11 82.3 3.2 58 269-340 222-279 (449)
37 PF14685 Tricorn_PDZ: Tricorn 98.0 3.6E-06 7.8E-11 62.1 2.7 60 268-339 12-79 (88)
38 PRK10942 serine endoprotease; 98.0 4.2E-06 9.1E-11 81.6 3.1 58 268-340 408-465 (473)
39 TIGR00054 RIP metalloprotease 97.9 4.9E-06 1.1E-10 80.0 2.7 57 268-339 128-184 (420)
40 COG0793 Prc Periplasmic protea 97.9 1.6E-05 3.4E-10 75.9 5.4 74 247-339 98-171 (406)
41 TIGR02860 spore_IV_B stage IV 97.8 2.3E-05 5.1E-10 73.6 3.9 66 251-340 98-171 (402)
42 PF05579 Peptidase_S32: Equine 97.7 0.00028 6E-09 61.8 9.6 117 57-216 112-230 (297)
43 KOG3553 Tax interaction protei 97.6 4.5E-05 9.7E-10 56.3 3.2 34 268-312 59-92 (124)
44 PRK11186 carboxy-terminal prot 97.6 7.4E-05 1.6E-09 75.2 5.5 73 247-338 242-319 (667)
45 PF03761 DUF316: Domain of unk 97.4 0.0048 1E-07 56.1 14.4 111 114-235 158-273 (282)
46 PF00548 Peptidase_C3: 3C cyst 97.2 0.015 3.2E-07 48.9 13.5 135 58-214 26-170 (172)
47 PF10459 Peptidase_S46: Peptid 97.1 0.0032 7E-08 63.9 10.4 23 58-80 48-70 (698)
48 PF04495 GRASP55_65: GRASP55/6 97.1 0.00025 5.5E-09 57.1 1.9 57 268-338 43-100 (138)
49 PF12812 PDZ_1: PDZ-like domai 97.0 0.00049 1.1E-08 49.7 2.7 62 250-325 10-73 (78)
50 COG5640 Secreted trypsin-like 97.0 0.017 3.6E-07 53.0 12.5 55 189-243 223-280 (413)
51 PF05580 Peptidase_S55: SpoIVB 96.9 0.032 6.9E-07 47.9 12.6 44 185-232 171-214 (218)
52 COG3480 SdrC Predicted secrete 96.8 0.00064 1.4E-08 61.1 2.0 54 268-337 130-183 (342)
53 PF08192 Peptidase_S64: Peptid 96.7 0.014 2.9E-07 57.8 10.6 118 115-240 541-688 (695)
54 PF10459 Peptidase_S46: Peptid 96.7 0.0028 6E-08 64.3 5.8 57 184-240 623-686 (698)
55 COG3975 Predicted protease wit 96.6 0.0036 7.8E-08 60.1 5.2 31 268-309 462-492 (558)
56 PF02122 Peptidase_S39: Peptid 96.5 0.032 6.9E-07 48.0 10.1 148 57-231 30-182 (203)
57 KOG3209 WW domain-containing p 96.4 0.0016 3.4E-08 64.2 2.1 54 272-337 782-835 (984)
58 PF00949 Peptidase_S7: Peptida 96.4 0.0069 1.5E-07 48.2 5.2 33 185-217 88-120 (132)
59 KOG3129 26S proteasome regulat 96.4 0.0018 4E-08 54.7 2.0 60 269-339 140-199 (231)
60 PRK09681 putative type II secr 96.1 0.003 6.6E-08 56.6 2.1 53 275-340 211-266 (276)
61 KOG3532 Predicted protein kina 95.9 0.0029 6.3E-08 62.1 1.2 54 268-338 398-451 (1051)
62 KOG3580 Tight junction protein 94.9 0.017 3.6E-07 56.2 2.4 78 251-340 200-279 (1027)
63 TIGR02860 spore_IV_B stage IV 94.8 0.26 5.6E-06 46.8 10.0 96 132-232 294-394 (402)
64 KOG3580 Tight junction protein 94.5 0.014 3E-07 56.8 0.8 60 266-338 427-486 (1027)
65 COG3031 PulC Type II secretory 94.4 0.016 3.5E-07 50.1 1.0 59 269-340 208-266 (275)
66 PF02907 Peptidase_S29: Hepati 93.4 0.15 3.2E-06 40.3 4.5 131 59-233 14-146 (148)
67 KOG3550 Receptor targeting pro 93.3 0.096 2.1E-06 42.0 3.3 38 268-315 115-152 (207)
68 PF00944 Peptidase_S3: Alphavi 92.8 0.19 4.1E-06 39.6 4.3 30 188-217 100-129 (158)
69 KOG2921 Intramembrane metallop 92.0 0.095 2E-06 48.7 2.0 43 263-316 215-258 (484)
70 PF03510 Peptidase_C24: 2C end 91.9 1 2.2E-05 34.3 7.2 56 59-139 1-56 (105)
71 KOG3552 FERM domain protein FR 91.9 0.14 2.9E-06 52.6 3.1 36 268-315 75-110 (1298)
72 KOG3209 WW domain-containing p 91.8 0.092 2E-06 52.3 1.8 60 268-340 923-982 (984)
73 KOG3605 Beta amyloid precursor 91.1 0.38 8.3E-06 47.6 5.3 101 191-313 677-790 (829)
74 KOG3542 cAMP-regulated guanine 90.9 0.091 2E-06 52.0 0.9 37 268-315 562-598 (1283)
75 PF02395 Peptidase_S6: Immunog 90.6 3.7 8E-05 42.7 12.0 49 191-240 213-266 (769)
76 PF05416 Peptidase_C37: Southa 89.3 0.84 1.8E-05 43.1 5.7 140 53-216 375-528 (535)
77 KOG1892 Actin filament-binding 88.7 0.19 4E-06 51.8 1.1 59 268-338 960-1018(1629)
78 KOG3606 Cell polarity protein 88.5 0.45 9.7E-06 42.1 3.1 61 268-338 194-261 (358)
79 PF09342 DUF1986: Domain of un 88.1 7.8 0.00017 34.2 10.4 92 56-155 27-131 (267)
80 KOG3651 Protein kinase C, alph 86.7 0.75 1.6E-05 41.4 3.5 38 268-315 30-67 (429)
81 KOG0606 Microtubule-associated 84.5 0.3 6.6E-06 51.3 0.1 35 270-315 660-694 (1205)
82 KOG3549 Syntrophins (type gamm 84.3 0.36 7.8E-06 44.2 0.4 38 269-316 81-118 (505)
83 KOG3834 Golgi reassembly stack 81.3 0.59 1.3E-05 44.1 0.6 58 268-338 15-72 (462)
84 PF00947 Pico_P2A: Picornaviru 81.2 3.3 7.2E-05 32.5 4.6 32 183-215 79-110 (127)
85 PF01732 DUF31: Putative pepti 81.0 1.2 2.5E-05 42.4 2.5 24 189-212 350-373 (374)
86 COG0750 Predicted membrane-ass 80.8 1.2 2.7E-05 42.0 2.7 32 274-316 135-166 (375)
87 KOG3571 Dishevelled 3 and rela 79.3 1.7 3.8E-05 41.9 3.0 39 268-316 277-315 (626)
88 KOG0609 Calcium/calmodulin-dep 79.3 1.3 2.9E-05 43.1 2.3 37 269-315 147-183 (542)
89 KOG3551 Syntrophins (type beta 77.2 0.36 7.9E-06 44.8 -2.0 38 269-316 111-148 (506)
90 PF12381 Peptidase_C3G: Tungro 63.3 9.3 0.0002 33.0 3.5 56 182-241 168-229 (231)
91 cd00600 Sm_like The eukaryotic 61.2 26 0.00056 23.5 5.0 32 93-126 8-39 (63)
92 PF11874 DUF3394: Domain of un 61.0 33 0.00071 29.0 6.4 28 268-306 122-149 (183)
93 cd01731 archaeal_Sm1 The archa 59.4 27 0.00058 24.1 4.8 32 93-126 12-43 (68)
94 cd01726 LSm6 The eukaryotic Sm 58.2 26 0.00057 24.1 4.6 32 93-126 12-43 (67)
95 cd01730 LSm3 The eukaryotic Sm 58.2 22 0.00047 25.7 4.3 30 93-124 13-42 (82)
96 COG0298 HypC Hydrogenase matur 57.5 23 0.00049 25.5 4.1 47 106-154 5-52 (82)
97 PRK00737 small nuclear ribonuc 57.1 30 0.00065 24.3 4.8 32 93-126 16-47 (72)
98 cd01722 Sm_F The eukaryotic Sm 56.8 27 0.00059 24.1 4.5 31 93-125 13-43 (68)
99 cd01732 LSm5 The eukaryotic Sm 56.6 27 0.00059 24.9 4.5 30 93-124 15-44 (76)
100 cd01720 Sm_D2 The eukaryotic S 56.1 29 0.00062 25.5 4.7 32 93-126 16-47 (87)
101 cd01717 Sm_B The eukaryotic Sm 56.1 28 0.0006 24.9 4.6 31 93-125 12-42 (79)
102 PF00571 CBS: CBS domain CBS d 55.6 14 0.00031 23.8 2.9 21 193-213 28-48 (57)
103 cd06168 LSm9 The eukaryotic Sm 55.4 32 0.0007 24.4 4.7 31 93-125 12-42 (75)
104 cd01729 LSm7 The eukaryotic Sm 54.4 33 0.00071 24.8 4.7 31 93-125 14-44 (81)
105 cd01735 LSm12_N LSm12 belongs 53.0 54 0.0012 22.3 5.2 32 93-126 8-39 (61)
106 KOG3605 Beta amyloid precursor 51.9 6.9 0.00015 39.2 1.1 35 270-314 675-709 (829)
107 cd01719 Sm_G The eukaryotic Sm 51.2 42 0.0009 23.6 4.7 31 93-125 12-42 (72)
108 cd01728 LSm1 The eukaryotic Sm 50.4 42 0.00092 23.8 4.7 31 93-125 14-44 (74)
109 smart00651 Sm snRNP Sm protein 49.6 46 0.001 22.6 4.8 32 93-126 10-41 (67)
110 COG1868 FliM Flagellar motor s 48.7 65 0.0014 30.1 6.8 40 204-243 191-230 (332)
111 cd01727 LSm8 The eukaryotic Sm 47.1 48 0.001 23.3 4.6 31 93-125 11-41 (74)
112 PF01423 LSM: LSM domain ; In 44.0 66 0.0014 21.8 4.8 33 93-127 10-42 (67)
113 COG1958 LSM1 Small nuclear rib 43.5 55 0.0012 23.3 4.5 32 93-126 19-50 (79)
114 PF08669 GCV_T_C: Glycine clea 42.8 44 0.00096 24.5 4.1 33 195-227 34-66 (95)
115 KOG3834 Golgi reassembly stack 41.2 10 0.00022 36.2 0.3 35 272-316 113-147 (462)
116 PF02743 Cache_1: Cache domain 39.2 34 0.00074 24.1 2.9 32 197-241 18-49 (81)
117 PF01455 HupF_HypC: HupF/HypC 38.2 1.2E+02 0.0026 21.1 5.3 43 106-151 5-47 (68)
118 PF02601 Exonuc_VII_L: Exonucl 38.1 45 0.00097 30.8 4.2 38 58-109 281-318 (319)
119 PF14827 Cache_3: Sensory doma 35.9 40 0.00086 25.9 3.0 18 198-215 94-111 (116)
120 COG4820 EutJ Ethanolamine util 35.5 1.1E+02 0.0024 26.5 5.6 103 198-307 43-167 (277)
121 COG0260 PepB Leucyl aminopepti 33.6 43 0.00093 33.0 3.3 15 298-312 316-330 (485)
122 PTZ00138 small nuclear ribonuc 32.2 1E+02 0.0022 22.7 4.4 33 93-125 28-60 (89)
123 COG5233 GRH1 Peripheral Golgi 32.0 30 0.00064 31.8 1.8 33 271-314 66-98 (417)
124 PF01732 DUF31: Putative pepti 30.7 29 0.00063 32.9 1.7 24 56-79 35-68 (374)
125 KOG1738 Membrane-associated gu 30.7 53 0.0012 33.0 3.4 63 236-315 200-262 (638)
126 COG2524 Predicted transcriptio 30.3 2.9E+02 0.0062 24.9 7.5 20 193-213 201-220 (294)
127 cd01739 LSm11_C The eukaryotic 30.2 94 0.002 21.4 3.5 35 93-127 12-46 (66)
128 PRK06437 hypothetical protein; 29.0 1.4E+02 0.003 20.5 4.4 30 298-338 33-62 (67)
129 PF10049 DUF2283: Protein of u 28.1 43 0.00094 21.6 1.7 11 202-212 36-46 (50)
130 cd04627 CBS_pair_14 The CBS do 26.9 53 0.0011 24.8 2.3 21 193-213 97-117 (123)
131 cd01721 Sm_D3 The eukaryotic S 26.9 2.2E+02 0.0047 19.7 7.4 32 93-126 12-43 (70)
132 PF08605 Rad9_Rad53_bind: Fung 26.6 1.4E+02 0.003 23.8 4.6 56 93-152 15-70 (131)
133 cd01733 LSm10 The eukaryotic S 26.6 2.4E+02 0.0052 20.1 7.5 32 93-126 21-52 (78)
134 COG2104 ThiS Sulfur transfer p 25.4 1.3E+02 0.0028 21.0 3.7 34 298-338 30-63 (68)
135 cd04603 CBS_pair_KefB_assoc Th 25.2 60 0.0013 24.1 2.3 21 193-213 85-105 (111)
136 cd04620 CBS_pair_7 The CBS dom 24.7 62 0.0013 23.9 2.3 20 194-213 90-109 (115)
137 TIGR00739 yajC preprotein tran 24.3 1.3E+02 0.0027 21.9 3.7 40 296-340 37-78 (84)
138 cd01724 Sm_D1 The eukaryotic S 22.6 3.2E+02 0.0069 20.1 7.2 60 93-156 13-72 (90)
139 TIGR03279 cyano_FeS_chp putati 21.8 41 0.00089 32.5 0.9 21 271-291 1-21 (433)
140 cd04597 CBS_pair_DRTGG_assoc2 21.6 91 0.002 23.4 2.7 21 193-213 87-107 (113)
141 cd01723 LSm4 The eukaryotic Sm 21.0 3.1E+02 0.0066 19.3 7.6 32 93-126 13-44 (76)
142 PRK10413 hydrogenase 2 accesso 20.3 2.1E+02 0.0047 20.7 4.1 48 106-153 5-55 (82)
143 PF05578 Peptidase_S31: Pestiv 20.3 2.9E+02 0.0062 22.6 5.2 127 56-214 50-182 (211)
144 cd04592 CBS_pair_EriC_assoc_eu 20.2 99 0.0021 24.2 2.7 20 194-213 23-42 (133)
145 TIGR00074 hypC_hupF hydrogenas 20.0 2.5E+02 0.0055 20.0 4.4 41 106-151 5-45 (76)
146 smart00116 CBS Domain in cysta 20.0 97 0.0021 17.9 2.2 20 194-213 22-41 (49)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=6.1e-52 Score=397.64 Aligned_cols=289 Identities=39% Similarity=0.564 Sum_probs=244.0
Q ss_pred HHHHHHHHhCCCeEEEEeeeecccc--------ccCCc------ccccCCceEEEEEcC-CCEEEeCccccCCCCCCCCC
Q 019504 22 RIAQLFEKNTYSVVNIFDVTLRPTL--------NVTGL------VEIPEGNGSGVVWDG-KGHIVTNFHVIGSALSRKPA 86 (340)
Q Consensus 22 ~~~~~~~~~~~svV~I~~~~~~~~~--------~~~~~------~~~~~~~GsGfiI~~-~G~IlT~~Hvv~~~~~~~~~ 86 (340)
.+.++++++.+|||.|.+....... .+++. .....+.||||+|++ +||||||+|||.++.
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~----- 115 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQ----- 115 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCC-----
Confidence 6899999999999999875432110 01111 112347899999985 799999999999886
Q ss_pred CCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEE
Q 019504 87 EGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVI 166 (340)
Q Consensus 87 ~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~v 166 (340)
.+.|++.||+ .++|++++.|+.+||||||++.+. .++++.|+++..+++|++|+++|+|+++..+++.|+|
T Consensus 116 ------~i~V~~~dg~--~~~a~vvg~D~~~DlAvlkv~~~~-~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~Giv 186 (455)
T PRK10139 116 ------KISIQLNDGR--EFDAKLIGSDDQSDIALLQIQNPS-KLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGII 186 (455)
T ss_pred ------EEEEEECCCC--EEEEEEEEEcCCCCEEEEEecCCC-CCceeEecCccccCCCCEEEEEecCCCCCCceEEEEE
Confidence 8999999997 899999999999999999998643 4789999999999999999999999999999999999
Q ss_pred eeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCce
Q 019504 167 SGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKV 246 (340)
Q Consensus 167 s~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~ 246 (340)
++..+...... .+.++|++|+.+++|+|||||||.+|+||||+++.+...++..+++|+||++.+++++++|+++|++
T Consensus 187 S~~~r~~~~~~--~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v 264 (455)
T PRK10139 187 SALGRSGLNLE--GLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEI 264 (455)
T ss_pred ccccccccCCC--CcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcc
Confidence 99877533222 2346899999999999999999999999999999877666678999999999999999999999999
Q ss_pred eeeeeeEEec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCcee
Q 019504 247 VRAGLNVDIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRI 322 (340)
Q Consensus 247 ~~~~lg~~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~ 322 (340)
.++|||+.+. +..+++.++++ .|++|.+|.++|||+++|| ++||+|++|||++|.+..| +..
T Consensus 265 ~r~~LGv~~~~l~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL-----------~~GDvIl~InG~~V~s~~d---l~~ 330 (455)
T PRK10139 265 KRGLLGIKGTEMSADIAKAFNLDVQRGAFVSEVLPNSGSAKAGV-----------KAGDIITSLNGKPLNSFAE---LRS 330 (455)
T ss_pred cccceeEEEEECCHHHHHhcCCCCCCceEEEEECCCChHHHCCC-----------CCCCEEEEECCEECCCHHH---HHH
Confidence 9999999764 57788888876 7999999999999999999 9999999999999999998 555
Q ss_pred EEEe-eCCCCceEEEEeCC
Q 019504 323 YLIC-AEPNQDHLTCLKSS 340 (340)
Q Consensus 323 ~~~~-~~~~~~~~~~~~~~ 340 (340)
.+.. ......+++++|++
T Consensus 331 ~l~~~~~g~~v~l~V~R~G 349 (455)
T PRK10139 331 RIATTEPGTKVKLGLLRNG 349 (455)
T ss_pred HHHhcCCCCEEEEEEEECC
Confidence 5554 33344678888864
No 2
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=3.9e-51 Score=381.49 Aligned_cols=291 Identities=36% Similarity=0.496 Sum_probs=244.4
Q ss_pred CCcchhHHHHHHHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEE
Q 019504 16 LLPNEERIAQLFEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVN 95 (340)
Q Consensus 16 ~~~~~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~ 95 (340)
..+.+..+.++++++.+|||.|+......+ .+......+.||||+|+++||||||+|||.++. .+.
T Consensus 40 ~~~~~~~~~~~~~~~~psVV~I~~~~~~~~---~~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~-----------~i~ 105 (351)
T TIGR02038 40 NNTVEISFNKAVRRAAPAVVNIYNRSISQN---SLNQLSIQGLGSGVIMSKEGYILTNYHVIKKAD-----------QIV 105 (351)
T ss_pred ccccchhHHHHHHhcCCcEEEEEeEecccc---ccccccccceEEEEEEeCCeEEEecccEeCCCC-----------EEE
Confidence 345556889999999999999987544332 122233457899999999999999999999876 799
Q ss_pred EEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeecccccc
Q 019504 96 ILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFS 175 (340)
Q Consensus 96 v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~ 175 (340)
|++.||+ .++|++++.|+.+||||||++... +++++++++..+++|++|+++|||++...+++.|+|+...+....
T Consensus 106 V~~~dg~--~~~a~vv~~d~~~DlAvlkv~~~~--~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~ 181 (351)
T TIGR02038 106 VALQDGR--KFEAELVGSDPLTDLAVLKIEGDN--LPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLS 181 (351)
T ss_pred EEECCCC--EEEEEEEEecCCCCEEEEEecCCC--CceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccC
Confidence 9999997 799999999999999999999754 788899888889999999999999999999999999998775432
Q ss_pred CCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCC--CcCceEEEEehHhHHHHHHHHHHcCceeeeeeeE
Q 019504 176 QAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTG--TSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNV 253 (340)
Q Consensus 176 ~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~--~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~ 253 (340)
..+ ..++|++|+.+++|+|||||+|.+|+||||+++.+.... ...+++|+||++.+++++++|+++|++.++|||+
T Consensus 182 ~~~--~~~~iqtda~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv 259 (351)
T TIGR02038 182 SVG--RQNFIQTDAAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGV 259 (351)
T ss_pred CCC--cceEEEECCccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeee
Confidence 221 236799999999999999999999999999998765422 2368999999999999999999999999999999
Q ss_pred Eec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe-eC
Q 019504 254 DIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC-AE 328 (340)
Q Consensus 254 ~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~-~~ 328 (340)
.+. +...++.++++ .|++|.+|.++|||+++|| ++||+|++|||++|.+..| +...+.. ..
T Consensus 260 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL-----------~~GDvI~~Ing~~V~s~~d---l~~~l~~~~~ 325 (351)
T TIGR02038 260 SGEDINSVVAQGLGLPDLRGIVITGVDPNGPAARAGI-----------LVRDVILKYDGKDVIGAEE---LMDRIAETRP 325 (351)
T ss_pred EEEECCHHHHHhcCCCccccceEeecCCCChHHHCCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhcCC
Confidence 875 46667788887 6999999999999999999 9999999999999999988 5555544 33
Q ss_pred CCCceEEEEeCC
Q 019504 329 PNQDHLTCLKSS 340 (340)
Q Consensus 329 ~~~~~~~~~~~~ 340 (340)
....+++++|++
T Consensus 326 g~~v~l~v~R~g 337 (351)
T TIGR02038 326 GSKVMVTVLRQG 337 (351)
T ss_pred CCEEEEEEEECC
Confidence 445678888863
No 3
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=2.8e-50 Score=375.46 Aligned_cols=287 Identities=36% Similarity=0.500 Sum_probs=237.4
Q ss_pred hhHHHHHHHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEec
Q 019504 20 EERIAQLFEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILAS 99 (340)
Q Consensus 20 ~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~ 99 (340)
...+.++++++.+|||.|........ ........+.||||+|+++||||||+|||.++. .+.|++.
T Consensus 44 ~~~~~~~~~~~~psvV~v~~~~~~~~---~~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~-----------~i~V~~~ 109 (353)
T PRK10898 44 PASYNQAVRRAAPAVVNVYNRSLNST---SHNQLEIRTLGSGVIMDQRGYILTNKHVINDAD-----------QIIVALQ 109 (353)
T ss_pred cchHHHHHHHhCCcEEEEEeEecccc---CcccccccceeeEEEEeCCeEEEecccEeCCCC-----------EEEEEeC
Confidence 34788999999999999987543221 112233447899999999999999999999876 8999999
Q ss_pred CCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCc
Q 019504 100 DGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGV 179 (340)
Q Consensus 100 ~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~ 179 (340)
||+ .++|++++.|+.+||||||++... +++++|+++..+++|+.|+++|||++...+++.|+|+...+......+
T Consensus 110 dg~--~~~a~vv~~d~~~DlAvl~v~~~~--l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~~~- 184 (353)
T PRK10898 110 DGR--VFEALLVGSDSLTDLAVLKINATN--LPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSPTG- 184 (353)
T ss_pred CCC--EEEEEEEEEcCCCCEEEEEEcCCC--CCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCCcc-
Confidence 997 799999999999999999998753 788999888889999999999999998899999999988775432222
Q ss_pred eecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCC---CcCceEEEEehHhHHHHHHHHHHcCceeeeeeeEEec
Q 019504 180 TIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTG---TSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIA 256 (340)
Q Consensus 180 ~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~---~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~ 256 (340)
..++|++|+.+++|+|||||+|.+|+||||+++.+...+ ...+++|+||++.+++++++|+++|++.++|||+...
T Consensus 185 -~~~~iqtda~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~ 263 (353)
T PRK10898 185 -RQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGR 263 (353)
T ss_pred -ccceEEeccccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEE
Confidence 235799999999999999999999999999998775432 2368999999999999999999999999999999764
Q ss_pred c--HHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe-eCCCC
Q 019504 257 P--DLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC-AEPNQ 331 (340)
Q Consensus 257 ~--~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~-~~~~~ 331 (340)
+ ...+..++++ .|++|.+|.++|||+++|| ++||+|++|||++|.+..| +...+.. .....
T Consensus 264 ~~~~~~~~~~~~~~~~Gv~V~~V~~~spA~~aGL-----------~~GDvI~~Ing~~V~s~~~---l~~~l~~~~~g~~ 329 (353)
T PRK10898 264 EIAPLHAQGGGIDQLQGIVVNEVSPDGPAAKAGI-----------QVNDLIISVNNKPAISALE---TMDQVAEIRPGSV 329 (353)
T ss_pred ECCHHHHHhcCCCCCCeEEEEEECCCChHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhcCCCCE
Confidence 3 4444555554 8999999999999999999 9999999999999999887 5544444 33344
Q ss_pred ceEEEEeCC
Q 019504 332 DHLTCLKSS 340 (340)
Q Consensus 332 ~~~~~~~~~ 340 (340)
.+++++|++
T Consensus 330 v~l~v~R~g 338 (353)
T PRK10898 330 IPVVVMRDD 338 (353)
T ss_pred EEEEEEECC
Confidence 678888863
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=2.9e-49 Score=380.91 Aligned_cols=289 Identities=39% Similarity=0.552 Sum_probs=242.4
Q ss_pred HHHHHHHHhCCCeEEEEeeeeccc---------cccCCc----------------------------ccccCCceEEEEE
Q 019504 22 RIAQLFEKNTYSVVNIFDVTLRPT---------LNVTGL----------------------------VEIPEGNGSGVVW 64 (340)
Q Consensus 22 ~~~~~~~~~~~svV~I~~~~~~~~---------~~~~~~----------------------------~~~~~~~GsGfiI 64 (340)
.+.++++++.+|||.|++...... ..+++. .....+.||||+|
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 599999999999999987553211 001110 0112468999999
Q ss_pred cC-CCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCC
Q 019504 65 DG-KGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLK 143 (340)
Q Consensus 65 ~~-~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~ 143 (340)
++ +||||||+||+.++. ++.|++.||+ .++|++++.|+.+||||||++... .++++.|+++..++
T Consensus 119 ~~~~G~IlTn~HVv~~a~-----------~i~V~~~dg~--~~~a~vv~~D~~~DlAvlki~~~~-~l~~~~lg~s~~l~ 184 (473)
T PRK10942 119 DADKGYVVTNNHVVDNAT-----------KIKVQLSDGR--KFDAKVVGKDPRSDIALIQLQNPK-NLTAIKMADSDALR 184 (473)
T ss_pred ECCCCEEEeChhhcCCCC-----------EEEEEECCCC--EEEEEEEEecCCCCEEEEEecCCC-CCceeEecCccccC
Confidence 86 599999999999886 8999999997 899999999999999999997543 47899999999999
Q ss_pred CCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCc
Q 019504 144 VGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAG 223 (340)
Q Consensus 144 ~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~ 223 (340)
+|++|+++|+|+++..+++.|+|+...+..... ..+.++|++|+.+++|+|||||+|.+|+||||+++.+...++..+
T Consensus 185 ~G~~V~aiG~P~g~~~tvt~GiVs~~~r~~~~~--~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g 262 (473)
T PRK10942 185 VGDYTVAIGNPYGLGETVTSGIVSALGRSGLNV--ENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIG 262 (473)
T ss_pred CCCEEEEEcCCCCCCcceeEEEEEEeecccCCc--ccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCccc
Confidence 999999999999999999999999987653221 123467999999999999999999999999999998877666678
Q ss_pred eEEEEehHhHHHHHHHHHHcCceeeeeeeEEec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcC
Q 019504 224 VGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIIL 299 (340)
Q Consensus 224 ~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~ 299 (340)
++|+||++.+++++++|++++++.++|||+.+. +..+++.++++ .|++|.+|.++|||+++|| ++
T Consensus 263 ~gfaIP~~~~~~v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL-----------~~ 331 (473)
T PRK10942 263 IGFAIPSNMVKNLTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGI-----------KA 331 (473)
T ss_pred EEEEEEHHHHHHHHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCC-----------CC
Confidence 999999999999999999999999999998774 57788888886 7999999999999999999 99
Q ss_pred CcEEEEECCEEccCCCCCCCceeEEEee-CCCCceEEEEeCC
Q 019504 300 GDIIVAVNNKPVSFSCLSIPSRIYLICA-EPNQDHLTCLKSS 340 (340)
Q Consensus 300 GDvi~~i~g~~v~~~~d~~~~~~~~~~~-~~~~~~~~~~~~~ 340 (340)
||+|++|||++|.+.+| ++..+... .....+++|+|++
T Consensus 332 GDvIl~InG~~V~s~~d---l~~~l~~~~~g~~v~l~v~R~G 370 (473)
T PRK10942 332 GDVITSLNGKPISSFAA---LRAQVGTMPVGSKLTLGLLRDG 370 (473)
T ss_pred CCEEEEECCEECCCHHH---HHHHHHhcCCCCEEEEEEEECC
Confidence 99999999999999998 55555443 3334678888764
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=1.1e-48 Score=375.71 Aligned_cols=288 Identities=43% Similarity=0.571 Sum_probs=242.9
Q ss_pred HHHHHHHhCCCeEEEEeeeecccc-----------ccCCc----------ccccCCceEEEEEcCCCEEEeCccccCCCC
Q 019504 23 IAQLFEKNTYSVVNIFDVTLRPTL-----------NVTGL----------VEIPEGNGSGVVWDGKGHIVTNFHVIGSAL 81 (340)
Q Consensus 23 ~~~~~~~~~~svV~I~~~~~~~~~-----------~~~~~----------~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~ 81 (340)
+.++++++.+|||.|.+....... .+++. .....+.||||+|+++||||||+||+.++.
T Consensus 3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~ 82 (428)
T TIGR02037 3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD 82 (428)
T ss_pred HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence 778999999999999875422110 11111 112457899999999999999999999876
Q ss_pred CCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCce
Q 019504 82 SRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTL 161 (340)
Q Consensus 82 ~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~ 161 (340)
.+.|++.|++ .++|++++.|+.+|||||+++.. ..++++.|+++..+++|++|+++|||++...++
T Consensus 83 -----------~i~V~~~~~~--~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~ 148 (428)
T TIGR02037 83 -----------EITVTLSDGR--EFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTV 148 (428)
T ss_pred -----------eEEEEeCCCC--EEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcE
Confidence 8999999997 79999999999999999999875 348999999888899999999999999999999
Q ss_pred eEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHH
Q 019504 162 TVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLI 241 (340)
Q Consensus 162 ~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~ 241 (340)
+.|+|+...+.... ...+..++++|+.+++|+|||||||.+|+||||+++.+...++..+++|+||++.+++++++|+
T Consensus 149 t~G~vs~~~~~~~~--~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~ 226 (428)
T TIGR02037 149 TSGIVSALGRSGLG--IGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLI 226 (428)
T ss_pred EEEEEEecccCccC--CCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHH
Confidence 99999988765321 1223467999999999999999999999999999988776555678999999999999999999
Q ss_pred HcCceeeeeeeEEec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCC
Q 019504 242 QYGKVVRAGLNVDIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLS 317 (340)
Q Consensus 242 ~~~~~~~~~lg~~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~ 317 (340)
+++++.++|||+.+. +..+++.++++ .|++|.+|.++|||+++|| ++||+|++|||++|.+..|
T Consensus 227 ~~g~~~~~~lGi~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL-----------~~GDvI~~Vng~~i~~~~~- 294 (428)
T TIGR02037 227 EGGKVQRGWLGVTIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGL-----------KAGDVILSVNGKPISSFAD- 294 (428)
T ss_pred hcCcCcCCcCceEeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCC-----------CCCCEEEEECCEEcCCHHH-
Confidence 999999999999875 47788889986 8999999999999999999 9999999999999999987
Q ss_pred CCceeEEEee-CCCCceEEEEeCC
Q 019504 318 IPSRIYLICA-EPNQDHLTCLKSS 340 (340)
Q Consensus 318 ~~~~~~~~~~-~~~~~~~~~~~~~ 340 (340)
+...+... .....+++++|++
T Consensus 295 --~~~~l~~~~~g~~v~l~v~R~g 316 (428)
T TIGR02037 295 --LRRAIGTLKPGKKVTLGILRKG 316 (428)
T ss_pred --HHHHHHhcCCCCEEEEEEEECC
Confidence 55555443 3345678888863
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=1.1e-38 Score=298.68 Aligned_cols=290 Identities=42% Similarity=0.538 Sum_probs=238.4
Q ss_pred hHHHHHHHHhCCCeEEEEeeeeccccccC-Ccc-cc-cCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEE
Q 019504 21 ERIAQLFEKNTYSVVNIFDVTLRPTLNVT-GLV-EI-PEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNIL 97 (340)
Q Consensus 21 ~~~~~~~~~~~~svV~I~~~~~~~~~~~~-~~~-~~-~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~ 97 (340)
..+..+++++.++||.++.........++ ... .. ..+.||||+++++|||+|+.|++..+. ++.+.
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~-----------~i~v~ 101 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAE-----------EITVT 101 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcc-----------eEEEE
Confidence 57889999999999999875433320000 000 00 147899999998999999999999965 88999
Q ss_pred ecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCC
Q 019504 98 ASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQA 177 (340)
Q Consensus 98 ~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~ 177 (340)
+.||+ .+++++++.|+..|+|+|+++.... ++.+.++++..++.|+.++++|+|+++..+++.|+++...+.....
T Consensus 102 l~dg~--~~~a~~vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~~v~~- 177 (347)
T COG0265 102 LADGR--EVPAKLVGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRTGVGS- 177 (347)
T ss_pred eCCCC--EEEEEEEecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccccccC-
Confidence 99997 8999999999999999999998654 7888999999999999999999999999999999999998862211
Q ss_pred CceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCceeeeeeeEEecc
Q 019504 178 GVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIAP 257 (340)
Q Consensus 178 ~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~~ 257 (340)
...+.++||+|+.+++|+||||++|.+|++|||++..+...++..+++|+||++.+..++.++...|++.++++|+.+.+
T Consensus 178 ~~~~~~~IqtdAain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~ 257 (347)
T COG0265 178 AGGYVNFIQTDAAINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEP 257 (347)
T ss_pred cccccchhhcccccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEE
Confidence 11145789999999999999999999999999999988876655679999999999999999999889999999988754
Q ss_pred HHHHhhcCC--CCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeC-CCCceE
Q 019504 258 DLVASQLNV--GNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAE-PNQDHL 334 (340)
Q Consensus 258 ~~~~~~~~~--~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~-~~~~~~ 334 (340)
......+|+ ..|++|.+|.+++||+++|+ +.||+|+++||+++.+..| +...+.... .....+
T Consensus 258 ~~~~~~~g~~~~~G~~V~~v~~~spa~~agi-----------~~Gdii~~vng~~v~~~~~---l~~~v~~~~~g~~v~~ 323 (347)
T COG0265 258 LTADIALGLPVAAGAVVLGVLPGSPAAKAGI-----------KAGDIITAVNGKPVASLSD---LVAAVASNRPGDEVAL 323 (347)
T ss_pred cccccccCCCCCCceEEEecCCCChHHHcCC-----------CCCCEEEEECCEEccCHHH---HHHHHhccCCCCEEEE
Confidence 222122443 48999999999999999999 8999999999999999998 454444433 445677
Q ss_pred EEEeC
Q 019504 335 TCLKS 339 (340)
Q Consensus 335 ~~~~~ 339 (340)
+++|.
T Consensus 324 ~~~r~ 328 (347)
T COG0265 324 KLLRG 328 (347)
T ss_pred EEEEC
Confidence 77775
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96 E-value=5.7e-29 Score=233.05 Aligned_cols=290 Identities=38% Similarity=0.464 Sum_probs=224.7
Q ss_pred CcchhHHHHHHHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEE
Q 019504 17 LPNEERIAQLFEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNI 96 (340)
Q Consensus 17 ~~~~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v 96 (340)
.......+++.++...|+|.|..........++.....+...||||+++.+|+++||+||+................+.+
T Consensus 124 ~k~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi 203 (473)
T KOG1320|consen 124 RKYKAFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQI 203 (473)
T ss_pred hhhhhhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEE
Confidence 33456778899999999999987544333333444456677899999999999999999998654322222223346888
Q ss_pred EecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccC
Q 019504 97 LASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQ 176 (340)
Q Consensus 97 ~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~ 176 (340)
..++|.....++.+.+.|+..|+|+++++.+....++++++.+..++.|+++..+|.|++..++.+.|+++...|....-
T Consensus 204 ~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~l 283 (473)
T KOG1320|consen 204 DAAIGPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKL 283 (473)
T ss_pred EEeecCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeeccccccccccccc
Confidence 88887445889999999999999999997765447888888888999999999999999999999999999988866542
Q ss_pred C---CceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCcee------
Q 019504 177 A---GVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVV------ 247 (340)
Q Consensus 177 ~---~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~------ 247 (340)
. +....+++++|+.++.|+||+|++|.+|++||+++......+-..+++|++|.+.++.++.+..++...-
T Consensus 284 g~~~g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~ 363 (473)
T KOG1320|consen 284 GLETGVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPL 363 (473)
T ss_pred CcccceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCc
Confidence 2 2355688999999999999999999999999999987776555678999999999999888874443211
Q ss_pred ---eeeeeEEe-------ccHHHHhhcC----CCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccC
Q 019504 248 ---RAGLNVDI-------APDLVASQLN----VGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSF 313 (340)
Q Consensus 248 ---~~~lg~~~-------~~~~~~~~~~----~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~ 313 (340)
+.++|... .-..+.+.+. ...+++|.+|.+++++...++ ++||+|++|||++|.+
T Consensus 364 ~p~~~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~-----------~~g~~V~~vng~~V~n 432 (473)
T KOG1320|consen 364 VPVHQYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGL-----------KPGDQVVKVNGKPVKN 432 (473)
T ss_pred ccccccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccc-----------cCCCEEEEECCEEeec
Confidence 12333221 1111122232 226899999999999999998 8999999999999999
Q ss_pred CCCC
Q 019504 314 SCLS 317 (340)
Q Consensus 314 ~~d~ 317 (340)
..|+
T Consensus 433 ~~~l 436 (473)
T KOG1320|consen 433 LKHL 436 (473)
T ss_pred hHHH
Confidence 9873
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.92 E-value=6.3e-24 Score=201.44 Aligned_cols=287 Identities=23% Similarity=0.272 Sum_probs=221.5
Q ss_pred cchhHHHHHHHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCC-CEEEeCccccCCCCCCCCCCCccEEEEEE
Q 019504 18 PNEERIAQLFEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGK-GHIVTNFHVIGSALSRKPAEGQVVARVNI 96 (340)
Q Consensus 18 ~~~~~~~~~~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~-G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v 96 (340)
+....+...+.++-+|||.|.......- .......+.||||++++. ||||||+|++.... +.-.+
T Consensus 49 ~~~e~w~~~ia~VvksvVsI~~S~v~~f----dtesag~~~atgfvvd~~~gyiLtnrhvv~pgP----------~va~a 114 (955)
T KOG1421|consen 49 ATSEDWRNTIANVVKSVVSIRFSAVRAF----DTESAGESEATGFVVDKKLGYILTNRHVVAPGP----------FVASA 114 (955)
T ss_pred chhhhhhhhhhhhcccEEEEEehheeec----ccccccccceeEEEEecccceEEEeccccCCCC----------ceeEE
Confidence 3344888999999999999986543331 111223456999999976 99999999998765 24455
Q ss_pred EecCCceeEEEEEEEEeCCCCcEEEEEEecCC---CCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeecccc
Q 019504 97 LASDGVQKNFEGKLVGADRAKDLAVLKIEASE---DLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDI 173 (340)
Q Consensus 97 ~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~---~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~ 173 (340)
.|.+-. +.+--.++.|+.+|+.+++.++.. ..+..+.++ ....++|.+++++|+..+...++..|.++.+.+..
T Consensus 115 vf~n~e--e~ei~pvyrDpVhdfGf~r~dps~ir~s~vt~i~la-p~~akvgseirvvgNDagEklsIlagflSrldr~a 191 (955)
T KOG1421|consen 115 VFDNHE--EIEIYPVYRDPVHDFGFFRYDPSTIRFSIVTEICLA-PELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNA 191 (955)
T ss_pred Eecccc--cCCcccccCCchhhcceeecChhhcceeeeeccccC-ccccccCCceEEecCCccceEEeehhhhhhccCCC
Confidence 565554 667778899999999999998763 123444554 34458999999999988888899999999998876
Q ss_pred ccCCCceec----ceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCceeee
Q 019504 174 FSQAGVTIG----GGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVVRA 249 (340)
Q Consensus 174 ~~~~~~~~~----~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~ 249 (340)
+......+. .++|.-+....|.||+|++|.+|..|.++..+... ....|++|++.+++.+.-++++..++|+
T Consensus 192 pdyg~~~yndfnTfy~QaasstsggssgspVv~i~gyAVAl~agg~~s----sas~ffLpLdrV~RaL~clq~n~PItRG 267 (955)
T KOG1421|consen 192 PDYGEDTYNDFNTFYIQAASSTSGGSSGSPVVDIPGYAVALNAGGSIS----SASDFFLPLDRVVRALRCLQNNTPITRG 267 (955)
T ss_pred ccccccccccccceeeeehhcCCCCCCCCceecccceEEeeecCCccc----ccccceeeccchhhhhhhhhcCCCcccc
Confidence 544333222 35788888889999999999999999998876543 4457999999999999999989999999
Q ss_pred eeeEEeccHHH--HhhcCCC--------------CCcEEE-eeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEcc
Q 019504 250 GLNVDIAPDLV--ASQLNVG--------------NGALVL-QVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVS 312 (340)
Q Consensus 250 ~lg~~~~~~~~--~~~~~~~--------------~g~~V~-~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~ 312 (340)
.|-++|..... ++.+|++ .|++|. .|.+++||++. |++||++++||+.-+.
T Consensus 268 tLqvefl~k~~de~rrlGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~------------Le~GDillavN~t~l~ 335 (955)
T KOG1421|consen 268 TLQVEFLHKLFDECRRLGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKK------------LEPGDILLAVNSTCLN 335 (955)
T ss_pred eEEEEEehhhhHHHHhcCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhc------------cCCCcEEEEEcceehH
Confidence 99999976433 4556653 566654 59999999976 5999999999998888
Q ss_pred CCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 313 FSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 313 ~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
++.. ....|...-.+..++||.|-+
T Consensus 336 df~~---l~~iLDegvgk~l~LtI~Rgg 360 (955)
T KOG1421|consen 336 DFEA---LEQILDEGVGKNLELTIQRGG 360 (955)
T ss_pred HHHH---HHHHHhhccCceEEEEEEeCC
Confidence 8876 777787777777899998854
No 9
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.76 E-value=1e-17 Score=159.58 Aligned_cols=274 Identities=13% Similarity=0.125 Sum_probs=193.2
Q ss_pred HHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCC-CEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeE
Q 019504 27 FEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGK-GHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKN 105 (340)
Q Consensus 27 ~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~-G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~ 105 (340)
.+++..+.|.+........++...+. ..|||.|++.+ |++++++.++.... .+.++++.|.. .
T Consensus 524 ~~~i~~~~~~v~~~~~~~l~g~s~~i----~kgt~~i~d~~~g~~vvsr~~vp~d~----------~d~~vt~~dS~--~ 587 (955)
T KOG1421|consen 524 SADISNCLVDVEPMMPVNLDGVSSDI----YKGTALIMDTSKGLGVVSRSVVPSDA----------KDQRVTEADSD--G 587 (955)
T ss_pred hhHHhhhhhhheeceeeccccchhhh----hcCceEEEEccCCceeEecccCCchh----------hceEEeecccc--c
Confidence 57788888888776555544433322 34999999855 99999999998654 37888888775 7
Q ss_pred EEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeee-----ccccccCCCce
Q 019504 106 FEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGL-----NRDIFSQAGVT 180 (340)
Q Consensus 106 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~-----~~~~~~~~~~~ 180 (340)
++|.+.+.++..++|.++.++.. ...+.|. ...+..||++...|+............++.+ .+.........
T Consensus 588 i~a~~~fL~~t~n~a~~kydp~~--~~~~kl~-~~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~ 664 (955)
T KOG1421|consen 588 IPANVSFLHPTENVASFKYDPAL--EVQLKLT-DTTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRAT 664 (955)
T ss_pred ccceeeEecCccceeEeccChhH--hhhhccc-eeeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeec
Confidence 89999999999999999999864 3455664 3567889999999988654322222222222 22222222222
Q ss_pred ecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCC--CCcCceEEEEehHhHHHHHHHHHHcCceeeeeeeEEeccH
Q 019504 181 IGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQT--GTSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIAPD 258 (340)
Q Consensus 181 ~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~--~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~~~ 258 (340)
..+.|.+++.+..++--|-+.|.+|+|+|++-....+. +....+-|.+.+..+++.++.|+.+.......+|++|...
T Consensus 665 n~e~Is~~~nlsT~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i 744 (955)
T KOG1421|consen 665 NLEVISFMDNLSTSCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHI 744 (955)
T ss_pred ceEEEEEeccccccccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeE
Confidence 23567787777766666788999999999988766542 2234567788999999999999988777666678888765
Q ss_pred HHHh--hcCCCCCcEEEeeCCCChhhhcCCCccccCC--CCCCcCCcEEEEECCEEccCCCCCCCc
Q 019504 259 LVAS--QLNVGNGALVLQVPGNSLAAKAGILPTTRGF--AGNIILGDIIVAVNNKPVSFSCLSIPS 320 (340)
Q Consensus 259 ~~~~--~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~--~~~l~~GDvi~~i~g~~v~~~~d~~~~ 320 (340)
++++ .+|++ ..++.+.+.+|.-.++-+..+|... -..|..||||+++|||.|+.+.|+-++
T Consensus 745 ~laqar~lglp-~e~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~ 809 (955)
T KOG1421|consen 745 TLAQARTLGLP-SEFIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDF 809 (955)
T ss_pred EeehhhccCCC-HHHHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhh
Confidence 5554 45665 5566666666666666665555532 234789999999999999999984333
No 10
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.73 E-value=2.6e-17 Score=130.07 Aligned_cols=117 Identities=32% Similarity=0.471 Sum_probs=74.8
Q ss_pred eEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEE--EEEEEeCCC-CcEEEEEEecCCCCcccee
Q 019504 59 GSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFE--GKLVGADRA-KDLAVLKIEASEDLLKPIN 135 (340)
Q Consensus 59 GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~--a~v~~~d~~-~DlAlL~v~~~~~~~~~~~ 135 (340)
||||+|+++|+||||+||+.......... ...+.+.+.++. ... +++++.|+. .|+|||+++.
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~---~~~~~~~~~~~~--~~~~~~~~~~~~~~~~D~All~v~~--------- 66 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPD---NSSVEVVFPDGR--RVPPVAEVVYFDPDDYDLALLKVDP--------- 66 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-T---CSEEEEEETTSC--EEETEEEEEEEETT-TTEEEEEESC---------
T ss_pred CEEEEEcCCceEEEchhheecccccccCC---CCEEEEEecCCC--EEeeeEEEEEECCccccEEEEEEec---------
Confidence 79999999899999999999754321100 127888888887 456 999999999 9999999990
Q ss_pred ecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEE
Q 019504 136 VGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGI 210 (340)
Q Consensus 136 l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi 210 (340)
.. ..+......+ ............... .++ +++.+.+|+||||+||.+|+||||
T Consensus 67 ----~~-~~~~~~~~~~------------~~~~~~~~~~~~~~~---~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 67 ----WT-GVGGGVRVPG------------STSGVSPTSTNDNRM---LYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp ----EE-EEEEEEEEEE------------EEEEEEEEEEEETEE---EEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred ----cc-ceeeeeEeee------------eccccccccCcccce---eEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 00 0000001000 000000000000000 014 799999999999999999999997
No 11
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.46 E-value=3.7e-13 Score=127.06 Aligned_cols=256 Identities=21% Similarity=0.326 Sum_probs=184.0
Q ss_pred HHHHhCCCeEEEEeeeeccccccCCccc-ccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCcee
Q 019504 26 LFEKNTYSVVNIFDVTLRPTLNVTGLVE-IPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQK 104 (340)
Q Consensus 26 ~~~~~~~svV~I~~~~~~~~~~~~~~~~-~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~ 104 (340)
..+....|++.+.+....+....+|+.. +....|+||.+... .++|++|++....+. ..+.+. ..|...
T Consensus 55 ~~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~~-~lltn~~~v~~~~~~--------~~v~v~-~~gs~~ 124 (473)
T KOG1320|consen 55 VVDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYGK-KLLTNAHVVAPNNDH--------KFVTVK-KHGSPR 124 (473)
T ss_pred CccccccceeEEEeecccccccCcceeeehhcccccchhhccc-ceeecCccccccccc--------cccccc-cCCCch
Confidence 4566777899998877777766667654 44567999999854 899999999854321 144444 555555
Q ss_pred EEEEEEEEeCCCCcEEEEEEecCC--CCccceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceec
Q 019504 105 NFEGKLVGADRAKDLAVLKIEASE--DLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIG 182 (340)
Q Consensus 105 ~~~a~v~~~d~~~DlAlL~v~~~~--~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~ 182 (340)
.+.+++...-.+.|+|++.++..+ ....|+.+.+ -+...+-++++| +....++.|.|+......+......+
T Consensus 125 k~~~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~--ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~y~~~~~~l- 198 (473)
T KOG1320|consen 125 KYKAFVAAVFEECDLAVVYIESEEFWKGMNPFELGD--IPSLNGSGFVVG---GDGIIVTNGHVVRVEPRIYAHSSTVL- 198 (473)
T ss_pred hhhhhHHHhhhcccceEEEEeeccccCCCcccccCC--CcccCccEEEEc---CCcEEEEeeEEEEEEeccccCCCcce-
Confidence 788888888899999999999743 2334455543 345568899998 77789999999988765544432221
Q ss_pred ceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCce-eeeeeeEE---eccH
Q 019504 183 GGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKV-VRAGLNVD---IAPD 258 (340)
Q Consensus 183 ~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~-~~~~lg~~---~~~~ 258 (340)
..+++|+...+|+||+|.+...+++.|+.....+.. ..+.+.+|.-.+.++.......+.. .++.++.. +.+.
T Consensus 199 ~~vqi~aa~~~~~s~ep~i~g~d~~~gvA~l~ik~~---~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~ 275 (473)
T KOG1320|consen 199 LRVQIDAAIGPGNSGEPVIVGVDKVAGVAFLKIKTP---ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSG 275 (473)
T ss_pred eeEEEEEeecCCccCCCeEEccccccceEEEEEecC---CcccceeecceeeeecccceeeccccCceeeeeeeeccccc
Confidence 348999999999999999987799999998877543 2678889988887776554443322 12222222 2234
Q ss_pred HHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEcc
Q 019504 259 LVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVS 312 (340)
Q Consensus 259 ~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~ 312 (340)
.+++.+.+. .|+.+.++.+-+.|-+.| +.||+|+++||..|.
T Consensus 276 ~~R~~~~lg~~~g~~i~~~~qtd~ai~~~------------nsg~~ll~~DG~~Ig 319 (473)
T KOG1320|consen 276 QLRKSFKLGLETGVLISKINQTDAAINPG------------NSGGPLLNLDGEVIG 319 (473)
T ss_pred ccccccccCcccceeeeeecccchhhhcc------------cCCCcEEEecCcEee
Confidence 555555555 569999999988777664 899999999999996
No 12
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.40 E-value=1.3e-11 Score=107.51 Aligned_cols=167 Identities=23% Similarity=0.315 Sum_probs=105.3
Q ss_pred ceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEe-------cCCceeEEEEEEEEe----CC---CCcEEEEE
Q 019504 58 NGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILA-------SDGVQKNFEGKLVGA----DR---AKDLAVLK 123 (340)
Q Consensus 58 ~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~-------~~g~~~~~~a~v~~~----d~---~~DlAlL~ 123 (340)
.++|++|+++ +|||++||+.... .+.+.+ .++....+...-+.. +. ..|+|||+
T Consensus 26 ~C~G~li~~~-~vLTaahC~~~~~-----------~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~ 93 (220)
T PF00089_consen 26 FCTGTLISPR-WVLTAAHCVDGAS-----------DIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLK 93 (220)
T ss_dssp EEEEEEEETT-EEEEEGGGHTSGG-----------SEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEE
T ss_pred eEeEEecccc-ccccccccccccc-----------ccccccccccccccccccccccccccccccccccccccccccccc
Confidence 4999999987 9999999999821 222222 222112333333322 22 56999999
Q ss_pred EecC---CCCccceeecCC-CCCCCCCEEEEEecCCCCCC----ceeEEEEeeecccc-cc-CCCceecceEEEee----
Q 019504 124 IEAS---EDLLKPINVGQS-SFLKVGQQCLAIGNPFGFDH----TLTVGVISGLNRDI-FS-QAGVTIGGGIQTDA---- 189 (340)
Q Consensus 124 v~~~---~~~~~~~~l~~~-~~~~~G~~v~~iG~p~g~~~----~~~~G~vs~~~~~~-~~-~~~~~~~~~i~~d~---- 189 (340)
++.+ ...+.++.+... ..++.|+.+.++||+..... .+....+..+.... .. .........+....
T Consensus 94 L~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~ 173 (220)
T PF00089_consen 94 LDRPITFGDNIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSG 173 (220)
T ss_dssp ESSSSEHBSSBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSS
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 9987 345677777652 34578999999999875322 34444443332211 00 00001113344444
Q ss_pred ccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHH
Q 019504 190 AINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIV 237 (340)
Q Consensus 190 ~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l 237 (340)
..+.|+|||||++.++.|+||++.. ..++......++.+++.+++|+
T Consensus 174 ~~~~g~sG~pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 174 DACQGDSGGPLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp BGGTTTTTSEEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGGHHHH
T ss_pred cccccccccccccceeeecceeeec-CCCCCCCcCEEEEEHHHhhccC
Confidence 7889999999998766799999987 3343333468889999888875
No 13
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.27 E-value=1.3e-10 Score=101.94 Aligned_cols=172 Identities=19% Similarity=0.189 Sum_probs=98.7
Q ss_pred CceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCC-------ceeEEEEEEEEeC-------CCCcEEEE
Q 019504 57 GNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDG-------VQKNFEGKLVGAD-------RAKDLAVL 122 (340)
Q Consensus 57 ~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g-------~~~~~~a~v~~~d-------~~~DlAlL 122 (340)
...+|.+|+++ +|||+|||+..... ..+.+.+... ....+..+-+..+ ...|||||
T Consensus 25 ~~C~GtlIs~~-~VLTaAhC~~~~~~---------~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll 94 (232)
T cd00190 25 HFCGGSLISPR-WVLTAAHCVYSSAP---------SNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALL 94 (232)
T ss_pred EEEEEEEeeCC-EEEECHHhcCCCCC---------ccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEE
Confidence 35899999977 99999999987421 1333333211 1112333333333 35799999
Q ss_pred EEecCC---CCccceeecCCC-CCCCCCEEEEEecCCCCCC-----ceeEEEEeeecccc---ccCC-CceecceEE---
Q 019504 123 KIEASE---DLLKPINVGQSS-FLKVGQQCLAIGNPFGFDH-----TLTVGVISGLNRDI---FSQA-GVTIGGGIQ--- 186 (340)
Q Consensus 123 ~v~~~~---~~~~~~~l~~~~-~~~~G~~v~~iG~p~g~~~-----~~~~G~vs~~~~~~---~~~~-~~~~~~~i~--- 186 (340)
+++.+. ..+.|+.|.... .+..|+.+.+.||...... ......+.-+.... .... .......+-
T Consensus 95 ~L~~~~~~~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~ 174 (232)
T cd00190 95 KLKRPVTLSDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGG 174 (232)
T ss_pred EECCcccCCCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCC
Confidence 998763 236788886553 5678999999998654321 12222222221100 0000 000001111
Q ss_pred --EeeccCCCCccceeecCC---CcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHH
Q 019504 187 --TDAAINPGNSGGPLLDSK---GNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQ 239 (340)
Q Consensus 187 --~d~~i~~G~SGGPl~d~~---G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~ 239 (340)
.+...+.|+|||||+... +.++||.++... ++.......+..+...++|+++
T Consensus 175 ~~~~~~~c~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~~WI~~ 231 (232)
T cd00190 175 LEGGKDACQGDSGGPLVCNDNGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYLDWIQK 231 (232)
T ss_pred CCCCCccccCCCCCcEEEEeCCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhhHHhhc
Confidence 134577899999999764 789999998654 3321233445666777777653
No 14
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.18 E-value=1.2e-11 Score=91.06 Aligned_cols=72 Identities=36% Similarity=0.360 Sum_probs=55.8
Q ss_pred eeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEE-Eee
Q 019504 249 AGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYL-ICA 327 (340)
Q Consensus 249 ~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~-~~~ 327 (340)
+|||+.+....- ..|++|.+|.++|||+++|| ++||+|++|||++|++..| +..++ ...
T Consensus 1 ~~lGv~~~~~~~------~~g~~V~~V~~~spA~~aGl-----------~~GD~I~~ing~~v~~~~~---~~~~l~~~~ 60 (82)
T PF13180_consen 1 GGLGVTVQNLSD------TGGVVVVSVIPGSPAAKAGL-----------QPGDIILAINGKPVNSSED---LVNILSKGK 60 (82)
T ss_dssp -E-SEEEEECSC------SSSEEEEEESTTSHHHHTTS------------TTEEEEEETTEESSSHHH---HHHHHHCSS
T ss_pred CEECeEEEEccC------CCeEEEEEeCCCCcHHHCCC-----------CCCcEEEEECCEEcCCHHH---HHHHHHhCC
Confidence 477777754321 36999999999999999999 9999999999999999998 66666 345
Q ss_pred CCCCceEEEEeCC
Q 019504 328 EPNQDHLTCLKSS 340 (340)
Q Consensus 328 ~~~~~~~~~~~~~ 340 (340)
.....+++++|.+
T Consensus 61 ~g~~v~l~v~R~g 73 (82)
T PF13180_consen 61 PGDTVTLTVLRDG 73 (82)
T ss_dssp TTSEEEEEEEETT
T ss_pred CCCEEEEEEEECC
Confidence 5566788888864
No 15
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.16 E-value=1.2e-09 Score=95.98 Aligned_cols=164 Identities=19% Similarity=0.202 Sum_probs=92.3
Q ss_pred CceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCce------eEEEEEEEEe-------CCCCcEEEEE
Q 019504 57 GNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQ------KNFEGKLVGA-------DRAKDLAVLK 123 (340)
Q Consensus 57 ~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~------~~~~a~v~~~-------d~~~DlAlL~ 123 (340)
...+|.+|+++ +|||+|||+..... ..+.|.+..... ......-+.. ....|+|||+
T Consensus 26 ~~C~GtlIs~~-~VLTaahC~~~~~~---------~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~ 95 (229)
T smart00020 26 HFCGGSLISPR-WVLTAAHCVYGSDP---------SNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLK 95 (229)
T ss_pred cEEEEEEecCC-EEEECHHHcCCCCC---------cceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEE
Confidence 35899999977 99999999987530 134444432210 1233333332 2457999999
Q ss_pred EecCC---CCccceeecCC-CCCCCCCEEEEEecCCCCC------CceeEEEEeeecccccc---CCCceec-ceEE---
Q 019504 124 IEASE---DLLKPINVGQS-SFLKVGQQCLAIGNPFGFD------HTLTVGVISGLNRDIFS---QAGVTIG-GGIQ--- 186 (340)
Q Consensus 124 v~~~~---~~~~~~~l~~~-~~~~~G~~v~~iG~p~g~~------~~~~~G~vs~~~~~~~~---~~~~~~~-~~i~--- 186 (340)
++.+. ..+.|+.|... ..+..+..+.+.||+.... .......+..+....-. .....+. ..+-
T Consensus 96 L~~~i~~~~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~ 175 (229)
T smart00020 96 LKSPVTLSDNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGG 175 (229)
T ss_pred ECcccCCCCceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecC
Confidence 98762 34677777653 2566789999999875532 12222222222210000 0000000 0111
Q ss_pred --EeeccCCCCccceeecCCC--cEEEEEeeeeeCCCCcCceEEEEehH
Q 019504 187 --TDAAINPGNSGGPLLDSKG--NLIGINTAIITQTGTSAGVGFAIPSS 231 (340)
Q Consensus 187 --~d~~i~~G~SGGPl~d~~G--~VVGi~~~~~~~~~~~~~~~~aip~~ 231 (340)
.....++|+||||++...+ .++||.+... .+........+..+.
T Consensus 176 ~~~~~~~c~gdsG~pl~~~~~~~~l~Gi~s~g~-~C~~~~~~~~~~~i~ 223 (229)
T smart00020 176 LEGGKDACQGDSGGPLVCNDGRWVLVGIVSWGS-GCARPGKPGVYTRVS 223 (229)
T ss_pred CCCCCcccCCCCCCeeEEECCCEEEEEEEEECC-CCCCCCCCCEEEEec
Confidence 1355788999999997543 8999999875 333222333444444
No 16
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.87 E-value=6.7e-08 Score=84.80 Aligned_cols=195 Identities=19% Similarity=0.185 Sum_probs=105.2
Q ss_pred HHHhCCCeEEEEeeeeccccccCCcccc-cCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEE----ecCC
Q 019504 27 FEKNTYSVVNIFDVTLRPTLNVTGLVEI-PEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNIL----ASDG 101 (340)
Q Consensus 27 ~~~~~~svV~I~~~~~~~~~~~~~~~~~-~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~----~~~g 101 (340)
.......+.+|......+-......... .....++|+|+++ .+||++||+....... . ++.+. -.++
T Consensus 33 ~~~~~d~r~~V~dt~~~Py~av~~~~~~tG~~~~~~~lI~pn-tvLTa~Hc~~s~~~G~-~------~~~~~p~g~~~~~ 104 (251)
T COG3591 33 TASAEDDRTQVTDTTQFPYSAVVQFEAATGRLCTAATLIGPN-TVLTAGHCIYSPDYGE-D------DIAAAPPGVNSDG 104 (251)
T ss_pred hhcCCCCeeecccCCCCCcceeEEeecCCCcceeeEEEEcCc-eEEEeeeEEecCCCCh-h------hhhhcCCcccCCC
Confidence 3344466777754443332221111000 0111245999987 9999999998765321 0 11111 1111
Q ss_pred c-eeEEEEEEEEeC-C---CCcEEEEEEecCC--------CCccceeecCCCCCCCCCEEEEEecCCCCCCce----eEE
Q 019504 102 V-QKNFEGKLVGAD-R---AKDLAVLKIEASE--------DLLKPINVGQSSFLKVGQQCLAIGNPFGFDHTL----TVG 164 (340)
Q Consensus 102 ~-~~~~~a~v~~~d-~---~~DlAlL~v~~~~--------~~~~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~----~~G 164 (340)
. ...+........ . ..|.+...+.... .......+......+.++.+.++|||.+..... ..+
T Consensus 105 ~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~ 184 (251)
T COG3591 105 GPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINIGDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTG 184 (251)
T ss_pred CCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCccccccccccccccccccCceeEEEeccCCCCcceeEeeecc
Confidence 1 111221111111 2 3455555543221 112223333445668899999999997754222 222
Q ss_pred EEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEE-ehHhHHHHHHHHH
Q 019504 165 VISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAI-PSSTVLKIVPQLI 241 (340)
Q Consensus 165 ~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~ai-p~~~i~~~l~~l~ 241 (340)
.|..+. ...+++++.+++|+||+|+++.+.+|||++.......++ ...++++ -...++++++++.
T Consensus 185 ~v~~~~-----------~~~l~y~~dT~pG~SGSpv~~~~~~vigv~~~g~~~~~~-~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 185 KVNSIK-----------GNKLFYDADTLPGSSGSPVLISKDEVIGVHYNGPGANGG-SLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred eeEEEe-----------cceEEEEecccCCCCCCceEecCceEEEEEecCCCcccc-cccCcceEecHHHHHHHHHhh
Confidence 222221 124899999999999999999988999999987654332 3344433 3466788877764
No 17
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.81 E-value=3.7e-09 Score=77.17 Aligned_cols=60 Identities=23% Similarity=0.175 Sum_probs=49.6
Q ss_pred CCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeC-CCCceEEEEeCC
Q 019504 267 GNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAE-PNQDHLTCLKSS 340 (340)
Q Consensus 267 ~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~-~~~~~~~~~~~~ 340 (340)
..|++|.+|.++|||+++|| ++||+|++|||++|.+..| +...+.... .....+++.|++
T Consensus 9 ~~Gv~V~~V~~~spa~~aGL-----------~~GDiI~~Ing~~v~~~~d---~~~~l~~~~~g~~v~l~v~r~g 69 (79)
T cd00991 9 VAGVVIVGVIVGSPAENAVL-----------HTGDVIYSINGTPITTLED---FMEALKPTKPGEVITVTVLPST 69 (79)
T ss_pred CCcEEEEEECCCChHHhcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHhcCCCCCEEEEEEEECC
Confidence 37999999999999999999 9999999999999999888 666665542 344677777753
No 18
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.80 E-value=7.8e-08 Score=83.67 Aligned_cols=167 Identities=14% Similarity=0.195 Sum_probs=89.1
Q ss_pred HHHhCCCeEEEEeeeeccccccCCcccccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEE
Q 019504 27 FEKNTYSVVNIFDVTLRPTLNVTGLVEIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNF 106 (340)
Q Consensus 27 ~~~~~~svV~I~~~~~~~~~~~~~~~~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~ 106 (340)
+.-+...|++|....... ...=-||... + ||+|++|...... ..+.+....|. ..+
T Consensus 13 yn~Ia~~ic~l~n~s~~~-----------~~~l~gigyG-~-~iItn~HLf~~nn----------g~L~i~s~hG~-f~v 68 (235)
T PF00863_consen 13 YNPIASNICRLTNESDGG-----------TRSLYGIGYG-S-YIITNAHLFKRNN----------GELTIKSQHGE-FTV 68 (235)
T ss_dssp -HHHHTTEEEEEEEETTE-----------EEEEEEEEET-T-EEEEEGGGGSSTT----------CEEEEEETTEE-EEE
T ss_pred cchhhheEEEEEEEeCCC-----------eEEEEEEeEC-C-EEEEChhhhccCC----------CeEEEEeCceE-EEc
Confidence 445667788886321111 1123477776 3 9999999998766 37888888774 222
Q ss_pred E---EEEEEeCCCCcEEEEEEecCCCCccceee-cCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceec
Q 019504 107 E---GKLVGADRAKDLAVLKIEASEDLLKPINV-GQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIG 182 (340)
Q Consensus 107 ~---a~v~~~d~~~DlAlL~v~~~~~~~~~~~l-~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~ 182 (340)
+ .--+..=+..||.++|++.+ ++|.+- .....++.+|+|.++|.-+.. ......||...........
T Consensus 69 ~nt~~lkv~~i~~~DiviirmPkD---fpPf~~kl~FR~P~~~e~v~mVg~~fq~--k~~~s~vSesS~i~p~~~~---- 139 (235)
T PF00863_consen 69 PNTTQLKVHPIEGRDIVIIRMPKD---FPPFPQKLKFRAPKEGERVCMVGSNFQE--KSISSTVSESSWIYPEENS---- 139 (235)
T ss_dssp CEGGGSEEEE-TCSSEEEEE--TT---S----S---B----TT-EEEEEEEECSS--CCCEEEEEEEEEEEEETTT----
T ss_pred CCccccceEEeCCccEEEEeCCcc---cCCcchhhhccCCCCCCEEEEEEEEEEc--CCeeEEECCceEEeecCCC----
Confidence 1 11233346789999999875 333321 123467899999999964432 2233344443332322222
Q ss_pred ceEEEeeccCCCCccceeecC-CCcEEEEEeeeeeCCCCcCceEEEEeh
Q 019504 183 GGIQTDAAINPGNSGGPLLDS-KGNLIGINTAIITQTGTSAGVGFAIPS 230 (340)
Q Consensus 183 ~~i~~d~~i~~G~SGGPl~d~-~G~VVGi~~~~~~~~~~~~~~~~aip~ 230 (340)
.+..+-.....|+=|.|+++. +|++|||++..... ...||+.|+
T Consensus 140 ~fWkHwIsTk~G~CG~PlVs~~Dg~IVGiHsl~~~~----~~~N~F~~f 184 (235)
T PF00863_consen 140 HFWKHWISTKDGDCGLPLVSTKDGKIVGIHSLTSNT----SSRNYFTPF 184 (235)
T ss_dssp TEEEE-C---TT-TT-EEEETTT--EEEEEEEEETT----TSSEEEEE-
T ss_pred CeeEEEecCCCCccCCcEEEcCCCcEEEEEcCccCC----CCeEEEEcC
Confidence 345666666789999999987 89999999976543 345677776
No 19
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=98.78 E-value=5.5e-09 Score=93.38 Aligned_cols=92 Identities=17% Similarity=0.054 Sum_probs=74.4
Q ss_pred hHhHHHHHHHHHHcCceeeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCE
Q 019504 230 SSTVLKIVPQLIQYGKVVRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNK 309 (340)
Q Consensus 230 ~~~i~~~l~~l~~~~~~~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~ 309 (340)
...++++++++++.++..+.++|+...... +-..|++|..+.+++||+++|| ++||+|++|||+
T Consensus 158 ~~~~~~v~~~l~~~g~~~~~~lgi~p~~~~-----g~~~G~~v~~v~~~s~a~~aGL-----------r~GDvIv~ING~ 221 (259)
T TIGR01713 158 IVVSRRIIEELTKDPQKMFDYIRLSPVMKN-----DKLEGYRLNPGKDPSLFYKSGL-----------QDGDIAVALNGL 221 (259)
T ss_pred hhhHHHHHHHHHHCHHhhhheEeEEEEEeC-----CceeEEEEEecCCCCHHHHcCC-----------CCCCEEEEECCE
Confidence 346788999999999998999998864321 3337999999999999999999 999999999999
Q ss_pred EccCCCCCCCceeEEEeeCC-CCceEEEEeCC
Q 019504 310 PVSFSCLSIPSRIYLICAEP-NQDHLTCLKSS 340 (340)
Q Consensus 310 ~v~~~~d~~~~~~~~~~~~~-~~~~~~~~~~~ 340 (340)
++++.++ +...+....+ ...+++|.|++
T Consensus 222 ~i~~~~~---~~~~l~~~~~~~~v~l~V~R~G 250 (259)
T TIGR01713 222 DLRDPEQ---AFQALQMLREETNLTLTVERDG 250 (259)
T ss_pred EcCCHHH---HHHHHHhcCCCCeEEEEEEECC
Confidence 9999988 5555555433 46789999874
No 20
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.76 E-value=1.2e-08 Score=76.16 Aligned_cols=77 Identities=36% Similarity=0.401 Sum_probs=55.7
Q ss_pred eeeEEec--cHHHHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEE
Q 019504 250 GLNVDIA--PDLVASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLI 325 (340)
Q Consensus 250 ~lg~~~~--~~~~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~ 325 (340)
|+|+.+. +....+.++++ .|++|.+|.++|||+++|| ++||+|++|||+++.+..| +...+.
T Consensus 2 ~~G~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl-----------~~GD~I~~Ing~~i~~~~~---~~~~l~ 67 (90)
T cd00987 2 WLGVTVQDLTPDLAEELGLKDTKGVLVASVDPGSPAAKAGL-----------KPGDVILAVNGKPVKSVAD---LRRALA 67 (90)
T ss_pred ccceEEeECCHHHHHHcCCCCCCEEEEEEECCCCHHHHcCC-----------CcCCEEEEECCEECCCHHH---HHHHHH
Confidence 4565553 44444444443 6999999999999999999 9999999999999999887 444444
Q ss_pred eeC-CCCceEEEEeCC
Q 019504 326 CAE-PNQDHLTCLKSS 340 (340)
Q Consensus 326 ~~~-~~~~~~~~~~~~ 340 (340)
... .....+++.|++
T Consensus 68 ~~~~~~~i~l~v~r~g 83 (90)
T cd00987 68 ELKPGDKVTLTVLRGG 83 (90)
T ss_pred hcCCCCEEEEEEEECC
Confidence 332 344667777653
No 21
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.72 E-value=1.1e-08 Score=74.42 Aligned_cols=58 Identities=24% Similarity=0.112 Sum_probs=47.1
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKS 339 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~ 339 (340)
..+.|.+|.++|||+++|| ++||+|++|||+++.+..| +...+.........+++.|+
T Consensus 12 ~~~~V~~v~~~s~a~~~gl-----------~~GD~I~~ing~~i~~~~~---~~~~l~~~~~~~~~l~v~r~ 69 (79)
T cd00989 12 IEPVIGEVVPGSPAAKAGL-----------KAGDRILAINGQKIKSWED---LVDAVQENPGKPLTLTVERN 69 (79)
T ss_pred cCcEEEeECCCCHHHHcCC-----------CCCCEEEEECCEECCCHHH---HHHHHHHCCCceEEEEEEEC
Confidence 3589999999999999999 9999999999999999887 55555444334567777775
No 22
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.71 E-value=1.6e-08 Score=73.81 Aligned_cols=67 Identities=25% Similarity=0.224 Sum_probs=48.7
Q ss_pred eeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeC
Q 019504 249 AGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAE 328 (340)
Q Consensus 249 ~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~ 328 (340)
+++|+.+... ..|+.|.+|.++|||+++|| ++||+|++|||+++.++.+.+ ..+ ..
T Consensus 1 ~~~G~~~~~~--------~~~~~V~~V~~~s~a~~aGl-----------~~GD~I~~Ing~~v~~~~~~l---~~~--~~ 56 (80)
T cd00990 1 PYLGLTLDKE--------EGLGKVTFVRDDSPADKAGL-----------VAGDELVAVNGWRVDALQDRL---KEY--QA 56 (80)
T ss_pred CcccEEEEcc--------CCcEEEEEECCCChHHHhCC-----------CCCCEEEEECCEEhHHHHHHH---Hhc--CC
Confidence 3567766432 25799999999999999999 999999999999999865421 111 12
Q ss_pred CCCceEEEEeC
Q 019504 329 PNQDHLTCLKS 339 (340)
Q Consensus 329 ~~~~~~~~~~~ 339 (340)
.....+++.|+
T Consensus 57 ~~~v~l~v~r~ 67 (80)
T cd00990 57 GDPVELTVFRD 67 (80)
T ss_pred CCEEEEEEEEC
Confidence 23466677665
No 23
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.70 E-value=2.3e-08 Score=73.90 Aligned_cols=69 Identities=32% Similarity=0.336 Sum_probs=52.8
Q ss_pred eeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC--CCCCCceeEEEe
Q 019504 249 AGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS--CLSIPSRIYLIC 326 (340)
Q Consensus 249 ~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~--~d~~~~~~~~~~ 326 (340)
.+||+.+.... .++.|..|.++|||+++|| ++||+|++|||+++.+. .| +..++..
T Consensus 2 ~~lG~~~~~~~--------~~~~V~~v~~~s~a~~~gl-----------~~GD~I~~vng~~i~~~~~~~---~~~~l~~ 59 (85)
T cd00988 2 GGIGLELKYDD--------GGLVITSVLPGSPAAKAGI-----------KAGDIIVAIDGEPVDGLSLED---VVKLLRG 59 (85)
T ss_pred eEEEEEEEEcC--------CeEEEEEecCCCCHHHcCC-----------CCCCEEEEECCEEcCCCCHHH---HHHHhcC
Confidence 35777764322 6899999999999999999 99999999999999998 65 4445544
Q ss_pred eCCCCceEEEEeC
Q 019504 327 AEPNQDHLTCLKS 339 (340)
Q Consensus 327 ~~~~~~~~~~~~~ 339 (340)
.......+++.|+
T Consensus 60 ~~~~~i~l~v~r~ 72 (85)
T cd00988 60 KAGTKVRLTLKRG 72 (85)
T ss_pred CCCCEEEEEEEcC
Confidence 4434466777765
No 24
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.63 E-value=2.6e-08 Score=70.67 Aligned_cols=55 Identities=33% Similarity=0.303 Sum_probs=43.6
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC--CCCCCceeEEEeeCCCCceEEE
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS--CLSIPSRIYLICAEPNQDHLTC 336 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~--~d~~~~~~~~~~~~~~~~~~~~ 336 (340)
.+++|.+|.+++||+++|| ++||+|++|||+++.+. .+ ...++........++++
T Consensus 13 ~~~~V~~v~~~s~a~~~gl-----------~~GD~I~~Ing~~v~~~~~~~---~~~~l~~~~g~~v~l~v 69 (70)
T cd00136 13 GGVVVLSVEPGSPAERAGL-----------QAGDVILAVNGTDVKNLTLED---VAELLKKEVGEKVTLTV 69 (70)
T ss_pred CCEEEEEeCCCCHHHHcCC-----------CCCCEEEEECCEECCCCCHHH---HHHHHhhCCCCeEEEEE
Confidence 4999999999999999999 99999999999999998 65 55555554423344443
No 25
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.55 E-value=5e-08 Score=71.15 Aligned_cols=58 Identities=26% Similarity=0.254 Sum_probs=46.3
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe-eCCCCceEEEEeCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC-AEPNQDHLTCLKSS 340 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~-~~~~~~~~~~~~~~ 340 (340)
.|++|.+|.++|||+. || ++||+|++|||+++.+.+| +...+.. ......++++.|++
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL-----------~~GD~I~~Ing~~v~~~~~---~~~~l~~~~~~~~v~l~v~r~g 66 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KL-----------KAGDHIIAVDGKPFKEAEE---LIDYIQSKKEGDTVKLKVKREE 66 (79)
T ss_pred cCEEEEEECCCCchhh-CC-----------CCCCEEEEECCEECCCHHH---HHHHHHhCCCCCEEEEEEEECC
Confidence 5899999999999986 78 9999999999999999887 5555543 23334678887763
No 26
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.41 E-value=2.8e-07 Score=89.07 Aligned_cols=77 Identities=31% Similarity=0.380 Sum_probs=60.7
Q ss_pred eeeEEec--cHHHHhhcCCC---CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEE
Q 019504 250 GLNVDIA--PDLVASQLNVG---NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYL 324 (340)
Q Consensus 250 ~lg~~~~--~~~~~~~~~~~---~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~ 324 (340)
++|+.+. +...++.++++ .|++|.+|.++|||+++|| ++||+|++|||++|.+..| +...+
T Consensus 339 ~lGi~~~~l~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL-----------~~GDvI~~Ing~~V~s~~d---~~~~l 404 (428)
T TIGR02037 339 FLGLTVANLSPEIRKELRLKGDVKGVVVTKVVSGSPAARAGL-----------QPGDVILSVNQQPVSSVAE---LRKVL 404 (428)
T ss_pred ccceEEecCCHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHH
Confidence 3555443 35556667765 6999999999999999999 9999999999999999998 66666
Q ss_pred Eee-CCCCceEEEEeCC
Q 019504 325 ICA-EPNQDHLTCLKSS 340 (340)
Q Consensus 325 ~~~-~~~~~~~~~~~~~ 340 (340)
... ......++|.|++
T Consensus 405 ~~~~~g~~v~l~v~R~g 421 (428)
T TIGR02037 405 DRAKKGGRVALLILRGG 421 (428)
T ss_pred HhcCCCCEEEEEEEECC
Confidence 553 3455788888864
No 27
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.27 E-value=8e-07 Score=65.27 Aligned_cols=60 Identities=32% Similarity=0.336 Sum_probs=45.3
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
.|++|..|.+++||+++|| ++||+|++|||+++.+..+ ......+.. .+...++++.|.+
T Consensus 26 ~~~~i~~v~~~s~a~~~gl-----------~~GD~I~~In~~~v~~~~~-~~~~~~~~~-~~~~~~l~i~r~~ 85 (85)
T smart00228 26 GGVVVSSVVPGSPAAKAGL-----------KVGDVILEVNGTSVEGLTH-LEAVDLLKK-AGGKVTLTVLRGG 85 (85)
T ss_pred CCEEEEEECCCCHHHHcCC-----------CCCCEEEEECCEECCCCCH-HHHHHHHHh-CCCeEEEEEEeCC
Confidence 5999999999999999999 9999999999999998764 111112222 2235788887763
No 28
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.22 E-value=2.5e-06 Score=79.70 Aligned_cols=70 Identities=23% Similarity=0.256 Sum_probs=51.5
Q ss_pred eeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC--CCCCceeEEE
Q 019504 248 RAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC--LSIPSRIYLI 325 (340)
Q Consensus 248 ~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~--d~~~~~~~~~ 325 (340)
..++|+.+.... .+++|.+|.++|||+++|| ++||+|++|||++|.+.+ + +...+.
T Consensus 50 ~~~lG~~~~~~~--------~~~~V~~V~~~spA~~aGL-----------~~GD~I~~Ing~~v~~~~~~~---~~~~l~ 107 (334)
T TIGR00225 50 LEGIGIQVGMDD--------GEIVIVSPFEGSPAEKAGI-----------KPGDKIIKINGKSVAGMSLDD---AVALIR 107 (334)
T ss_pred eEEEEEEEEEEC--------CEEEEEEeCCCChHHHcCC-----------CCCCEEEEECCEECCCCCHHH---HHHhcc
Confidence 456777764322 5899999999999999999 999999999999998763 2 333443
Q ss_pred eeCCCCceEEEEeC
Q 019504 326 CAEPNQDHLTCLKS 339 (340)
Q Consensus 326 ~~~~~~~~~~~~~~ 339 (340)
.......++++.|.
T Consensus 108 ~~~g~~v~l~v~R~ 121 (334)
T TIGR00225 108 GKKGTKVSLEILRA 121 (334)
T ss_pred CCCCCEEEEEEEeC
Confidence 33444566777664
No 29
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.17 E-value=1.1e-06 Score=64.20 Aligned_cols=38 Identities=29% Similarity=0.366 Sum_probs=35.0
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEcc--CCCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVS--FSCL 316 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~--~~~d 316 (340)
.|++|.+|.++|||+++|| ++||+|++|||+++. +..+
T Consensus 26 ~~~~V~~v~~~s~a~~~gl-----------~~GD~I~~ing~~i~~~~~~~ 65 (82)
T cd00992 26 GGIFVSRVEPGGPAERGGL-----------RVGDRILEVNGVSVEGLTHEE 65 (82)
T ss_pred CCeEEEEECCCChHHhCCC-----------CCCCEEEEECCEEcCccCHHH
Confidence 5899999999999999999 999999999999999 5554
No 30
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.15 E-value=3.6e-06 Score=80.13 Aligned_cols=79 Identities=22% Similarity=0.212 Sum_probs=51.9
Q ss_pred eeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe
Q 019504 247 VRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC 326 (340)
Q Consensus 247 ~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~ 326 (340)
...++|+.+..... ..+...|++|..|.++|||+++|| ++||+|++|||++|.+... ..+...+..
T Consensus 83 ~~~GiG~~~~~~~~--~~~~~~g~~V~~V~~~SPA~~aGl-----------~~GD~Iv~InG~~v~~~~~-~~~~~~l~g 148 (389)
T PLN00049 83 AVTGVGLEVGYPTG--SDGPPAGLVVVAPAPGGPAARAGI-----------RPGDVILAIDGTSTEGLSL-YEAADRLQG 148 (389)
T ss_pred CceEEEEEEEEccC--CCCccCcEEEEEeCCCChHHHcCC-----------CCCCEEEEECCEECCCCCH-HHHHHHHhc
Confidence 35678887643210 001124899999999999999999 9999999999999986521 012333333
Q ss_pred eCCCCceEEEEeC
Q 019504 327 AEPNQDHLTCLKS 339 (340)
Q Consensus 327 ~~~~~~~~~~~~~ 339 (340)
.....++++|.|+
T Consensus 149 ~~g~~v~ltv~r~ 161 (389)
T PLN00049 149 PEGSSVELTLRRG 161 (389)
T ss_pred CCCCEEEEEEEEC
Confidence 3334456666654
No 31
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.15 E-value=8.4e-07 Score=86.11 Aligned_cols=57 Identities=23% Similarity=0.176 Sum_probs=45.5
Q ss_pred cEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCC-CCceEEEEeCC
Q 019504 270 ALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEP-NQDHLTCLKSS 340 (340)
Q Consensus 270 ~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~-~~~~~~~~~~~ 340 (340)
.+|.+|.++|||++||| |+||+|++|||++|.+.+| ++..+....+ ...++|++|++
T Consensus 128 ~lV~~V~~~SpA~kAGL-----------k~GDvI~~vnG~~V~~~~~---l~~~v~~~~~g~~v~v~v~R~g 185 (449)
T PRK10779 128 PVVGEIAPNSIAAQAQI-----------APGTELKAVDGIETPDWDA---VRLALVSKIGDESTTITVAPFG 185 (449)
T ss_pred ccccccCCCCHHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhhccCCceEEEEEeCC
Confidence 47899999999999999 9999999999999999998 4444443332 34677887763
No 32
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.12 E-value=1.7e-06 Score=83.24 Aligned_cols=59 Identities=19% Similarity=0.106 Sum_probs=49.3
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
.|+.|.+|.++|||+++|| ++||+|++|||++|.+.+| +...+........++++.|++
T Consensus 203 ~g~vV~~V~~~SpA~~aGL-----------~~GD~Iv~Vng~~V~s~~d---l~~~l~~~~~~~v~l~v~R~g 261 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGL-----------KEGDYIQSINGEKLRSWTD---FVSAVKENPGKSMDIKVERNG 261 (420)
T ss_pred cCcEEEEECCCCHHHHcCC-----------CCCCEEEEECCEECCCHHH---HHHHHHhCCCCceEEEEEECC
Confidence 4899999999999999999 9999999999999999998 666665544444678887763
No 33
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.07 E-value=0.00023 Score=63.59 Aligned_cols=124 Identities=23% Similarity=0.209 Sum_probs=67.2
Q ss_pred CcEEEEEEecCC---CCccceeecCCCC---CCCCCEEEEEecCCCC------CCceeEEEEeeecccc-ccCCCc--ee
Q 019504 117 KDLAVLKIEASE---DLLKPINVGQSSF---LKVGQQCLAIGNPFGF------DHTLTVGVISGLNRDI-FSQAGV--TI 181 (340)
Q Consensus 117 ~DlAlL~v~~~~---~~~~~~~l~~~~~---~~~G~~v~~iG~p~g~------~~~~~~G~vs~~~~~~-~~~~~~--~~ 181 (340)
.|||||+++.+. ..+.|+.|..... ...+..+++.||+... ........+.-+.... ...... ..
T Consensus 106 nDiall~l~~~v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~ 185 (256)
T KOG3627|consen 106 NDIALLRLSEPVTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTI 185 (256)
T ss_pred CCEEEEEECCCcccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCcccc
Confidence 799999999752 4466777743332 3445888889975421 1122222222221100 000000 00
Q ss_pred -cceEE-----EeeccCCCCccceeecCC---CcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHH
Q 019504 182 -GGGIQ-----TDAAINPGNSGGPLLDSK---GNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQL 240 (340)
Q Consensus 182 -~~~i~-----~d~~i~~G~SGGPl~d~~---G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l 240 (340)
...+- .....|.|+|||||+-.+ ..++||++++...++....-+....+....+++++.
T Consensus 186 ~~~~~Ca~~~~~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~~ 253 (256)
T KOG3627|consen 186 TDTMLCAGGPEGGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKEN 253 (256)
T ss_pred CCCEEeeCccCCCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHHH
Confidence 01121 223368899999999654 699999999876333321223355566677776654
No 34
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.05 E-value=5.9e-06 Score=60.38 Aligned_cols=38 Identities=32% Similarity=0.410 Sum_probs=35.8
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL 316 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d 316 (340)
.+++|.+|.+++||+++|| ++||.|++|||+++.+...
T Consensus 25 ~~~~V~~v~~~~~a~~~gl-----------~~GD~Il~INg~~v~~~~~ 62 (81)
T PF00595_consen 25 KGVFVSSVVPGSPAERAGL-----------KVGDRILEINGQSVRGMSH 62 (81)
T ss_dssp EEEEEEEECTTSHHHHHTS-----------STTEEEEEETTEESTTSBH
T ss_pred CCEEEEEEeCCChHHhccc-----------chhhhhheeCCEeCCCCCH
Confidence 5999999999999999998 9999999999999998864
No 35
>PRK10139 serine endoprotease; Provisional
Probab=98.04 E-value=2.7e-06 Score=82.51 Aligned_cols=58 Identities=26% Similarity=0.260 Sum_probs=50.1
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
.|++|.+|.++|||+++|| ++||+|++|||++|.+.+| +...+..+. +...++|+|++
T Consensus 390 ~Gv~V~~V~~~spA~~aGL-----------~~GD~I~~Ing~~v~~~~~---~~~~l~~~~-~~v~l~v~R~g 447 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGL-----------QKDDVIIGVNRDRVNSIAE---MRKVLAAKP-AIIALQIVRGN 447 (455)
T ss_pred CceEEEEeCCCChHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhCC-CeEEEEEEECC
Confidence 5899999999999999999 9999999999999999998 666665543 56788888864
No 36
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.03 E-value=3e-06 Score=82.28 Aligned_cols=58 Identities=24% Similarity=0.218 Sum_probs=47.6
Q ss_pred CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
+++|.+|.++|||+++|| ++||+|++|||++|.+.+| +...+.........+++.|++
T Consensus 222 ~~vV~~V~~~SpA~~AGL-----------~~GDvIl~Ing~~V~s~~d---l~~~l~~~~~~~v~l~v~R~g 279 (449)
T PRK10779 222 EPVLAEVQPNSAASKAGL-----------QAGDRIVKVDGQPLTQWQT---FVTLVRDNPGKPLALEIERQG 279 (449)
T ss_pred CcEEEeeCCCCHHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhCCCCEEEEEEEECC
Confidence 689999999999999999 9999999999999999888 555554434344677777763
No 37
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.02 E-value=3.6e-06 Score=62.11 Aligned_cols=60 Identities=25% Similarity=0.289 Sum_probs=42.0
Q ss_pred CCcEEEeeCCC--------ChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeC
Q 019504 268 NGALVLQVPGN--------SLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKS 339 (340)
Q Consensus 268 ~g~~V~~v~~~--------spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~ 339 (340)
.+..|.++.++ ||-.+.|+ ++++||+|++|||+++....+ +..+|..+....+.||+.+.
T Consensus 12 ~~y~I~~I~~gd~~~~~~~sPL~~pGv---------~v~~GD~I~aInG~~v~~~~~---~~~lL~~~agk~V~Ltv~~~ 79 (88)
T PF14685_consen 12 GGYRIARIYPGDPWNPNARSPLAQPGV---------DVREGDYILAINGQPVTADAN---PYRLLEGKAGKQVLLTVNRK 79 (88)
T ss_dssp TEEEEEEE-BS-TTSSS-B-GGGGGS-------------TT-EEEEETTEE-BTTB----HHHHHHTTTTSEEEEEEE-S
T ss_pred CEEEEEEEeCCCCCCccccCCccCCCC---------CCCCCCEEEEECCEECCCCCC---HHHHhcccCCCEEEEEEecC
Confidence 57778888775 66667777 348999999999999999887 78888888878888888775
No 38
>PRK10942 serine endoprotease; Provisional
Probab=97.97 E-value=4.2e-06 Score=81.57 Aligned_cols=58 Identities=26% Similarity=0.301 Sum_probs=50.2
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
.|++|.+|.++|||+++|| ++||+|++|||++|.+.+| +...+.... ....++|.|.+
T Consensus 408 ~gvvV~~V~~~S~A~~aGL-----------~~GDvIv~VNg~~V~s~~d---l~~~l~~~~-~~v~l~V~R~g 465 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGL-----------KKGDVIIGANQQPVKNIAE---LRKILDSKP-SVLALNIQRGD 465 (473)
T ss_pred CCeEEEEeCCCChHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhCC-CeEEEEEEECC
Confidence 4899999999999999999 9999999999999999998 666665533 56788888864
No 39
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=97.93 E-value=4.9e-06 Score=80.02 Aligned_cols=57 Identities=25% Similarity=0.230 Sum_probs=47.1
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKS 339 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~ 339 (340)
.|.+|.+|.++|||++||| ++||+|+++||+++.+..| +...+.... +...++++|+
T Consensus 128 ~g~~V~~V~~~SpA~~AGL-----------~~GDvI~~vng~~v~~~~d---l~~~ia~~~-~~v~~~I~r~ 184 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGI-----------EPGDEILSVNGNKIPGFKD---VRQQIADIA-GEPMVEILAE 184 (420)
T ss_pred CCceeeccCCCCHHHHcCC-----------CCCCEEEEECCEEcCCHHH---HHHHHHhhc-ccceEEEEEe
Confidence 6899999999999999999 9999999999999999998 444444433 4567777763
No 40
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=97.90 E-value=1.6e-05 Score=75.91 Aligned_cols=74 Identities=26% Similarity=0.311 Sum_probs=57.7
Q ss_pred eeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEe
Q 019504 247 VRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLIC 326 (340)
Q Consensus 247 ~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~ 326 (340)
.+.|+|+++.-... .++.|.++.+++||+++|| ++||+|++|||+++....- -+....+..
T Consensus 98 ~~~GiG~~i~~~~~-------~~~~V~s~~~~~PA~kagi-----------~~GD~I~~IdG~~~~~~~~-~~av~~irG 158 (406)
T COG0793 98 EFGGIGIELQMEDI-------GGVKVVSPIDGSPAAKAGI-----------KPGDVIIKIDGKSVGGVSL-DEAVKLIRG 158 (406)
T ss_pred cccceeEEEEEecC-------CCcEEEecCCCChHHHcCC-----------CCCCEEEEECCEEccCCCH-HHHHHHhCC
Confidence 56788888754221 6899999999999999999 9999999999999998861 112345666
Q ss_pred eCCCCceEEEEeC
Q 019504 327 AEPNQDHLTCLKS 339 (340)
Q Consensus 327 ~~~~~~~~~~~~~ 339 (340)
++...+++||.|+
T Consensus 159 ~~Gt~V~L~i~r~ 171 (406)
T COG0793 159 KPGTKVTLTILRA 171 (406)
T ss_pred CCCCeEEEEEEEc
Confidence 6667788888885
No 41
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=97.75 E-value=2.3e-05 Score=73.64 Aligned_cols=66 Identities=23% Similarity=0.243 Sum_probs=51.1
Q ss_pred eeEEeccHHHHhhcCCCCCcEEEeeC--------CCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCcee
Q 019504 251 LNVDIAPDLVASQLNVGNGALVLQVP--------GNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRI 322 (340)
Q Consensus 251 lg~~~~~~~~~~~~~~~~g~~V~~v~--------~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~ 322 (340)
+|+.+.+ .|++|.... .+|||+++|| ++||+|++|||++|.+.+| +..
T Consensus 98 iGI~l~t----------~GVlVvg~~~v~~~~g~~~SPAa~AGL-----------q~GDiIvsING~~V~s~~D---L~~ 153 (402)
T TIGR02860 98 IGVKLNT----------KGVLVVGFSDIETEKGKIHSPGEEAGI-----------QIGDRILKINGEKIKNMDD---LAN 153 (402)
T ss_pred EEEEEec----------CEEEEEEEEcccccCCCCCCHHHHcCC-----------CCCCEEEEECCEECCCHHH---HHH
Confidence 6776644 589986642 2699999999 9999999999999999998 665
Q ss_pred EEEeeCCCCceEEEEeCC
Q 019504 323 YLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 323 ~~~~~~~~~~~~~~~~~~ 340 (340)
.+.........+++.|++
T Consensus 154 iL~~~~g~~V~LtV~R~G 171 (402)
T TIGR02860 154 LINKAGGEKLTLTIERGG 171 (402)
T ss_pred HHHhCCCCeEEEEEEECC
Confidence 665555555778888763
No 42
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.71 E-value=0.00028 Score=61.79 Aligned_cols=117 Identities=24% Similarity=0.383 Sum_probs=61.9
Q ss_pred CceEEEEEcCC--CEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccce
Q 019504 57 GNGSGVVWDGK--GHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPI 134 (340)
Q Consensus 57 ~~GsGfiI~~~--G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~ 134 (340)
..|||=+...+ -.|+|+.||+... ...+... +. -+...++..-|+|.-.++.-+...|.+
T Consensus 112 s~Gsggvft~~~~~vvvTAtHVlg~~------------~a~v~~~-g~-----~~~~tF~~~GDfA~~~~~~~~G~~P~~ 173 (297)
T PF05579_consen 112 SVGSGGVFTIGGNTVVVTATHVLGGN------------TARVSGV-GT-----RRMLTFKKNGDFAEADITNWPGAAPKY 173 (297)
T ss_dssp SEEEEEEEECTTEEEEEEEHHHCBTT------------EEEEEET-TE-----EEEEEEEEETTEEEEEETTS-S---B-
T ss_pred cccccceEEECCeEEEEEEEEEcCCC------------eEEEEec-ce-----EEEEEEeccCcEEEEECCCCCCCCCce
Confidence 44555555444 4799999999853 3333332 22 123345566799999995444446777
Q ss_pred eecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEEeee
Q 019504 135 NVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGINTAI 214 (340)
Q Consensus 135 ~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~ 214 (340)
++++. ..|.--+.. ..-+..|.|..-. ++ +-..+|+||+|++..+|.+||+++..
T Consensus 174 k~a~~---~~GrAyW~t------~tGvE~G~ig~~~-------------~~---~fT~~GDSGSPVVt~dg~liGVHTGS 228 (297)
T PF05579_consen 174 KFAQN---YTGRAYWLT------STGVEPGFIGGGG-------------AV---CFTGPGDSGSPVVTEDGDLIGVHTGS 228 (297)
T ss_dssp -B-TT----SEEEEEEE------TTEEEEEEEETTE-------------EE---ESS-GGCTT-EEEETTC-EEEEEEEE
T ss_pred eecCC---cccceEEEc------ccCcccceecCce-------------EE---EEcCCCCCCCccCcCCCCEEEEEecC
Confidence 77522 223322222 2234455543211 12 22346999999999999999999975
Q ss_pred ee
Q 019504 215 IT 216 (340)
Q Consensus 215 ~~ 216 (340)
-+
T Consensus 229 n~ 230 (297)
T PF05579_consen 229 NK 230 (297)
T ss_dssp ET
T ss_pred CC
Confidence 43
No 43
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=97.64 E-value=4.5e-05 Score=56.28 Aligned_cols=34 Identities=32% Similarity=0.372 Sum_probs=31.7
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEcc
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVS 312 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~ 312 (340)
.|+||++|.+||||+.||| +.+|-|+.+||-..+
T Consensus 59 ~GiYvT~V~eGsPA~~AGL-----------rihDKIlQvNG~DfT 92 (124)
T KOG3553|consen 59 KGIYVTRVSEGSPAEIAGL-----------RIHDKILQVNGWDFT 92 (124)
T ss_pred ccEEEEEeccCChhhhhcc-----------eecceEEEecCceeE
Confidence 8999999999999999999 899999999996654
No 44
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.62 E-value=7.4e-05 Score=75.21 Aligned_cols=73 Identities=19% Similarity=0.153 Sum_probs=49.2
Q ss_pred eeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhc-CCCccccCCCCCCcCCcEEEEEC--CEEccCCCC--CCCce
Q 019504 247 VRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKA-GILPTTRGFAGNIILGDIIVAVN--NKPVSFSCL--SIPSR 321 (340)
Q Consensus 247 ~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~-gl~~~~~~~~~~l~~GDvi~~i~--g~~v~~~~d--~~~~~ 321 (340)
...|+|+.+.... .++.|.+|.+|+||+++ || ++||+|++|| |+++.+... +-...
T Consensus 242 ~~~GIGa~l~~~~--------~~~~V~~vipGsPA~ka~gL-----------k~GD~IlaVn~~g~~~~dv~g~~~~~vv 302 (667)
T PRK11186 242 SLEGIGAVLQMDD--------DYTVINSLVAGGPAAKSKKL-----------SVGDKIVGVGQDGKPIVDVIGWRLDDVV 302 (667)
T ss_pred ceeEEEEEEEEeC--------CeEEEEEccCCChHHHhCCC-----------CCCCEEEEECCCCCcccccccCCHHHHH
Confidence 4567888875432 47899999999999998 88 9999999999 665544321 00123
Q ss_pred eEEEeeCCCCceEEEEe
Q 019504 322 IYLICAEPNQDHLTCLK 338 (340)
Q Consensus 322 ~~~~~~~~~~~~~~~~~ 338 (340)
.++...+...++|||.|
T Consensus 303 ~lirG~~Gt~V~LtV~r 319 (667)
T PRK11186 303 ALIKGPKGSKVRLEILP 319 (667)
T ss_pred HHhcCCCCCEEEEEEEe
Confidence 34444454556666665
No 45
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=97.43 E-value=0.0048 Score=56.14 Aligned_cols=111 Identities=19% Similarity=0.179 Sum_probs=64.9
Q ss_pred CCCCcEEEEEEecC-CCCccceeecCCC-CCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeecc
Q 019504 114 DRAKDLAVLKIEAS-EDLLKPINVGQSS-FLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAI 191 (340)
Q Consensus 114 d~~~DlAlL~v~~~-~~~~~~~~l~~~~-~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i 191 (340)
....+++||+++.+ .....|+-|+++. .+..++.+.+.|+... ..+....+.-..... ....+..+...
T Consensus 158 ~~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~~--~~~~~~~~~i~~~~~-------~~~~~~~~~~~ 228 (282)
T PF03761_consen 158 NRPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNST--GKLKHRKLKITNCTK-------CAYSICTKQYS 228 (282)
T ss_pred ccccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCCC--CeEEEEEEEEEEeec-------cceeEeccccc
Confidence 35579999999887 3346788887653 4678999999997211 112222221111100 11225555667
Q ss_pred CCCCccceeecC-C--CcEEEEEeeeeeCCCCcCceEEEEehHhHHH
Q 019504 192 NPGNSGGPLLDS-K--GNLIGINTAIITQTGTSAGVGFAIPSSTVLK 235 (340)
Q Consensus 192 ~~G~SGGPl~d~-~--G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~ 235 (340)
+.|++|||++.. + ..|||+.+...... .....+++.+..+++
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~~~~~--~~~~~~f~~v~~~~~ 273 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASGNYEC--NKNNSYFFNVSWYQD 273 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccCCCcc--cccccEEEEHHHhhh
Confidence 789999999833 4 45999976543221 112456666665543
No 46
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=97.19 E-value=0.015 Score=48.92 Aligned_cols=135 Identities=22% Similarity=0.323 Sum_probs=77.2
Q ss_pred ceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCC---CCcEEEEEEecCCCC---c
Q 019504 58 NGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADR---AKDLAVLKIEASEDL---L 131 (340)
Q Consensus 58 ~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~---~~DlAlL~v~~~~~~---~ 131 (340)
.++++.|..+ ++|...|.-. .. .+.+ +|........+...+. ..|+++++++...+. .
T Consensus 26 t~l~~gi~~~-~~lvp~H~~~-~~-----------~i~i---~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~~kfrDIr 89 (172)
T PF00548_consen 26 TMLALGIYDR-YFLVPTHEEP-ED-----------TIYI---DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRNPKFRDIR 89 (172)
T ss_dssp EEEEEEEEBT-EEEEEGGGGG-CS-----------EEEE---TTEEEEEEEEEEEEETTSSEEEEEEEEEESSS-B--GG
T ss_pred EEecceEeee-EEEEECcCCC-cE-----------EEEE---CCEEEEeeeeEEEecCCCcceeEEEEEccCCcccCchh
Confidence 4778888755 9999999211 11 3443 3442233334333444 459999999775421 1
Q ss_pred cceeecCCCCCCCCCEEEEEecCCCCCC-ceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecC---CCcE
Q 019504 132 KPINVGQSSFLKVGQQCLAIGNPFGFDH-TLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDS---KGNL 207 (340)
Q Consensus 132 ~~~~l~~~~~~~~G~~v~~iG~p~g~~~-~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~---~G~V 207 (340)
+.+. +.. ....+..+++=.+ .... ....+.++....- ..++......+.++++..+|+-||||+.. .+++
T Consensus 90 k~~~--~~~-~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i--~~~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i 163 (172)
T PF00548_consen 90 KFFP--ESI-PEYPECVLLVNST-KFPRMIVEVGFVTNFGFI--NLSGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKI 163 (172)
T ss_dssp GGSB--SSG-GTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE--EETTEEEEEEEEEESEEETTGTTEEEEESCGGTTEE
T ss_pred hhhc--ccc-ccCCCcEEEEECC-CCccEEEEEEEEeecCcc--ccCCCEeeEEEEEccCCCCCccCCeEEEeeccCccE
Confidence 2222 111 1344555555333 2222 2333444433321 22334445678999999999999999953 6899
Q ss_pred EEEEeee
Q 019504 208 IGINTAI 214 (340)
Q Consensus 208 VGi~~~~ 214 (340)
+||+.++
T Consensus 164 ~GiHvaG 170 (172)
T PF00548_consen 164 IGIHVAG 170 (172)
T ss_dssp EEEEEEE
T ss_pred EEEEecc
Confidence 9999875
No 47
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=97.12 E-value=0.0032 Score=63.88 Aligned_cols=23 Identities=30% Similarity=0.280 Sum_probs=20.5
Q ss_pred ceEEEEEcCCCEEEeCccccCCC
Q 019504 58 NGSGVVWDGKGHIVTNFHVIGSA 80 (340)
Q Consensus 58 ~GsGfiI~~~G~IlT~~Hvv~~~ 80 (340)
-+||-||+++|+|+||.||.-++
T Consensus 48 GCSgsfVS~~GLvlTNHHC~~~~ 70 (698)
T PF10459_consen 48 GCSGSFVSPDGLVLTNHHCGYGA 70 (698)
T ss_pred ceeEEEEcCCceEEecchhhhhH
Confidence 38999999999999999998653
No 48
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.10 E-value=0.00025 Score=57.12 Aligned_cols=57 Identities=26% Similarity=0.152 Sum_probs=40.5
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcC-CcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIIL-GDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK 338 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~-GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~ 338 (340)
.+.-|.+|.++|||+.||| ++ .|.|+.+|+..+.+.++ |..++.........+.|+.
T Consensus 43 ~~~~Vl~V~p~SPA~~AGL-----------~p~~DyIig~~~~~l~~~~~---l~~~v~~~~~~~l~L~Vyn 100 (138)
T PF04495_consen 43 EGWHVLRVAPNSPAAKAGL-----------EPFFDYIIGIDGGLLDDEDD---LFELVEANENKPLQLYVYN 100 (138)
T ss_dssp CEEEEEEE-TTSHHHHTT-------------TTTEEEEEETTCE--STCH---HHHHHHHTTTS-EEEEEEE
T ss_pred ceEEEeEecCCCHHHHCCc-----------cccccEEEEccceecCCHHH---HHHHHHHcCCCcEEEEEEE
Confidence 6889999999999999999 76 69999999999987776 6666665555555665553
No 49
>PF12812 PDZ_1: PDZ-like domain
Probab=97.04 E-value=0.00049 Score=49.73 Aligned_cols=62 Identities=27% Similarity=0.177 Sum_probs=50.0
Q ss_pred eeeEEe--ccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEE
Q 019504 250 GLNVDI--APDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLI 325 (340)
Q Consensus 250 ~lg~~~--~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~ 325 (340)
+.|..| ++.+.++.++++-|.++.....++++..-|+ ..|-+|.+|||+++.+.++ |...+.
T Consensus 10 ~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i-----------~~g~iI~~Vn~kpt~~Ld~---f~~vvk 73 (78)
T PF12812_consen 10 VCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGI-----------SKGFIITSVNGKPTPDLDD---FIKVVK 73 (78)
T ss_pred EcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCC-----------CCCeEEEeECCcCCcCHHH---HHHHHH
Confidence 456655 4688899999997777778888999887668 8899999999999999997 554443
No 50
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=96.98 E-value=0.017 Score=53.05 Aligned_cols=55 Identities=18% Similarity=0.211 Sum_probs=38.9
Q ss_pred eccCCCCccceeecC--CCc-EEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHc
Q 019504 189 AAINPGNSGGPLLDS--KGN-LIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQY 243 (340)
Q Consensus 189 ~~i~~G~SGGPl~d~--~G~-VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~ 243 (340)
...|.|+||||+|-. +|+ -+||++|+-..++...-.+..--++....|++..++.
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~ 280 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNG 280 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcC
Confidence 346789999999954 454 6899999887666433333444577888888886553
No 51
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.87 E-value=0.032 Score=47.88 Aligned_cols=44 Identities=27% Similarity=0.497 Sum_probs=33.5
Q ss_pred EEEeeccCCCCccceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHh
Q 019504 185 IQTDAAINPGNSGGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSST 232 (340)
Q Consensus 185 i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~ 232 (340)
+..+.-+..||||+|++ .+|++||=++..+.+ .+..+|.++++.
T Consensus 171 l~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~---dp~~Gygi~ie~ 214 (218)
T PF05580_consen 171 LEKTGGIVQGMSGSPII-QDGKLIGAVTHVFVN---DPTKGYGIFIEW 214 (218)
T ss_pred hhhhCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCceeeecHHH
Confidence 33344566899999999 699999998877754 366788888754
No 52
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=96.80 E-value=0.00064 Score=61.09 Aligned_cols=54 Identities=33% Similarity=0.384 Sum_probs=46.5
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEE
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCL 337 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~ 337 (340)
.|+||..|..++|+. |. |+.||-|++|||+++.+.+| +-.++..++++ ++|||-
T Consensus 130 ~gvyv~~v~~~~~~~--gk----------l~~gD~i~avdg~~f~s~~e---~i~~v~~~k~G-d~VtI~ 183 (342)
T COG3480 130 AGVYVLSVIDNSPFK--GK----------LEAGDTIIAVDGEPFTSSDE---LIDYVSSKKPG-DEVTID 183 (342)
T ss_pred eeEEEEEccCCcchh--ce----------eccCCeEEeeCCeecCCHHH---HHHHHhccCCC-CeEEEE
Confidence 799999999999986 43 69999999999999999999 77888887765 777763
No 53
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=96.73 E-value=0.014 Score=57.79 Aligned_cols=118 Identities=19% Similarity=0.323 Sum_probs=72.7
Q ss_pred CCCcEEEEEEecCC-------C------CccceeecC------CCCCCCCCEEEEEecCCCCCCceeEEEEeeecccccc
Q 019504 115 RAKDLAVLKIEASE-------D------LLKPINVGQ------SSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFS 175 (340)
Q Consensus 115 ~~~DlAlL~v~~~~-------~------~~~~~~l~~------~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~ 175 (340)
+-.|+||++++... + .-|.+.+.+ -..+.+|..|+=+|...+ .+.|++.++.-...
T Consensus 541 ~LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvyw- 615 (695)
T PF08192_consen 541 RLSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVYW- 615 (695)
T ss_pred cccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEEe-
Confidence 44599999997542 1 112233321 124567999999997665 46677766543222
Q ss_pred CCCc-eecceEEEe----eccCCCCccceeecCCCc------EEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHH
Q 019504 176 QAGV-TIGGGIQTD----AAINPGNSGGPLLDSKGN------LIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQL 240 (340)
Q Consensus 176 ~~~~-~~~~~i~~d----~~i~~G~SGGPl~d~~G~------VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l 240 (340)
.++. ...+++... .-...|+||+-+++.-+. |+||.+++-.. ...++++.|+..|..=|++.
T Consensus 616 ~dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsydge---~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 616 ADGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYDGE---QKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred cCCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecCCc---cceeeccCcHHHHHHHHHHh
Confidence 2222 112333333 224579999999986444 99998875332 45789999998877766654
No 54
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=96.68 E-value=0.0028 Score=64.34 Aligned_cols=57 Identities=23% Similarity=0.351 Sum_probs=41.6
Q ss_pred eEEEeeccCCCCccceeecCCCcEEEEEeeeeeCC-C------CcCceEEEEehHhHHHHHHHH
Q 019504 184 GIQTDAAINPGNSGGPLLDSKGNLIGINTAIITQT-G------TSAGVGFAIPSSTVLKIVPQL 240 (340)
Q Consensus 184 ~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~~-~------~~~~~~~aip~~~i~~~l~~l 240 (340)
++.++..+..||||+|++|.+|+|||+++-..-.. . ....-+..+.+..++.+++++
T Consensus 623 ~FlstnDitGGNSGSPvlN~~GeLVGl~FDgn~Esl~~D~~fdp~~~R~I~VDiRyvL~~ldkv 686 (698)
T PF10459_consen 623 NFLSTNDITGGNSGSPVLNAKGELVGLAFDGNWESLSGDIAFDPELNRTIHVDIRYVLWALDKV 686 (698)
T ss_pred EEEeccCcCCCCCCCccCCCCceEEEEeecCchhhcccccccccccceeEEEEHHHHHHHHHHH
Confidence 47888899999999999999999999987432111 0 012335567778888888765
No 55
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=96.55 E-value=0.0036 Score=60.14 Aligned_cols=31 Identities=35% Similarity=0.302 Sum_probs=29.7
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCE
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNK 309 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~ 309 (340)
.+.+|..|.++|||++||| .+||-|++|||.
T Consensus 462 g~~~i~~V~~~gPA~~AGl-----------~~Gd~ivai~G~ 492 (558)
T COG3975 462 GHEKITFVFPGGPAYKAGL-----------SPGDKIVAINGI 492 (558)
T ss_pred CeeEEEecCCCChhHhccC-----------CCccEEEEEcCc
Confidence 6789999999999999999 899999999999
No 56
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=96.47 E-value=0.032 Score=47.97 Aligned_cols=148 Identities=21% Similarity=0.208 Sum_probs=49.4
Q ss_pred CceEEEEE-cCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCcee-EEEEEEEEeCCCCcEEEEEEecCC---CCc
Q 019504 57 GNGSGVVW-DGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQK-NFEGKLVGADRAKDLAVLKIEASE---DLL 131 (340)
Q Consensus 57 ~~GsGfiI-~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~-~~~a~v~~~d~~~DlAlL~v~~~~---~~~ 131 (340)
+.++.+-. +-+-.++|++||..... .+. .+.+|... .-..+.+..+...|++||+....- ..+
T Consensus 30 Gya~cv~l~~g~~~L~ta~Hv~~~~~-----------~~~-~~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~ 97 (203)
T PF02122_consen 30 GYATCVRLFDGEDALLTARHVWSRPS-----------KVT-SLKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGV 97 (203)
T ss_dssp ----EEEE----EEEEE-HHHHTSSS---------------EEETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-
T ss_pred ccceEEECcCCccceecccccCCCcc-----------cee-EcCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCc
Confidence 34555442 22237999999999843 222 23344211 112345567889999999998431 113
Q ss_pred cceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEeeccCCCCccceeecCCCcEEEEE
Q 019504 132 KPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTDAAINPGNSGGPLLDSKGNLIGIN 211 (340)
Q Consensus 132 ~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPl~d~~G~VVGi~ 211 (340)
+.+.+.....+..| .+.. +....+.............+ .+..+-+...+|.||.|+|+.+ +++|++
T Consensus 98 k~~~~~~~~~~~~g----~~~~-----y~~~~~~~~~~sa~i~g~~~----~~~~vls~T~~G~SGtp~y~g~-~vvGvH 163 (203)
T PF02122_consen 98 KAAQLSQNSQLAKG----PVSF-----YGFSSGEWPCSSAKIPGTEG----KFASVLSNTSPGWSGTPYYSGK-NVVGVH 163 (203)
T ss_dssp ----B----SEEEE----ESST-----TSEEEEEEEEEE-S----ST----TEEEE-----TT-TT-EEE-SS--EEEEE
T ss_pred ccccccchhhhCCC----Ceee-----eeecCCCceeccCccccccC----cCCceEcCCCCCCCCCCeEECC-CceEee
Confidence 44444322221100 1111 11122111111111222221 2467777888999999999977 999999
Q ss_pred eeeeeCCCCcCceEEEEehH
Q 019504 212 TAIITQTGTSAGVGFAIPSS 231 (340)
Q Consensus 212 ~~~~~~~~~~~~~~~aip~~ 231 (340)
... .......+.++..|+.
T Consensus 164 ~G~-~~~~~~~n~n~~spip 182 (203)
T PF02122_consen 164 TGS-PSGSNRENNNRMSPIP 182 (203)
T ss_dssp EEE-----------------
T ss_pred cCc-cccccccccccccccc
Confidence 975 2222234555555543
No 57
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=96.44 E-value=0.0016 Score=64.20 Aligned_cols=54 Identities=24% Similarity=0.238 Sum_probs=40.5
Q ss_pred EEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEE
Q 019504 272 VLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCL 337 (340)
Q Consensus 272 V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~ 337 (340)
|-+|.+||||+|.|- |++||-|++|||+.|.+.+. +--..|++.....++|||+
T Consensus 782 iGrIieGSPAdRCgk----------LkVGDrilAVNG~sI~~lsH--adiv~LIKdaGlsVtLtIi 835 (984)
T KOG3209|consen 782 IGRIIEGSPADRCGK----------LKVGDRILAVNGQSILNLSH--ADIVSLIKDAGLSVTLTII 835 (984)
T ss_pred ccccccCChhHhhcc----------ccccceEEEecCeeeeccCc--hhHHHHHHhcCceEEEEEc
Confidence 888999999999974 69999999999999998875 1122344444455666664
No 58
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=96.40 E-value=0.0069 Score=48.20 Aligned_cols=33 Identities=21% Similarity=0.473 Sum_probs=23.2
Q ss_pred EEEeeccCCCCccceeecCCCcEEEEEeeeeeC
Q 019504 185 IQTDAAINPGNSGGPLLDSKGNLIGINTAIITQ 217 (340)
Q Consensus 185 i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~ 217 (340)
...+....+|.||+|+||.+|++|||.......
T Consensus 88 ~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 88 GAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred EeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 444555778999999999999999998876654
No 59
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=96.39 E-value=0.0018 Score=54.67 Aligned_cols=60 Identities=22% Similarity=0.108 Sum_probs=42.5
Q ss_pred CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeC
Q 019504 269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKS 339 (340)
Q Consensus 269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~ 339 (340)
=+.|.+|.++|||+++|| +.||-|+++....-.++.++.............-..||++|.
T Consensus 140 Fa~V~sV~~~SPA~~aGl-----------~~gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~ 199 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGL-----------CVGDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIRE 199 (231)
T ss_pred eEEEeecCCCChhhhhCc-----------ccCceEEEecccccccchhHHHHHHHHHhccCcceeEEEecC
Confidence 467999999999999999 999999999988777776421111222223334466777775
No 60
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=96.12 E-value=0.003 Score=56.60 Aligned_cols=53 Identities=23% Similarity=0.191 Sum_probs=37.5
Q ss_pred eCCCChh---hhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 275 VPGNSLA---AKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 275 v~~~spa---~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
|.|+..+ .++|| |+|||+++|||.+++++++.+.....|. +....++||.|++
T Consensus 211 l~Pgkd~~lF~~~GL-----------q~GDva~sING~dL~D~~qa~~l~~~L~--~~tei~ltVeRdG 266 (276)
T PRK09681 211 VKPGADRSLFDASGF-----------KEGDIAIALNQQDFTDPRAMIALMRQLP--SMDSIQLTVLRKG 266 (276)
T ss_pred ECCCCcHHHHHHcCC-----------CCCCEEEEeCCeeCCCHHHHHHHHHHhc--cCCeEEEEEEECC
Confidence 4565433 47798 9999999999999999997433333333 3345788888864
No 61
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=95.94 E-value=0.0029 Score=62.09 Aligned_cols=54 Identities=35% Similarity=0.434 Sum_probs=41.9
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK 338 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~ 338 (340)
.-+.|..|++++||.++.+ ++|||+++|||.||++..+ ...++.... -+|+++|
T Consensus 398 ~~v~v~tv~~ns~a~k~~~-----------~~gdvlvai~~~pi~s~~q---~~~~~~s~~---~~~~~l~ 451 (1051)
T KOG3532|consen 398 RAVKVCTVEDNSLADKAAF-----------KPGDVLVAINNVPIRSERQ---ATRFLQSTT---GDLTVLV 451 (1051)
T ss_pred eEEEEEEecCCChhhHhcC-----------CCcceEEEecCccchhHHH---HHHHHHhcc---cceEEEE
Confidence 4677888999999999998 8999999999999999987 444444333 3455554
No 62
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.88 E-value=0.017 Score=56.20 Aligned_cols=78 Identities=23% Similarity=0.314 Sum_probs=54.1
Q ss_pred eeEEeccHHHHhhcCC--CCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeC
Q 019504 251 LNVDIAPDLVASQLNV--GNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAE 328 (340)
Q Consensus 251 lg~~~~~~~~~~~~~~--~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~ 328 (340)
+++.+....-.+.||+ ..-+.|+++...+.|++-|= |+.|||||+|||....+++ +...+ .|+.+-
T Consensus 200 ~kv~LvKsR~nEEyGlrLgSqIFvKeit~~gLAardgn----------lqEGDiiLkINGtvteNmS-LtDar-~LIEkS 267 (1027)
T KOG3580|consen 200 IKVLLVKSRANEEYGLRLGSQIFVKEITRTGLAARDGN----------LQEGDIILKINGTVTENMS-LTDAR-KLIEKS 267 (1027)
T ss_pred ceEEEEeeccchhhcccccchhhhhhhcccchhhccCC----------cccccEEEEECcEeecccc-chhHH-HHHHhc
Confidence 3444433333344554 47889999998888888753 5999999999999999887 33333 444555
Q ss_pred CCCceEEEEeCC
Q 019504 329 PNQDHLTCLKSS 340 (340)
Q Consensus 329 ~~~~~~~~~~~~ 340 (340)
.++..+-|+|.+
T Consensus 268 ~GKL~lvVlRD~ 279 (1027)
T KOG3580|consen 268 RGKLQLVVLRDS 279 (1027)
T ss_pred cCceEEEEEecC
Confidence 566888888864
No 63
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=94.79 E-value=0.26 Score=46.83 Aligned_cols=96 Identities=19% Similarity=0.174 Sum_probs=52.0
Q ss_pred cceeecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCcee-----cceEEEeeccCCCCccceeecCCCc
Q 019504 132 KPINVGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTI-----GGGIQTDAAINPGNSGGPLLDSKGN 206 (340)
Q Consensus 132 ~~~~l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~-----~~~i~~d~~i~~G~SGGPl~d~~G~ 206 (340)
.+++++....+++|......-. .+.......-.|..+............ .+++..+..+..||||+|++ .+|+
T Consensus 294 ~~~~va~~~ev~~G~a~i~t~~-~g~~~~~~~iei~~v~~~~~~~~k~~~i~~td~~ll~~tgGivqGMSGSPi~-q~gk 371 (402)
T TIGR02860 294 KPMPVALRDEVKEGPAKILTVI-DGEKVEKFDIEIVKLVPQNSPATKGMVIKITDPRLLEKTGGIVQGMSGSPII-QNGK 371 (402)
T ss_pred cEEeEEEHHHcccccEEEEEEE-cCCEEEEEEEEEEEEccCCCCCCceEEEEEcCccHhhHhCCEEecccCCCEE-ECCE
Confidence 5667776778888887532221 222111112222233222111111110 11233344566899999999 6999
Q ss_pred EEEEEeeeeeCCCCcCceEEEEehHh
Q 019504 207 LIGINTAIITQTGTSAGVGFAIPSST 232 (340)
Q Consensus 207 VVGi~~~~~~~~~~~~~~~~aip~~~ 232 (340)
+||=++-.+.+ .+..+|+|-++.
T Consensus 372 liGAvtHVfvn---dpt~GYGi~ie~ 394 (402)
T TIGR02860 372 VIGAVTHVFVN---DPTSGYGVYIEW 394 (402)
T ss_pred EEEEEEEEEec---CCCcceeehHHH
Confidence 99987776664 355678886644
No 64
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=94.51 E-value=0.014 Score=56.79 Aligned_cols=60 Identities=23% Similarity=0.263 Sum_probs=45.8
Q ss_pred CCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504 266 VGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK 338 (340)
Q Consensus 266 ~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~ 338 (340)
..-|+.|..|..+|||+..|| +.||-||.||..+.++.--. +.-.+|.... ++.++||+-
T Consensus 427 NDVGIFVaGvqegspA~~eGl-----------qEGDQIL~VN~vdF~nl~RE-eAVlfLL~lP-kGEevtila 486 (1027)
T KOG3580|consen 427 NDVGIFVAGVQEGSPAEQEGL-----------QEGDQILKVNTVDFRNLVRE-EAVLFLLELP-KGEEVTILA 486 (1027)
T ss_pred CceeEEEeecccCCchhhccc-----------cccceeEEeccccchhhhHH-HHHHHHhcCC-CCcEEeehh
Confidence 346999999999999999999 99999999999988775310 1223444444 568999873
No 65
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=94.41 E-value=0.016 Score=50.15 Aligned_cols=59 Identities=25% Similarity=0.161 Sum_probs=43.9
Q ss_pred CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
|..+.=..+++.-+..|| |.|||.+++|+..+++++|.+.....+.... .-.+||+|.+
T Consensus 208 Gyr~~pgkd~slF~~sgl-----------q~GDIavaiNnldltdp~~m~~llq~l~~m~--s~qlTv~R~G 266 (275)
T COG3031 208 GYRFEPGKDGSLFYKSGL-----------QRGDIAVAINNLDLTDPEDMFRLLQMLRNMP--SLQLTVIRRG 266 (275)
T ss_pred EEEecCCCCcchhhhhcC-----------CCcceEEEecCcccCCHHHHHHHHHhhhcCc--ceEEEEEecC
Confidence 333333455677778888 9999999999999999998655555555443 4789999975
No 66
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=93.43 E-value=0.15 Score=40.28 Aligned_cols=131 Identities=21% Similarity=0.291 Sum_probs=65.4
Q ss_pred eEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecC
Q 019504 59 GSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQ 138 (340)
Q Consensus 59 GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~ 138 (340)
--|+.|+ |.+-|.+|--.... + --+-| +..-.+.+...|+..-....-...+.|-.-+.
T Consensus 14 fmgt~vn--GV~wT~~HGagsrt------------l--Agp~G-----pv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg~ 72 (148)
T PF02907_consen 14 FMGTCVN--GVMWTVYHGAGSRT------------L--AGPKG-----PVNQMYTSVDDDLVGWPAPPGARSLTPCTCGS 72 (148)
T ss_dssp EEEEEET--TEEEEEHHHHTTSE------------E--EBTTS-----EB-ESEEETTTTEEEEE-STTB--BBB-SSSS
T ss_pred eehhEEc--cEEEEEEecCCccc------------c--cCCCC-----cceEeEEcCCCCCcccccccccccCCccccCC
Confidence 3577786 78999999654321 0 01111 22334567778888777765544444444431
Q ss_pred CCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccccCCCceecceEEEe--eccCCCCccceeecCCCcEEEEEeeeee
Q 019504 139 SSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIFSQAGVTIGGGIQTD--AAINPGNSGGPLLDSKGNLIGINTAIIT 216 (340)
Q Consensus 139 ~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~d--~~i~~G~SGGPl~d~~G~VVGi~~~~~~ 216 (340)
..+|++-+-.. +..+. +.. .... .+..- .....|.||||++-.+|.+|||..+...
T Consensus 73 -------~dlylVtr~~~----v~p~r-----r~g--d~~~----~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~ 130 (148)
T PF02907_consen 73 -------SDLYLVTRDAD----VIPVR-----RRG--DSRA----SLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVC 130 (148)
T ss_dssp -------SEEEEE-TTS-----EEEEE-----EES--TTEE----EEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEE
T ss_pred -------ccEEEEeccCc----EeeeE-----EcC--CCce----EecCCceeEEEecCCCCcccCCCCCEEEEEEEEEE
Confidence 35566643211 11111 000 0000 01111 1234799999999889999999887665
Q ss_pred CCCCcCceEEEEehHhH
Q 019504 217 QTGTSAGVGFAIPSSTV 233 (340)
Q Consensus 217 ~~~~~~~~~~aip~~~i 233 (340)
..+....+-| +|++.+
T Consensus 131 trgvak~i~f-~P~e~l 146 (148)
T PF02907_consen 131 TRGVAKAIDF-IPVETL 146 (148)
T ss_dssp ETTEEEEEEE-EEHHHH
T ss_pred cCCceeeEEE-Eeeeec
Confidence 4433344555 487654
No 67
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=93.25 E-value=0.096 Score=42.02 Aligned_cols=38 Identities=18% Similarity=0.228 Sum_probs=32.7
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC 315 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~ 315 (340)
+-+||+++.||+-|+|-|= |+.||-++++||..|..-.
T Consensus 115 spiyisriipggvadrhgg----------lkrgdqllsvngvsvege~ 152 (207)
T KOG3550|consen 115 SPIYISRIIPGGVADRHGG----------LKRGDQLLSVNGVSVEGEH 152 (207)
T ss_pred CceEEEeecCCccccccCc----------ccccceeEeecceeecchh
Confidence 6799999999999998753 4999999999999887544
No 68
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=92.78 E-value=0.19 Score=39.61 Aligned_cols=30 Identities=27% Similarity=0.550 Sum_probs=24.5
Q ss_pred eeccCCCCccceeecCCCcEEEEEeeeeeC
Q 019504 188 DAAINPGNSGGPLLDSKGNLIGINTAIITQ 217 (340)
Q Consensus 188 d~~i~~G~SGGPl~d~~G~VVGi~~~~~~~ 217 (340)
...-.+|+||-|++|..|+||||+..+..+
T Consensus 100 ~g~g~~GDSGRpi~DNsGrVVaIVLGG~ne 129 (158)
T PF00944_consen 100 TGVGKPGDSGRPIFDNSGRVVAIVLGGANE 129 (158)
T ss_dssp TTS-STTSTTEEEESTTSBEEEEEEEEEEE
T ss_pred cCCCCCCCCCCccCcCCCCEEEEEecCCCC
Confidence 344578999999999999999999876653
No 69
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=91.96 E-value=0.095 Score=48.74 Aligned_cols=43 Identities=28% Similarity=0.387 Sum_probs=36.0
Q ss_pred hcCCCCCcEEEeeCCCChhhhc-CCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504 263 QLNVGNGALVLQVPGNSLAAKA-GILPTTRGFAGNIILGDIIVAVNNKPVSFSCL 316 (340)
Q Consensus 263 ~~~~~~g~~V~~v~~~spa~~~-gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d 316 (340)
.|-...|+.|++|...||+..- || .+||+|+++||.+|++.+|
T Consensus 215 fya~g~gV~Vtev~~~Spl~gprGL-----------~vgdvitsldgcpV~~v~d 258 (484)
T KOG2921|consen 215 FYAHGEGVTVTEVPSVSPLFGPRGL-----------SVGDVITSLDGCPVHKVSD 258 (484)
T ss_pred hhhcCceEEEEeccccCCCcCcccC-----------CccceEEecCCcccCCHHH
Confidence 3444589999999999987622 55 9999999999999999998
No 70
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=91.94 E-value=1 Score=34.30 Aligned_cols=56 Identities=29% Similarity=0.303 Sum_probs=36.3
Q ss_pred eEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecC
Q 019504 59 GSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQ 138 (340)
Q Consensus 59 GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~ 138 (340)
|-++-|. +|..+|+.||.+..+ .| +|. +-+++. ...|+|+++.+... ++.+++++
T Consensus 1 G~avHIG-nG~~vt~tHva~~~~-----------~v-----~g~----~f~~~~--~~ge~~~v~~~~~~--~p~~~ig~ 55 (105)
T PF03510_consen 1 GWAVHIG-NGRYVTVTHVAKSSD-----------SV-----DGQ----PFKIVK--TDGELCWVQSPLVH--LPAAQIGT 55 (105)
T ss_pred CceEEeC-CCEEEEEEEEeccCc-----------eE-----cCc----CcEEEE--eccCEEEEECCCCC--CCeeEecc
Confidence 3467776 689999999998764 22 122 223333 34599999998764 56666754
Q ss_pred C
Q 019504 139 S 139 (340)
Q Consensus 139 ~ 139 (340)
.
T Consensus 56 g 56 (105)
T PF03510_consen 56 G 56 (105)
T ss_pred C
Confidence 3
No 71
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=91.87 E-value=0.14 Score=52.59 Aligned_cols=36 Identities=31% Similarity=0.360 Sum_probs=31.2
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC 315 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~ 315 (340)
.-++|..|.+|+|+. |+|++||-|++|||++|.+..
T Consensus 75 rPviVr~VT~GGps~------------GKL~PGDQIl~vN~Epv~dap 110 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSI------------GKLQPGDQILAVNGEPVKDAP 110 (1298)
T ss_pred CceEEEEecCCCCcc------------ccccCCCeEEEecCccccccc
Confidence 368899999999986 457999999999999998764
No 72
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=91.78 E-value=0.092 Score=52.26 Aligned_cols=60 Identities=18% Similarity=0.239 Sum_probs=47.2
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
-+++|-++..++||.+.|= +++||-|++|||+....+.. .+.+-.-+..+...+++||.+
T Consensus 923 M~LfVLRlAeDGPA~rdGr----------m~VGDqi~eINGesTkgmtH---~rAIelIk~gg~~vll~Lr~g 982 (984)
T KOG3209|consen 923 MDLFVLRLAEDGPAIRDGR----------MRVGDQITEINGESTKGMTH---DRAIELIKQGGRRVLLLLRRG 982 (984)
T ss_pred cceEEEEeccCCCccccCc----------eeecceEEEecCcccCCCcH---HHHHHHHHhCCeEEEEEeccC
Confidence 4799999999999999974 59999999999999999886 344333344556777888763
No 73
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=91.15 E-value=0.38 Score=47.59 Aligned_cols=101 Identities=24% Similarity=0.287 Sum_probs=69.6
Q ss_pred cCCCCccceee-----cCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCceeee------eeeEEeccHH
Q 019504 191 INPGNSGGPLL-----DSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVVRA------GLNVDIAPDL 259 (340)
Q Consensus 191 i~~G~SGGPl~-----d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~------~lg~~~~~~~ 259 (340)
+..=++|||.- |.--+++.|+-.. -..+|.+.....++.+++.-.+... -.-+...-+.
T Consensus 677 iAnmm~~GpAarsgkLnIGDQiiaING~S----------LVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~RPd 746 (829)
T KOG3605|consen 677 IANMMHGGPAARSGKLNIGDQIMSINGTS----------LVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRRPD 746 (829)
T ss_pred HHhcccCChhhhcCCccccceeEeecCce----------eccccHHHHHHHHhcccccceEEEEEecCCCceEEEeeccc
Confidence 33457788874 3334566664332 2368999999999988776655432 2334444566
Q ss_pred HHhhcCCC--CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccC
Q 019504 260 VASQLNVG--NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSF 313 (340)
Q Consensus 260 ~~~~~~~~--~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~ 313 (340)
+..++|.. +|++-+ ...|+=|+|-|+ ++|--|++|||+.|.-
T Consensus 747 ~kyQLGFSVQNGiICS-LlRGGIAERGGV-----------RVGHRIIEINgQSVVA 790 (829)
T KOG3605|consen 747 LRYQLGFSVQNGIICS-LLRGGIAERGGV-----------RVGHRIIEINGQSVVA 790 (829)
T ss_pred chhhccceeeCcEeeh-hhcccchhccCc-----------eeeeeEEEECCceEEe
Confidence 66677766 777544 567888999999 8999999999998853
No 74
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=90.92 E-value=0.091 Score=52.04 Aligned_cols=37 Identities=32% Similarity=0.349 Sum_probs=34.4
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC 315 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~ 315 (340)
.|++|.+|.|++.|++.|| +.||-|++|||+...+..
T Consensus 562 fgifV~~V~pgskAa~~Gl-----------KRgDqilEVNgQnfenis 598 (1283)
T KOG3542|consen 562 FGIFVAEVFPGSKAAREGL-----------KRGDQILEVNGQNFENIS 598 (1283)
T ss_pred ceeEEeeecCCchHHHhhh-----------hhhhhhhhccccchhhhh
Confidence 7999999999999999999 899999999999877665
No 75
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=90.56 E-value=3.7 Score=42.66 Aligned_cols=49 Identities=29% Similarity=0.357 Sum_probs=30.0
Q ss_pred cCCCCccceee--cC---CCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHH
Q 019504 191 INPGNSGGPLL--DS---KGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQL 240 (340)
Q Consensus 191 i~~G~SGGPl~--d~---~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l 240 (340)
..+|+||+||| |. +.-++|+.+......+ .......+|.+++.++.++.
T Consensus 213 ~~~GDSGSPlF~YD~~~kKWvl~Gv~~~~~~~~g-~~~~~~~~~~~f~~~~~~~d 266 (769)
T PF02395_consen 213 GSPGDSGSPLFAYDKEKKKWVLVGVLSGGNGYNG-KGNWWNVIPPDFINQIKQND 266 (769)
T ss_dssp --TT-TT-EEEEEETTTTEEEEEEEEEEECCCCH-SEEEEEEECHHHHHHHHHHC
T ss_pred cccCcCCCceEEEEccCCeEEEEEEEccccccCC-ccceeEEecHHHHHHHHhhh
Confidence 45899999998 32 3459999876543322 22455678888887777663
No 76
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=89.29 E-value=0.84 Score=43.06 Aligned_cols=140 Identities=19% Similarity=0.252 Sum_probs=68.9
Q ss_pred cccCCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCC-CCc
Q 019504 53 EIPEGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASE-DLL 131 (340)
Q Consensus 53 ~~~~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~-~~~ 131 (340)
-..-+.|-||.++++ +++|+.||+..... ++. | .+-.-+..+..-+++-+++..+- ..+
T Consensus 375 iv~fGsGWGfWVS~~-lfITttHViP~g~~----------E~F-----G----v~i~~i~vh~sGeF~~~rFpk~iRPDv 434 (535)
T PF05416_consen 375 IVKFGSGWGFWVSPT-LFITTTHVIPPGAK----------EAF-----G----VPISQIQVHKSGEFCRFRFPKPIRPDV 434 (535)
T ss_dssp EEEETTEEEEESSSS-EEEEEGGGS-STTS----------EET-----T----EECGGEEEEEETTEEEEEESS-SSTTS
T ss_pred heecCCceeeeecce-EEEEeeeecCCcch----------hhh-----C----CChhHeEEeeccceEEEecCCCCCCCc
Confidence 344577999999987 99999999987542 111 1 11111233444577777777652 234
Q ss_pred cceeecCCCCCCCCCEEEE-EecCCCC--CCceeEEEEeeeccccccCCCceecceEEE-------eeccCCCCccceee
Q 019504 132 KPINVGQSSFLKVGQQCLA-IGNPFGF--DHTLTVGVISGLNRDIFSQAGVTIGGGIQT-------DAAINPGNSGGPLL 201 (340)
Q Consensus 132 ~~~~l~~~~~~~~G~~v~~-iG~p~g~--~~~~~~G~vs~~~~~~~~~~~~~~~~~i~~-------d~~i~~G~SGGPl~ 201 (340)
+-+.|. .-...|.-+.+ +=.|.|. +..+..|......-.-..-.+. ..++.+ |....||+-|.|-+
T Consensus 435 tgmiLE--eGapEGtV~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v~GQ--~GMLLTGaNAK~mDLGT~PGDCGcPYv 510 (535)
T PF05416_consen 435 TGMILE--EGAPEGTVCSILIKRPSGELLPLAVRMGTHASMKIQGRTVHGQ--MGMLLTGANAKGMDLGTIPGDCGCPYV 510 (535)
T ss_dssp ---EE---SS--TT-EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEEEEE--EEEETTSTT-SSTTTS--TTGTT-EEE
T ss_pred cceeec--cCCCCceEEEEEEEcCCccchhhhhhhccceeEEEcceeecce--eeeeeecCCccccccCCCCCCCCCcee
Confidence 555553 22344655533 4455442 2344555544332110000000 122322 33456899999999
Q ss_pred cCCC---cEEEEEeeeee
Q 019504 202 DSKG---NLIGINTAIIT 216 (340)
Q Consensus 202 d~~G---~VVGi~~~~~~ 216 (340)
-..| -|+|++++...
T Consensus 511 yKrgNd~VV~GVH~AAtr 528 (535)
T PF05416_consen 511 YKRGNDWVVIGVHAAATR 528 (535)
T ss_dssp EEETTEEEEEEEEEEE-S
T ss_pred eecCCcEEEEEEEehhcc
Confidence 6555 49999998654
No 77
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=88.70 E-value=0.19 Score=51.82 Aligned_cols=59 Identities=20% Similarity=0.177 Sum_probs=42.2
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK 338 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~ 338 (340)
-|+||++|.+|++|+.-|= |+.||-+|+|||+.+--+.++- ...++. ...+-+++.|.|
T Consensus 960 lGIYvKsVV~GgaAd~DGR----------L~aGDQLLsVdG~SLiGisQEr-AA~lmt-rtg~vV~leVaK 1018 (1629)
T KOG1892|consen 960 LGIYVKSVVEGGAADHDGR----------LEAGDQLLSVDGHSLIGISQER-AARLMT-RTGNVVHLEVAK 1018 (1629)
T ss_pred cceEEEEeccCCccccccc----------cccCceeeeecCcccccccHHH-HHHHHh-ccCCeEEEehhh
Confidence 5999999999999987764 5999999999999998877621 112332 233336665544
No 78
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=88.49 E-value=0.45 Score=42.13 Aligned_cols=61 Identities=18% Similarity=0.304 Sum_probs=43.4
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC-----CCCC--CceeEEEeeCCCCceEEEEe
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS-----CLSI--PSRIYLICAEPNQDHLTCLK 338 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~-----~d~~--~~~~~~~~~~~~~~~~~~~~ 338 (340)
.|+.|.+..+|+.|+..|| |.+.|-|++|||..|.-- .|+| ....++...+|.++.=.+.|
T Consensus 194 pGIFISRlVpGGLAeSTGL----------LaVnDEVlEVNGIEVaGKTLDQVTDMMvANshNLIiTVkPANQRnnvvr 261 (358)
T KOG3606|consen 194 PGIFISRLVPGGLAESTGL----------LAVNDEVLEVNGIEVAGKTLDQVTDMMVANSHNLIITVKPANQRNNVVR 261 (358)
T ss_pred CceEEEeecCCccccccce----------eeecceeEEEcCEEeccccHHHHHHHHhhcccceEEEecccccccceee
Confidence 7999999999999999999 689999999999988633 2211 12234555555555544444
No 79
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=88.09 E-value=7.8 Score=34.19 Aligned_cols=92 Identities=16% Similarity=0.212 Sum_probs=58.4
Q ss_pred CCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEE-E---EEEEEeC-----CCCcEEEEEEec
Q 019504 56 EGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNF-E---GKLVGAD-----RAKDLAVLKIEA 126 (340)
Q Consensus 56 ~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~-~---a~v~~~d-----~~~DlAlL~v~~ 126 (340)
.-+.||++|+++ |||++..|+.+..-.. -++.+.+..++.+.. . -++...| ++.+++||.++.
T Consensus 27 ~~~CsgvLlD~~-WlLvsssCl~~I~L~~-------~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~ 98 (267)
T PF09342_consen 27 RYWCSGVLLDPH-WLLVSSSCLRGISLSH-------HYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQ 98 (267)
T ss_pred eEEEEEEEeccc-eEEEeccccCCccccc-------ceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecC
Confidence 346899999987 9999999998743100 156666655541110 0 1333333 678999999998
Q ss_pred CC---CCccceeecC-CCCCCCCCEEEEEecCC
Q 019504 127 SE---DLLKPINVGQ-SSFLKVGQQCLAIGNPF 155 (340)
Q Consensus 127 ~~---~~~~~~~l~~-~~~~~~G~~v~~iG~p~ 155 (340)
+. ..+.|.-+.+ .......+.++++|.-.
T Consensus 99 ~~~fTr~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 99 PANFTRYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred cccceeeecccccccccCCCCCCCceEEEEccc
Confidence 74 3345555543 23444566899999654
No 80
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=86.68 E-value=0.75 Score=41.39 Aligned_cols=38 Identities=39% Similarity=0.329 Sum_probs=33.1
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC 315 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~ 315 (340)
.-+||..|..++||++-|- ++.||-|++|||..|..-.
T Consensus 30 PClYiVQvFD~tPAa~dG~----------i~~GDEi~avNg~svKGkt 67 (429)
T KOG3651|consen 30 PCLYIVQVFDKTPAAKDGR----------IRCGDEIVAVNGISVKGKT 67 (429)
T ss_pred CeEEEEEeccCCchhccCc----------cccCCeeEEecceeecCcc
Confidence 5789999999999999974 4999999999999987543
No 81
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=84.55 E-value=0.3 Score=51.31 Aligned_cols=35 Identities=37% Similarity=0.367 Sum_probs=31.9
Q ss_pred cEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504 270 ALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC 315 (340)
Q Consensus 270 ~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~ 315 (340)
-.|..|.++|||..+|+ +.||.|+.+||++|....
T Consensus 660 h~v~sv~egsPA~~agl-----------s~~DlIthvnge~v~gl~ 694 (1205)
T KOG0606|consen 660 HSVGSVEEGSPAFEAGL-----------SAGDLITHVNGEPVHGLV 694 (1205)
T ss_pred eeeeeecCCCCccccCC-----------CccceeEeccCcccchhh
Confidence 56888999999999999 999999999999998765
No 82
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=84.30 E-value=0.36 Score=44.21 Aligned_cols=38 Identities=21% Similarity=0.265 Sum_probs=32.4
Q ss_pred CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504 269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL 316 (340)
Q Consensus 269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d 316 (340)
-++|+++..+-.|+..|+ |-.||-|++|||..|+.-..
T Consensus 81 PvviSkI~kdQaAd~tG~----------LFvGDAilqvNGi~v~~c~H 118 (505)
T KOG3549|consen 81 PVVISKIYKDQAADITGQ----------LFVGDAILQVNGIYVTACPH 118 (505)
T ss_pred cEEeehhhhhhhhhhcCc----------eEeeeeeEEeccEEeecCCh
Confidence 578899988888888876 58999999999999987653
No 83
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=81.32 E-value=0.59 Score=44.11 Aligned_cols=58 Identities=29% Similarity=0.275 Sum_probs=43.5
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK 338 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~ 338 (340)
.|.-|.+|..+|||.+|||. -==|-|++|||..+....|. ...+.++...++++|++-
T Consensus 15 eg~hvlkVqedSpa~~agle----------pffdFIvSI~g~rL~~dnd~---Lk~llk~~sekVkltv~n 72 (462)
T KOG3834|consen 15 EGYHVLKVQEDSPAHKAGLE----------PFFDFIVSINGIRLNKDNDT---LKALLKANSEKVKLTVYN 72 (462)
T ss_pred eeEEEEEeecCChHHhcCcc----------hhhhhhheeCcccccCchHH---HHHHHHhcccceEEEEEe
Confidence 57779999999999999993 24799999999999988773 333333333447887763
No 84
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=81.18 E-value=3.3 Score=32.55 Aligned_cols=32 Identities=34% Similarity=0.425 Sum_probs=24.4
Q ss_pred ceEEEeeccCCCCccceeecCCCcEEEEEeeee
Q 019504 183 GGIQTDAAINPGNSGGPLLDSKGNLIGINTAII 215 (340)
Q Consensus 183 ~~i~~d~~i~~G~SGGPl~d~~G~VVGi~~~~~ 215 (340)
+++....+..||+-||+|+ .+--||||++++-
T Consensus 79 ~~l~g~Gp~~PGdCGg~L~-C~HGViGi~Tagg 110 (127)
T PF00947_consen 79 NLLIGEGPAEPGDCGGILR-CKHGVIGIVTAGG 110 (127)
T ss_dssp CEEEEE-SSSTT-TCSEEE-ETTCEEEEEEEEE
T ss_pred CceeecccCCCCCCCceeE-eCCCeEEEEEeCC
Confidence 4566677889999999999 5666999999863
No 85
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=80.98 E-value=1.2 Score=42.42 Aligned_cols=24 Identities=25% Similarity=0.512 Sum_probs=21.0
Q ss_pred eccCCCCccceeecCCCcEEEEEe
Q 019504 189 AAINPGNSGGPLLDSKGNLIGINT 212 (340)
Q Consensus 189 ~~i~~G~SGGPl~d~~G~VVGi~~ 212 (340)
..+..|.||+.++|.+|++|||..
T Consensus 350 ~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 350 YSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred cCCCCCCCcCeEECCCCCEEEEeC
Confidence 355689999999999999999965
No 86
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=80.79 E-value=1.2 Score=42.02 Aligned_cols=32 Identities=41% Similarity=0.334 Sum_probs=30.0
Q ss_pred eeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504 274 QVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL 316 (340)
Q Consensus 274 ~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d 316 (340)
.+..+++|..+|+ ++||.|+++|++++.+.+|
T Consensus 135 ~v~~~s~a~~a~l-----------~~Gd~iv~~~~~~i~~~~~ 166 (375)
T COG0750 135 EVAPKSAAALAGL-----------RPGDRIVAVDGEKVASWDD 166 (375)
T ss_pred ecCCCCHHHHcCC-----------CCCCEEEeECCEEccCHHH
Confidence 6788999999999 9999999999999999986
No 87
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=79.33 E-value=1.7 Score=41.92 Aligned_cols=39 Identities=26% Similarity=0.256 Sum_probs=32.2
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL 316 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d 316 (340)
.|+||.++.+++.-+.-| .+++||.||.||.....++..
T Consensus 277 ggIYVgsImkgGAVA~DG----------RIe~GDMiLQVNevsFENmSN 315 (626)
T KOG3571|consen 277 GGIYVGSIMKGGAVALDG----------RIEPGDMILQVNEVSFENMSN 315 (626)
T ss_pred CceEEeeeccCceeeccC----------ccCccceEEEeeecchhhcCc
Confidence 799999999988666555 249999999999988877764
No 88
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=79.25 E-value=1.3 Score=43.05 Aligned_cols=37 Identities=24% Similarity=0.413 Sum_probs=33.9
Q ss_pred CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504 269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC 315 (340)
Q Consensus 269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~ 315 (340)
-++|.++..|+-+.+.|+ |..||.|+++||..|.+..
T Consensus 147 ~~~vARI~~GG~~~r~gl----------L~~GD~i~EvNGi~v~~~~ 183 (542)
T KOG0609|consen 147 KVVVARIMHGGMADRQGL----------LHVGDEILEVNGISVANKS 183 (542)
T ss_pred ccEEeeeccCCcchhccc----------eeeccchheecCeecccCC
Confidence 589999999999999998 6899999999999999773
No 89
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=77.17 E-value=0.36 Score=44.83 Aligned_cols=38 Identities=24% Similarity=0.339 Sum_probs=32.2
Q ss_pred CcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504 269 GALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL 316 (340)
Q Consensus 269 g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d 316 (340)
-++|+++.+|-.|+..+- |..||.|++|||....+...
T Consensus 111 PIlISKIFkGlAADQt~a----------L~~gDaIlSVNG~dL~~AtH 148 (506)
T KOG3551|consen 111 PILISKIFKGLAADQTGA----------LFLGDAILSVNGEDLRDATH 148 (506)
T ss_pred ceehhHhccccccccccc----------eeeccEEEEecchhhhhcch
Confidence 588999999888887764 58999999999999987764
No 90
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=63.30 E-value=9.3 Score=33.00 Aligned_cols=56 Identities=18% Similarity=0.411 Sum_probs=40.7
Q ss_pred cceEEEeeccCCCCccceeecC----CCcEEEEEeeeeeCCCCcCceEEEEeh--HhHHHHHHHHH
Q 019504 182 GGGIQTDAAINPGNSGGPLLDS----KGNLIGINTAIITQTGTSAGVGFAIPS--STVLKIVPQLI 241 (340)
Q Consensus 182 ~~~i~~d~~i~~G~SGGPl~d~----~G~VVGi~~~~~~~~~~~~~~~~aip~--~~i~~~l~~l~ 241 (340)
...+++..+...|+=|||++-. .-+++||+.++... .+.+||-++ +.+++.+..|.
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~~----~~~gYAe~itQEDL~~A~~~l~ 229 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSAN----HAMGYAESITQEDLMRAINKLE 229 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeeccccc----ccceehhhhhHHHHHHHHHhhc
Confidence 3457888888999999999843 35899999987653 356777666 55666666553
No 91
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=61.16 E-value=26 Score=23.45 Aligned_cols=32 Identities=31% Similarity=0.428 Sum_probs=28.2
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.||+ .+.+++..+|...++.|-....
T Consensus 8 ~V~V~l~~g~--~~~G~L~~~D~~~Ni~L~~~~~ 39 (63)
T cd00600 8 TVRVELKDGR--VLEGVLVAFDKYMNLVLDDVEE 39 (63)
T ss_pred EEEEEECCCc--EEEEEEEEECCCCCEEECCEEE
Confidence 7899999997 8999999999999998877654
No 92
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=60.99 E-value=33 Score=29.02 Aligned_cols=28 Identities=36% Similarity=0.213 Sum_probs=25.0
Q ss_pred CCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEE
Q 019504 268 NGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAV 306 (340)
Q Consensus 268 ~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i 306 (340)
..++|+.|..||||+++|+ .-++.|+++
T Consensus 122 ~~~~Vd~v~fgS~A~~~g~-----------d~d~~I~~v 149 (183)
T PF11874_consen 122 GKVIVDEVEFGSPAEKAGI-----------DFDWEITEV 149 (183)
T ss_pred CEEEEEecCCCCHHHHcCC-----------CCCcEEEEE
Confidence 6789999999999999999 888877776
No 93
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=59.44 E-value=27 Score=24.15 Aligned_cols=32 Identities=22% Similarity=0.295 Sum_probs=28.8
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
++.|.+.+|+ .+.+++.++|+..++.|-....
T Consensus 12 ~V~V~l~~g~--~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 12 PVLVKLKGGK--EVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred EEEEEECCCC--EEEEEEEEECCcceEEEeeEEE
Confidence 7999999997 8999999999999999887754
No 94
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=58.18 E-value=26 Score=24.14 Aligned_cols=32 Identities=25% Similarity=0.372 Sum_probs=28.1
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.+|+ .+.+++.++|+..++.|=....
T Consensus 12 ~V~V~Lk~g~--~~~G~L~~~D~~mNlvL~~~~~ 43 (67)
T cd01726 12 PVVVKLNSGV--DYRGILACLDGYMNIALEQTEE 43 (67)
T ss_pred eEEEEECCCC--EEEEEEEEEccceeeEEeeEEE
Confidence 7899999997 8999999999999998876643
No 95
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=58.16 E-value=22 Score=25.71 Aligned_cols=30 Identities=17% Similarity=0.289 Sum_probs=26.6
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEE
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKI 124 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v 124 (340)
.+.|.+.+|+ .+.+++.++|.+.+|.|=..
T Consensus 13 ~V~V~l~~gr--~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 13 RVYVKLRGDR--ELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred EEEEEECCCC--EEEEEEEEEccceEEeccce
Confidence 7899999997 89999999999999987544
No 96
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=57.49 E-value=23 Score=25.45 Aligned_cols=47 Identities=26% Similarity=0.393 Sum_probs=32.1
Q ss_pred EEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEE-EecC
Q 019504 106 FEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLA-IGNP 154 (340)
Q Consensus 106 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~-iG~p 154 (340)
++++++..|...++|++.+-.-...+.---+. ..++.|+.|++ +||.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvkreV~l~Lv~--~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVKREVNLDLVG--EEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEeEEEEeeeec--CccccCCEEEEEeeEE
Confidence 57888899988789999987653322222222 26789999987 5653
No 97
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=57.05 E-value=30 Score=24.26 Aligned_cols=32 Identities=28% Similarity=0.364 Sum_probs=28.4
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.+|+ .+.+++.++|+..++.|-....
T Consensus 16 ~V~V~lk~g~--~~~G~L~~~D~~mNlvL~d~~e 47 (72)
T PRK00737 16 PVLVRLKGGR--EFRGELQGYDIHMNLVLDNAEE 47 (72)
T ss_pred EEEEEECCCC--EEEEEEEEEcccceeEEeeEEE
Confidence 7899999997 8999999999999999887654
No 98
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=56.78 E-value=27 Score=24.14 Aligned_cols=31 Identities=26% Similarity=0.385 Sum_probs=27.6
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE 125 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~ 125 (340)
.+.|.+.+|+ .+.+++.++|...++.|=...
T Consensus 13 ~V~V~Lk~g~--~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 13 PVIVKLKWGM--EYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred EEEEEECCCc--EEEEEEEEECCCEEEEEeeEE
Confidence 7899999997 899999999999999886654
No 99
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=56.60 E-value=27 Score=24.87 Aligned_cols=30 Identities=23% Similarity=0.438 Sum_probs=26.8
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEE
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKI 124 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v 124 (340)
.+.|.+.+|+ .+.+++.++|.+.++.|=..
T Consensus 15 ~V~V~l~~gr--~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 15 RIWIVMKSDK--EFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred EEEEEECCCe--EEEEEEEEeccceEEEEccE
Confidence 7899999997 89999999999999987654
No 100
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=56.09 E-value=29 Score=25.51 Aligned_cols=32 Identities=16% Similarity=0.280 Sum_probs=28.0
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.+|+ .+.+++.++|.+.++.|=....
T Consensus 16 ~V~V~lr~~r--~~~G~L~~fD~hmNlvL~d~~E 47 (87)
T cd01720 16 QVLINCRNNK--KLLGRVKAFDRHCNMVLENVKE 47 (87)
T ss_pred EEEEEEcCCC--EEEEEEEEecCccEEEEcceEE
Confidence 7999999997 8999999999999999866543
No 101
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=56.06 E-value=28 Score=24.92 Aligned_cols=31 Identities=26% Similarity=0.506 Sum_probs=27.3
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE 125 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~ 125 (340)
.+.|.+.+|+ .+.+++.++|.+.++.|=...
T Consensus 12 ~V~V~l~dgR--~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 12 RLRVTLQDGR--QFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred EEEEEECCCc--EEEEEEEEEcCccCEEcCCEE
Confidence 7899999997 899999999999999876554
No 102
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=55.58 E-value=14 Score=23.83 Aligned_cols=21 Identities=38% Similarity=0.627 Sum_probs=17.6
Q ss_pred CCCccceeecCCCcEEEEEee
Q 019504 193 PGNSGGPLLDSKGNLIGINTA 213 (340)
Q Consensus 193 ~G~SGGPl~d~~G~VVGi~~~ 213 (340)
.+.+.-|++|.+|+++|+.+.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 467789999999999999763
No 103
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=55.42 E-value=32 Score=24.44 Aligned_cols=31 Identities=19% Similarity=0.332 Sum_probs=27.2
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE 125 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~ 125 (340)
.+.|.+.||+ .+.+++.++|...+|.|=...
T Consensus 12 ~v~V~l~dgR--~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 12 TMRIHMTDGR--TLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred eEEEEEcCCe--EEEEEEEEEcCCCcEEecCcE
Confidence 7899999997 899999999999999876553
No 104
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=54.35 E-value=33 Score=24.76 Aligned_cols=31 Identities=19% Similarity=0.290 Sum_probs=27.1
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE 125 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~ 125 (340)
++.|.+.+|+ .+.+++.++|...+|.|=...
T Consensus 14 ~V~V~l~~gr--~~~G~L~~~D~~mNlvL~~~~ 44 (81)
T cd01729 14 KIRVKFQGGR--EVTGILKGYDQLLNLVLDDTV 44 (81)
T ss_pred eEEEEECCCc--EEEEEEEEEcCcccEEecCEE
Confidence 7899999997 899999999999999876553
No 105
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=52.98 E-value=54 Score=22.33 Aligned_cols=32 Identities=19% Similarity=0.281 Sum_probs=27.4
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.+....|. .++++++.+|....+.+|+-+.
T Consensus 8 ~V~~kTc~g~--~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 8 QVSCRTCFEQ--RLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred EEEEEecCCc--eEEEEEEEecCCCcEEEEECcc
Confidence 6777777786 8999999999999999998655
No 106
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=51.87 E-value=6.9 Score=39.17 Aligned_cols=35 Identities=20% Similarity=0.326 Sum_probs=27.2
Q ss_pred cEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC
Q 019504 270 ALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS 314 (340)
Q Consensus 270 ~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~ 314 (340)
++|..+..++||+|.|- |..||-|++|||..+.-.
T Consensus 675 VViAnmm~~GpAarsgk----------LnIGDQiiaING~SLVGL 709 (829)
T KOG3605|consen 675 VVIANMMHGGPAARSGK----------LNIGDQIMSINGTSLVGL 709 (829)
T ss_pred HHHHhcccCChhhhcCC----------ccccceeEeecCceeccc
Confidence 33555667899999974 689999999999877543
No 107
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=51.16 E-value=42 Score=23.59 Aligned_cols=31 Identities=16% Similarity=0.198 Sum_probs=27.1
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE 125 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~ 125 (340)
++.|.+.+|+ .+.+++.++|...+|.|=...
T Consensus 12 ~V~V~L~~g~--~~~G~L~~~D~~mNlvL~~~~ 42 (72)
T cd01719 12 KLSLKLNGNR--KVSGILRGFDPFMNLVLDDAV 42 (72)
T ss_pred eEEEEECCCe--EEEEEEEEEcccccEEeccEE
Confidence 7899999997 899999999999999886553
No 108
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.45 E-value=42 Score=23.75 Aligned_cols=31 Identities=26% Similarity=0.341 Sum_probs=27.3
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE 125 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~ 125 (340)
++.|.+.+|+ .+.+.+.++|++.++.|=...
T Consensus 14 ~v~V~l~~gr--~~~G~L~~fD~~~NlvL~d~~ 44 (74)
T cd01728 14 KVVVLLRDGR--KLIGILRSFDQFANLVLQDTV 44 (74)
T ss_pred EEEEEEcCCe--EEEEEEEEECCcccEEecceE
Confidence 7999999997 899999999999999886553
No 109
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=49.57 E-value=46 Score=22.58 Aligned_cols=32 Identities=28% Similarity=0.478 Sum_probs=27.9
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.||+ .+.+++..+|...++-|=....
T Consensus 10 ~V~V~l~~g~--~~~G~L~~~D~~~NlvL~~~~e 41 (67)
T smart00651 10 RVLVELKNGR--EYRGTLKGFDQFMNLVLEDVEE 41 (67)
T ss_pred EEEEEECCCc--EEEEEEEEECccccEEEccEEE
Confidence 7899999997 8999999999999998876644
No 110
>COG1868 FliM Flagellar motor switch protein [Cell motility and secretion]
Probab=48.68 E-value=65 Score=30.05 Aligned_cols=40 Identities=18% Similarity=0.131 Sum_probs=29.9
Q ss_pred CCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHc
Q 019504 204 KGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQY 243 (340)
Q Consensus 204 ~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~ 243 (340)
-.+++-+++....-.....-+++++|...++++.+.+...
T Consensus 191 pne~vv~i~~~i~ig~~~g~~niciP~~~le~i~~kl~~~ 230 (332)
T COG1868 191 PNEIVVLITLEVEIGNLSGMFNICIPYSMLEPIREKLSSR 230 (332)
T ss_pred CCceEEEEEEEEEECCcceEEEEEeeHHHHHHHHHHHhhh
Confidence 4566666666555444456799999999999999988764
No 111
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=47.13 E-value=48 Score=23.35 Aligned_cols=31 Identities=29% Similarity=0.432 Sum_probs=27.4
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE 125 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~ 125 (340)
++.|.+.+|+ .+.+++.++|.+.++.|=...
T Consensus 11 ~V~V~l~dgr--~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 11 TVSVITVDGR--VIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred EEEEEECCCc--EEEEEEEEEccccCEEccceE
Confidence 7899999997 899999999999999887653
No 112
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=43.99 E-value=66 Score=21.81 Aligned_cols=33 Identities=24% Similarity=0.467 Sum_probs=29.5
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecC
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEAS 127 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~ 127 (340)
.+.|.+.+|. .+.+++..+|...++.|-.....
T Consensus 10 ~V~V~l~~g~--~~~G~L~~~D~~~Nl~L~~~~~~ 42 (67)
T PF01423_consen 10 RVRVELKNGR--TYRGTLVSFDQFMNLVLSDVTET 42 (67)
T ss_dssp EEEEEETTSE--EEEEEEEEEETTEEEEEEEEEEE
T ss_pred EEEEEEeCCE--EEEEEEEEeechheEEeeeEEEE
Confidence 7999999997 89999999999999998887654
No 113
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=43.52 E-value=55 Score=23.31 Aligned_cols=32 Identities=31% Similarity=0.527 Sum_probs=28.3
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.+|+ .+.+++.++|...++.|--...
T Consensus 19 ~V~V~lk~g~--~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 19 RVLVKLKNGR--EYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred EEEEEECCCC--EEEEEEEEEccceeEEEeceEE
Confidence 7899999997 8999999999999998876654
No 114
>PF08669 GCV_T_C: Glycine cleavage T-protein C-terminal barrel domain; InterPro: IPR013977 This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=42.79 E-value=44 Score=24.47 Aligned_cols=33 Identities=21% Similarity=0.347 Sum_probs=22.1
Q ss_pred CccceeecCCCcEEEEEeeeeeCCCCcCceEEE
Q 019504 195 NSGGPLLDSKGNLIGINTAIITQTGTSAGVGFA 227 (340)
Q Consensus 195 ~SGGPl~d~~G~VVGi~~~~~~~~~~~~~~~~a 227 (340)
..|.|+++.+|+.||.+++......-...++++
T Consensus 34 ~~g~~v~~~~g~~vG~vTS~~~sp~~~~~Iala 66 (95)
T PF08669_consen 34 RGGEPVYDEDGKPVGRVTSGAYSPTLGKNIALA 66 (95)
T ss_dssp STTCEEEETTTEEEEEEEEEEEETTTTEEEEEE
T ss_pred CCCCEEEECCCcEEeEEEEEeECCCCCceEEEE
Confidence 457899987999999988764432223444444
No 115
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=41.18 E-value=10 Score=36.18 Aligned_cols=35 Identities=34% Similarity=0.224 Sum_probs=26.4
Q ss_pred EEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCCC
Q 019504 272 VLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSCL 316 (340)
Q Consensus 272 V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~d 316 (340)
|-+|.++|||++|||+ --+|-|+-+-.......+|
T Consensus 113 vl~V~p~SPaalAgl~----------~~~DYivG~~~~~~~~~eD 147 (462)
T KOG3834|consen 113 VLSVEPNSPAALAGLR----------PYTDYIVGIWDAVMHEEED 147 (462)
T ss_pred eeecCCCCHHHhcccc----------cccceEecchhhhccchHH
Confidence 5578899999999993 3789999994444455555
No 116
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=39.24 E-value=34 Score=24.09 Aligned_cols=32 Identities=22% Similarity=0.473 Sum_probs=24.1
Q ss_pred cceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHH
Q 019504 197 GGPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLI 241 (340)
Q Consensus 197 GGPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~ 241 (340)
.-|+.+.+|+++|+... .+..+.+.++++++.
T Consensus 18 s~pi~~~~g~~~Gvv~~-------------di~l~~l~~~i~~~~ 49 (81)
T PF02743_consen 18 SVPIYDDDGKIIGVVGI-------------DISLDQLSEIISNIK 49 (81)
T ss_dssp EEEEEETTTEEEEEEEE-------------EEEHHHHHHHHTTSB
T ss_pred EEEEECCCCCEEEEEEE-------------EeccceeeeEEEeeE
Confidence 36888889999999654 577888887776653
No 117
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=38.17 E-value=1.2e+02 Score=21.07 Aligned_cols=43 Identities=21% Similarity=0.346 Sum_probs=29.8
Q ss_pred EEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEE
Q 019504 106 FEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAI 151 (340)
Q Consensus 106 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~i 151 (340)
++++++..+.....|++..... ...+.+.--.++++||+|++-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G~---~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGGV---RREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETTE---EEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCCc---EEEEEEEEeCCCCCCCEEEEe
Confidence 6888999988889999987753 244444333458999999886
No 118
>PF02601 Exonuc_VII_L: Exonuclease VII, large subunit; InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus. This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=38.10 E-value=45 Score=30.77 Aligned_cols=38 Identities=21% Similarity=0.435 Sum_probs=31.4
Q ss_pred ceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEE
Q 019504 58 NGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGK 109 (340)
Q Consensus 58 ~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~ 109 (340)
.|-.++.+++|.++|+..-+...+ .+.+.+.||. +.++
T Consensus 281 RGYaiv~~~~g~vI~s~~~l~~gd-----------~i~i~l~DG~---~~a~ 318 (319)
T PF02601_consen 281 RGYAIVRDKDGKVITSVKQLKPGD-----------EIEIRLADGS---IKAE 318 (319)
T ss_pred CceEEEECCCCCEECCHHHCCCCC-----------EEEEEEcceE---EEEE
Confidence 377788888899999999998876 8999999994 5554
No 119
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=35.89 E-value=40 Score=25.87 Aligned_cols=18 Identities=33% Similarity=0.708 Sum_probs=13.3
Q ss_pred ceeecCCCcEEEEEeeee
Q 019504 198 GPLLDSKGNLIGINTAII 215 (340)
Q Consensus 198 GPl~d~~G~VVGi~~~~~ 215 (340)
.|++|.+|++||++.-++
T Consensus 94 ~PV~d~~g~viG~V~VG~ 111 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVGV 111 (116)
T ss_dssp EEEE-TTS-EEEEEEEEE
T ss_pred EeeECCCCcEEEEEEEEE
Confidence 688999999999987654
No 120
>COG4820 EutJ Ethanolamine utilization protein, possible chaperonin [Amino acid transport and metabolism]
Probab=35.48 E-value=1.1e+02 Score=26.47 Aligned_cols=103 Identities=19% Similarity=0.182 Sum_probs=52.4
Q ss_pred ceeecCCCcEEEEEeeeeeCCCCcCceEEEEehHhHHHHHHHHHHcCceeeeeeeEEeccHHHHhh-----------cC-
Q 019504 198 GPLLDSKGNLIGINTAIITQTGTSAGVGFAIPSSTVLKIVPQLIQYGKVVRAGLNVDIAPDLVASQ-----------LN- 265 (340)
Q Consensus 198 GPl~d~~G~VVGi~~~~~~~~~~~~~~~~aip~~~i~~~l~~l~~~~~~~~~~lg~~~~~~~~~~~-----------~~- 265 (340)
.-++|.+|+-+..+.-...--.+.--..|.=.++.++++.+.+.+. ||+++....-+=- .+
T Consensus 43 ~~vlD~d~~Pvag~~~~advVRDGiVvdf~eaveiVrrlkd~lEk~-------lGi~~tha~taiPPGt~~~~~ri~iNV 115 (277)
T COG4820 43 SMVLDRDGQPVAGCLDWADVVRDGIVVDFFEAVEIVRRLKDTLEKQ-------LGIRFTHAATAIPPGTEQGDPRISINV 115 (277)
T ss_pred EEEEcCCCCeEEEEehhhhhhccceEEehhhHHHHHHHHHHHHHHh-------hCeEeeeccccCCCCccCCCceEEEEe
Confidence 3456777776665443211111112345666778888888887653 3333321110000 00
Q ss_pred -CCCCcEEEeeCCCChhhhc--CCCcccc-----CCCC--CCcCCcEEEEEC
Q 019504 266 -VGNGALVLQVPGNSLAAKA--GILPTTR-----GFAG--NIILGDIIVAVN 307 (340)
Q Consensus 266 -~~~g~~V~~v~~~spa~~~--gl~~~~~-----~~~~--~l~~GDvi~~i~ 307 (340)
...|+-|..|..+..|+.. +|+-+-. ++.| .++.|+||...|
T Consensus 116 iESAGlevl~vlDEPTAaa~vL~l~dg~VVDiGGGTTGIsi~kkGkViy~AD 167 (277)
T COG4820 116 IESAGLEVLHVLDEPTAAADVLQLDDGGVVDIGGGTTGISIVKKGKVIYSAD 167 (277)
T ss_pred ecccCceeeeecCCchhHHHHhccCCCcEEEeCCCcceeEEEEcCcEEEecc
Confidence 1257777777655555443 3322211 1222 368999999876
No 121
>COG0260 PepB Leucyl aminopeptidase [Amino acid transport and metabolism]
Probab=33.60 E-value=43 Score=32.96 Aligned_cols=15 Identities=40% Similarity=0.612 Sum_probs=14.1
Q ss_pred cCCcEEEEECCEEcc
Q 019504 298 ILGDIIVAVNNKPVS 312 (340)
Q Consensus 298 ~~GDvi~~i~g~~v~ 312 (340)
+|||||+++||+.|.
T Consensus 316 rPGDVits~~GkTVE 330 (485)
T COG0260 316 RPGDVITSMNGKTVE 330 (485)
T ss_pred CCCCeEEecCCcEEE
Confidence 899999999999885
No 122
>PTZ00138 small nuclear ribonucleoprotein; Provisional
Probab=32.17 E-value=1e+02 Score=22.74 Aligned_cols=33 Identities=30% Similarity=0.409 Sum_probs=26.4
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIE 125 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~ 125 (340)
.+.+.+.++....+.+++.++|...++.|=...
T Consensus 28 ~V~i~l~~~~~r~~~G~L~gfD~~mNlVL~d~~ 60 (89)
T PTZ00138 28 RVQIWLYDHPNLRIEGKILGFDEYMNMVLDDAE 60 (89)
T ss_pred EEEEEEEeCCCcEEEEEEEEEcccceEEEccEE
Confidence 677777776545899999999999998876654
No 123
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=32.00 E-value=30 Score=31.78 Aligned_cols=33 Identities=42% Similarity=0.698 Sum_probs=27.9
Q ss_pred EEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCC
Q 019504 271 LVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFS 314 (340)
Q Consensus 271 ~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~ 314 (340)
-|-+|.+.+||+++|. ..||-|+-+|+-++.-+
T Consensus 66 ~~lrv~~~~~~e~~~~-----------~~~dyilg~n~Dp~~fl 98 (417)
T COG5233 66 EVLRVNPESPAEKAGM-----------VVGDYILGINEDPLRFL 98 (417)
T ss_pred hheeccccChhHhhcc-----------ccceeEEeecCCcHHHH
Confidence 3567789999999998 78999999998887643
No 124
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=30.73 E-value=29 Score=32.89 Aligned_cols=24 Identities=29% Similarity=0.432 Sum_probs=19.0
Q ss_pred CCceEEEEEcC----CC------EEEeCccccCC
Q 019504 56 EGNGSGVVWDG----KG------HIVTNFHVIGS 79 (340)
Q Consensus 56 ~~~GsGfiI~~----~G------~IlT~~Hvv~~ 79 (340)
...|||+|+|- ++ |+.||.||+..
T Consensus 35 ~~~GT~WIlDy~~~~~~~~p~k~y~ATNlHVa~~ 68 (374)
T PF01732_consen 35 SVSGTGWILDYKKPEDNKYPTKWYFATNLHVASN 68 (374)
T ss_pred cCcceEEEEEEeccCCCCCCeEEEEEechhhhcc
Confidence 35799999972 22 79999999984
No 125
>KOG1738 consensus Membrane-associated guanylate kinase-interacting protein/connector enhancer of KSR-like [Nucleotide transport and metabolism]
Probab=30.66 E-value=53 Score=33.00 Aligned_cols=63 Identities=16% Similarity=0.095 Sum_probs=41.0
Q ss_pred HHHHHHHcCceeeeeeeEEeccHHHHhhcCCCCCcEEEeeCCCChhhhcCCCccccCCCCCCcCCcEEEEECCEEccCCC
Q 019504 236 IVPQLIQYGKVVRAGLNVDIAPDLVASQLNVGNGALVLQVPGNSLAAKAGILPTTRGFAGNIILGDIIVAVNNKPVSFSC 315 (340)
Q Consensus 236 ~l~~l~~~~~~~~~~lg~~~~~~~~~~~~~~~~g~~V~~v~~~spa~~~gl~~~~~~~~~~l~~GDvi~~i~g~~v~~~~ 315 (340)
.++.......-+.-++|+...+.. .+-.+|.++.++|||.+... |..||-|+.||++.|....
T Consensus 200 ~Le~vqls~~kp~eglg~~I~Ssy-------dg~h~~s~~~e~Spad~~~k----------I~dgdEv~qiN~qtvVgwq 262 (638)
T KOG1738|consen 200 SLERVQLSTLSPSEGLGLYIDSSY-------DGPHVTSKIFEQSPADYRQK----------ILDGDEVLQINEQTVVGWQ 262 (638)
T ss_pred HHHHHHhccCCcccCCceEEeeec-------CCceeccccccCChHHHhhc----------ccCccceeeecccccccch
Confidence 444443333333445555543221 24556778899999998864 5899999999999977443
No 126
>COG2524 Predicted transcriptional regulator, contains C-terminal CBS domains [Transcription]
Probab=30.29 E-value=2.9e+02 Score=24.92 Aligned_cols=20 Identities=35% Similarity=0.713 Sum_probs=17.1
Q ss_pred CCCccceeecCCCcEEEEEee
Q 019504 193 PGNSGGPLLDSKGNLIGINTA 213 (340)
Q Consensus 193 ~G~SGGPl~d~~G~VVGi~~~ 213 (340)
.|..|.|++|.+ ++||+.+.
T Consensus 201 ~~i~GaPVvd~d-k~vGiit~ 220 (294)
T COG2524 201 KGIRGAPVVDDD-KIVGIITL 220 (294)
T ss_pred cCccCCceecCC-ceEEEEEH
Confidence 799999999855 99999764
No 127
>cd01739 LSm11_C The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm11 is an SmD2 - like subunit which binds U7 snRNA along with LSm10 and five other Sm subunits to form a 7-member ring structure. LSm11 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=30.17 E-value=94 Score=21.44 Aligned_cols=35 Identities=26% Similarity=0.427 Sum_probs=27.1
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecC
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEAS 127 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~ 127 (340)
.+.++-.+|-.-.+.+.++++|.+.+++|.-++..
T Consensus 12 rV~iR~~~gvrG~~~G~lvAFDK~wNm~L~DV~E~ 46 (66)
T cd01739 12 RVHIRTFKGLRGVCSGFLVAFDKFWNMALVDVDET 46 (66)
T ss_pred EEEEecccCcccEEEEEEEeeeeehhheehhhhhh
Confidence 45555555554478999999999999999988765
No 128
>PRK06437 hypothetical protein; Provisional
Probab=28.98 E-value=1.4e+02 Score=20.54 Aligned_cols=30 Identities=23% Similarity=0.241 Sum_probs=22.0
Q ss_pred cCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504 298 ILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK 338 (340)
Q Consensus 298 ~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~ 338 (340)
.+..+.+++||+.+. .| ..+ + .+|+|.|++
T Consensus 33 ~~~~vaV~vNg~iv~--~~-----~~L---~-dgD~Veiv~ 62 (67)
T PRK06437 33 DEEEYVVIVNGSPVL--ED-----HNV---K-KEDDVLILE 62 (67)
T ss_pred CCccEEEEECCEECC--Cc-----eEc---C-CCCEEEEEe
Confidence 678999999999997 32 111 2 458888886
No 129
>PF10049 DUF2283: Protein of unknown function (DUF2283); InterPro: IPR019270 Members of this family of hypothetical proteins have no known function.
Probab=28.09 E-value=43 Score=21.58 Aligned_cols=11 Identities=36% Similarity=0.881 Sum_probs=7.9
Q ss_pred cCCCcEEEEEe
Q 019504 202 DSKGNLIGINT 212 (340)
Q Consensus 202 d~~G~VVGi~~ 212 (340)
|.+|++|||-.
T Consensus 36 d~~G~ivGIEI 46 (50)
T PF10049_consen 36 DEDGRIVGIEI 46 (50)
T ss_pred CCCCCEEEEEE
Confidence 45788998843
No 130
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=26.93 E-value=53 Score=24.82 Aligned_cols=21 Identities=38% Similarity=0.550 Sum_probs=16.9
Q ss_pred CCCccceeecCCCcEEEEEee
Q 019504 193 PGNSGGPLLDSKGNLIGINTA 213 (340)
Q Consensus 193 ~G~SGGPl~d~~G~VVGi~~~ 213 (340)
.+.+.=|++|.+|+++|+++.
T Consensus 97 ~~~~~lpVvd~~~~~vGiit~ 117 (123)
T cd04627 97 EGISSVAVVDNQGNLIGNISV 117 (123)
T ss_pred cCCceEEEECCCCcEEEEEeH
Confidence 345567999989999999875
No 131
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=26.89 E-value=2.2e+02 Score=19.69 Aligned_cols=32 Identities=19% Similarity=0.332 Sum_probs=28.8
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.+|. .+.+++..+|...++.|-....
T Consensus 12 ~V~VeLk~g~--~~~G~L~~~D~~MNl~L~~~~~ 43 (70)
T cd01721 12 IVTVELKTGE--VYRGKLIEAEDNMNCQLKDVTV 43 (70)
T ss_pred EEEEEECCCc--EEEEEEEEEcCCceeEEEEEEE
Confidence 7899999997 8999999999999999988753
No 132
>PF08605 Rad9_Rad53_bind: Fungal Rad9-like Rad53-binding; InterPro: IPR013914 In Saccharomyces cerevisiae (Baker s yeast), the Rad9 is a key adaptor protein in DNA damage checkpoint pathways. DNA damage induces Rad9 phosphorylation, and Rad53 specifically associates with this region of Rad9, when phosphorylated, via the Rad53 IPR000253 from INTERPRO domain []. There is no clear higher eukaryotic ortholog to Rad9.
Probab=26.57 E-value=1.4e+02 Score=23.81 Aligned_cols=56 Identities=20% Similarity=0.208 Sum_probs=39.1
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEe
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIG 152 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG 152 (340)
.++..+ +- +.|+|+++..+...+-.+++++.....++.-.+ ..-+++.||.|-+-+
T Consensus 15 avW~~~-~~--~yYPa~~~~~~~~~~~~~V~Fedg~~~i~~~dv-~~LDlRIGD~Vkv~~ 70 (131)
T PF08605_consen 15 AVWAGY-NL--KYYPATCVGSGVDRDRSLVRFEDGTYEIKNEDV-KYLDLRIGDTVKVDG 70 (131)
T ss_pred ceeecC-CC--eEeeEEEEeecCCCCeEEEEEecCceEeCcccE-eeeeeecCCEEEECC
Confidence 455544 22 478999999988888899999876433333333 234689999998876
No 133
>cd01733 LSm10 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm10 is an SmD1-like protein which is thought to bind U7 snRNA along with LSm11 and five other Sm subunits to form a 7-member ring structure. LSm10 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing.
Probab=26.57 E-value=2.4e+02 Score=20.07 Aligned_cols=32 Identities=13% Similarity=0.275 Sum_probs=28.5
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.+|. .+.+++..+|...++-|-....
T Consensus 21 ~V~VeLKng~--~~~G~L~~vD~~MNl~L~~~~~ 52 (78)
T cd01733 21 VVTVELRNET--TVTGRIASVDAFMNIRLAKVTI 52 (78)
T ss_pred EEEEEECCCC--EEEEEEEEEcCCceeEEEEEEE
Confidence 6899999997 8999999999999998887754
No 134
>COG2104 ThiS Sulfur transfer protein involved in thiamine biosynthesis [Coenzyme metabolism]
Probab=25.39 E-value=1.3e+02 Score=20.95 Aligned_cols=34 Identities=21% Similarity=0.166 Sum_probs=22.4
Q ss_pred cCCcEEEEECCEEccCCCCCCCceeEEEeeCCCCceEEEEe
Q 019504 298 ILGDIIVAVNNKPVSFSCLSIPSRIYLICAEPNQDHLTCLK 338 (340)
Q Consensus 298 ~~GDvi~~i~g~~v~~~~d~~~~~~~~~~~~~~~~~~~~~~ 338 (340)
.+-=+++++||+.|.... +..... ..+|.++|+|
T Consensus 30 ~~~~vav~vNg~iVpr~~----~~~~~l---~~gD~ievv~ 63 (68)
T COG2104 30 NPEGVAVAVNGEIVPRSQ----WADTIL---KEGDRIEVVR 63 (68)
T ss_pred CCceEEEEECCEEccchh----hhhccc---cCCCEEEEEE
Confidence 667789999999997543 222222 2348888876
No 135
>cd04603 CBS_pair_KefB_assoc This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with the KefB (Kef-type K+ transport systems) domain which is involved in inorganic ion transport and metabolism. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=25.22 E-value=60 Score=24.06 Aligned_cols=21 Identities=19% Similarity=0.286 Sum_probs=16.4
Q ss_pred CCCccceeecCCCcEEEEEee
Q 019504 193 PGNSGGPLLDSKGNLIGINTA 213 (340)
Q Consensus 193 ~G~SGGPl~d~~G~VVGi~~~ 213 (340)
.+.+--|++|.+|+++|+++.
T Consensus 85 ~~~~~lpVvd~~~~~~Giit~ 105 (111)
T cd04603 85 TEPPVVAVVDKEGKLVGTIYE 105 (111)
T ss_pred cCCCeEEEEcCCCeEEEEEEh
Confidence 344456999988999999874
No 136
>cd04620 CBS_pair_7 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic genera
Probab=24.72 E-value=62 Score=23.93 Aligned_cols=20 Identities=45% Similarity=0.629 Sum_probs=16.3
Q ss_pred CCccceeecCCCcEEEEEee
Q 019504 194 GNSGGPLLDSKGNLIGINTA 213 (340)
Q Consensus 194 G~SGGPl~d~~G~VVGi~~~ 213 (340)
+...-|++|.+|+++|+.+.
T Consensus 90 ~~~~~pVvd~~~~~~Gvit~ 109 (115)
T cd04620 90 QIRHLPVLDDQGQLIGLVTA 109 (115)
T ss_pred CCceEEEEcCCCCEEEEEEh
Confidence 44567999988999999875
No 137
>TIGR00739 yajC preprotein translocase, YajC subunit. While this protein is part of the preprotein translocase in Escherichia coli, it is not essential for viability or protein secretion. The N-terminus region contains a predicted membrane-spanning region followed by a region consisting almost entirely of residues with charged (acidic, basic, or zwitterionic) side chains. This small protein is about 100 residues in length, and is restricted to bacteria; however, this protein is absent from some lineages, including spirochetes and Mycoplasmas.
Probab=24.34 E-value=1.3e+02 Score=21.91 Aligned_cols=40 Identities=15% Similarity=0.196 Sum_probs=25.2
Q ss_pred CCcCCcEEEEECCEE--ccCCCCCCCceeEEEeeCCCCceEEEEeCC
Q 019504 296 NIILGDIIVAVNNKP--VSFSCLSIPSRIYLICAEPNQDHLTCLKSS 340 (340)
Q Consensus 296 ~l~~GDvi~~i~g~~--v~~~~d~~~~~~~~~~~~~~~~~~~~~~~~ 340 (340)
.|++||-|+-..|-- |.+.+| ........ .+..+++.|++
T Consensus 37 ~L~~Gd~VvT~gGi~G~V~~i~d----~~v~vei~-~g~~i~~~r~a 78 (84)
T TIGR00739 37 SLKKGDKVLTIGGIIGTVTKIAE----NTIVIELN-DNTEITFSKNA 78 (84)
T ss_pred hCCCCCEEEECCCeEEEEEEEeC----CEEEEEEC-CCeEEEEEhHH
Confidence 469999999998853 344544 22232333 34889988864
No 138
>cd01724 Sm_D1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D1 heterodimerizes with subunit D2 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing DB, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=22.59 E-value=3.2e+02 Score=20.07 Aligned_cols=60 Identities=15% Similarity=0.200 Sum_probs=39.8
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEEecCCC
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAIGNPFG 156 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~iG~p~g 156 (340)
.+.|.+.+|. .+.+++..+|...++.|-.+......-.+..++ .-.-.|..|..+-.|..
T Consensus 13 ~V~VeLKng~--~~~G~L~~vD~~MNl~L~~a~~~~~~~~~~~~~--~v~IRG~nI~yi~lPd~ 72 (90)
T cd01724 13 TVTIELKNGT--IVHGTITGVDPSMNTHLKNVKLTLKGRNPVPLD--TLSIRGNNIRYFILPDS 72 (90)
T ss_pred EEEEEECCCC--EEEEEEEEEcCceeEEEEEEEEEcCCCceeEcc--eEEEeCCEEEEEEcCCc
Confidence 7899999997 899999999999999998875432111223332 11234666666555543
No 139
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=21.84 E-value=41 Score=32.52 Aligned_cols=21 Identities=29% Similarity=0.363 Sum_probs=17.6
Q ss_pred EEEeeCCCChhhhcCCCcccc
Q 019504 271 LVLQVPGNSLAAKAGILPTTR 291 (340)
Q Consensus 271 ~V~~v~~~spa~~~gl~~~~~ 291 (340)
+|..|.++|||+++||+++..
T Consensus 1 ~I~~V~pgSpAe~AGLe~GD~ 21 (433)
T TIGR03279 1 LISAVLPGSIAEELGFEPGDA 21 (433)
T ss_pred CcCCcCCCCHHHHcCCCCCCE
Confidence 467899999999999977655
No 140
>cd04597 CBS_pair_DRTGG_assoc2 This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains associated with a DRTGG domain upstream. The function of the DRTGG domain, named after its conserved residues, is unknown. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually come in tandem repeats, which associate to form a so-called Bateman domain or a CBS pair which is reflected in this model. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown.
Probab=21.58 E-value=91 Score=23.44 Aligned_cols=21 Identities=29% Similarity=0.332 Sum_probs=17.4
Q ss_pred CCCccceeecCCCcEEEEEee
Q 019504 193 PGNSGGPLLDSKGNLIGINTA 213 (340)
Q Consensus 193 ~G~SGGPl~d~~G~VVGi~~~ 213 (340)
.+...-|++|.+|+++|+++.
T Consensus 87 ~~~~~lpVvd~~~~l~Givt~ 107 (113)
T cd04597 87 HNIRTLPVVDDDGTPAGIITL 107 (113)
T ss_pred cCCCEEEEECCCCeEEEEEEH
Confidence 456678999999999999764
No 141
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=21.04 E-value=3.1e+02 Score=19.28 Aligned_cols=32 Identities=16% Similarity=0.332 Sum_probs=28.5
Q ss_pred EEEEEecCCceeEEEEEEEEeCCCCcEEEEEEec
Q 019504 93 RVNILASDGVQKNFEGKLVGADRAKDLAVLKIEA 126 (340)
Q Consensus 93 ~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~ 126 (340)
.+.|.+.+|. .+.+++..+|...++.+-.+..
T Consensus 13 ~V~VeLkng~--~~~G~L~~~D~~mNi~L~~~~~ 44 (76)
T cd01723 13 PMLVELKNGE--TYNGHLVNCDNWMNIHLREVIC 44 (76)
T ss_pred EEEEEECCCC--EEEEEEEEEcCCCceEEEeEEE
Confidence 7899999997 8999999999999999987643
No 142
>PRK10413 hydrogenase 2 accessory protein HypG; Provisional
Probab=20.27 E-value=2.1e+02 Score=20.68 Aligned_cols=48 Identities=15% Similarity=0.139 Sum_probs=28.1
Q ss_pred EEEEEEEeCCCC-cEEEEEEecCCCCccceeecCC-CCCCCCCEEEEE-ec
Q 019504 106 FEGKLVGADRAK-DLAVLKIEASEDLLKPINVGQS-SFLKVGQQCLAI-GN 153 (340)
Q Consensus 106 ~~a~v~~~d~~~-DlAlL~v~~~~~~~~~~~l~~~-~~~~~G~~v~~i-G~ 153 (340)
++++++..+... .+|++.+......+...-+++. ..+++||+|++- ||
T Consensus 5 iP~kVi~i~~~~~~~A~vd~~Gv~r~V~l~Lv~~~~~~~~vGDyVLVHaGf 55 (82)
T PRK10413 5 VPGQVLAVGEDIHQLAQVEVCGIKRDVNIALICEGNPADLLGQWVLVHVGF 55 (82)
T ss_pred cceEEEEECCCCCcEEEEEcCCeEEEEEeeeeccCCcccccCCEEEEecch
Confidence 577788777653 6788777654322221122221 246899999884 54
No 143
>PF05578 Peptidase_S31: Pestivirus NS3 polyprotein peptidase S31; InterPro: IPR000280 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S31 (clan PA(S)). The type example is pestivirus NS3 polyprotein peptidase from bovine viral diarrhea virus, which is Type 1 pestivirus. The pestiviruses are single-stranded RNA viruses whose genomes encode one large polyprotein []. The p80 endopeptidase resides towards the middle of the polyprotein and is responsible for processing all non-structural pestivirus proteins [, ]. The p80 enzyme is similar to other proteases in the PA(S) clan and is predicted to have a fold similar to that of chymotrypsin [, ]. An HDS catalytic triad has been identified [].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis
Probab=20.27 E-value=2.9e+02 Score=22.63 Aligned_cols=127 Identities=20% Similarity=0.263 Sum_probs=64.1
Q ss_pred CCceEEEEEcCCCEEEeCccccCCCCCCCCCCCccEEEEEEEecCCceeEEEEEEEEeCCCCcEEEEEEecCCCCcccee
Q 019504 56 EGNGSGVVWDGKGHIVTNFHVIGSALSRKPAEGQVVARVNILASDGVQKNFEGKLVGADRAKDLAVLKIEASEDLLKPIN 135 (340)
Q Consensus 56 ~~~GsGfiI~~~G~IlT~~Hvv~~~~~~~~~~~~~~~~i~v~~~~g~~~~~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~ 135 (340)
.+.-+|+.+..+|-|-.--||-.+. ++.|.-+-|+ .+++..+... +....+ + -+.
T Consensus 50 rgletgwaythqggissvdhvt~gk------------d~lvcdsmgr-----trvvcqsnnk------~tde~e-y-gvk 104 (211)
T PF05578_consen 50 RGLETGWAYTHQGGISSVDHVTAGK------------DLLVCDSMGR-----TRVVCQSNNK------MTDETE-Y-GVK 104 (211)
T ss_pred hcccccceeeccCCcccceeeecCC------------ceEEecCCCc-----eEEEEccCCc------ccchhh-c-ccc
Confidence 3456788887777777777887664 3444444443 1233322110 000000 0 011
Q ss_pred ecCCCCCCCCCEEEEEecCCCCCCceeEEEEeeeccccc-----cCCCceecceEEEeeccCCCCccceeecC-CCcEEE
Q 019504 136 VGQSSFLKVGQQCLAIGNPFGFDHTLTVGVISGLNRDIF-----SQAGVTIGGGIQTDAAINPGNSGGPLLDS-KGNLIG 209 (340)
Q Consensus 136 l~~~~~~~~G~~v~~iG~p~g~~~~~~~G~vs~~~~~~~-----~~~~~~~~~~i~~d~~i~~G~SGGPl~d~-~G~VVG 209 (340)
- ++ ....|..+|++ +|.....+-+.|.+-.+...-- ...+. --.+|..-..|.||=|+|.. .|++||
T Consensus 105 t-ds-gcp~garcyv~-npea~nisgtkga~vhlqk~ggef~cvta~gt----paf~~~knlkg~s~~pifeassgr~vg 177 (211)
T PF05578_consen 105 T-DS-GCPDGARCYVL-NPEATNISGTKGAMVHLQKTGGEFTCVTASGT----PAFFDLKNLKGWSGLPIFEASSGRVVG 177 (211)
T ss_pred c-CC-CCCCCcEEEEe-CCcccccccCcceEEEEeccCCceEEEeccCC----cceeeccccCCCCCCceeeccCCcEEE
Confidence 1 22 23567788877 5654444444443332221100 00010 02344444579999999976 799999
Q ss_pred EEeee
Q 019504 210 INTAI 214 (340)
Q Consensus 210 i~~~~ 214 (340)
=.-.+
T Consensus 178 r~k~g 182 (211)
T PF05578_consen 178 RVKVG 182 (211)
T ss_pred EEEec
Confidence 76544
No 144
>cd04592 CBS_pair_EriC_assoc_euk This cd contains two tandem repeats of the cystathionine beta-synthase (CBS pair) domains in the EriC CIC-type chloride channels in eukaryotes. These ion channels are proteins with a seemingly simple task of allowing the passive flow of chloride ions across biological membranes. CIC-type chloride channels come from all kingdoms of life, have several gene families, and can be gated by voltage. The members of the CIC-type chloride channel are double-barreled: two proteins forming homodimers at a broad interface formed by four helices from each protein. The two pores are not found at this interface, but are completely contained within each subunit, as deduced from the mutational analyses, unlike many other channels, in which four or five identical or structurally related subunits jointly form one pore. CBS is a small domain originally identified in cystathionine beta-synthase and subsequently found in a wide range of different proteins. CBS domains usually
Probab=20.21 E-value=99 Score=24.23 Aligned_cols=20 Identities=35% Similarity=0.209 Sum_probs=16.9
Q ss_pred CCccceeecCCCcEEEEEee
Q 019504 194 GNSGGPLLDSKGNLIGINTA 213 (340)
Q Consensus 194 G~SGGPl~d~~G~VVGi~~~ 213 (340)
+.++-|++|.+|+++|+++.
T Consensus 23 ~~~~~~VvD~~g~l~Givt~ 42 (133)
T cd04592 23 KQSCVLVVDSDDFLEGILTL 42 (133)
T ss_pred CCCEEEEECCCCeEEEEEEH
Confidence 45678999999999999874
No 145
>TIGR00074 hypC_hupF hydrogenase assembly chaperone HypC/HupF. An additional proposed function is to shuttle the iron atom that has been liganded at the HypC/HypD complex to the precursor of the large hydrogenase (HycE) subunit. PubMed:12441107.
Probab=20.03 E-value=2.5e+02 Score=19.97 Aligned_cols=41 Identities=20% Similarity=0.370 Sum_probs=26.1
Q ss_pred EEEEEEEeCCCCcEEEEEEecCCCCccceeecCCCCCCCCCEEEEE
Q 019504 106 FEGKLVGADRAKDLAVLKIEASEDLLKPINVGQSSFLKVGQQCLAI 151 (340)
Q Consensus 106 ~~a~v~~~d~~~DlAlL~v~~~~~~~~~~~l~~~~~~~~G~~v~~i 151 (340)
++++++..+. +.|++.+.... ..+.+.--.++++||+|++-
T Consensus 5 iP~~V~~i~~--~~A~v~~~G~~---~~v~l~lv~~~~vGD~VLVH 45 (76)
T TIGR00074 5 IPGQVVEIDE--NIALVEFCGIK---RDVSLDLVGEVKVGDYVLVH 45 (76)
T ss_pred cceEEEEEcC--CEEEEEcCCeE---EEEEEEeeCCCCCCCEEEEe
Confidence 5677777765 46888776432 23333222467899999874
No 146
>smart00116 CBS Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease.
Probab=20.02 E-value=97 Score=17.93 Aligned_cols=20 Identities=40% Similarity=0.664 Sum_probs=15.1
Q ss_pred CCccceeecCCCcEEEEEee
Q 019504 194 GNSGGPLLDSKGNLIGINTA 213 (340)
Q Consensus 194 G~SGGPl~d~~G~VVGi~~~ 213 (340)
+.+.-|+++.+++++|+.+.
T Consensus 22 ~~~~~~v~~~~~~~~g~i~~ 41 (49)
T smart00116 22 GIRRLPVVDEEGRLVGIVTR 41 (49)
T ss_pred CCCcccEECCCCeEEEEEEH
Confidence 44566888888999998763
Done!