Query 013014
Match_columns 451
No_of_seqs 412 out of 3201
Neff 7.7
Searched_HMMs 46136
Date Fri Mar 29 08:39:40 2013
Command hhsearch -i /work/01045/syshi/csienesis_hhblits_a3m/013014.a3m -d /work/01045/syshi/HHdatabase/Cdd.hhm -o /work/01045/syshi/hhsearch_cdd/013014hhsearch_cdd -cpu 12 -v 0
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 PRK10139 serine endoprotease; 100.0 4.7E-53 1E-57 439.3 39.3 300 117-431 41-360 (455)
2 PRK10898 serine endoprotease; 100.0 2.5E-51 5.5E-56 414.4 39.3 300 113-431 42-349 (353)
3 TIGR02038 protease_degS peripl 100.0 2.3E-51 5E-56 414.8 39.0 300 112-430 41-347 (351)
4 PRK10942 serine endoprotease; 100.0 1.1E-50 2.3E-55 423.6 37.7 301 117-432 39-382 (473)
5 TIGR02037 degP_htrA_DO peripla 100.0 4.5E-49 9.8E-54 409.2 36.1 300 118-432 3-328 (428)
6 COG0265 DegQ Trypsin-like seri 100.0 7.8E-40 1.7E-44 331.3 31.0 302 116-431 33-340 (347)
7 KOG1320 Serine protease [Postt 100.0 5.8E-28 1.3E-32 245.8 19.6 303 116-429 128-466 (473)
8 KOG1421 Predicted signaling-as 99.9 3.4E-24 7.4E-29 220.1 20.0 315 117-441 53-381 (955)
9 KOG1421 Predicted signaling-as 99.7 2.2E-16 4.7E-21 163.1 17.8 304 123-442 525-840 (955)
10 PF13365 Trypsin_2: Trypsin-li 99.7 3.8E-16 8.3E-21 132.7 12.5 109 154-293 1-120 (120)
11 PF13180 PDZ_2: PDZ domain; PD 99.6 1.5E-14 3.3E-19 115.7 9.7 82 332-429 1-82 (82)
12 PF00089 Trypsin: Trypsin; In 99.5 3.6E-12 7.8E-17 119.1 19.4 168 151-320 24-220 (220)
13 KOG1320 Serine protease [Postt 99.5 1.3E-13 2.8E-18 141.3 8.5 276 122-420 56-351 (473)
14 cd00190 Tryp_SPc Trypsin-like 99.3 1.4E-10 3.1E-15 109.1 18.7 169 151-321 24-230 (232)
15 cd00991 PDZ_archaeal_metallopr 99.3 4E-11 8.6E-16 95.2 9.7 68 350-428 10-77 (79)
16 cd00987 PDZ_serine_protease PD 99.2 5E-11 1.1E-15 96.4 9.4 84 332-426 1-89 (90)
17 smart00020 Tryp_SPc Trypsin-li 99.2 8.4E-10 1.8E-14 104.1 18.5 165 151-317 25-226 (229)
18 TIGR01713 typeII_sec_gspC gene 99.2 4.8E-11 1E-15 115.6 9.8 101 314-429 159-259 (259)
19 cd00986 PDZ_LON_protease PDZ d 99.2 1.7E-10 3.6E-15 91.5 10.2 70 350-431 8-77 (79)
20 cd00990 PDZ_glycyl_aminopeptid 99.2 1.5E-10 3.2E-15 91.8 9.0 78 332-430 1-78 (80)
21 cd00989 PDZ_metalloprotease PD 99.1 8.6E-10 1.9E-14 87.0 9.2 67 350-428 12-78 (79)
22 cd00988 PDZ_CTP_protease PDZ d 99.0 2.8E-09 6.1E-14 85.3 9.0 78 333-429 3-83 (85)
23 COG3591 V8-like Glu-specific e 99.0 2.1E-08 4.5E-13 95.7 16.3 158 153-324 65-250 (251)
24 TIGR02037 degP_htrA_DO peripla 98.8 1.2E-08 2.7E-13 106.4 9.8 85 331-426 337-427 (428)
25 cd00136 PDZ PDZ domain, also c 98.8 2.3E-08 5.1E-13 76.8 6.8 67 333-417 2-70 (70)
26 PRK10779 zinc metallopeptidase 98.7 1.5E-08 3.2E-13 106.3 7.4 67 353-430 129-195 (449)
27 TIGR00054 RIP metalloprotease 98.6 1E-07 2.2E-12 99.1 9.3 69 350-430 203-271 (420)
28 PRK10779 zinc metallopeptidase 98.6 1.6E-07 3.5E-12 98.6 10.6 70 351-432 222-291 (449)
29 TIGR02860 spore_IV_B stage IV 98.5 4.7E-07 1E-11 92.1 10.0 69 350-430 105-181 (402)
30 TIGR00225 prc C-terminal pepti 98.5 3.5E-07 7.6E-12 92.4 9.2 82 332-432 51-134 (334)
31 smart00228 PDZ Domain present 98.5 3.4E-07 7.4E-12 72.6 6.7 74 332-420 12-85 (85)
32 PRK10139 serine endoprotease; 98.4 5.4E-07 1.2E-11 94.6 9.0 65 350-427 390-454 (455)
33 cd00992 PDZ_signaling PDZ doma 98.3 1.9E-06 4.1E-11 68.1 7.3 69 331-416 11-81 (82)
34 PF00863 Peptidase_C4: Peptida 98.3 6.5E-05 1.4E-09 71.3 18.8 162 124-313 15-184 (235)
35 PRK10942 serine endoprotease; 98.3 1.7E-06 3.7E-11 91.2 9.0 65 350-427 408-472 (473)
36 PLN00049 carboxyl-terminal pro 98.3 2.5E-06 5.3E-11 88.0 9.9 69 350-430 102-172 (389)
37 PF00595 PDZ: PDZ domain (Also 98.3 8.5E-07 1.8E-11 70.4 4.4 71 331-417 9-81 (81)
38 PF14685 Tricorn_PDZ: Tricorn 98.3 4.7E-06 1E-10 67.2 8.4 78 333-427 2-88 (88)
39 TIGR03279 cyano_FeS_chp putati 98.2 2.6E-06 5.6E-11 87.4 7.4 63 353-430 1-64 (433)
40 TIGR00054 RIP metalloprotease 98.2 2E-06 4.4E-11 89.5 6.4 67 350-429 128-194 (420)
41 KOG3627 Trypsin [Amino acid tr 98.2 6.4E-05 1.4E-09 72.5 16.5 168 153-322 39-252 (256)
42 COG0793 Prc Periplasmic protea 98.2 7.7E-06 1.7E-10 84.5 9.4 81 330-428 98-182 (406)
43 COG3480 SdrC Predicted secrete 98.2 9.8E-06 2.1E-10 78.8 9.4 72 350-433 130-202 (342)
44 PRK09681 putative type II secr 98.0 1.9E-05 4.2E-10 76.6 8.2 58 361-429 218-275 (276)
45 PF04495 GRASP55_65: GRASP55/6 97.8 4.1E-05 9E-10 67.2 6.8 87 332-430 26-114 (138)
46 KOG3129 26S proteasome regulat 97.8 5.1E-05 1.1E-09 69.6 7.1 74 351-435 140-215 (231)
47 PRK11186 carboxy-terminal prot 97.7 0.00011 2.4E-09 80.1 8.7 79 332-429 244-333 (667)
48 PF05579 Peptidase_S32: Equine 97.7 0.00038 8.2E-09 66.4 10.6 116 151-297 111-228 (297)
49 COG3975 Predicted protease wit 97.7 5.7E-05 1.2E-09 78.2 5.4 63 350-431 462-524 (558)
50 COG5640 Secreted trypsin-like 97.4 0.008 1.7E-07 59.8 15.8 55 272-326 223-280 (413)
51 COG3031 PulC Type II secretory 97.1 0.00089 1.9E-08 62.8 6.2 59 359-428 216-274 (275)
52 PF12812 PDZ_1: PDZ-like domai 97.0 0.0018 4E-08 51.1 5.9 64 332-406 9-75 (78)
53 KOG3553 Tax interaction protei 96.9 0.00082 1.8E-08 54.5 3.0 32 350-392 59-90 (124)
54 PF05580 Peptidase_S55: SpoIVB 96.7 0.021 4.6E-07 53.3 11.4 166 145-315 13-214 (218)
55 PF03761 DUF316: Domain of unk 96.6 0.077 1.7E-06 52.1 15.4 91 197-297 159-254 (282)
56 PF00548 Peptidase_C3: 3C cyst 96.4 0.077 1.7E-06 48.4 12.8 137 150-297 23-170 (172)
57 KOG3580 Tight junction protein 95.6 0.013 2.8E-07 61.6 4.5 58 350-418 429-488 (1027)
58 KOG3580 Tight junction protein 95.6 0.02 4.3E-07 60.2 5.6 84 334-429 200-288 (1027)
59 PF00949 Peptidase_S7: Peptida 95.6 0.021 4.5E-07 49.6 4.8 34 267-300 87-120 (132)
60 PF08192 Peptidase_S64: Peptid 95.4 0.095 2.1E-06 56.3 9.9 117 198-323 542-688 (695)
61 PF10459 Peptidase_S46: Peptid 95.2 0.015 3.2E-07 64.1 3.6 28 268-295 624-651 (698)
62 KOG3209 WW domain-containing p 95.1 0.025 5.4E-07 60.7 4.5 54 354-419 782-837 (984)
63 TIGR02860 spore_IV_B stage IV 94.1 0.46 9.9E-06 49.0 10.9 39 271-313 354-392 (402)
64 PF02122 Peptidase_S39: Peptid 93.5 0.28 6.1E-06 45.9 7.5 129 152-297 30-166 (203)
65 KOG3834 Golgi reassembly stack 93.4 0.16 3.5E-06 51.8 6.1 70 349-429 14-85 (462)
66 COG0750 Predicted membrane-ass 93.4 0.26 5.7E-06 50.3 7.9 56 355-422 134-193 (375)
67 KOG3532 Predicted protein kina 93.3 0.14 3.1E-06 54.8 5.6 46 350-406 398-443 (1051)
68 KOG3542 cAMP-regulated guanine 92.7 0.085 1.8E-06 56.4 3.0 57 350-418 562-618 (1283)
69 PF10459 Peptidase_S46: Peptid 91.7 0.099 2.1E-06 57.7 2.2 22 152-173 47-68 (698)
70 KOG3549 Syntrophins (type gamm 91.2 0.33 7.1E-06 48.2 4.9 55 351-417 81-137 (505)
71 PF09342 DUF1986: Domain of un 91.0 5.9 0.00013 38.0 12.8 88 149-237 25-131 (267)
72 KOG3606 Cell polarity protein 90.8 0.65 1.4E-05 44.7 6.4 57 349-417 193-251 (358)
73 PF00944 Peptidase_S3: Alphavi 90.3 0.49 1.1E-05 40.9 4.6 31 269-299 98-128 (158)
74 KOG3209 WW domain-containing p 90.1 0.59 1.3E-05 50.6 6.0 58 350-420 923-982 (984)
75 KOG3550 Receptor targeting pro 90.1 0.6 1.3E-05 41.0 5.1 55 350-417 115-172 (207)
76 KOG1892 Actin filament-binding 90.0 0.39 8.5E-06 53.4 4.7 61 350-420 960-1020(1629)
77 KOG3605 Beta amyloid precursor 89.6 0.38 8.3E-06 51.5 4.1 116 272-410 675-806 (829)
78 PF00947 Pico_P2A: Picornaviru 87.9 8.1 0.00018 33.2 10.3 32 265-297 78-109 (127)
79 KOG3552 FERM domain protein FR 87.8 0.65 1.4E-05 51.7 4.5 56 350-419 75-132 (1298)
80 KOG0609 Calcium/calmodulin-dep 87.5 0.82 1.8E-05 48.1 5.0 68 333-418 135-204 (542)
81 KOG3834 Golgi reassembly stack 87.5 0.89 1.9E-05 46.6 5.0 65 354-429 113-179 (462)
82 KOG3571 Dishevelled 3 and rela 87.2 0.82 1.8E-05 47.7 4.6 74 332-418 261-338 (626)
83 PF02907 Peptidase_S29: Hepati 87.1 0.42 9.2E-06 41.3 2.2 131 153-314 13-144 (148)
84 KOG3551 Syntrophins (type beta 86.3 0.7 1.5E-05 46.6 3.5 70 332-417 96-167 (506)
85 PF02395 Peptidase_S6: Immunog 85.1 3.6 7.9E-05 46.1 8.7 64 152-218 65-130 (769)
86 KOG3651 Protein kinase C, alph 83.3 2.2 4.8E-05 41.8 5.4 56 350-417 30-87 (429)
87 KOG2921 Intramembrane metallop 81.8 1.9 4.2E-05 43.8 4.4 45 350-405 220-265 (484)
88 PF03510 Peptidase_C24: 2C end 80.4 7.1 0.00015 32.5 6.6 54 155-220 2-55 (105)
89 KOG0606 Microtubule-associated 79.6 2.5 5.4E-05 48.4 4.8 50 353-415 661-712 (1205)
90 KOG3605 Beta amyloid precursor 79.4 3 6.5E-05 45.0 5.1 68 351-428 674-743 (829)
91 PF01732 DUF31: Putative pepti 73.1 2.6 5.5E-05 43.3 2.6 25 271-295 349-373 (374)
92 PF05416 Peptidase_C37: Southa 63.5 22 0.00047 36.8 6.8 135 151-298 378-527 (535)
93 PF11874 DUF3394: Domain of un 54.7 61 0.0013 29.8 7.6 61 302-388 89-149 (183)
94 cd00600 Sm_like The eukaryotic 51.0 42 0.00091 24.5 5.1 32 176-207 7-38 (63)
95 cd01720 Sm_D2 The eukaryotic S 48.1 34 0.00073 27.5 4.3 37 171-207 10-46 (87)
96 cd01735 LSm12_N LSm12 belongs 46.6 68 0.0015 24.0 5.4 34 175-208 6-39 (61)
97 cd01731 archaeal_Sm1 The archa 44.2 55 0.0012 24.6 4.9 33 176-208 11-43 (68)
98 PRK00737 small nuclear ribonuc 43.8 56 0.0012 25.0 4.8 32 176-207 15-46 (72)
99 PF00571 CBS: CBS domain CBS d 43.4 20 0.00044 25.2 2.2 21 276-296 28-48 (57)
100 cd01722 Sm_F The eukaryotic Sm 43.2 48 0.001 25.1 4.4 32 176-207 12-43 (68)
101 cd01726 LSm6 The eukaryotic Sm 42.8 55 0.0012 24.6 4.6 32 176-207 11-42 (67)
102 cd01730 LSm3 The eukaryotic Sm 39.4 52 0.0011 25.9 4.2 31 176-206 12-42 (82)
103 cd06168 LSm9 The eukaryotic Sm 39.4 73 0.0016 24.8 4.9 32 176-207 11-42 (75)
104 cd01717 Sm_B The eukaryotic Sm 38.9 66 0.0014 25.1 4.7 32 176-207 11-42 (79)
105 cd01729 LSm7 The eukaryotic Sm 38.3 70 0.0015 25.2 4.7 31 176-206 13-43 (81)
106 cd01732 LSm5 The eukaryotic Sm 38.1 64 0.0014 25.1 4.4 31 176-206 14-44 (76)
107 cd01719 Sm_G The eukaryotic Sm 36.7 85 0.0018 24.1 4.9 32 176-207 11-42 (72)
108 KOG3938 RGS-GAIP interacting p 36.4 32 0.00069 33.4 2.9 40 378-417 167-208 (334)
109 smart00651 Sm snRNP Sm protein 35.4 91 0.002 23.0 4.8 32 176-207 9-40 (67)
110 cd01728 LSm1 The eukaryotic Sm 35.0 86 0.0019 24.3 4.6 31 176-206 13-43 (74)
111 TIGR03000 plancto_dom_1 Planct 34.3 49 0.0011 25.8 3.1 48 382-429 11-63 (75)
112 PF01423 LSM: LSM domain ; In 33.4 67 0.0015 23.8 3.8 33 176-208 9-41 (67)
113 cd01721 Sm_D3 The eukaryotic S 32.9 1E+02 0.0022 23.4 4.8 32 176-207 11-42 (70)
114 COG1958 LSM1 Small nuclear rib 30.5 95 0.0021 24.1 4.3 33 176-208 18-50 (79)
115 cd01727 LSm8 The eukaryotic Sm 30.3 1.1E+02 0.0023 23.6 4.5 32 176-207 10-41 (74)
116 COG0298 HypC Hydrogenase matur 29.4 87 0.0019 24.7 3.8 47 188-236 5-52 (82)
117 COG5233 GRH1 Peripheral Golgi 29.0 31 0.00067 34.4 1.5 30 354-394 67-96 (417)
118 PF09465 LBR_tudor: Lamin-B re 28.5 2.3E+02 0.005 20.8 5.5 35 174-208 8-43 (55)
119 COG4956 Integral membrane prot 28.4 51 0.0011 32.9 2.9 42 385-426 269-311 (356)
120 PF09122 DUF1930: Domain of un 26.5 2E+02 0.0044 21.6 5.0 45 382-427 19-64 (68)
121 PF14827 Cache_3: Sensory doma 25.9 59 0.0013 27.1 2.6 17 281-297 94-110 (116)
122 PF02601 Exonuc_VII_L: Exonucl 25.2 81 0.0018 31.4 3.9 35 152-186 280-314 (319)
123 COG0061 nadF NAD kinase [Coenz 24.7 30 0.00065 34.0 0.6 32 2-33 179-210 (281)
124 cd01723 LSm4 The eukaryotic Sm 24.3 1.9E+02 0.0041 22.3 5.0 32 176-207 12-43 (76)
125 PF01455 HupF_HypC: HupF/HypC 24.3 2.2E+02 0.0048 21.6 5.2 43 188-233 5-47 (68)
126 PF08669 GCV_T_C: Glycine clea 23.8 1E+02 0.0022 24.5 3.5 20 278-297 34-53 (95)
127 PF12381 Peptidase_C3G: Tungro 23.4 60 0.0013 30.6 2.3 45 265-313 168-216 (231)
128 PRK14420 acylphosphatase; Prov 23.0 1.1E+02 0.0023 24.5 3.5 47 336-406 15-63 (91)
129 PF02743 Cache_1: Cache domain 22.3 61 0.0013 24.8 1.9 30 281-323 19-48 (81)
130 PRK14440 acylphosphatase; Prov 22.3 1.2E+02 0.0027 24.3 3.7 42 341-406 23-64 (90)
131 PF14275 DUF4362: Domain of un 21.9 2.5E+02 0.0055 23.1 5.4 47 381-429 2-62 (98)
132 cd04627 CBS_pair_14 The CBS do 21.4 67 0.0014 26.3 2.0 20 277-296 98-117 (123)
133 PF11325 DUF3127: Domain of un 20.6 2.6E+02 0.0057 22.3 5.1 64 350-424 3-71 (84)
134 PF11948 DUF3465: Protein of u 20.6 5.7E+02 0.012 22.2 9.5 12 223-234 85-96 (131)
135 PF01732 DUF31: Putative pepti 20.5 63 0.0014 33.1 2.0 23 151-173 35-67 (374)
No 1
>PRK10139 serine endoprotease; Provisional
Probab=100.00 E-value=4.7e-53 Score=439.31 Aligned_cols=300 Identities=38% Similarity=0.600 Sum_probs=262.2
Q ss_pred hHHHHHHHhCCceEEEEeeeeccC------cc---c-c---cc-ccccCeEEEEEEEcC-CCeEEecccccCCCCcEEEE
Q 013014 117 ATVRLFQENTPSVVNITNLAARQD------AF---T-L---DV-LEVPQGSGSGFVWDS-KGHVVTNYHVIRGASDIRVT 181 (451)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~------~~---~-~---~~-~~~~~~~GSGfiI~~-~G~ILT~aHVv~~~~~i~V~ 181 (451)
++.++++++.||||.|.+...... .| . . +. .....+.||||||++ +||||||+|||+++..+.|+
T Consensus 41 ~~~~~~~~~~pavV~i~~~~~~~~~~~~~~~~~~~f~~~~~~~~~~~~~~~GSG~ii~~~~g~IlTn~HVv~~a~~i~V~ 120 (455)
T PRK10139 41 SLAPMLEKVLPAVVSVRVEGTASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIIDAAKGYVLTNNHVINQAQKISIQ 120 (455)
T ss_pred cHHHHHHHhCCcEEEEEEEEeecccccCchhHHHhccccCCccccccccceEEEEEEECCCCEEEeChHHhCCCCEEEEE
Confidence 588999999999999987643211 11 1 0 00 112347899999985 79999999999999999999
Q ss_pred eCCCcEEEEEEEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCC
Q 013014 182 FADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG 261 (451)
Q Consensus 182 ~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~ 261 (451)
+.||+.++|++++.|+.+||||||++.+ ..+++++|+++..+++||+|+++|||++...+++.|+|++..+.....
T Consensus 121 ~~dg~~~~a~vvg~D~~~DlAvlkv~~~-~~l~~~~lg~s~~~~~G~~V~aiG~P~g~~~tvt~GivS~~~r~~~~~--- 196 (455)
T PRK10139 121 LNDGREFDAKLIGSDDQSDIALLQIQNP-SKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIISALGRSGLNL--- 196 (455)
T ss_pred ECCCCEEEEEEEEEcCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEecCCCCCCceEEEEEccccccccCC---
Confidence 9999999999999999999999999854 468999999999999999999999999999999999999987752211
Q ss_pred CCcccEEEEcccCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchhhHHHhhhccccccccccceecc-
Q 013014 262 RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP- 340 (451)
Q Consensus 262 ~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~~~~~~lGi~~~~- 340 (451)
..+.+++|+|+++++|+|||||+|.+|+||||+++...+.++..+++|+||++.+++++++|+++|++.|+|||+.+++
T Consensus 197 ~~~~~~iqtda~in~GnSGGpl~n~~G~vIGi~~~~~~~~~~~~gigfaIP~~~~~~v~~~l~~~g~v~r~~LGv~~~~l 276 (455)
T PRK10139 197 EGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEM 276 (455)
T ss_pred CCcceEEEECCccCCCCCcceEECCCCeEEEEEEEEEcCCCCccceEEEEEhHHHHHHHHHHhhcCcccccceeEEEEEC
Confidence 1235789999999999999999999999999999988776677899999999999999999999999999999999886
Q ss_pred -chhhhhhCc---cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEE
Q 013014 341 -DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQG 416 (451)
Q Consensus 341 -~~~~~~~~~---~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v 416 (451)
.+.++.+++ .|++|.+|.++|||+++||++ ||+|++|||++|.++.|+.+.+...++|+++.++|
T Consensus 277 ~~~~~~~lgl~~~~Gv~V~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V~s~~dl~~~l~~~~~g~~v~l~V 345 (455)
T PRK10139 277 SADIAKAFNLDVQRGAFVSEVLPNSGSAKAGVKA-----------GDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGL 345 (455)
T ss_pred CHHHHHhcCCCCCCceEEEEECCCChHHHCCCCC-----------CCEEEEECCEECCCHHHHHHHHHhcCCCCEEEEEE
Confidence 335666775 699999999999999999999 99999999999999999999998878899999999
Q ss_pred EECCeEEEEEEEeee
Q 013014 417 IKQPPVLSDNLRLLW 431 (451)
Q Consensus 417 ~R~g~~~~~~v~l~~ 431 (451)
+|+|+.+++++++..
T Consensus 346 ~R~G~~~~l~v~~~~ 360 (455)
T PRK10139 346 LRNGKPLEVEVTLDT 360 (455)
T ss_pred EECCEEEEEEEEECC
Confidence 999999999998743
No 2
>PRK10898 serine endoprotease; Provisional
Probab=100.00 E-value=2.5e-51 Score=414.37 Aligned_cols=300 Identities=33% Similarity=0.502 Sum_probs=257.5
Q ss_pred cchhhHHHHHHHhCCceEEEEeeeeccCccccccccccCeEEEEEEEcCCCeEEecccccCCCCcEEEEeCCCcEEEEEE
Q 013014 113 TDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKI 192 (451)
Q Consensus 113 ~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHVv~~~~~i~V~~~dg~~~~a~v 192 (451)
..+.++.++++++.||||.|......... .......+.||||+|+++||||||+|||.++..+.|++.||+.++|++
T Consensus 42 ~~~~~~~~~~~~~~psvV~v~~~~~~~~~---~~~~~~~~~GSGfvi~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~v 118 (353)
T PRK10898 42 ETPASYNQAVRRAAPAVVNVYNRSLNSTS---HNQLEIRTLGSGVIMDQRGYILTNKHVINDADQIIVALQDGRVFEALL 118 (353)
T ss_pred cccchHHHHHHHhCCcEEEEEeEeccccC---cccccccceeeEEEEeCCeEEEecccEeCCCCEEEEEeCCCCEEEEEE
Confidence 33457889999999999999886432211 011123478999999999999999999999999999999999999999
Q ss_pred EEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcc
Q 013014 193 VGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDA 272 (451)
Q Consensus 193 v~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~ 272 (451)
++.|+.+||||||++.. .+++++++++..+++|++|+++|||++...+++.|+|++..+..... .....++++|+
T Consensus 119 v~~d~~~DlAvl~v~~~--~l~~~~l~~~~~~~~G~~V~aiG~P~g~~~~~t~Giis~~~r~~~~~---~~~~~~iqtda 193 (353)
T PRK10898 119 VGSDSLTDLAVLKINAT--NLPVIPINPKRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIGLSP---TGRQNFLQTDA 193 (353)
T ss_pred EEEcCCCCEEEEEEcCC--CCCeeeccCcCcCCCCCEEEEEeCCCCcCCCcceeEEEeccccccCC---ccccceEEecc
Confidence 99999999999999863 58899999888899999999999999988899999999887753221 12247899999
Q ss_pred cCCCCCCCCceeCCCceEEEEEeeeeCCCC---CccceeEEEeccCchhhHHHhhhccccccccccceeccc--hhhhhh
Q 013014 273 AINPGNSGGPLLDSSGSLIGINTAIYSPSG---ASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPD--QSVEQL 347 (451)
Q Consensus 273 ~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~---~~~~~~~aIP~~~i~~~l~~l~~~g~~~~~~lGi~~~~~--~~~~~~ 347 (451)
++++|+|||||+|.+|+||||+++.+...+ ...+++|+||++.+++++++|+++|++.++|||+.+++. ..++.+
T Consensus 194 ~i~~GnSGGPl~n~~G~vvGI~~~~~~~~~~~~~~~g~~faIP~~~~~~~~~~l~~~G~~~~~~lGi~~~~~~~~~~~~~ 273 (353)
T PRK10898 194 SINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPTQLATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQGG 273 (353)
T ss_pred ccCCCCCcceEECCCCeEEEEEEEEecccCCCCcccceEEEEchHHHHHHHHHHhhcCcccccccceEEEECCHHHHHhc
Confidence 999999999999999999999998765432 236899999999999999999999999999999998753 223344
Q ss_pred Cc---cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEE
Q 013014 348 GV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLS 424 (451)
Q Consensus 348 ~~---~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~ 424 (451)
++ .|++|.+|.+++||+++||++ ||+|++|||++|.++.|+.+.+...++|+++.++|.|+|+.++
T Consensus 274 ~~~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~~l~~~l~~~~~g~~v~l~v~R~g~~~~ 342 (353)
T PRK10898 274 GIDQLQGIVVNEVSPDGPAAKAGIQV-----------NDLIISVNNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQLT 342 (353)
T ss_pred CCCCCCeEEEEEECCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEE
Confidence 43 799999999999999999999 9999999999999999999999887899999999999999999
Q ss_pred EEEEeee
Q 013014 425 DNLRLLW 431 (451)
Q Consensus 425 ~~v~l~~ 431 (451)
+++++..
T Consensus 343 ~~v~l~~ 349 (353)
T PRK10898 343 LQVTIQE 349 (353)
T ss_pred EEEEecc
Confidence 9988753
No 3
>TIGR02038 protease_degS periplasmic serine pepetdase DegS. This family consists of the periplasmic serine protease DegS (HhoB), a shorter paralog of protease DO (HtrA, DegP) and DegQ (HhoA). It is found in E. coli and several other Proteobacteria of the gamma subdivision. It contains a trypsin domain and a single copy of PDZ domain (in contrast to DegP with two copies). A critical role of this DegS is to sense stress in the periplasm and partially degrade an inhibitor of sigma(E).
Probab=100.00 E-value=2.3e-51 Score=414.84 Aligned_cols=300 Identities=35% Similarity=0.578 Sum_probs=260.3
Q ss_pred CcchhhHHHHHHHhCCceEEEEeeeeccCccccccccccCeEEEEEEEcCCCeEEecccccCCCCcEEEEeCCCcEEEEE
Q 013014 112 QTDELATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK 191 (451)
Q Consensus 112 ~~~~~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHVv~~~~~i~V~~~dg~~~~a~ 191 (451)
...+.++.++++++.||||.|.+.....+. .......+.||||+|+++||||||+|||.+++.+.|.+.||+.++|+
T Consensus 41 ~~~~~~~~~~~~~~~psVV~I~~~~~~~~~---~~~~~~~~~GSG~vi~~~G~IlTn~HVV~~~~~i~V~~~dg~~~~a~ 117 (351)
T TIGR02038 41 NTVEISFNKAVRRAAPAVVNIYNRSISQNS---LNQLSIQGLGSGVIMSKEGYILTNYHVIKKADQIVVALQDGRKFEAE 117 (351)
T ss_pred cccchhHHHHHHhcCCcEEEEEeEeccccc---cccccccceEEEEEEeCCeEEEecccEeCCCCEEEEEECCCCEEEEE
Confidence 344457889999999999999876433321 11122457899999999999999999999999999999999999999
Q ss_pred EEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEc
Q 013014 192 IVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTD 271 (451)
Q Consensus 192 vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d 271 (451)
+++.|+.+||||||++.. .+++++++++..+++|++|+++|||++...+++.|+|++..+..... .....++++|
T Consensus 118 vv~~d~~~DlAvlkv~~~--~~~~~~l~~s~~~~~G~~V~aiG~P~~~~~s~t~GiIs~~~r~~~~~---~~~~~~iqtd 192 (351)
T TIGR02038 118 LVGSDPLTDLAVLKIEGD--NLPTIPVNLDRPPHVGDVVLAIGNPYNLGQTITQGIISATGRNGLSS---VGRQNFIQTD 192 (351)
T ss_pred EEEecCCCCEEEEEecCC--CCceEeccCcCccCCCCEEEEEeCCCCCCCcEEEEEEEeccCcccCC---CCcceEEEEC
Confidence 999999999999999864 48899998888899999999999999998999999999887753211 1235789999
Q ss_pred ccCCCCCCCCceeCCCceEEEEEeeeeCCC--CCccceeEEEeccCchhhHHHhhhccccccccccceecc--chhhhhh
Q 013014 272 AAINPGNSGGPLLDSSGSLIGINTAIYSPS--GASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP--DQSVEQL 347 (451)
Q Consensus 272 ~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~--~~~~~~~~aIP~~~i~~~l~~l~~~g~~~~~~lGi~~~~--~~~~~~~ 347 (451)
+.+++|+|||||+|.+|+||||+++.+... +...+++|+||++.+++++++|+++|++.|+|||+.+++ ...++.+
T Consensus 193 a~i~~GnSGGpl~n~~G~vIGI~~~~~~~~~~~~~~g~~faIP~~~~~~vl~~l~~~g~~~r~~lGv~~~~~~~~~~~~l 272 (351)
T TIGR02038 193 AAINAGNSGGALINTNGELVGINTASFQKGGDEGGEGINFAIPIKLAHKIMGKIIRDGRVIRGYIGVSGEDINSVVAQGL 272 (351)
T ss_pred CccCCCCCcceEECCCCeEEEEEeeeecccCCCCccceEEEecHHHHHHHHHHHhhcCcccceEeeeEEEECCHHHHHhc
Confidence 999999999999999999999999776432 234689999999999999999999999999999999886 3345667
Q ss_pred Cc---cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEE
Q 013014 348 GV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLS 424 (451)
Q Consensus 348 ~~---~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~ 424 (451)
|+ .|++|.++.+++||+++||++ ||+|++|||++|.+++|+.+.+...++|++++++|.|+|+.++
T Consensus 273 gl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Ing~~V~s~~dl~~~l~~~~~g~~v~l~v~R~g~~~~ 341 (351)
T TIGR02038 273 GLPDLRGIVITGVDPNGPAARAGILV-----------RDVILKYDGKDVIGAEELMDRIAETRPGSKVMVTVLRQGKQLE 341 (351)
T ss_pred CCCccccceEeecCCCChHHHCCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCEEEEEEEECCEEEE
Confidence 76 599999999999999999999 9999999999999999999999887889999999999999999
Q ss_pred EEEEee
Q 013014 425 DNLRLL 430 (451)
Q Consensus 425 ~~v~l~ 430 (451)
+++++.
T Consensus 342 ~~v~l~ 347 (351)
T TIGR02038 342 LPVTID 347 (351)
T ss_pred EEEEec
Confidence 988874
No 4
>PRK10942 serine endoprotease; Provisional
Probab=100.00 E-value=1.1e-50 Score=423.57 Aligned_cols=301 Identities=38% Similarity=0.588 Sum_probs=262.5
Q ss_pred hHHHHHHHhCCceEEEEeeeeccC---c--------cccc--------------------------cccccCeEEEEEEE
Q 013014 117 ATVRLFQENTPSVVNITNLAARQD---A--------FTLD--------------------------VLEVPQGSGSGFVW 159 (451)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~---~--------~~~~--------------------------~~~~~~~~GSGfiI 159 (451)
++.++++++.||||.|.+...... . |..+ ......+.||||||
T Consensus 39 ~~~~~~~~~~pavv~i~~~~~~~~~~~~~~~~~~~ff~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSG~ii 118 (473)
T PRK10942 39 SLAPMLEKVMPSVVSINVEGSTTVNTPRMPRQFQQFFGDNSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVII 118 (473)
T ss_pred cHHHHHHHhCCceEEEEEEEeccccCCCCChhHHHhhcccccccccccccccccccccccccccccccccccceEEEEEE
Confidence 588999999999999987653211 0 1100 00122468999999
Q ss_pred cC-CCeEEecccccCCCCcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEEeeCCCC
Q 013014 160 DS-KGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFG 238 (451)
Q Consensus 160 ~~-~G~ILT~aHVv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~iG~p~g 238 (451)
++ +||||||+|||.+++++.|++.|++.|+|++++.|+.+||||||++.. ..+++++|+++..+++|++|+++|+|++
T Consensus 119 ~~~~G~IlTn~HVv~~a~~i~V~~~dg~~~~a~vv~~D~~~DlAvlki~~~-~~l~~~~lg~s~~l~~G~~V~aiG~P~g 197 (473)
T PRK10942 119 DADKGYVVTNNHVVDNATKIKVQLSDGRKFDAKVVGKDPRSDIALIQLQNP-KNLTAIKMADSDALRVGDYTVAIGNPYG 197 (473)
T ss_pred ECCCCEEEeChhhcCCCCEEEEEECCCCEEEEEEEEecCCCCEEEEEecCC-CCCceeEecCccccCCCCEEEEEcCCCC
Confidence 96 599999999999999999999999999999999999999999999753 4689999999999999999999999999
Q ss_pred CCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchh
Q 013014 239 LDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNG 318 (451)
Q Consensus 239 ~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~ 318 (451)
...+++.|+|++..+.... ...+.+++++|+++++|+|||||+|.+|+||||+++.+.+.++..+++|+||++.+++
T Consensus 198 ~~~tvt~GiVs~~~r~~~~---~~~~~~~iqtda~i~~GnSGGpL~n~~GeviGI~t~~~~~~g~~~g~gfaIP~~~~~~ 274 (473)
T PRK10942 198 LGETVTSGIVSALGRSGLN---VENYENFIQTDAAINRGNSGGALVNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKN 274 (473)
T ss_pred CCcceeEEEEEEeecccCC---cccccceEEeccccCCCCCcCccCCCCCeEEEEEEEEEcCCCCcccEEEEEEHHHHHH
Confidence 9999999999998775211 1124578999999999999999999999999999998887777788999999999999
Q ss_pred hHHHhhhccccccccccceecc--chhhhhhCc---cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEc
Q 013014 319 IVDQLVKFGKVTRPILGIKFAP--DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV 393 (451)
Q Consensus 319 ~l~~l~~~g~~~~~~lGi~~~~--~~~~~~~~~---~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i 393 (451)
++++|+++|++.|+|||+.+++ ...++.+++ .|++|.+|.++|||+++||++ ||+|++|||++|
T Consensus 275 v~~~l~~~g~v~rg~lGv~~~~l~~~~a~~~~l~~~~GvlV~~V~~~SpA~~AGL~~-----------GDvIl~InG~~V 343 (473)
T PRK10942 275 LTSQMVEYGQVKRGELGIMGTELNSELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKA-----------GDVITSLNGKPI 343 (473)
T ss_pred HHHHHHhccccccceeeeEeeecCHHHHHhcCCCCCCceEEEEECCCChHHHcCCCC-----------CCEEEEECCEEC
Confidence 9999999999999999999886 335667776 599999999999999999999 999999999999
Q ss_pred CCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEeeec
Q 013014 394 SNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRLLWS 432 (451)
Q Consensus 394 ~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~~~ 432 (451)
.+++|+.+.+....+|++++++|.|+|+.+++++++...
T Consensus 344 ~s~~dl~~~l~~~~~g~~v~l~v~R~G~~~~v~v~l~~~ 382 (473)
T PRK10942 344 SSFAALRAQVGTMPVGSKLTLGLLRDGKPVNVNVELQQS 382 (473)
T ss_pred CCHHHHHHHHHhcCCCCEEEEEEEECCeEEEEEEEeCcC
Confidence 999999999988888999999999999999999887543
No 5
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=100.00 E-value=4.5e-49 Score=409.15 Aligned_cols=300 Identities=43% Similarity=0.640 Sum_probs=263.1
Q ss_pred HHHHHHHhCCceEEEEeeeeccC-------------cccc--c------cccccCeEEEEEEEcCCCeEEecccccCCCC
Q 013014 118 TVRLFQENTPSVVNITNLAARQD-------------AFTL--D------VLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS 176 (451)
Q Consensus 118 ~~~~~~~~~~SVV~I~~~~~~~~-------------~~~~--~------~~~~~~~~GSGfiI~~~G~ILT~aHVv~~~~ 176 (451)
+.++++++.||||.|.+...... .|.. . ......+.||||+|+++||||||+||+.++.
T Consensus 3 ~~~~~~~~~p~vv~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~GSGfii~~~G~IlTn~Hvv~~~~ 82 (428)
T TIGR02037 3 FAPLVEKVAPAVVNISVEGTVKRRNRPPALPPFFRQFFGDDMPNFPRQQRERKVRGLGSGVIISADGYILTNNHVVDGAD 82 (428)
T ss_pred HHHHHHHhCCceEEEEEEEEecccCCCcccchhHHHhhcccccCcccccccccccceeeEEEECCCCEEEEcHHHcCCCC
Confidence 67899999999999988652211 1110 0 0123457899999999999999999999999
Q ss_pred cEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeec
Q 013014 177 DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREIS 256 (451)
Q Consensus 177 ~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~ 256 (451)
++.|++.|++.++|++++.|+.+||||||++.. ..+++++|+++..+++|++|+++|||++...+++.|+|++..+...
T Consensus 83 ~i~V~~~~~~~~~a~vv~~d~~~DlAllkv~~~-~~~~~~~l~~~~~~~~G~~v~aiG~p~g~~~~~t~G~vs~~~~~~~ 161 (428)
T TIGR02037 83 EITVTLSDGREFKAKLVGKDPRTDIAVLKIDAK-KNLPVIKLGDSDKLRVGDWVLAIGNPFGLGQTVTSGIVSALGRSGL 161 (428)
T ss_pred eEEEEeCCCCEEEEEEEEecCCCCEEEEEecCC-CCceEEEccCCCCCCCCCEEEEEECCCcCCCcEEEEEEEecccCcc
Confidence 999999999999999999999999999999864 4699999998888999999999999999999999999998876531
Q ss_pred cCCCCCCcccEEEEcccCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchhhHHHhhhccccccccccc
Q 013014 257 SAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGI 336 (451)
Q Consensus 257 ~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~~~~~~lGi 336 (451)
....+..++++|+++++|+|||||+|.+|+||||+++.....++..+++|+||++.+++++++|+++|++.++|||+
T Consensus 162 ---~~~~~~~~i~tda~i~~GnSGGpl~n~~G~viGI~~~~~~~~g~~~g~~faiP~~~~~~~~~~l~~~g~~~~~~lGi 238 (428)
T TIGR02037 162 ---GIGDYENFIQTDAAINPGNSGGPLVNLRGEVIGINTAIYSPSGGNVGIGFAIPSNMAKNVVDQLIEGGKVQRGWLGV 238 (428)
T ss_pred ---CCCCccceEEECCCCCCCCCCCceECCCCeEEEEEeEEEcCCCCccceEEEEEhHHHHHHHHHHHhcCcCcCCcCce
Confidence 11234568999999999999999999999999999998877666788999999999999999999999999999999
Q ss_pred eecc--chhhhhhCc---cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCE
Q 013014 337 KFAP--DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (451)
Q Consensus 337 ~~~~--~~~~~~~~~---~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~ 411 (451)
.+++ .+.++.+|+ .|++|.+|.++|||+++||++ ||+|++|||++|.++.++.+.+....+|++
T Consensus 239 ~~~~~~~~~~~~lgl~~~~Gv~V~~V~~~spA~~aGL~~-----------GDvI~~Vng~~i~~~~~~~~~l~~~~~g~~ 307 (428)
T TIGR02037 239 TIQEVTSDLAKSLGLEKQRGALVAQVLPGSPAEKAGLKA-----------GDVILSVNGKPISSFADLRRAIGTLKPGKK 307 (428)
T ss_pred EeecCCHHHHHHcCCCCCCceEEEEccCCCChHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhcCCCCE
Confidence 9987 345777887 799999999999999999999 999999999999999999999988888999
Q ss_pred EEEEEEECCeEEEEEEEeeec
Q 013014 412 LLLQGIKQPPVLSDNLRLLWS 432 (451)
Q Consensus 412 v~l~v~R~g~~~~~~v~l~~~ 432 (451)
++++|.|+|+.+++++++...
T Consensus 308 v~l~v~R~g~~~~~~v~l~~~ 328 (428)
T TIGR02037 308 VTLGILRKGKEKTITVTLGAS 328 (428)
T ss_pred EEEEEEECCEEEEEEEEECcC
Confidence 999999999999999987644
No 6
>COG0265 DegQ Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain [Posttranslational modification, protein turnover, chaperones]
Probab=100.00 E-value=7.8e-40 Score=331.33 Aligned_cols=302 Identities=44% Similarity=0.641 Sum_probs=261.2
Q ss_pred hhHHHHHHHhCCceEEEEeeeeccC-cccccc--ccccCeEEEEEEEcCCCeEEecccccCCCCcEEEEeCCCcEEEEEE
Q 013014 116 LATVRLFQENTPSVVNITNLAARQD-AFTLDV--LEVPQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKI 192 (451)
Q Consensus 116 ~~~~~~~~~~~~SVV~I~~~~~~~~-~~~~~~--~~~~~~~GSGfiI~~~G~ILT~aHVv~~~~~i~V~~~dg~~~~a~v 192 (451)
..+..+++++.|+||.|........ .|.... .....+.||||+++++|||+|+.||+.++.++.+.+.||+.+++++
T Consensus 33 ~~~~~~~~~~~~~vV~~~~~~~~~~~~~~~~~~~~~~~~~~gSg~i~~~~g~ivTn~hVi~~a~~i~v~l~dg~~~~a~~ 112 (347)
T COG0265 33 LSFATAVEKVAPAVVSIATGLTAKLRSFFPSDPPLRSAEGLGSGFIISSDGYIVTNNHVIAGAEEITVTLADGREVPAKL 112 (347)
T ss_pred cCHHHHHHhcCCcEEEEEeeeeecchhcccCCcccccccccccEEEEcCCeEEEecceecCCcceEEEEeCCCCEEEEEE
Confidence 5788899999999999987543321 110000 0001489999999989999999999999999999999999999999
Q ss_pred EEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcc
Q 013014 193 VGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDA 272 (451)
Q Consensus 193 v~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~ 272 (451)
++.|+..|+|+||++.... ++.+.++++..+++|++++++|+|++...+++.|+++...+. . ........++||+|+
T Consensus 113 vg~d~~~dlavlki~~~~~-~~~~~~~~s~~l~vg~~v~aiGnp~g~~~tvt~Givs~~~r~-~-v~~~~~~~~~IqtdA 189 (347)
T COG0265 113 VGKDPISDLAVLKIDGAGG-LPVIALGDSDKLRVGDVVVAIGNPFGLGQTVTSGIVSALGRT-G-VGSAGGYVNFIQTDA 189 (347)
T ss_pred EecCCccCEEEEEeccCCC-CceeeccCCCCcccCCEEEEecCCCCcccceeccEEeccccc-c-ccCcccccchhhccc
Confidence 9999999999999997533 888899999999999999999999999999999999999886 1 111112568899999
Q ss_pred cCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchhhHHHhhhccccccccccceeccchhhhhhC---c
Q 013014 273 AINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLG---V 349 (451)
Q Consensus 273 ~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~~~~~~lGi~~~~~~~~~~~~---~ 349 (451)
++++|+||||++|.+|++|||+++.....++..+++|+||++.++.+++++++.|++.++|+|+.+.+......+| .
T Consensus 190 ain~gnsGgpl~n~~g~~iGint~~~~~~~~~~gigfaiP~~~~~~v~~~l~~~G~v~~~~lgv~~~~~~~~~~~g~~~~ 269 (347)
T COG0265 190 AINPGNSGGPLVNIDGEVVGINTAIIAPSGGSSGIGFAIPVNLVAPVLDELISKGKVVRGYLGVIGEPLTADIALGLPVA 269 (347)
T ss_pred ccCCCCCCCceEcCCCcEEEEEEEEecCCCCcceeEEEecHHHHHHHHHHHHHcCCccccccceEEEEcccccccCCCCC
Confidence 9999999999999999999999999887766677999999999999999999988999999999988633211144 3
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.|++|.++.+++||+++|++. ||+|+++||+++.+..++...+....+|+++.+++.|+|++.++.+++
T Consensus 270 ~G~~V~~v~~~spa~~agi~~-----------Gdii~~vng~~v~~~~~l~~~v~~~~~g~~v~~~~~r~g~~~~~~v~l 338 (347)
T COG0265 270 AGAVVLGVLPGSPAAKAGIKA-----------GDIITAVNGKPVASLSDLVAAVASNRPGDEVALKLLRGGKERELAVTL 338 (347)
T ss_pred CceEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCHHHHHHHHhccCCCCEEEEEEEECCEEEEEEEEe
Confidence 799999999999999999999 999999999999999999999998889999999999999999999998
Q ss_pred ee
Q 013014 430 LW 431 (451)
Q Consensus 430 ~~ 431 (451)
..
T Consensus 339 ~~ 340 (347)
T COG0265 339 GD 340 (347)
T ss_pred cC
Confidence 64
No 7
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.96 E-value=5.8e-28 Score=245.81 Aligned_cols=303 Identities=35% Similarity=0.497 Sum_probs=242.7
Q ss_pred hhHHHHHHHhCCceEEEEeeeeccCccccccccccCeEEEEEEEcCCCeEEecccccCCCC-----------cEEEEeCC
Q 013014 116 LATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRGAS-----------DIRVTFAD 184 (451)
Q Consensus 116 ~~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHVv~~~~-----------~i~V~~~d 184 (451)
.....++++-.+|+|.|+...-..........+.+...|||||++.+|+++||+||+.... .+.+...+
T Consensus 128 ~~v~~~~~~cd~Avv~Ie~~~f~~~~~~~e~~~ip~l~~S~~Vv~gd~i~VTnghV~~~~~~~y~~~~~~l~~vqi~aa~ 207 (473)
T KOG1320|consen 128 AFVAAVFEECDLAVVYIESEEFWKGMNPFELGDIPSLNGSGFVVGGDGIIVTNGHVVRVEPRIYAHSSTVLLRVQIDAAI 207 (473)
T ss_pred hhHHHhhhcccceEEEEeeccccCCCcccccCCCcccCccEEEEcCCcEEEEeeEEEEEEeccccCCCcceeeEEEEEee
Confidence 3456788899999999987443322222333455778999999999999999999997432 36777766
Q ss_pred C--cEEEEEEEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCC-
Q 013014 185 Q--SAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG- 261 (451)
Q Consensus 185 g--~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~- 261 (451)
+ ..+++.+.+.|+..|+|+++++.+..-.++++++.+..+..|+++.++|.|++..+..+.|.+++..|........
T Consensus 208 ~~~~s~ep~i~g~d~~~gvA~l~ik~~~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg~~~ 287 (473)
T KOG1320|consen 208 GPGNSGEPVIVGVDKVAGVAFLKIKTPENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLGLET 287 (473)
T ss_pred cCCccCCCeEEccccccceEEEEEecCCcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccCccc
Confidence 6 8899999999999999999997554347888888888899999999999999999999999999888875543333
Q ss_pred -CCcccEEEEcccCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchhhHHHhhhccc---c------cc
Q 013014 262 -RPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGK---V------TR 331 (451)
Q Consensus 262 -~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~---~------~~ 331 (451)
....+++|+|++++.|+||||++|.+|++||+++......+-..+++|++|.+.+..++.+..+... . .+
T Consensus 288 g~~i~~~~qtd~ai~~~nsg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~e~~~~lr~~~~~~p~~ 367 (473)
T KOG1320|consen 288 GVLISKINQTDAAINPGNSGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLGEFQISLRPVKPLVPVH 367 (473)
T ss_pred ceeeeeecccchhhhcccCCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhhhhceeeccccCccccc
Confidence 4457889999999999999999999999999998876544445789999999999999988744332 2 23
Q ss_pred ccccceecc-------chhhhhh----C-ccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHH
Q 013014 332 PILGIKFAP-------DQSVEQL----G-VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDL 399 (451)
Q Consensus 332 ~~lGi~~~~-------~~~~~~~----~-~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl 399 (451)
.|+|....- ....+.+ + ..+|+|.+|.+++++...++++ ||+|.+|||++|.|..++
T Consensus 368 ~~~g~~s~~i~~g~vf~~~~~~~~~~~~~~q~v~is~Vlp~~~~~~~~~~~-----------g~~V~~vng~~V~n~~~l 436 (473)
T KOG1320|consen 368 QYIGLPSYYIFAGLVFVPLTKSYIFPSGVVQLVLVSQVLPGSINGGYGLKP-----------GDQVVKVNGKPVKNLKHL 436 (473)
T ss_pred ccCCceeEEEecceEEeecCCCccccccceeEEEEEEeccCCCcccccccC-----------CCEEEEECCEEeechHHH
Confidence 466654221 0011111 1 2689999999999999999999 999999999999999999
Q ss_pred HHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 013014 400 YRILDQCKVGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 400 ~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.+++.++..+++|.+..+|..|..++.+..
T Consensus 437 ~~~i~~~~~~~~v~vl~~~~~e~~tl~Il~ 466 (473)
T KOG1320|consen 437 YELIEECSTEDKVAVLDRRSAEDATLEILP 466 (473)
T ss_pred HHHHHhcCcCceEEEEEecCccceeEEecc
Confidence 999999988899999999988888888764
No 8
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.92 E-value=3.4e-24 Score=220.12 Aligned_cols=315 Identities=24% Similarity=0.270 Sum_probs=242.8
Q ss_pred hHHHHHHHhCCceEEEEeeeeccCccccccccccCeEEEEEEEcCC-CeEEecccccCCC-CcEEEEeCCCcEEEEEEEE
Q 013014 117 ATVRLFQENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIRGA-SDIRVTFADQSAYDAKIVG 194 (451)
Q Consensus 117 ~~~~~~~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHVv~~~-~~i~V~~~dg~~~~a~vv~ 194 (451)
.....+.++-+|||.|...... .| +....+.+.+|||++++. ||||||+||+... -...+.+.+..+.+.-.++
T Consensus 53 ~w~~~ia~VvksvVsI~~S~v~--~f--dtesag~~~atgfvvd~~~gyiLtnrhvv~pgP~va~avf~n~ee~ei~pvy 128 (955)
T KOG1421|consen 53 DWRNTIANVVKSVVSIRFSAVR--AF--DTESAGESEATGFVVDKKLGYILTNRHVVAPGPFVASAVFDNHEEIEIYPVY 128 (955)
T ss_pred hhhhhhhhhcccEEEEEehhee--ec--ccccccccceeEEEEecccceEEEeccccCCCCceeEEEecccccCCccccc
Confidence 6677889999999999875432 12 222345678999999976 9999999999754 4566777777788888889
Q ss_pred EcCCCCEEEEEEcCCC---CCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCC---CCcccEE
Q 013014 195 FDQDKDVAVLRIDAPK---DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATG---RPIQDVI 268 (451)
Q Consensus 195 ~d~~~DlAlLkv~~~~---~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~---~~~~~~i 268 (451)
.|+-+|+.+++.+... ..+..+.++. +..++|.++.++|+..+...++..|.++.+.+....|... .....++
T Consensus 129 rDpVhdfGf~r~dps~ir~s~vt~i~lap-~~akvgseirvvgNDagEklsIlagflSrldr~apdyg~~~yndfnTfy~ 207 (955)
T KOG1421|consen 129 RDPVHDFGFFRYDPSTIRFSIVTEICLAP-ELAKVGSEIRVVGNDAGEKLSILAGFLSRLDRNAPDYGEDTYNDFNTFYI 207 (955)
T ss_pred CCchhhcceeecChhhcceeeeeccccCc-cccccCCceEEecCCccceEEeehhhhhhccCCCccccccccccccceee
Confidence 9999999999998543 1234445542 3467999999999988888899999999998887665421 1224567
Q ss_pred EEcccCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchhhHHHhhhccccccccccceecc--chhhhh
Q 013014 269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAP--DQSVEQ 346 (451)
Q Consensus 269 ~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~~~~~~lGi~~~~--~~~~~~ 346 (451)
|..+....|.||+|++|.+|..|.++..+.. ..+.+|++|++.+.+-+.-++++..+.|+.|.++|.. .+.++.
T Consensus 208 QaasstsggssgspVv~i~gyAVAl~agg~~----ssas~ffLpLdrV~RaL~clq~n~PItRGtLqvefl~k~~de~rr 283 (955)
T KOG1421|consen 208 QAASSTSGGSSGSPVVDIPGYAVALNAGGSI----SSASDFFLPLDRVVRALRCLQNNTPITRGTLQVEFLHKLFDECRR 283 (955)
T ss_pred eehhcCCCCCCCCceecccceEEeeecCCcc----cccccceeeccchhhhhhhhhcCCCcccceEEEEEehhhhHHHHh
Confidence 8888888999999999999999999987643 3456899999999999999998888889999999986 446778
Q ss_pred hCccceEEEecCCCCcccccCceeeecccCC----CCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeE
Q 013014 347 LGVSGVLVLDAPPNGPAGKAGLLSTKRDAYG----RLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPV 422 (451)
Q Consensus 347 ~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~----~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~ 422 (451)
+|++.-|.+.+....|.+...|....+.+++ +|+.||++++||+.-+.++.++.++|++. .|+.+.++|+|.|++
T Consensus 284 lGL~sE~eqv~r~k~P~~tgmLvV~~vL~~gpa~k~Le~GDillavN~t~l~df~~l~~iLDeg-vgk~l~LtI~Rggqe 362 (955)
T KOG1421|consen 284 LGLSSEWEQVVRTKFPERTGMLVVETVLPEGPAEKKLEPGDILLAVNSTCLNDFEALEQILDEG-VGKNLELTIQRGGQE 362 (955)
T ss_pred cCCcHHHHHHHHhcCcccceeEEEEEeccCCchhhccCCCcEEEEEcceehHHHHHHHHHHhhc-cCceEEEEEEeCCEE
Confidence 8874444444444444433334443333332 36669999999999999999999999985 899999999999999
Q ss_pred EEEEEEeeecCCCccccee
Q 013014 423 LSDNLRLLWSEERPVRQML 441 (451)
Q Consensus 423 ~~~~v~l~~~~~~~~~~~~ 441 (451)
.+++++.+..+.....+++
T Consensus 363 lel~vtvqdlh~itp~R~l 381 (955)
T KOG1421|consen 363 LELTVTVQDLHGITPDRFL 381 (955)
T ss_pred EEEEEEeccccCCCCceEE
Confidence 9999998877766555543
No 9
>KOG1421 consensus Predicted signaling-associated protein (contains a PDZ domain) [General function prediction only]
Probab=99.71 E-value=2.2e-16 Score=163.14 Aligned_cols=304 Identities=17% Similarity=0.194 Sum_probs=217.5
Q ss_pred HHhCCceEEEEeeeeccCccccccccccCeEEEEEEEcCC-CeEEecccccC-CCCcEEEEeCCCcEEEEEEEEEcCCCC
Q 013014 123 QENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSK-GHVVTNYHVIR-GASDIRVTFADQSAYDAKIVGFDQDKD 200 (451)
Q Consensus 123 ~~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~-G~ILT~aHVv~-~~~~i~V~~~dg~~~~a~vv~~d~~~D 200 (451)
+....+.|.+.... ++.+++.......|||.|++.+ |++++...++. ++.+.+|+..|.....|.+.+.|+..+
T Consensus 525 ~~i~~~~~~v~~~~----~~~l~g~s~~i~kgt~~i~d~~~g~~vvsr~~vp~d~~d~~vt~~dS~~i~a~~~fL~~t~n 600 (955)
T KOG1421|consen 525 ADISNCLVDVEPMM----PVNLDGVSSDIYKGTALIMDTSKGLGVVSRSVVPSDAKDQRVTEADSDGIPANVSFLHPTEN 600 (955)
T ss_pred hHHhhhhhhheece----eeccccchhhhhcCceEEEEccCCceeEecccCCchhhceEEeecccccccceeeEecCccc
Confidence 44555566665432 2334444444567999999955 89999999986 678899999999999999999999999
Q ss_pred EEEEEEcCCCCCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeE---EeeeeeeeccCCC-CCCcccEEEEcccCCC
Q 013014 201 VAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGV---ISGLRREISSAAT-GRPIQDVIQTDAAINP 276 (451)
Q Consensus 201 lAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~---Vs~~~~~~~~~~~-~~~~~~~i~~d~~i~~ 276 (451)
+|.+|.+... ...+.|.+ ..+..||++...|+......-...-. ++.+.......+. .....+.|.+++.+..
T Consensus 601 ~a~~kydp~~--~~~~kl~~-~~v~~gD~~~f~g~~~~~r~ltaktsv~dvs~~~~ps~~~pr~r~~n~e~Is~~~nlsT 677 (955)
T KOG1421|consen 601 VASFKYDPAL--EVQLKLTD-TTVLRGDECTFEGFTEDLRALTAKTSVTDVSVVIIPSSVMPRFRATNLEVISFMDNLST 677 (955)
T ss_pred eeEeccChhH--hhhhccce-eeEecCCceeEecccccchhhcccceeeeeEEEEecCCCCcceeecceEEEEEeccccc
Confidence 9999998642 34455543 45788999999998755432111111 2111111111111 1222467777777776
Q ss_pred CCCCCceeCCCceEEEEEeeeeCC--CCCccceeEEEeccCchhhHHHhhhccccccccccceeccch--hhhhhCccce
Q 013014 277 GNSGGPLLDSSGSLIGINTAIYSP--SGASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQ--SVEQLGVSGV 352 (451)
Q Consensus 277 G~SGGPlvd~~G~VVGI~s~~~~~--~~~~~~~~~aIP~~~i~~~l~~l~~~g~~~~~~lGi~~~~~~--~~~~~~~~Gv 352 (451)
++--|-+.|.+|+|+|++-..... .+.+...-|.+.+.++++.+++|+..++...-.+|++|.... .|+.+|+.--
T Consensus 678 ~c~sg~ltdddg~vvalwl~~~ge~~~~kd~~y~~gl~~~~~l~vl~rlk~g~~~rp~i~~vef~~i~laqar~lglp~e 757 (955)
T KOG1421|consen 678 SCLSGRLTDDDGEVVALWLSVVGEDVGGKDYTYKYGLSMSYILPVLERLKLGPSARPTIAGVEFSHITLAQARTLGLPSE 757 (955)
T ss_pred cccceEEECCCCeEEEEEeeeeccccCCceeEEEeccchHHHHHHHHHHhcCCCCCceeeccceeeEEeehhhccCCCHH
Confidence 776778899999999998766544 234445677888999999999999988887778999988744 4667787766
Q ss_pred EEEecCCCCcccccCceeeecccCC--CCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEee
Q 013014 353 LVLDAPPNGPAGKAGLLSTKRDAYG--RLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRLL 430 (451)
Q Consensus 353 ~V~~v~~~s~a~~aGl~~~~~~~~~--~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~ 430 (451)
++-+...+|.-.+.-+..+++.+.. .|..||||+++|||.|+...||.+.. .++++|+|||.++++++.+.
T Consensus 758 ~imk~e~es~~~~ql~~ishv~~~~~kil~~gdiilsvngk~itr~~dl~d~~-------eid~~ilrdg~~~~ikipt~ 830 (955)
T KOG1421|consen 758 FIMKSEEESTIPRQLYVISHVRPLLHKILGVGDIILSVNGKMITRLSDLHDFE-------EIDAVILRDGIEMEIKIPTY 830 (955)
T ss_pred HHhhhhhcCCCcceEEEEEeeccCcccccccccEEEEecCeEEeeehhhhhhh-------hhheeeeecCcEEEEEeccc
Confidence 7777777777666666665665553 47789999999999999999999632 67899999999999999876
Q ss_pred ecCCCcccceee
Q 013014 431 WSEERPVRQMLC 442 (451)
Q Consensus 431 ~~~~~~~~~~~~ 442 (451)
+.. ...+.++
T Consensus 831 p~~--et~r~vi 840 (955)
T KOG1421|consen 831 PEY--ETSRAVI 840 (955)
T ss_pred ccc--ccceEEE
Confidence 543 4444443
No 10
>PF13365 Trypsin_2: Trypsin-like peptidase domain; PDB: 1Y8T_A 2Z9I_A 3QO6_A 1L1J_A 1QY6_A 2O8L_A 3OTP_E 2ZLE_I 1KY9_A 3CS0_A ....
Probab=99.68 E-value=3.8e-16 Score=132.66 Aligned_cols=109 Identities=38% Similarity=0.594 Sum_probs=75.0
Q ss_pred EEEEEEcCCCeEEecccccC--------CCCcEEEEeCCCcEEE--EEEEEEcCC-CCEEEEEEcCCCCCCcceecCCCC
Q 013014 154 GSGFVWDSKGHVVTNYHVIR--------GASDIRVTFADQSAYD--AKIVGFDQD-KDVAVLRIDAPKDKLRPIPIGVSA 222 (451)
Q Consensus 154 GSGfiI~~~G~ILT~aHVv~--------~~~~i~V~~~dg~~~~--a~vv~~d~~-~DlAlLkv~~~~~~~~~l~l~~s~ 222 (451)
||||+|+++|+||||+||+. ....+.+...++..+. ++++..|+. .|+|||+++.
T Consensus 1 GTGf~i~~~g~ilT~~Hvv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~D~All~v~~-------------- 66 (120)
T PF13365_consen 1 GTGFLIGPDGYILTAAHVVEDWNDGKQPDNSSVEVVFPDGRRVPPVAEVVYFDPDDYDLALLKVDP-------------- 66 (120)
T ss_dssp EEEEEEETTTEEEEEHHHHTCCTT--G-TCSEEEEEETTSCEEETEEEEEEEETT-TTEEEEEESC--------------
T ss_pred CEEEEEcCCceEEEchhheecccccccCCCCEEEEEecCCCEEeeeEEEEEECCccccEEEEEEec--------------
Confidence 89999999999999999998 4567888888998888 999999999 9999999970
Q ss_pred CCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccCCCCCCCCceeCCCceEEEE
Q 013014 223 DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGI 293 (451)
Q Consensus 223 ~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI 293 (451)
....+...... ............ ......+ +++.+.+|+|||||||.+|+||||
T Consensus 67 ~~~~~~~~~~~------------~~~~~~~~~~~~----~~~~~~~-~~~~~~~G~SGgpv~~~~G~vvGi 120 (120)
T PF13365_consen 67 WTGVGGGVRVP------------GSTSGVSPTSTN----DNRMLYI-TDADTRPGSSGGPVFDSDGRVVGI 120 (120)
T ss_dssp EEEEEEEEEEE------------EEEEEEEEEEEE----ETEEEEE-ESSS-STTTTTSEEEETTSEEEEE
T ss_pred ccceeeeeEee------------eeccccccccCc----ccceeEe-eecccCCCcEeHhEECCCCEEEeC
Confidence 00000001100 000100010000 0001124 899999999999999999999997
No 11
>PF13180 PDZ_2: PDZ domain; PDB: 2L97_A 1Y8T_A 2Z9I_A 1LCY_A 2PZD_B 2P3W_A 1VCW_C 1TE0_B 1SOZ_C 1SOT_C ....
Probab=99.57 E-value=1.5e-14 Score=115.68 Aligned_cols=82 Identities=37% Similarity=0.549 Sum_probs=72.2
Q ss_pred ccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCE
Q 013014 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (451)
Q Consensus 332 ~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~ 411 (451)
||||+.+..... ..|++|.++.++|||+++||++ ||+|++|||++|+++.++..++....+|++
T Consensus 1 ~~lGv~~~~~~~-----~~g~~V~~V~~~spA~~aGl~~-----------GD~I~~ing~~v~~~~~~~~~l~~~~~g~~ 64 (82)
T PF13180_consen 1 GGLGVTVQNLSD-----TGGVVVVSVIPGSPAAKAGLQP-----------GDIILAINGKPVNSSEDLVNILSKGKPGDT 64 (82)
T ss_dssp -E-SEEEEECSC-----SSSEEEEEESTTSHHHHTTS-T-----------TEEEEEETTEESSSHHHHHHHHHCSSTTSE
T ss_pred CEECeEEEEccC-----CCeEEEEEeCCCCcHHHCCCCC-----------CcEEEEECCEEcCCHHHHHHHHHhCCCCCE
Confidence 689999876441 3699999999999999999999 999999999999999999999988899999
Q ss_pred EEEEEEECCeEEEEEEEe
Q 013014 412 LLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 412 v~l~v~R~g~~~~~~v~l 429 (451)
++++|.|+|+.++++++|
T Consensus 65 v~l~v~R~g~~~~~~v~l 82 (82)
T PF13180_consen 65 VTLTVLRDGEELTVEVTL 82 (82)
T ss_dssp EEEEEEETTEEEEEEEE-
T ss_pred EEEEEEECCEEEEEEEEC
Confidence 999999999999999875
No 12
>PF00089 Trypsin: Trypsin; InterPro: IPR001254 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine proteases belong to the MEROPS peptidase family S1 (chymotrypsin family, clan PA(S))and to peptidase family S6 (Hap serine peptidases). The chymotrypsin family is almost totally confined to animals, although trypsin-like enzymes are found in actinomycetes of the genera Streptomyces and Saccharopolyspora, and in the fungus Fusarium oxysporum []. The enzymes are inherently secreted, being synthesised with a signal peptide that targets them to the secretory pathway. Animal enzymes are either secreted directly, packaged into vesicles for regulated secretion, or are retained in leukocyte granules []. The Hap family, 'Haemophilus adhesion and penetration', are proteins that play a role in the interaction with human epithelial cells. The serine protease activity is localized at the N-terminal domain, whereas the binding domain is in the C-terminal region. ; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 1SPJ_A 1A5I_A 2ZGH_A 2ZKS_A 2ZGJ_A 2ZGC_A 2ODP_A 2I6Q_A 2I6S_A 2ODQ_A ....
Probab=99.48 E-value=3.6e-12 Score=119.15 Aligned_cols=168 Identities=24% Similarity=0.362 Sum_probs=112.6
Q ss_pred CeEEEEEEEcCCCeEEecccccCCCCcEEEEeC-------CC--cEEEEEEEEEcC-------CCCEEEEEEcCC---CC
Q 013014 151 QGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFA-------DQ--SAYDAKIVGFDQ-------DKDVAVLRIDAP---KD 211 (451)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHVv~~~~~i~V~~~-------dg--~~~~a~vv~~d~-------~~DlAlLkv~~~---~~ 211 (451)
...|+|++|+++ +|||++||+.+..++.+.+. ++ ..+..+-+..++ .+|+|||+++.+ ..
T Consensus 24 ~~~C~G~li~~~-~vLTaahC~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~h~~~~~~~~~~DiAll~L~~~~~~~~ 102 (220)
T PF00089_consen 24 RFFCTGTLISPR-WVLTAAHCVDGASDIKVRLGTYSIRNSDGSEQTIKVSKIIIHPKYDPSTYDNDIALLKLDRPITFGD 102 (220)
T ss_dssp EEEEEEEEEETT-EEEEEGGGHTSGGSEEEEESESBTTSTTTTSEEEEEEEEEEETTSBTTTTTTSEEEEEESSSSEHBS
T ss_pred CeeEeEEecccc-ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 457999999987 99999999999666666443 22 345555444432 469999999976 24
Q ss_pred CCcceecCCC-CCCCCCcEEEEeeCCCCCCC----ceEEeEEeeeeeeeccCCC-CCCcccEEEEcc----cCCCCCCCC
Q 013014 212 KLRPIPIGVS-ADLLVGQKVYAIGNPFGLDH----TLTTGVISGLRREISSAAT-GRPIQDVIQTDA----AINPGNSGG 281 (451)
Q Consensus 212 ~~~~l~l~~s-~~~~~G~~V~~iG~p~g~~~----~~~~G~Vs~~~~~~~~~~~-~~~~~~~i~~d~----~i~~G~SGG 281 (451)
.+.++.+... ..+..|+.+.++||+..... ......+.-+......... .......+.... ..+.|+|||
T Consensus 103 ~~~~~~l~~~~~~~~~~~~~~~~G~~~~~~~~~~~~~~~~~~~~~~~~~c~~~~~~~~~~~~~c~~~~~~~~~~~g~sG~ 182 (220)
T PF00089_consen 103 NIQPICLPSAGSDPNVGTSCIVVGWGRTSDNGYSSNLQSVTVPVVSRKTCRSSYNDNLTPNMICAGSSGSGDACQGDSGG 182 (220)
T ss_dssp SBEESBBTSTTHTTTTTSEEEEEESSBSSTTSBTSBEEEEEEEEEEHHHHHHHTTTTSTTTEEEEETTSSSBGGTTTTTS
T ss_pred cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Confidence 5677777652 33588999999999875332 3444444433332211100 112245566555 788999999
Q ss_pred ceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchhhH
Q 013014 282 PLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIV 320 (451)
Q Consensus 282 Plvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l 320 (451)
||++.++.|+||++.. ..++.....++++++....+||
T Consensus 183 pl~~~~~~lvGI~s~~-~~c~~~~~~~v~~~v~~~~~WI 220 (220)
T PF00089_consen 183 PLICNNNYLVGIVSFG-ENCGSPNYPGVYTRVSSYLDWI 220 (220)
T ss_dssp EEEETTEEEEEEEEEE-SSSSBTTSEEEEEEGGGGHHHH
T ss_pred ccccceeeecceeeec-CCCCCCCcCEEEEEHHHhhccC
Confidence 9998877899999987 4444343468888888877765
No 13
>KOG1320 consensus Serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=99.46 E-value=1.3e-13 Score=141.26 Aligned_cols=276 Identities=18% Similarity=0.219 Sum_probs=194.3
Q ss_pred HHHhCCceEEEEeeeeccCcccccccc-ccCeEEEEEEEcCCCeEEecccccC---CCCcEEEEe-CCCcEEEEEEEEEc
Q 013014 122 FQENTPSVVNITNLAARQDAFTLDVLE-VPQGSGSGFVWDSKGHVVTNYHVIR---GASDIRVTF-ADQSAYDAKIVGFD 196 (451)
Q Consensus 122 ~~~~~~SVV~I~~~~~~~~~~~~~~~~-~~~~~GSGfiI~~~G~ILT~aHVv~---~~~~i~V~~-~dg~~~~a~vv~~d 196 (451)
.+....|++.+............|... .....|+||.+.- ..++|++|++. +...+.+.. ..-+.|.+++...-
T Consensus 56 ~~~~~~s~~~v~~~~~~~~~~~pw~~~~q~~~~~s~f~i~~-~~lltn~~~v~~~~~~~~v~v~~~gs~~k~~~~v~~~~ 134 (473)
T KOG1320|consen 56 VDLALQSVVKVFSVSTEPSSVLPWQRTRQFSSGGSGFAIYG-KKLLTNAHVVAPNNDHKFVTVKKHGSPRKYKAFVAAVF 134 (473)
T ss_pred ccccccceeEEEeecccccccCcceeeehhcccccchhhcc-cceeecCccccccccccccccccCCCchhhhhhHHHhh
Confidence 345566788887766555443333322 2346799999974 37999999999 555566653 23467889999888
Q ss_pred CCCCEEEEEEcCCC--CCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccC
Q 013014 197 QDKDVAVLRIDAPK--DKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI 274 (451)
Q Consensus 197 ~~~DlAlLkv~~~~--~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i 274 (451)
.++|+|++.++... ....|+.++ +-+...+.++++| |....+|.|.|....... +..+......+++++++
T Consensus 135 ~~cd~Avv~Ie~~~f~~~~~~~e~~--~ip~l~~S~~Vv~---gd~i~VTnghV~~~~~~~--y~~~~~~l~~vqi~aa~ 207 (473)
T KOG1320|consen 135 EECDLAVVYIESEEFWKGMNPFELG--DIPSLNGSGFVVG---GDGIIVTNGHVVRVEPRI--YAHSSTVLLRVQIDAAI 207 (473)
T ss_pred hcccceEEEEeeccccCCCcccccC--CCcccCccEEEEc---CCcEEEEeeEEEEEEecc--ccCCCcceeeEEEEEee
Confidence 99999999998643 122234443 3456678899998 777799999998776542 22333345679999999
Q ss_pred CCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchhhHHHhhhccccc-cccccceeccch---hhhhh--C
Q 013014 275 NPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVT-RPILGIKFAPDQ---SVEQL--G 348 (451)
Q Consensus 275 ~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~~~-~~~lGi~~~~~~---~~~~~--~ 348 (451)
.+|+||+|.+...+++.|+..-..... ..+++.||.-.+.++.......+... +++++...+... ..+.+ +
T Consensus 208 ~~~~s~ep~i~g~d~~~gvA~l~ik~~---~~i~~~i~~~~~~~~~~G~~~~a~~~~f~~~nt~t~g~vs~~~R~~~~lg 284 (473)
T KOG1320|consen 208 GPGNSGEPVIVGVDKVAGVAFLKIKTP---ENILYVIPLGVSSHFRTGVEVSAIGNGFGLLNTLTQGMVSGQLRKSFKLG 284 (473)
T ss_pred cCCccCCCeEEccccccceEEEEEecC---CcccceeecceeeeecccceeeccccCceeeeeeeecccccccccccccC
Confidence 999999999988789999988776432 26899999999999987766655543 566666555322 12222 2
Q ss_pred c-cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH----H--HHHHHHhcCCCCCEEEEEEEECC
Q 013014 349 V-SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG----S--DLYRILDQCKVGDELLLQGIKQP 420 (451)
Q Consensus 349 ~-~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~----~--dl~~~l~~~~~g~~v~l~v~R~g 420 (451)
. .|+.+.++.+.+.|-+. ++ .||+|+.+||..|... . .+...+..+.++|++.+.+.|.+
T Consensus 285 ~~~g~~i~~~~qtd~ai~~-~n-----------sg~~ll~~DG~~IgVn~~~~~ri~~~~~iSf~~p~d~vl~~v~r~~ 351 (473)
T KOG1320|consen 285 LETGVLISKINQTDAAINP-GN-----------SGGPLLNLDGEVIGVNTRKVTRIGFSHGISFKIPIDTVLVIVLRLG 351 (473)
T ss_pred cccceeeeeecccchhhhc-cc-----------CCCcEEEecCcEeeeeeeeeEEeeccccceeccCchHhhhhhhhhh
Confidence 2 57899999887776554 33 3999999999988421 1 23345566789999999999998
No 14
>cd00190 Tryp_SPc Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues.
Probab=99.31 E-value=1.4e-10 Score=109.09 Aligned_cols=169 Identities=21% Similarity=0.263 Sum_probs=101.8
Q ss_pred CeEEEEEEEcCCCeEEecccccCCC--CcEEEEeCC---------CcEEEEEEEEEcC-------CCCEEEEEEcCCC--
Q 013014 151 QGSGSGFVWDSKGHVVTNYHVIRGA--SDIRVTFAD---------QSAYDAKIVGFDQ-------DKDVAVLRIDAPK-- 210 (451)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHVv~~~--~~i~V~~~d---------g~~~~a~vv~~d~-------~~DlAlLkv~~~~-- 210 (451)
...|+|++|+++ +|||+|||+.+. ..+.|.+.. ...+..+-+..++ ..|||||+++.+.
T Consensus 24 ~~~C~GtlIs~~-~VLTaAhC~~~~~~~~~~v~~g~~~~~~~~~~~~~~~v~~~~~hp~y~~~~~~~DiAll~L~~~~~~ 102 (232)
T cd00190 24 RHFCGGSLISPR-WVLTAAHCVYSSAPSNYTVRLGSHDLSSNEGGGQVIKVKKVIVHPNYNPSTYDNDIALLKLKRPVTL 102 (232)
T ss_pred cEEEEEEEeeCC-EEEECHHhcCCCCCccEEEEeCcccccCCCCceEEEEEEEEEECCCCCCCCCcCCEEEEEECCcccC
Confidence 468999999976 999999999875 566666532 2234444445553 5799999998653
Q ss_pred -CCCcceecCCCC-CCCCCcEEEEeeCCCCCCC-----ceEEeEEeeeeeeeccCCCC---CCcccEEEE-----cccCC
Q 013014 211 -DKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDH-----TLTTGVISGLRREISSAATG---RPIQDVIQT-----DAAIN 275 (451)
Q Consensus 211 -~~~~~l~l~~s~-~~~~G~~V~~iG~p~g~~~-----~~~~G~Vs~~~~~~~~~~~~---~~~~~~i~~-----d~~i~ 275 (451)
..+.|+.|.... .+..|+.+.++||...... ......+.-+.......... ......+.. +...+
T Consensus 103 ~~~v~picl~~~~~~~~~~~~~~~~G~g~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 182 (232)
T cd00190 103 SDNVRPICLPSSGYNLPAGTTCTVSGWGRTSEGGPLPDVLQEVNVPIVSNAECKRAYSYGGTITDNMLCAGGLEGGKDAC 182 (232)
T ss_pred CCcccceECCCccccCCCCCEEEEEeCCcCCCCCCCCceeeEEEeeeECHHHhhhhccCcccCCCceEeeCCCCCCCccc
Confidence 236777776543 5778999999998754332 12222222222211110000 001122211 34567
Q ss_pred CCCCCCceeCCC---ceEEEEEeeeeCCCCCccceeEEEeccCchhhHH
Q 013014 276 PGNSGGPLLDSS---GSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVD 321 (451)
Q Consensus 276 ~G~SGGPlvd~~---G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~ 321 (451)
+|+|||||+... +.++||.+.... ++.....+.+..+....+||+
T Consensus 183 ~gdsGgpl~~~~~~~~~lvGI~s~g~~-c~~~~~~~~~t~v~~~~~WI~ 230 (232)
T cd00190 183 QGDSGGPLVCNDNGRGVLVGIVSWGSG-CARPNYPGVYTRVSSYLDWIQ 230 (232)
T ss_pred cCCCCCcEEEEeCCEEEEEEEEehhhc-cCCCCCCCEEEEcHHhhHHhh
Confidence 899999999754 789999997643 332223344555555555554
No 15
>cd00991 PDZ_archaeal_metalloprotease PDZ domain of archaeal zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.26 E-value=4e-11 Score=95.20 Aligned_cols=68 Identities=25% Similarity=0.406 Sum_probs=63.3
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLR 428 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (451)
.|++|.++.+++||+++||++ ||+|++|||++|.+++|+.+.+....+|+.+.+++.|+|+..+++++
T Consensus 10 ~Gv~V~~V~~~spa~~aGL~~-----------GDiI~~Ing~~v~~~~d~~~~l~~~~~g~~v~l~v~r~g~~~~~~~~ 77 (79)
T cd00991 10 AGVVIVGVIVGSPAENAVLHT-----------GDVIYSINGTPITTLEDFMEALKPTKPGEVITVTVLPSTTKLTNVST 77 (79)
T ss_pred CcEEEEEECCCChHHhcCCCC-----------CCEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCEEEEEEEE
Confidence 699999999999999999999 99999999999999999999998776799999999999998877765
No 16
>cd00987 PDZ_serine_protease PDZ domain of tryspin-like serine proteases, such as DegP/HtrA, which are oligomeric proteins involved in heat-shock response, chaperone function, and apoptosis. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.23 E-value=5e-11 Score=96.41 Aligned_cols=84 Identities=40% Similarity=0.606 Sum_probs=71.0
Q ss_pred ccccceeccchh--hhhhCc---cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcC
Q 013014 332 PILGIKFAPDQS--VEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (451)
Q Consensus 332 ~~lGi~~~~~~~--~~~~~~---~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~ 406 (451)
+|+|+.+++... .+.+++ .|++|.++.+++||+++||++ ||+|++|||++|.++.++.+++...
T Consensus 1 ~~~G~~~~~~~~~~~~~~~~~~~~g~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~i~~~~~~~~~l~~~ 69 (90)
T cd00987 1 PWLGVTVQDLTPDLAEELGLKDTKGVLVASVDPGSPAAKAGLKP-----------GDVILAVNGKPVKSVADLRRALAEL 69 (90)
T ss_pred CccceEEeECCHHHHHHcCCCCCCEEEEEEECCCCHHHHcCCCc-----------CCEEEEECCEECCCHHHHHHHHHhc
Confidence 588999887432 222333 599999999999999999999 9999999999999999999999876
Q ss_pred CCCCEEEEEEEECCeEEEEE
Q 013014 407 KVGDELLLQGIKQPPVLSDN 426 (451)
Q Consensus 407 ~~g~~v~l~v~R~g~~~~~~ 426 (451)
..++.+.+++.|+|+..++.
T Consensus 70 ~~~~~i~l~v~r~g~~~~~~ 89 (90)
T cd00987 70 KPGDKVTLTVLRGGKELTVT 89 (90)
T ss_pred CCCCEEEEEEEECCEEEEee
Confidence 67899999999999876654
No 17
>smart00020 Tryp_SPc Trypsin-like serine protease. Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues.
Probab=99.22 E-value=8.4e-10 Score=104.07 Aligned_cols=165 Identities=22% Similarity=0.291 Sum_probs=98.5
Q ss_pred CeEEEEEEEcCCCeEEecccccCCCC--cEEEEeCCC--------cEEEEEEEEEc-------CCCCEEEEEEcCCC---
Q 013014 151 QGSGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFADQ--------SAYDAKIVGFD-------QDKDVAVLRIDAPK--- 210 (451)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHVv~~~~--~i~V~~~dg--------~~~~a~vv~~d-------~~~DlAlLkv~~~~--- 210 (451)
...|+|++|+++ +|||++||+.+.. .+.|.+... ..+.+.-+..+ ..+|+|||+++.+.
T Consensus 25 ~~~C~GtlIs~~-~VLTaahC~~~~~~~~~~v~~g~~~~~~~~~~~~~~v~~~~~~p~~~~~~~~~DiAll~L~~~i~~~ 103 (229)
T smart00020 25 RHFCGGSLISPR-WVLTAAHCVYGSDPSNIRVRLGSHDLSSGEEGQVIKVSKVIIHPNYNPSTYDNDIALLKLKSPVTLS 103 (229)
T ss_pred CcEEEEEEecCC-EEEECHHHcCCCCCcceEEEeCcccCCCCCCceEEeeEEEEECCCCCCCCCcCCEEEEEECcccCCC
Confidence 458999999976 9999999998754 677776432 33444444433 35699999998752
Q ss_pred CCCcceecCCC-CCCCCCcEEEEeeCCCCCC------CceEEeEEeeeeeeeccCCCCC---CcccEEE-----EcccCC
Q 013014 211 DKLRPIPIGVS-ADLLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQ-----TDAAIN 275 (451)
Q Consensus 211 ~~~~~l~l~~s-~~~~~G~~V~~iG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~~---~~~~~i~-----~d~~i~ 275 (451)
..+.|+.|... ..+..++.+.+.||+.... .......+.-+........... .....+. .....+
T Consensus 104 ~~~~pi~l~~~~~~~~~~~~~~~~g~g~~~~~~~~~~~~~~~~~~~~~~~~~C~~~~~~~~~~~~~~~C~~~~~~~~~~c 183 (229)
T smart00020 104 DNVRPICLPSSNYNVPAGTTCTVSGWGRTSEGAGSLPDTLQEVNVPIVSNATCRRAYSGGGAITDNMLCAGGLEGGKDAC 183 (229)
T ss_pred CceeeccCCCcccccCCCCEEEEEeCCCCCCCCCcCCCEeeEEEEEEeCHHHhhhhhccccccCCCcEeecCCCCCCccc
Confidence 23567777543 3467789999999886542 1122222222222111100000 0011111 135578
Q ss_pred CCCCCCceeCCCc--eEEEEEeeeeCCCCCccceeEEEeccCch
Q 013014 276 PGNSGGPLLDSSG--SLIGINTAIYSPSGASSGVGFSIPVDTVN 317 (451)
Q Consensus 276 ~G~SGGPlvd~~G--~VVGI~s~~~~~~~~~~~~~~aIP~~~i~ 317 (451)
+|+||||++...+ .++||++... .++.......+..+....
T Consensus 184 ~gdsG~pl~~~~~~~~l~Gi~s~g~-~C~~~~~~~~~~~i~~~~ 226 (229)
T smart00020 184 QGDSGGPLVCNDGRWVLVGIVSWGS-GCARPGKPGVYTRVSSYL 226 (229)
T ss_pred CCCCCCeeEEECCCEEEEEEEEECC-CCCCCCCCCEEEEecccc
Confidence 8999999997553 8999999765 444233444455544433
No 18
>TIGR01713 typeII_sec_gspC general secretion pathway protein C. This model represents GspC, protein C of the main terminal branch of the general secretion pathway, also called type II secretion. This system transports folded proteins across the bacterial outer membrane and is widely distributed in Gram-negative pathogens.
Probab=99.21 E-value=4.8e-11 Score=115.65 Aligned_cols=101 Identities=16% Similarity=0.146 Sum_probs=89.8
Q ss_pred cCchhhHHHhhhccccccccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEc
Q 013014 314 DTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKV 393 (451)
Q Consensus 314 ~~i~~~l~~l~~~g~~~~~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i 393 (451)
..++++++++++++++.+.|+|+...... -...|++|..+.++++++++||++ ||+|++|||+++
T Consensus 159 ~~~~~v~~~l~~~g~~~~~~lgi~p~~~~----g~~~G~~v~~v~~~s~a~~aGLr~-----------GDvIv~ING~~i 223 (259)
T TIGR01713 159 VVSRRIIEELTKDPQKMFDYIRLSPVMKN----DKLEGYRLNPGKDPSLFYKSGLQD-----------GDIAVALNGLDL 223 (259)
T ss_pred hhHHHHHHHHHHCHHhhhheEeEEEEEeC----CceeEEEEEecCCCCHHHHcCCCC-----------CCEEEEECCEEc
Confidence 45788999999999999999999875422 114799999999999999999999 999999999999
Q ss_pred CCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 013014 394 SNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 394 ~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.+++++.+++...+++++++++|+|+|+.+++.+.+
T Consensus 224 ~~~~~~~~~l~~~~~~~~v~l~V~R~G~~~~i~v~~ 259 (259)
T TIGR01713 224 RDPEQAFQALQMLREETNLTLTVERDGQREDIYVRF 259 (259)
T ss_pred CCHHHHHHHHHhcCCCCeEEEEEEECCEEEEEEEEC
Confidence 999999999998888999999999999998888763
No 19
>cd00986 PDZ_LON_protease PDZ domain of ATP-dependent LON serine proteases. Most PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this bacterial subfamily of protease-associated PDZ domains a C-terminal beta-strand is thought to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.19 E-value=1.7e-10 Score=91.46 Aligned_cols=70 Identities=29% Similarity=0.371 Sum_probs=64.4
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.|++|.++.+++||++ ||++ ||+|++|||+++.+++++.+++....+|+.+.+++.|+|+..++++++
T Consensus 8 ~Gv~V~~V~~~s~A~~-gL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~~~~~~v~l~v~r~g~~~~~~v~l 75 (79)
T cd00986 8 HGVYVTSVVEGMPAAG-KLKA-----------GDHIIAVDGKPFKEAEELIDYIQSKKEGDTVKLKVKREEKELPEDLIL 75 (79)
T ss_pred cCEEEEEECCCCchhh-CCCC-----------CCEEEEECCEECCCHHHHHHHHHhCCCCCEEEEEEEECCEEEEEEEEE
Confidence 6899999999999986 7999 999999999999999999999987678999999999999999999987
Q ss_pred ee
Q 013014 430 LW 431 (451)
Q Consensus 430 ~~ 431 (451)
..
T Consensus 76 ~~ 77 (79)
T cd00986 76 KT 77 (79)
T ss_pred ec
Confidence 54
No 20
>cd00990 PDZ_glycyl_aminopeptidase PDZ domain associated with archaeal and bacterial M61 glycyl-aminopeptidases. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand is presumed to form the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.17 E-value=1.5e-10 Score=91.76 Aligned_cols=78 Identities=29% Similarity=0.429 Sum_probs=65.5
Q ss_pred ccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCE
Q 013014 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (451)
Q Consensus 332 ~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~ 411 (451)
+|+|+.+.... .|+.|.++.++|+|+++||++ ||+|++|||+++.++.++ +...+.++.
T Consensus 1 ~~~G~~~~~~~-------~~~~V~~V~~~s~a~~aGl~~-----------GD~I~~Ing~~v~~~~~~---l~~~~~~~~ 59 (80)
T cd00990 1 PYLGLTLDKEE-------GLGKVTFVRDDSPADKAGLVA-----------GDELVAVNGWRVDALQDR---LKEYQAGDP 59 (80)
T ss_pred CcccEEEEccC-------CcEEEEEECCCChHHHhCCCC-----------CCEEEEECCEEhHHHHHH---HHhcCCCCE
Confidence 57888886532 579999999999999999999 999999999999986654 344357889
Q ss_pred EEEEEEECCeEEEEEEEee
Q 013014 412 LLLQGIKQPPVLSDNLRLL 430 (451)
Q Consensus 412 v~l~v~R~g~~~~~~v~l~ 430 (451)
+.+++.|+|+..++++++.
T Consensus 60 v~l~v~r~g~~~~~~v~~~ 78 (80)
T cd00990 60 VELTVFRDDRLIEVPLTLA 78 (80)
T ss_pred EEEEEEECCEEEEEEEEec
Confidence 9999999999888887753
No 21
>cd00989 PDZ_metalloprotease PDZ domain of bacterial and plant zinc metalloprotases, presumably membrane-associated or integral membrane proteases, which may be involved in signalling and regulatory mechanisms. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=99.07 E-value=8.6e-10 Score=86.97 Aligned_cols=67 Identities=28% Similarity=0.414 Sum_probs=59.7
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLR 428 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (451)
..++|.++.++++|+++||++ ||+|++|||+++.+++++...+... .++.+.+++.|+|+..+++++
T Consensus 12 ~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~l~~~-~~~~~~l~v~r~~~~~~~~l~ 78 (79)
T cd00989 12 IEPVIGEVVPGSPAAKAGLKA-----------GDRILAINGQKIKSWEDLVDAVQEN-PGKPLTLTVERNGETITLTLT 78 (79)
T ss_pred cCcEEEeECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHHC-CCceEEEEEEECCEEEEEEec
Confidence 347899999999999999999 9999999999999999999988774 478999999999987777654
No 22
>cd00988 PDZ_CTP_protease PDZ domain of C-terminal processing-, tail-specific-, and tricorn proteases, which function in posttranslational protein processing, maturation, and disassembly or degradation, in Bacteria, Archaea, and plant chloroplasts. May be responsible for substrate recognition and/or binding, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of protease-associated PDZ domains a C-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in Eumetazoan signaling proteins.
Probab=98.98 E-value=2.8e-09 Score=85.31 Aligned_cols=78 Identities=27% Similarity=0.456 Sum_probs=66.0
Q ss_pred cccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCC
Q 013014 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGD 410 (451)
Q Consensus 333 ~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~ 410 (451)
-||+.+.... .+++|..+.+++||+++||++ ||+|++|||+++.++ .++.+++.. ..|+
T Consensus 3 ~lG~~~~~~~-------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~vng~~i~~~~~~~~~~~l~~-~~~~ 63 (85)
T cd00988 3 GIGLELKYDD-------GGLVITSVLPGSPAAKAGIKA-----------GDIIVAIDGEPVDGLSLEDVVKLLRG-KAGT 63 (85)
T ss_pred EEEEEEEEcC-------CeEEEEEecCCCCHHHcCCCC-----------CCEEEEECCEEcCCCCHHHHHHHhcC-CCCC
Confidence 3566665422 678999999999999999999 999999999999999 899888866 4689
Q ss_pred EEEEEEEEC-CeEEEEEEEe
Q 013014 411 ELLLQGIKQ-PPVLSDNLRL 429 (451)
Q Consensus 411 ~v~l~v~R~-g~~~~~~v~l 429 (451)
.+.+++.|+ |+..+++++.
T Consensus 64 ~i~l~v~r~~~~~~~~~~~~ 83 (85)
T cd00988 64 KVRLTLKRGDGEPREVTLTR 83 (85)
T ss_pred EEEEEEEcCCCCEEEEEEEE
Confidence 999999999 8887777653
No 23
>COG3591 V8-like Glu-specific endopeptidase [Amino acid transport and metabolism]
Probab=98.98 E-value=2.1e-08 Score=95.70 Aligned_cols=158 Identities=18% Similarity=0.233 Sum_probs=93.5
Q ss_pred EEEEEEEcCCCeEEecccccCCCC----cEEEEe----CCCc-EEE--EEEEEEcC----CCCEEEEEEcCCCCC--C--
Q 013014 153 SGSGFVWDSKGHVVTNYHVIRGAS----DIRVTF----ADQS-AYD--AKIVGFDQ----DKDVAVLRIDAPKDK--L-- 213 (451)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHVv~~~~----~i~V~~----~dg~-~~~--a~vv~~d~----~~DlAlLkv~~~~~~--~-- 213 (451)
.+++|+|+++ .+||++||+.... ++.+.. .++. .+. .......+ +.|.+...+..-... .
T Consensus 65 ~~~~~lI~pn-tvLTa~Hc~~s~~~G~~~~~~~p~g~~~~~~~~~~~~~~~~~~~~g~~~~~d~~~~~v~~~~~~~g~~~ 143 (251)
T COG3591 65 CTAATLIGPN-TVLTAGHCIYSPDYGEDDIAAAPPGVNSDGGPFYGITKIEIRVYPGELYKEDGASYDVGEAALESGINI 143 (251)
T ss_pred eeeEEEEcCc-eEEEeeeEEecCCCChhhhhhcCCcccCCCCCCCceeeEEEEecCCceeccCCceeeccHHHhccCCCc
Confidence 4466999987 9999999996433 122211 1221 121 11222222 346666666421111 1
Q ss_pred ----cceecCCCCCCCCCcEEEEeeCCCCCCCce----EEeEEeeeeeeeccCCCCCCcccEEEEcccCCCCCCCCceeC
Q 013014 214 ----RPIPIGVSADLLVGQKVYAIGNPFGLDHTL----TTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLD 285 (451)
Q Consensus 214 ----~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~----~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd 285 (451)
....+......+.++.+-++|||.+..... .++.|.... ...+++++.+.+|+||+|+++
T Consensus 144 ~~~~~~~~~~~~~~~~~~d~i~v~GYP~dk~~~~~~~e~t~~v~~~~------------~~~l~y~~dT~pG~SGSpv~~ 211 (251)
T COG3591 144 GDVVNYLKRNTASEAKANDRITVIGYPGDKPNIGTMWESTGKVNSIK------------GNKLFYDADTLPGSSGSPVLI 211 (251)
T ss_pred cccccccccccccccccCceeEEEeccCCCCcceeEeeecceeEEEe------------cceEEEEecccCCCCCCceEe
Confidence 122223334577899999999997765322 233332211 246889999999999999999
Q ss_pred CCceEEEEEeeeeCCCCCccceeEE-EeccCchhhHHHhh
Q 013014 286 SSGSLIGINTAIYSPSGASSGVGFS-IPVDTVNGIVDQLV 324 (451)
Q Consensus 286 ~~G~VVGI~s~~~~~~~~~~~~~~a-IP~~~i~~~l~~l~ 324 (451)
.+.+|||+++......++ ...+++ .-...++++++++.
T Consensus 212 ~~~~vigv~~~g~~~~~~-~~~n~~vr~t~~~~~~I~~~~ 250 (251)
T COG3591 212 SKDEVIGVHYNGPGANGG-SLANNAVRLTPEILNFIQQNI 250 (251)
T ss_pred cCceEEEEEecCCCcccc-cccCcceEecHHHHHHHHHhh
Confidence 988999999987654332 233333 34466677776654
No 24
>TIGR02037 degP_htrA_DO periplasmic serine protease, Do/DeqQ family. This family consists of a set proteins various designated DegP, heat shock protein HtrA, and protease DO. The ortholog in Pseudomonas aeruginosa is designated MucD and is found in an operon that controls mucoid phenotype. This family also includes the DegQ (HhoA) paralog in E. coli which can rescue a DegP mutant, but not the smaller DegS paralog, which cannot. Members of this family are located in the periplasm and have separable functions as both protease and chaperone. Members have a trypsin domain and two copies of a PDZ domain. This protein protects bacteria from thermal and other stresses and may be important for the survival of bacterial pathogens.// The chaperone function is dominant at low temperatures, whereas the proteolytic activity is turned on at elevated temperatures.
Probab=98.83 E-value=1.2e-08 Score=106.37 Aligned_cols=85 Identities=32% Similarity=0.484 Sum_probs=73.5
Q ss_pred cccccceeccch--hhhhhCc----cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHh
Q 013014 331 RPILGIKFAPDQ--SVEQLGV----SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD 404 (451)
Q Consensus 331 ~~~lGi~~~~~~--~~~~~~~----~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~ 404 (451)
+.|+|+.+.+.. ..+.+++ .|++|.+|.++|||+++||++ ||+|++|||++|.+++|+.+++.
T Consensus 337 ~~~lGi~~~~l~~~~~~~~~l~~~~~Gv~V~~V~~~SpA~~aGL~~-----------GDvI~~Ing~~V~s~~d~~~~l~ 405 (428)
T TIGR02037 337 NPFLGLTVANLSPEIRKELRLKGDVKGVVVTKVVSGSPAARAGLQP-----------GDVILSVNQQPVSSVAELRKVLD 405 (428)
T ss_pred ccccceEEecCCHHHHHHcCCCcCcCceEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHH
Confidence 568999888632 3344554 599999999999999999999 99999999999999999999998
Q ss_pred cCCCCCEEEEEEEECCeEEEEE
Q 013014 405 QCKVGDELLLQGIKQPPVLSDN 426 (451)
Q Consensus 405 ~~~~g~~v~l~v~R~g~~~~~~ 426 (451)
..+.|+.++++|+|+|+...+.
T Consensus 406 ~~~~g~~v~l~v~R~g~~~~~~ 427 (428)
T TIGR02037 406 RAKKGGRVALLILRGGATIFVT 427 (428)
T ss_pred hcCCCCEEEEEEEECCEEEEEE
Confidence 8778999999999999987664
No 25
>cd00136 PDZ PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(pos
Probab=98.75 E-value=2.3e-08 Score=76.84 Aligned_cols=67 Identities=37% Similarity=0.586 Sum_probs=56.9
Q ss_pred cccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCC
Q 013014 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGD 410 (451)
Q Consensus 333 ~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~ 410 (451)
++|+.+..... .|++|.++.+++||+++||++ ||+|++|||+++.++ +++.+.+... .|+
T Consensus 2 ~~G~~~~~~~~------~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~~-~g~ 63 (70)
T cd00136 2 GLGFSIRGGTE------GGVVVLSVEPGSPAERAGLQA-----------GDVILAVNGTDVKNLTLEDVAELLKKE-VGE 63 (70)
T ss_pred CccEEEecCCC------CCEEEEEeCCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhhC-CCC
Confidence 56777665331 489999999999999999999 999999999999999 8999999875 488
Q ss_pred EEEEEEE
Q 013014 411 ELLLQGI 417 (451)
Q Consensus 411 ~v~l~v~ 417 (451)
.++++++
T Consensus 64 ~v~l~v~ 70 (70)
T cd00136 64 KVTLTVR 70 (70)
T ss_pred eEEEEEC
Confidence 8888763
No 26
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.75 E-value=1.5e-08 Score=106.32 Aligned_cols=67 Identities=15% Similarity=0.090 Sum_probs=62.8
Q ss_pred EEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEee
Q 013014 353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRLL 430 (451)
Q Consensus 353 ~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~ 430 (451)
+|.+|.++|||++|||++ ||+|++|||++|.+++|+...+....+|+++++++.|+|+++++++++.
T Consensus 129 lV~~V~~~SpA~kAGLk~-----------GDvI~~vnG~~V~~~~~l~~~v~~~~~g~~v~v~v~R~gk~~~~~v~l~ 195 (449)
T PRK10779 129 VVGEIAPNSIAAQAQIAP-----------GTELKAVDGIETPDWDAVRLALVSKIGDESTTITVAPFGSDQRRDKTLD 195 (449)
T ss_pred cccccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhccCCceEEEEEeCCccceEEEEec
Confidence 689999999999999999 9999999999999999999999888888999999999999888888774
No 27
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.63 E-value=1e-07 Score=99.12 Aligned_cols=69 Identities=30% Similarity=0.423 Sum_probs=63.8
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.|++|.+|.++|||+++||++ ||+|++|||++|.+++|+.+.+.. .+++++.++++|+|+..++++++
T Consensus 203 ~g~vV~~V~~~SpA~~aGL~~-----------GD~Iv~Vng~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~ 270 (420)
T TIGR00054 203 IEPVLSDVTPNSPAEKAGLKE-----------GDYIQSINGEKLRSWTDFVSAVKE-NPGKSMDIKVERNGETLSISLTP 270 (420)
T ss_pred cCcEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHh-CCCCceEEEEEECCEEEEEEEEE
Confidence 378999999999999999999 999999999999999999999987 47888999999999999888887
Q ss_pred e
Q 013014 430 L 430 (451)
Q Consensus 430 ~ 430 (451)
.
T Consensus 271 ~ 271 (420)
T TIGR00054 271 E 271 (420)
T ss_pred c
Confidence 4
No 28
>PRK10779 zinc metallopeptidase RseP; Provisional
Probab=98.62 E-value=1.6e-07 Score=98.55 Aligned_cols=70 Identities=27% Similarity=0.342 Sum_probs=63.9
Q ss_pred ceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEee
Q 013014 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRLL 430 (451)
Q Consensus 351 Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l~ 430 (451)
+++|.+|.++|||+++||++ ||+|++|||++|.+++|+.+.+.. .+|+.+.+++.|+|+..++++++.
T Consensus 222 ~~vV~~V~~~SpA~~AGL~~-----------GDvIl~Ing~~V~s~~dl~~~l~~-~~~~~v~l~v~R~g~~~~~~v~~~ 289 (449)
T PRK10779 222 EPVLAEVQPNSAASKAGLQA-----------GDRIVKVDGQPLTQWQTFVTLVRD-NPGKPLALEIERQGSPLSLTLTPD 289 (449)
T ss_pred CcEEEeeCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-CCCCEEEEEEEECCEEEEEEEEee
Confidence 57899999999999999999 999999999999999999999877 578899999999999999988875
Q ss_pred ec
Q 013014 431 WS 432 (451)
Q Consensus 431 ~~ 432 (451)
..
T Consensus 290 ~~ 291 (449)
T PRK10779 290 SK 291 (449)
T ss_pred ee
Confidence 33
No 29
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=98.51 E-value=4.7e-07 Score=92.14 Aligned_cols=69 Identities=28% Similarity=0.488 Sum_probs=60.1
Q ss_pred cceEEEecC--------CCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCe
Q 013014 350 SGVLVLDAP--------PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPP 421 (451)
Q Consensus 350 ~Gv~V~~v~--------~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~ 421 (451)
+||+|.... .++||+++||++ ||+|++|||++|.+++|+.+++.... ++.+.++|.|+|+
T Consensus 105 ~GVlVvg~~~v~~~~g~~~SPAa~AGLq~-----------GDiIvsING~~V~s~~DL~~iL~~~~-g~~V~LtV~R~Ge 172 (402)
T TIGR02860 105 KGVLVVGFSDIETEKGKIHSPGEEAGIQI-----------GDRILKINGEKIKNMDDLANLINKAG-GEKLTLTIERGGK 172 (402)
T ss_pred CEEEEEEEEcccccCCCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHHHhCC-CCeEEEEEEECCE
Confidence 688875542 368999999999 99999999999999999999998754 8999999999999
Q ss_pred EEEEEEEee
Q 013014 422 VLSDNLRLL 430 (451)
Q Consensus 422 ~~~~~v~l~ 430 (451)
..++++++.
T Consensus 173 ~~tv~V~Pv 181 (402)
T TIGR02860 173 IIETVIKPV 181 (402)
T ss_pred EEEEEEEEe
Confidence 988888754
No 30
>TIGR00225 prc C-terminal peptidase (prc). A C-terminal peptidase with different substrates in different species including processing of D1 protein of the photosystem II reaction center in higher plants and cleavage of a peptide of 11 residues from the precursor form of penicillin-binding protein in E.coli E.coli and H influenza have the most distal branch of the tree and their proteins have an N-terminal 200 amino acids that show no homology to other proteins in the database.
Probab=98.51 E-value=3.5e-07 Score=92.39 Aligned_cols=82 Identities=24% Similarity=0.368 Sum_probs=65.8
Q ss_pred ccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCC
Q 013014 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (451)
Q Consensus 332 ~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g 409 (451)
..+|+.+.... .+++|.+|.++|||+++||++ ||+|++|||++|.++ .++...+.. ..|
T Consensus 51 ~~lG~~~~~~~-------~~~~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~~~l~~-~~g 111 (334)
T TIGR00225 51 EGIGIQVGMDD-------GEIVIVSPFEGSPAEKAGIKP-----------GDKIIKINGKSVAGMSLDDAVALIRG-KKG 111 (334)
T ss_pred EEEEEEEEEEC-------CEEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHhccC-CCC
Confidence 45666664322 478999999999999999999 999999999999987 466666654 568
Q ss_pred CEEEEEEEECCeEEEEEEEeeec
Q 013014 410 DELLLQGIKQPPVLSDNLRLLWS 432 (451)
Q Consensus 410 ~~v~l~v~R~g~~~~~~v~l~~~ 432 (451)
+++.+++.|+|+..++++++...
T Consensus 112 ~~v~l~v~R~g~~~~~~v~l~~~ 134 (334)
T TIGR00225 112 TKVSLEILRAGKSKPLTFTLKRD 134 (334)
T ss_pred CEEEEEEEeCCCCceEEEEEEEE
Confidence 99999999999877777776543
No 31
>smart00228 PDZ Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities.
Probab=98.49 E-value=3.4e-07 Score=72.62 Aligned_cols=74 Identities=31% Similarity=0.403 Sum_probs=57.3
Q ss_pred ccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCE
Q 013014 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (451)
Q Consensus 332 ~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~ 411 (451)
..+|+.+....... .|++|..+.++++|+++||++ ||+|++|||+++.++.+..........++.
T Consensus 12 ~~~G~~~~~~~~~~----~~~~i~~v~~~s~a~~~gl~~-----------GD~I~~In~~~v~~~~~~~~~~~~~~~~~~ 76 (85)
T smart00228 12 GGLGFSLVGGKDEG----GGVVVSSVVPGSPAAKAGLKV-----------GDVILEVNGTSVEGLTHLEAVDLLKKAGGK 76 (85)
T ss_pred CcccEEEECCCCCC----CCEEEEEECCCCHHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHHhCCCe
Confidence 57777776532111 689999999999999999999 999999999999988665544333334679
Q ss_pred EEEEEEECC
Q 013014 412 LLLQGIKQP 420 (451)
Q Consensus 412 v~l~v~R~g 420 (451)
+.+++.|++
T Consensus 77 ~~l~i~r~~ 85 (85)
T smart00228 77 VTLTVLRGG 85 (85)
T ss_pred EEEEEEeCC
Confidence 999999875
No 32
>PRK10139 serine endoprotease; Provisional
Probab=98.45 E-value=5.4e-07 Score=94.58 Aligned_cols=65 Identities=23% Similarity=0.329 Sum_probs=59.6
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNL 427 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v 427 (451)
.|++|.++.+++||+++||++ ||+|++|||++|.+|+|+.+++.+. . +++.++|+|+|+...+.+
T Consensus 390 ~Gv~V~~V~~~spA~~aGL~~-----------GD~I~~Ing~~v~~~~~~~~~l~~~-~-~~v~l~v~R~g~~~~~~~ 454 (455)
T PRK10139 390 KGIKIDEVVKGSPAAQAGLQK-----------DDVIIGVNRDRVNSIAEMRKVLAAK-P-AIIALQIVRGNESIYLLL 454 (455)
T ss_pred CceEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhC-C-CeEEEEEEECCEEEEEEe
Confidence 589999999999999999999 9999999999999999999999863 3 789999999999877765
No 33
>cd00992 PDZ_signaling PDZ domain found in a variety of Eumetazoan signaling molecules, often in tandem arrangements. May be responsible for specific protein-protein interactions, as most PDZ domains bind C-terminal polypeptides, and binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. In this subfamily of PDZ domains an N-terminal beta-strand forms the peptide-binding groove base, a circular permutation with respect to PDZ domains found in proteases.
Probab=98.32 E-value=1.9e-06 Score=68.07 Aligned_cols=69 Identities=32% Similarity=0.490 Sum_probs=54.6
Q ss_pred cccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcC--CHHHHHHHHhcCCC
Q 013014 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS--NGSDLYRILDQCKV 408 (451)
Q Consensus 331 ~~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~--~~~dl~~~l~~~~~ 408 (451)
...+|+.+...... ..|++|.++.+++||+++||++ ||+|++|||+++. ++.++.+.+...
T Consensus 11 ~~~~G~~~~~~~~~----~~~~~V~~v~~~s~a~~~gl~~-----------GD~I~~ing~~i~~~~~~~~~~~l~~~-- 73 (82)
T cd00992 11 GGGLGFSLRGGKDS----GGGIFVSRVEPGGPAERGGLRV-----------GDRILEVNGVSVEGLTHEEAVELLKNS-- 73 (82)
T ss_pred CCCcCEEEeCcccC----CCCeEEEEECCCChHHhCCCCC-----------CCEEEEECCEEcCccCHHHHHHHHHhC--
Confidence 35677777653211 3689999999999999999999 9999999999999 889999888763
Q ss_pred CCEEEEEE
Q 013014 409 GDELLLQG 416 (451)
Q Consensus 409 g~~v~l~v 416 (451)
+..+.+++
T Consensus 74 ~~~v~l~v 81 (82)
T cd00992 74 GDEVTLTV 81 (82)
T ss_pred CCeEEEEE
Confidence 23666654
No 34
>PF00863 Peptidase_C4: Peptidase family C4 This family belongs to family C4 of the peptidase classification.; InterPro: IPR001730 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. Nuclear inclusion A (NIA) proteases from potyviruses are cysteine peptidases belong to the MEROPS peptidase family C4 (NIa protease family, clan PA(C)) [, ]. Potyviruses include plant viruses in which the single-stranded RNA encodes a polyprotein with NIA protease activity, where proteolytic cleavage is specific for Gln+Gly sites. The NIA protease acts on the polyprotein, releasing itself by Gln+Gly cleavage at both the N- and C-termini. It further processes the polyprotein by cleavage at five similar sites in the C-terminal half of the sequence. In addition to its C-terminal protease activity, the NIA protease contains an N-terminal domain that has been implicated in the transcription process []. This peptidase is present in the nuclear inclusion protein of potyviruses.; GO: 0008234 cysteine-type peptidase activity, 0006508 proteolysis; PDB: 3MMG_B 1Q31_B 1LVB_A 1LVM_A.
Probab=98.32 E-value=6.5e-05 Score=71.35 Aligned_cols=162 Identities=16% Similarity=0.266 Sum_probs=83.8
Q ss_pred HhCCceEEEEeeeeccCccccccccccCeEEEEEEEcCCCeEEecccccCC-CCcEEEEeCCCcEEEEE-----EEEEcC
Q 013014 124 ENTPSVVNITNLAARQDAFTLDVLEVPQGSGSGFVWDSKGHVVTNYHVIRG-ASDIRVTFADQSAYDAK-----IVGFDQ 197 (451)
Q Consensus 124 ~~~~SVV~I~~~~~~~~~~~~~~~~~~~~~GSGfiI~~~G~ILT~aHVv~~-~~~i~V~~~dg~~~~a~-----vv~~d~ 197 (451)
-+...|++|...+... ...=-|+... .+|+|++|..+. ...++|...-|.- ... -+..-+
T Consensus 15 ~Ia~~ic~l~n~s~~~-----------~~~l~gigyG--~~iItn~HLf~~nng~L~i~s~hG~f-~v~nt~~lkv~~i~ 80 (235)
T PF00863_consen 15 PIASNICRLTNESDGG-----------TRSLYGIGYG--SYIITNAHLFKRNNGELTIKSQHGEF-TVPNTTQLKVHPIE 80 (235)
T ss_dssp HHHTTEEEEEEEETTE-----------EEEEEEEEET--TEEEEEGGGGSSTTCEEEEEETTEEE-EECEGGGSEEEE-T
T ss_pred hhhheEEEEEEEeCCC-----------eEEEEEEeEC--CEEEEChhhhccCCCeEEEEeCceEE-EcCCccccceEEeC
Confidence 3455678887533211 1223466665 389999999964 4567777766632 221 223346
Q ss_pred CCCEEEEEEcCCCCCCcceecC-CCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccCCC
Q 013014 198 DKDVAVLRIDAPKDKLRPIPIG-VSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINP 276 (451)
Q Consensus 198 ~~DlAlLkv~~~~~~~~~l~l~-~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~ 276 (451)
..||.++|+.. ++||.+-. .-..++.+|.|+++|.-+..... ...|+......... ...+-..-.....
T Consensus 81 ~~DiviirmPk---DfpPf~~kl~FR~P~~~e~v~mVg~~fq~k~~--~s~vSesS~i~p~~-----~~~fWkHwIsTk~ 150 (235)
T PF00863_consen 81 GRDIVIIRMPK---DFPPFPQKLKFRAPKEGERVCMVGSNFQEKSI--SSTVSESSWIYPEE-----NSHFWKHWISTKD 150 (235)
T ss_dssp CSSEEEEE--T---TS----S---B----TT-EEEEEEEECSSCCC--EEEEEEEEEEEEET-----TTTEEEE-C---T
T ss_pred CccEEEEeCCc---ccCCcchhhhccCCCCCCEEEEEEEEEEcCCe--eEEECCceEEeecC-----CCCeeEEEecCCC
Confidence 88999999964 46666432 23458899999999975543332 11222221111111 1244555666678
Q ss_pred CCCCCceeC-CCceEEEEEeeeeCCCCCccceeEEEec
Q 013014 277 GNSGGPLLD-SSGSLIGINTAIYSPSGASSGVGFSIPV 313 (451)
Q Consensus 277 G~SGGPlvd-~~G~VVGI~s~~~~~~~~~~~~~~aIP~ 313 (451)
|+=|.|+++ .+|++|||++..... ...+|+.|+
T Consensus 151 G~CG~PlVs~~Dg~IVGiHsl~~~~----~~~N~F~~f 184 (235)
T PF00863_consen 151 GDCGLPLVSTKDGKIVGIHSLTSNT----SSRNYFTPF 184 (235)
T ss_dssp T-TT-EEEETTT--EEEEEEEEETT----TSSEEEEE-
T ss_pred CccCCcEEEcCCCcEEEEEcCccCC----CCeEEEEcC
Confidence 999999998 599999999976543 345677766
No 35
>PRK10942 serine endoprotease; Provisional
Probab=98.32 E-value=1.7e-06 Score=91.24 Aligned_cols=65 Identities=32% Similarity=0.434 Sum_probs=59.3
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNL 427 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v 427 (451)
.|++|.+|.++++|+++||++ ||+|++|||++|.+++|+.+++.. + ++.+.++|.|+|+.+.+.+
T Consensus 408 ~gvvV~~V~~~S~A~~aGL~~-----------GDvIv~VNg~~V~s~~dl~~~l~~-~-~~~v~l~V~R~g~~~~v~~ 472 (473)
T PRK10942 408 KGVVVDNVKPGTPAAQIGLKK-----------GDVIIGANQQPVKNIAELRKILDS-K-PSVLALNIQRGDSSIYLLM 472 (473)
T ss_pred CCeEEEEeCCCChHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHh-C-CCeEEEEEEECCEEEEEEe
Confidence 489999999999999999999 999999999999999999999987 3 3789999999998877664
No 36
>PLN00049 carboxyl-terminal processing protease; Provisional
Probab=98.32 E-value=2.5e-06 Score=88.00 Aligned_cols=69 Identities=25% Similarity=0.395 Sum_probs=59.0
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDELLLQGIKQPPVLSDNL 427 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v 427 (451)
.|++|..|.++|||+++||++ ||+|++|||++|.++ .++...+.. ..|+.+.++|.|+|+..++++
T Consensus 102 ~g~~V~~V~~~SPA~~aGl~~-----------GD~Iv~InG~~v~~~~~~~~~~~l~g-~~g~~v~ltv~r~g~~~~~~l 169 (389)
T PLN00049 102 AGLVVVAPAPGGPAARAGIRP-----------GDVILAIDGTSTEGLSLYEAADRLQG-PEGSSVELTLRRGPETRLVTL 169 (389)
T ss_pred CcEEEEEeCCCChHHHcCCCC-----------CCEEEEECCEECCCCCHHHHHHHHhc-CCCCEEEEEEEECCEEEEEEE
Confidence 489999999999999999999 999999999999865 566666654 578999999999998877777
Q ss_pred Eee
Q 013014 428 RLL 430 (451)
Q Consensus 428 ~l~ 430 (451)
+-.
T Consensus 170 ~r~ 172 (389)
T PLN00049 170 TRE 172 (389)
T ss_pred Eee
Confidence 543
No 37
>PF00595 PDZ: PDZ domain (Also known as DHR or GLGF) Coordinates are not yet available; InterPro: IPR001478 PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [, ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.; GO: 0005515 protein binding; PDB: 3AXA_A 1WF8_A 1QAV_B 1QAU_A 1B8Q_A 1MC7_A 2KAW_A 1I16_A 1VB7_A 1WI4_A ....
Probab=98.28 E-value=8.5e-07 Score=70.35 Aligned_cols=71 Identities=27% Similarity=0.429 Sum_probs=55.1
Q ss_pred cccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCC
Q 013014 331 RPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKV 408 (451)
Q Consensus 331 ~~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~ 408 (451)
...+|+.+....... ..|++|.++.++++|+++||++ ||.|++|||+.+.++ .++.+++...
T Consensus 9 ~~~lG~~l~~~~~~~---~~~~~V~~v~~~~~a~~~gl~~-----------GD~Il~INg~~v~~~~~~~~~~~l~~~-- 72 (81)
T PF00595_consen 9 NGPLGFTLRGGSDND---EKGVFVSSVVPGSPAERAGLKV-----------GDRILEINGQSVRGMSHDEVVQLLKSA-- 72 (81)
T ss_dssp TSBSSEEEEEESTSS---SEEEEEEEECTTSHHHHHTSST-----------TEEEEEETTEESTTSBHHHHHHHHHHS--
T ss_pred CCCcCEEEEecCCCC---cCCEEEEEEeCCChHHhcccch-----------hhhhheeCCEeCCCCCHHHHHHHHHCC--
Confidence 457788877533111 3589999999999999999999 999999999999987 4566666653
Q ss_pred CCEEEEEEE
Q 013014 409 GDELLLQGI 417 (451)
Q Consensus 409 g~~v~l~v~ 417 (451)
+++++|+|+
T Consensus 73 ~~~v~L~V~ 81 (81)
T PF00595_consen 73 SNPVTLTVQ 81 (81)
T ss_dssp TSEEEEEEE
T ss_pred CCcEEEEEC
Confidence 348888874
No 38
>PF14685 Tricorn_PDZ: Tricorn protease PDZ domain; PDB: 1N6F_D 1N6D_C 1N6E_C 1K32_A.
Probab=98.27 E-value=4.7e-06 Score=67.18 Aligned_cols=78 Identities=28% Similarity=0.516 Sum_probs=49.7
Q ss_pred cccceeccchhhhhhCccceEEEecCCC--------CcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHh
Q 013014 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPN--------GPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILD 404 (451)
Q Consensus 333 ~lGi~~~~~~~~~~~~~~Gv~V~~v~~~--------s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~ 404 (451)
.||..+.... .+..|.++.++ ||-.+.|+. +++||+|++|||+++....++..+|.
T Consensus 2 ~LGAd~~~~~-------~~y~I~~I~~gd~~~~~~~sPL~~pGv~---------v~~GD~I~aInG~~v~~~~~~~~lL~ 65 (88)
T PF14685_consen 2 LLGADFSYDN-------GGYRIARIYPGDPWNPNARSPLAQPGVD---------VREGDYILAINGQPVTADANPYRLLE 65 (88)
T ss_dssp B-SEEEEEET-------TEEEEEEE-BS-TTSSS-B-GGGGGS-------------TT-EEEEETTEE-BTTB-HHHHHH
T ss_pred ccceEEEEcC-------CEEEEEEEeCCCCCCccccCCccCCCCC---------CCCCCEEEEECCEECCCCCCHHHHhc
Confidence 5666665432 45667777764 555555544 46799999999999999999999988
Q ss_pred cCCCCCEEEEEEEECC-eEEEEEE
Q 013014 405 QCKVGDELLLQGIKQP-PVLSDNL 427 (451)
Q Consensus 405 ~~~~g~~v~l~v~R~g-~~~~~~v 427 (451)
. +.|+.|.|+|.+.+ +.+++.|
T Consensus 66 ~-~agk~V~Ltv~~~~~~~R~v~V 88 (88)
T PF14685_consen 66 G-KAGKQVLLTVNRKPGGARTVVV 88 (88)
T ss_dssp T-TTTSEEEEEEE-STT-EEEEEE
T ss_pred c-cCCCEEEEEEecCCCCceEEEC
Confidence 6 57999999999976 4555543
No 39
>TIGR03279 cyano_FeS_chp putative FeS-containing Cyanobacterial-specific oxidoreductase. Members of this protein family are predicted FeS-containing oxidoreductases of unknown function, apparently restricted to and universal across the Cyanobacteria. The high trusted cutoff score for this model, 700 bits, excludes homologs from other lineages. This exclusion seems justified because a significant number of sequence positions are simultaneously unique to and invariant across the Cyanobacteria, suggesting a specialized, conserved function, perhaps related to photosynthesis. A distantly related protein family, TIGR03278, in universal in and restricted to archaeal methanogens, and may be linked to methanogenesis.
Probab=98.22 E-value=2.6e-06 Score=87.40 Aligned_cols=63 Identities=21% Similarity=0.215 Sum_probs=54.5
Q ss_pred EEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEE-ECCeEEEEEEEee
Q 013014 353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGI-KQPPVLSDNLRLL 430 (451)
Q Consensus 353 ~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~-R~g~~~~~~v~l~ 430 (451)
+|.+|.++|+|+++||++ ||+|++|||++|.+|.|+...+. ++.+.++|. |+|+..++++...
T Consensus 1 ~I~~V~pgSpAe~AGLe~-----------GD~IlsING~~V~Dw~D~~~~l~----~e~l~L~V~~rdGe~~~l~Ie~~ 64 (433)
T TIGR03279 1 LISAVLPGSIAEELGFEP-----------GDALVSINGVAPRDLIDYQFLCA----DEELELEVLDANGESHQIEIEKD 64 (433)
T ss_pred CcCCcCCCCHHHHcCCCC-----------CCEEEEECCEECCCHHHHHHHhc----CCcEEEEEEcCCCeEEEEEEecC
Confidence 367889999999999999 99999999999999999887774 467899997 8998888877653
No 40
>TIGR00054 RIP metalloprotease RseP. A model that detects fragments as well matches a number of members of the PEPTIDASE FAMILY S2C. The region of match appears not to overlap the active site domain.
Probab=98.20 E-value=2e-06 Score=89.51 Aligned_cols=67 Identities=22% Similarity=0.225 Sum_probs=59.6
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.|++|.+|.++|||+++||++ ||+|++|||+++.++.|+.+.+.... +++.+++.|+++..++++++
T Consensus 128 ~g~~V~~V~~~SpA~~AGL~~-----------GDvI~~vng~~v~~~~dl~~~ia~~~--~~v~~~I~r~g~~~~l~v~l 194 (420)
T TIGR00054 128 VGPVIELLDKNSIALEAGIEP-----------GDEILSVNGNKIPGFKDVRQQIADIA--GEPMVEILAERENWTFEVMK 194 (420)
T ss_pred CCceeeccCCCCHHHHcCCCC-----------CCEEEEECCEEcCCHHHHHHHHHhhc--ccceEEEEEecCceEecccc
Confidence 588999999999999999999 99999999999999999999888755 68899999998887755443
No 41
>KOG3627 consensus Trypsin [Amino acid transport and metabolism]
Probab=98.20 E-value=6.4e-05 Score=72.47 Aligned_cols=168 Identities=23% Similarity=0.280 Sum_probs=92.3
Q ss_pred EEEEEEEcCCCeEEecccccCCCC--cEEEEeCC---------C---cEEEE-EEEEEcC-------C-CCEEEEEEcCC
Q 013014 153 SGSGFVWDSKGHVVTNYHVIRGAS--DIRVTFAD---------Q---SAYDA-KIVGFDQ-------D-KDVAVLRIDAP 209 (451)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHVv~~~~--~i~V~~~d---------g---~~~~a-~vv~~d~-------~-~DlAlLkv~~~ 209 (451)
.+.|.+|+++ +|+|++||+.+.. .+.|.+.. + ..... +++ .|+ . +|||+|+++.+
T Consensus 39 ~Cggsli~~~-~vltaaHC~~~~~~~~~~V~~G~~~~~~~~~~~~~~~~~~v~~~i-~H~~y~~~~~~~nDiall~l~~~ 116 (256)
T KOG3627|consen 39 LCGGSLISPR-WVLTAAHCVKGASASLYTVRLGEHDINLSVSEGEEQLVGDVEKII-VHPNYNPRTLENNDIALLRLSEP 116 (256)
T ss_pred eeeeEEeeCC-EEEEChhhCCCCCCcceEEEECccccccccccCchhhhceeeEEE-ECCCCCCCCCCCCCEEEEEECCC
Confidence 6778788655 9999999999876 66666531 1 11111 232 332 3 79999999864
Q ss_pred C---CCCcceecCCCCC---CCCCcEEEEeeCCCCCC------CceEEeEEeeeeeeeccCCCCC---CcccEEEEc---
Q 013014 210 K---DKLRPIPIGVSAD---LLVGQKVYAIGNPFGLD------HTLTTGVISGLRREISSAATGR---PIQDVIQTD--- 271 (451)
Q Consensus 210 ~---~~~~~l~l~~s~~---~~~G~~V~~iG~p~g~~------~~~~~G~Vs~~~~~~~~~~~~~---~~~~~i~~d--- 271 (451)
. ..+.|+.|..... ...+..+++.||+.... .......+.-+........... .....+...
T Consensus 117 v~~~~~i~piclp~~~~~~~~~~~~~~~v~GWG~~~~~~~~~~~~L~~~~v~i~~~~~C~~~~~~~~~~~~~~~Ca~~~~ 196 (256)
T KOG3627|consen 117 VTFSSHIQPICLPSSADPYFPPGGTTCLVSGWGRTESGGGPLPDTLQEVDVPIISNSECRRAYGGLGTITDTMLCAGGPE 196 (256)
T ss_pred cccCCcccccCCCCCcccCCCCCCCEEEEEeCCCcCCCCCCCCceeEEEEEeEcChhHhcccccCccccCCCEEeeCccC
Confidence 3 3456666642322 34458888899754321 1222222222222111111110 001223322
Q ss_pred --ccCCCCCCCCceeCCC---ceEEEEEeeeeCCCCCccceeEEEeccCchhhHHH
Q 013014 272 --AAINPGNSGGPLLDSS---GSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQ 322 (451)
Q Consensus 272 --~~i~~G~SGGPlvd~~---G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~ 322 (451)
...+.|+|||||+-.+ ..++||++++...++....-+....+....+++++
T Consensus 197 ~~~~~C~GDSGGPLv~~~~~~~~~~GivS~G~~~C~~~~~P~vyt~V~~y~~WI~~ 252 (256)
T KOG3627|consen 197 GGKDACQGDSGGPLVCEDNGRWVLVGIVSWGSGGCGQPNYPGVYTRVSSYLDWIKE 252 (256)
T ss_pred CCCccccCCCCCeEEEeeCCcEEEEEEEEecCCCCCCCCCCeEEeEhHHhHHHHHH
Confidence 2357899999999764 69999999876544432122334445555555544
No 42
>COG0793 Prc Periplasmic protease [Cell envelope biogenesis, outer membrane]
Probab=98.15 E-value=7.7e-06 Score=84.51 Aligned_cols=81 Identities=26% Similarity=0.413 Sum_probs=64.3
Q ss_pred ccccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHH--HHHHHHhcCC
Q 013014 330 TRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCK 407 (451)
Q Consensus 330 ~~~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~--dl~~~l~~~~ 407 (451)
....+|+++..... .++.|.++.+++||+++|+++ ||+|++|||+++.... +..+.+.. +
T Consensus 98 ~~~GiG~~i~~~~~------~~~~V~s~~~~~PA~kagi~~-----------GD~I~~IdG~~~~~~~~~~av~~irG-~ 159 (406)
T COG0793 98 EFGGIGIELQMEDI------GGVKVVSPIDGSPAAKAGIKP-----------GDVIIKIDGKSVGGVSLDEAVKLIRG-K 159 (406)
T ss_pred cccceeEEEEEecC------CCcEEEecCCCChHHHcCCCC-----------CCEEEEECCEEccCCCHHHHHHHhCC-C
Confidence 56778888775331 678999999999999999999 9999999999999884 45555554 6
Q ss_pred CCCEEEEEEEECCe--EEEEEEE
Q 013014 408 VGDELLLQGIKQPP--VLSDNLR 428 (451)
Q Consensus 408 ~g~~v~l~v~R~g~--~~~~~v~ 428 (451)
+|..|+|++.|.+. ..++++.
T Consensus 160 ~Gt~V~L~i~r~~~~k~~~v~l~ 182 (406)
T COG0793 160 PGTKVTLTILRAGGGKPFTVTLT 182 (406)
T ss_pred CCCeEEEEEEEcCCCceeEEEEE
Confidence 89999999999744 4444443
No 43
>COG3480 SdrC Predicted secreted protein containing a PDZ domain [Signal transduction mechanisms]
Probab=98.15 E-value=9.8e-06 Score=78.79 Aligned_cols=72 Identities=25% Similarity=0.369 Sum_probs=64.6
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEE-CCeEEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIK-QPPVLSDNLR 428 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R-~g~~~~~~v~ 428 (451)
.||++..+..++++.. +|..||.|++|||+++.+.+|+.+.+.+.++|++|++++.| +++...++++
T Consensus 130 ~gvyv~~v~~~~~~~g------------kl~~gD~i~avdg~~f~s~~e~i~~v~~~k~Gd~VtI~~~r~~~~~~~~~~t 197 (342)
T COG3480 130 AGVYVLSVIDNSPFKG------------KLEAGDTIIAVDGEPFTSSDELIDYVSSKKPGDEVTIDYERHNETPEIVTIT 197 (342)
T ss_pred eeEEEEEccCCcchhc------------eeccCCeEEeeCCeecCCHHHHHHHHhccCCCCeEEEEEEeccCCCceEEEE
Confidence 7999999999998853 34459999999999999999999999999999999999996 8888999999
Q ss_pred eeecC
Q 013014 429 LLWSE 433 (451)
Q Consensus 429 l~~~~ 433 (451)
+...+
T Consensus 198 l~~~~ 202 (342)
T COG3480 198 LIKND 202 (342)
T ss_pred EEeec
Confidence 87764
No 44
>PRK09681 putative type II secretion protein GspC; Provisional
Probab=98.00 E-value=1.9e-05 Score=76.65 Aligned_cols=58 Identities=16% Similarity=0.243 Sum_probs=52.5
Q ss_pred CcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 013014 361 GPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 361 s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.--+++||++ ||++++|||..+++.++..++++..+....++++|+|||+..++.+.|
T Consensus 218 ~lF~~~GLq~-----------GDva~sING~dL~D~~qa~~l~~~L~~~tei~ltVeRdGq~~~i~i~l 275 (276)
T PRK09681 218 SLFDASGFKE-----------GDIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIAL 275 (276)
T ss_pred HHHHHcCCCC-----------CCEEEEeCCeeCCCHHHHHHHHHHhccCCeEEEEEEECCEEEEEEEEc
Confidence 3456789999 999999999999999999999988888899999999999999988875
No 45
>PF04495 GRASP55_65: GRASP55/65 PDZ-like domain ; InterPro: IPR007583 GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system [].; PDB: 3RLE_A 4EDJ_A.
Probab=97.85 E-value=4.1e-05 Score=67.16 Aligned_cols=87 Identities=25% Similarity=0.337 Sum_probs=58.6
Q ss_pred ccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCE
Q 013014 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE 411 (451)
Q Consensus 332 ~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~ 411 (451)
+.||+.++-.... .....+.-|.+|.++|||++|||++ -.|.|+.+|+..+.+.++|.+.+.. ..++.
T Consensus 26 g~LG~sv~~~~~~-~~~~~~~~Vl~V~p~SPA~~AGL~p----------~~DyIig~~~~~l~~~~~l~~~v~~-~~~~~ 93 (138)
T PF04495_consen 26 GLLGISVRFESFE-GAEEEGWHVLRVAPNSPAAKAGLEP----------FFDYIIGIDGGLLDDEDDLFELVEA-NENKP 93 (138)
T ss_dssp SSS-EEEEEEE-T-TGCCCEEEEEEE-TTSHHHHTT--T----------TTEEEEEETTCE--STCHHHHHHHH-TTTS-
T ss_pred CCCcEEEEEeccc-ccccceEEEeEecCCCHHHHCCccc----------cccEEEEccceecCCHHHHHHHHHH-cCCCc
Confidence 6788877643211 0112577899999999999999997 2599999999999999999999987 46899
Q ss_pred EEEEEEEC--CeEEEEEEEee
Q 013014 412 LLLQGIKQ--PPVLSDNLRLL 430 (451)
Q Consensus 412 v~l~v~R~--g~~~~~~v~l~ 430 (451)
+.+.|... .+.+++++++.
T Consensus 94 l~L~Vyns~~~~vR~V~i~P~ 114 (138)
T PF04495_consen 94 LQLYVYNSKTDSVREVTITPS 114 (138)
T ss_dssp EEEEEEETTTTCEEEEEE---
T ss_pred EEEEEEECCCCeEEEEEEEcC
Confidence 99999863 44556666653
No 46
>KOG3129 consensus 26S proteasome regulatory complex, subunit PSMD9 [Posttranslational modification, protein turnover, chaperones]
Probab=97.82 E-value=5.1e-05 Score=69.58 Aligned_cols=74 Identities=23% Similarity=0.164 Sum_probs=63.1
Q ss_pred ceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHH--HhcCCCCCEEEEEEEECCeEEEEEEE
Q 013014 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRI--LDQCKVGDELLLQGIKQPPVLSDNLR 428 (451)
Q Consensus 351 Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~--l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (451)
=++|..|.++|||+++||+. ||.|++++...-.|+..|..+ +.+...++.+.+++.|.|+...+.++
T Consensus 140 Fa~V~sV~~~SPA~~aGl~~-----------gD~il~fGnV~sgn~~~lq~i~~~v~~~e~~~v~v~v~R~g~~v~L~lt 208 (231)
T KOG3129|consen 140 FAVVDSVVPGSPADEAGLCV-----------GDEILKFGNVHSGNFLPLQNIAAVVQSNEDQIVSVTVIREGQKVVLSLT 208 (231)
T ss_pred eEEEeecCCCChhhhhCccc-----------CceEEEecccccccchhHHHHHHHHHhccCcceeEEEecCCCEEEEEeC
Confidence 36799999999999999999 999999999888888766653 33456789999999999999999999
Q ss_pred eeecCCC
Q 013014 429 LLWSEER 435 (451)
Q Consensus 429 l~~~~~~ 435 (451)
+..|..+
T Consensus 209 P~~W~Gr 215 (231)
T KOG3129|consen 209 PKKWQGR 215 (231)
T ss_pred cccccCC
Confidence 8877654
No 47
>PRK11186 carboxy-terminal protease; Provisional
Probab=97.70 E-value=0.00011 Score=80.07 Aligned_cols=79 Identities=22% Similarity=0.276 Sum_probs=58.6
Q ss_pred ccccceeccchhhhhhCccceEEEecCCCCccccc-CceeeecccCCCCCCCcEEEEEC--CEEcCC-----HHHHHHHH
Q 013014 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKA-GLLSTKRDAYGRLILGDIITSVN--GKKVSN-----GSDLYRIL 403 (451)
Q Consensus 332 ~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~a-Gl~~~~~~~~~~L~~GDiIl~vn--G~~i~~-----~~dl~~~l 403 (451)
.-+|+.+.... .+++|.++.+++||+++ ||++ ||+|++|| |+++.+ .+++...|
T Consensus 244 ~GIGa~l~~~~-------~~~~V~~vipGsPA~ka~gLk~-----------GD~IlaVn~~g~~~~dv~g~~~~~vv~li 305 (667)
T PRK11186 244 EGIGAVLQMDD-------DYTVINSLVAGGPAAKSKKLSV-----------GDKIVGVGQDGKPIVDVIGWRLDDVVALI 305 (667)
T ss_pred eEEEEEEEEeC-------CeEEEEEccCCChHHHhCCCCC-----------CCEEEEECCCCCcccccccCCHHHHHHHh
Confidence 45666665432 46889999999999998 9999 99999999 554433 24677666
Q ss_pred hcCCCCCEEEEEEEEC---CeEEEEEEEe
Q 013014 404 DQCKVGDELLLQGIKQ---PPVLSDNLRL 429 (451)
Q Consensus 404 ~~~~~g~~v~l~v~R~---g~~~~~~v~l 429 (451)
.. ..|.+|.|+|.|+ ++..+++++-
T Consensus 306 rG-~~Gt~V~LtV~r~~~~~~~~~vtl~R 333 (667)
T PRK11186 306 KG-PKGSKVRLEILPAGKGTKTRIVTLTR 333 (667)
T ss_pred cC-CCCCEEEEEEEeCCCCCceEEEEEEe
Confidence 54 5799999999994 4556666653
No 48
>PF05579 Peptidase_S32: Equine arteritis virus serine endopeptidase S32; InterPro: IPR008760 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to MEROPS peptidase family S32 (clan PA(S)). The type example is equine arteritis virus serine endopeptidase (equine arteritis virus), which is involved in processing of nidovirus polyproteins [].; GO: 0004252 serine-type endopeptidase activity, 0016032 viral reproduction, 0019082 viral protein processing; PDB: 3FAN_A 3FAO_A 1MBM_A.
Probab=97.67 E-value=0.00038 Score=66.36 Aligned_cols=116 Identities=26% Similarity=0.391 Sum_probs=61.5
Q ss_pred CeEEEEEEEcCCC--eEEecccccCCCCcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCc
Q 013014 151 QGSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQ 228 (451)
Q Consensus 151 ~~~GSGfiI~~~G--~ILT~aHVv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~ 228 (451)
.+.|||=+...+| .|+|+.||+. .+...|.. .+... ..-++..-|+|.-.++.-...+|.++++... .|.
T Consensus 111 ss~Gsggvft~~~~~vvvTAtHVlg-~~~a~v~~-~g~~~---~~tF~~~GDfA~~~~~~~~G~~P~~k~a~~~---~Gr 182 (297)
T PF05579_consen 111 SSVGSGGVFTIGGNTVVVTATHVLG-GNTARVSG-VGTRR---MLTFKKNGDFAEADITNWPGAAPKYKFAQNY---TGR 182 (297)
T ss_dssp SSEEEEEEEECTTEEEEEEEHHHCB-TTEEEEEE-TTEEE---EEEEEEETTEEEEEETTS-S---B--B-TT----SEE
T ss_pred ecccccceEEECCeEEEEEEEEEcC-CCeEEEEe-cceEE---EEEEeccCcEEEEECCCCCCCCCceeecCCc---ccc
Confidence 4456555555444 5999999998 44445544 33222 2244556699999995433467777775221 232
Q ss_pred EEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccCCCCCCCCceeCCCceEEEEEeee
Q 013014 229 KVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI 297 (451)
Q Consensus 229 ~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~ 297 (451)
--+.- ..-+..|.|..- ..+ +-..+|+||+|+++.+|.+||||++.
T Consensus 183 AyW~t------~tGvE~G~ig~~--------------~~~---~fT~~GDSGSPVVt~dg~liGVHTGS 228 (297)
T PF05579_consen 183 AYWLT------STGVEPGFIGGG--------------GAV---CFTGPGDSGSPVVTEDGDLIGVHTGS 228 (297)
T ss_dssp EEEEE------TTEEEEEEEETT--------------EEE---ESS-GGCTT-EEEETTC-EEEEEEEE
T ss_pred eEEEc------ccCcccceecCc--------------eEE---EEcCCCCCCCccCcCCCCEEEEEecC
Confidence 22111 122334444211 112 23357999999999999999999975
No 49
>COG3975 Predicted protease with the C-terminal PDZ domain [General function prediction only]
Probab=97.66 E-value=5.7e-05 Score=78.19 Aligned_cols=63 Identities=30% Similarity=0.340 Sum_probs=55.6
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEEe
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.+.+|..|.++|||++|||.+ ||.|++|||. .+.+...++++++++++.|.|+.+++.+++
T Consensus 462 g~~~i~~V~~~gPA~~AGl~~-----------Gd~ivai~G~--------s~~l~~~~~~d~i~v~~~~~~~L~e~~v~~ 522 (558)
T COG3975 462 GHEKITFVFPGGPAYKAGLSP-----------GDKIVAINGI--------SDQLDRYKVNDKIQVHVFREGRLREFLVKL 522 (558)
T ss_pred CeeEEEecCCCChhHhccCCC-----------ccEEEEEcCc--------cccccccccccceEEEEccCCceEEeeccc
Confidence 467899999999999999999 9999999999 334566788999999999999999998887
Q ss_pred ee
Q 013014 430 LW 431 (451)
Q Consensus 430 ~~ 431 (451)
..
T Consensus 523 ~~ 524 (558)
T COG3975 523 GG 524 (558)
T ss_pred CC
Confidence 53
No 50
>COG5640 Secreted trypsin-like serine protease [Posttranslational modification, protein turnover, chaperones]
Probab=97.37 E-value=0.008 Score=59.84 Aligned_cols=55 Identities=18% Similarity=0.239 Sum_probs=41.2
Q ss_pred ccCCCCCCCCceeCC--Cc-eEEEEEeeeeCCCCCccceeEEEeccCchhhHHHhhhc
Q 013014 272 AAINPGNSGGPLLDS--SG-SLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKF 326 (451)
Q Consensus 272 ~~i~~G~SGGPlvd~--~G-~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~ 326 (451)
...|.|+||||+|-. +| +-+||++++...|+...-.+...-++....||+..++.
T Consensus 223 ~daCqGDSGGPi~~~g~~G~vQ~GVvSwG~~~Cg~t~~~gVyT~vsny~~WI~a~~~~ 280 (413)
T COG5640 223 KDACQGDSGGPIFHKGEEGRVQRGVVSWGDGGCGGTLIPGVYTNVSNYQDWIAAMTNG 280 (413)
T ss_pred cccccCCCCCceEEeCCCccEEEeEEEecCCCCCCCCcceeEEehhHHHHHHHHHhcC
Confidence 456889999999963 35 46999999988777655555556677888888886654
No 51
>COG3031 PulC Type II secretory pathway, component PulC [Intracellular trafficking and secretion]
Probab=97.14 E-value=0.00089 Score=62.82 Aligned_cols=59 Identities=25% Similarity=0.347 Sum_probs=52.6
Q ss_pred CCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 013014 359 PNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQPPVLSDNLR 428 (451)
Q Consensus 359 ~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (451)
+.+.-++.||+. |||.+++|+..+++.+++..++++...-+.++++|+|+|+..++.|.
T Consensus 216 d~slF~~sglq~-----------GDIavaiNnldltdp~~m~~llq~l~~m~s~qlTv~R~G~rhdInV~ 274 (275)
T COG3031 216 DGSLFYKSGLQR-----------GDIAVAINNLDLTDPEDMFRLLQMLRNMPSLQLTVIRRGKRHDINVR 274 (275)
T ss_pred CcchhhhhcCCC-----------cceEEEecCcccCCHHHHHHHHHhhhcCcceEEEEEecCccceeeec
Confidence 456677789998 99999999999999999999999877778899999999999988875
No 52
>PF12812 PDZ_1: PDZ-like domain
Probab=97.00 E-value=0.0018 Score=51.07 Aligned_cols=64 Identities=28% Similarity=0.390 Sum_probs=50.4
Q ss_pred ccccceecc--chhhhhhCc-cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcC
Q 013014 332 PILGIKFAP--DQSVEQLGV-SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (451)
Q Consensus 332 ~~lGi~~~~--~~~~~~~~~-~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~ 406 (451)
-|.|..|.+ .+.++++++ -|+++.....++++...|+.. |-+|++|||+++.+.++|.+++.+.
T Consensus 9 ~~~Ga~f~~Ls~q~aR~~~~~~~gv~v~~~~g~~~~~~~i~~-----------g~iI~~Vn~kpt~~Ld~f~~vvk~i 75 (78)
T PF12812_consen 9 EVCGAVFHDLSYQQARQYGIPVGGVYVAVSGGSLAFAGGISK-----------GFIITSVNGKPTPDLDDFIKVVKKI 75 (78)
T ss_pred EEcCeecccCCHHHHHHhCCCCCEEEEEecCCChhhhCCCCC-----------CeEEEeECCcCCcCHHHHHHHHHhC
Confidence 477888886 446777776 344555666778877766888 9999999999999999999998764
No 53
>KOG3553 consensus Tax interaction protein TIP1 [Cell wall/membrane/envelope biogenesis]
Probab=96.88 E-value=0.00082 Score=54.48 Aligned_cols=32 Identities=38% Similarity=0.453 Sum_probs=30.4
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKK 392 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~ 392 (451)
.|+||++|.++|||+.|||+. +|.|+.|||..
T Consensus 59 ~GiYvT~V~eGsPA~~AGLri-----------hDKIlQvNG~D 90 (124)
T KOG3553|consen 59 KGIYVTRVSEGSPAEIAGLRI-----------HDKILQVNGWD 90 (124)
T ss_pred ccEEEEEeccCChhhhhccee-----------cceEEEecCce
Confidence 799999999999999999999 99999999953
No 54
>PF05580 Peptidase_S55: SpoIVB peptidase S55; InterPro: IPR008763 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S55 (SpoIVB peptidase family, clan PA(S)). The protein SpoIVB plays a key role in signalling in the final sigma-K checkpoint of Bacillus subtilis [, ].
Probab=96.72 E-value=0.021 Score=53.26 Aligned_cols=166 Identities=18% Similarity=0.252 Sum_probs=89.5
Q ss_pred ccccccCeEEEEEEEcCC-CeEEecccccCCCCc-EEEEeCCCcEEEEEEEEEcCC----------------CCEEEEEE
Q 013014 145 DVLEVPQGSGSGFVWDSK-GHVVTNYHVIRGASD-IRVTFADQSAYDAKIVGFDQD----------------KDVAVLRI 206 (451)
Q Consensus 145 ~~~~~~~~~GSGfiI~~~-G~ILT~aHVv~~~~~-i~V~~~dg~~~~a~vv~~d~~----------------~DlAlLkv 206 (451)
|......+.||=.+++++ +..--=.|.+.+.+. ..+.+.+|+.|++++....+. .-+.-+.-
T Consensus 13 wVRD~~aGiGTlTf~dp~~~~fgALGH~I~D~dt~~~~~i~~G~I~~a~I~~I~kg~~G~PGe~~G~~~~~~~~~G~I~~ 92 (218)
T PF05580_consen 13 WVRDSTAGIGTLTFYDPETGTFGALGHGISDVDTGQLIPIKNGEIYEASITSIKKGKKGQPGEKIGVFDNESNILGTIEK 92 (218)
T ss_pred EEEeCCcCeEEEEEEECCCCcEEecCCeEEcCCCCceeEecCCEEEEEEEEEEecCCCcCCceEEEEECCCCceEEEEEe
Confidence 344455788999999874 556556898877654 566778888888887655421 11222222
Q ss_pred cC----------C----CCCCcceecCCCCCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCC----CcccEE
Q 013014 207 DA----------P----KDKLRPIPIGVSADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGR----PIQDVI 268 (451)
Q Consensus 207 ~~----------~----~~~~~~l~l~~s~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~----~~~~~i 268 (451)
+. . .....+++++...+++.|..-+..-. .|.....-.-.|..+.+.......+. ...+.+
T Consensus 93 Nt~~GI~G~~~~~~~~~~~~~~~~pva~~~evk~G~A~i~Tv~-~G~~ie~f~ieI~~v~~~~~~~~k~~vi~vtd~~Ll 171 (218)
T PF05580_consen 93 NTQFGIYGTLDQDDISNPSYNEPIPVAPKQEVKPGPAYILTVI-DGTKIEEFDIEIEKVLPQSSPSGKGMVIKVTDPRLL 171 (218)
T ss_pred ccccceeEEeccccccccccCceeEEEEHHHceEccEEEEEEE-cCCeEEEeEEEEEEEccCCCCCCCcEEEEECCcchh
Confidence 11 1 01234555655566777753321110 11111111112222222211100000 001223
Q ss_pred EEcccCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEeccC
Q 013014 269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDT 315 (451)
Q Consensus 269 ~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~ 315 (451)
.....+..||||+|++ .+|++||=++..+.+ +...||.++++.
T Consensus 172 ~~TGGIvqGMSGSPI~-qdGKLiGAVthvf~~---dp~~Gygi~ie~ 214 (218)
T PF05580_consen 172 EKTGGIVQGMSGSPII-QDGKLIGAVTHVFVN---DPTKGYGIFIEW 214 (218)
T ss_pred hhhCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCceeeecHHH
Confidence 3344577899999999 699999998877643 456788887643
No 55
>PF03761 DUF316: Domain of unknown function (DUF316) ; InterPro: IPR005514 This is a family of uncharacterised proteins from Caenorhabditis elegans.
Probab=96.60 E-value=0.077 Score=52.08 Aligned_cols=91 Identities=19% Similarity=0.212 Sum_probs=56.2
Q ss_pred CCCCEEEEEEcCC-CCCCcceecCCCC-CCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccC
Q 013014 197 QDKDVAVLRIDAP-KDKLRPIPIGVSA-DLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAI 274 (451)
Q Consensus 197 ~~~DlAlLkv~~~-~~~~~~l~l~~s~-~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i 274 (451)
...+++||+++.+ .....|+-|+++. ....|+.+-+.|+.. ........+.-..... ....+..+...
T Consensus 159 ~~~~~mIlEl~~~~~~~~~~~Cl~~~~~~~~~~~~~~~yg~~~--~~~~~~~~~~i~~~~~--------~~~~~~~~~~~ 228 (282)
T PF03761_consen 159 RPYSPMILELEEDFSKNVSPPCLADSSTNWEKGDEVDVYGFNS--TGKLKHRKLKITNCTK--------CAYSICTKQYS 228 (282)
T ss_pred cccceEEEEEcccccccCCCEEeCCCccccccCceEEEeecCC--CCeEEEEEEEEEEeec--------cceeEeccccc
Confidence 3469999999875 2367888887653 367889999888721 1122222221111100 12345556667
Q ss_pred CCCCCCCceeC---CCceEEEEEeee
Q 013014 275 NPGNSGGPLLD---SSGSLIGINTAI 297 (451)
Q Consensus 275 ~~G~SGGPlvd---~~G~VVGI~s~~ 297 (451)
+.|++|||++. .+..||||.+..
T Consensus 229 ~~~d~Gg~lv~~~~gr~tlIGv~~~~ 254 (282)
T PF03761_consen 229 CKGDRGGPLVKNINGRWTLIGVGASG 254 (282)
T ss_pred CCCCccCeEEEEECCCEEEEEEEccC
Confidence 78999999983 344699997654
No 56
>PF00548 Peptidase_C3: 3C cysteine protease (picornain 3C); InterPro: IPR000199 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This signature defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral C3 cysteine protease. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SJO_E 2H6M_A 1QA7_C 1HAV_B 2HAL_A 2H9H_A 3QZQ_B 3QZR_A 3R0F_B 3SJ9_A ....
Probab=96.38 E-value=0.077 Score=48.45 Aligned_cols=137 Identities=18% Similarity=0.274 Sum_probs=78.9
Q ss_pred cCeEEEEEEEcCCCeEEecccccCCCCcEEEEeCCCcEEEEE--EEEEcC---CCCEEEEEEcCCCCCCccee--cCCCC
Q 013014 150 PQGSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAK--IVGFDQ---DKDVAVLRIDAPKDKLRPIP--IGVSA 222 (451)
Q Consensus 150 ~~~~GSGfiI~~~G~ILT~aHVv~~~~~i~V~~~dg~~~~a~--vv~~d~---~~DlAlLkv~~~~~~~~~l~--l~~s~ 222 (451)
+...++++.|-.+ ++|-..| -.....+.+ +|+.++.. +.-.+. ..|+++++++.. .+++-+. +.+ .
T Consensus 23 g~~t~l~~gi~~~-~~lvp~H-~~~~~~i~i---~g~~~~~~d~~~lv~~~~~~~Dl~~v~l~~~-~kfrDIrk~~~~-~ 95 (172)
T PF00548_consen 23 GEFTMLALGIYDR-YFLVPTH-EEPEDTIYI---DGVEYKVDDSVVLVDRDGVDTDLTLVKLPRN-PKFRDIRKFFPE-S 95 (172)
T ss_dssp EEEEEEEEEEEBT-EEEEEGG-GGGCSEEEE---TTEEEEEEEEEEEEETTSSEEEEEEEEEESS-S-B--GGGGSBS-S
T ss_pred ceEEEecceEeee-EEEEECc-CCCcEEEEE---CCEEEEeeeeEEEecCCCcceeEEEEEccCC-cccCchhhhhcc-c
Confidence 3557888888755 9999999 222333333 45555433 223443 459999999753 3443332 111 1
Q ss_pred CCCCCcEEEEeeCCCCCCC-ceEEeEEeeeeeeeccCCCCCCcccEEEEcccCCCCCCCCceeCC---CceEEEEEeee
Q 013014 223 DLLVGQKVYAIGNPFGLDH-TLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDS---SGSLIGINTAI 297 (451)
Q Consensus 223 ~~~~G~~V~~iG~p~g~~~-~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~---~G~VVGI~s~~ 297 (451)
.....+...++-.. .... ....+.+...... .. .+......+.++++..+|+-||||+.. .++++||+.++
T Consensus 96 ~~~~~~~~l~v~~~-~~~~~~~~v~~v~~~~~i-~~--~g~~~~~~~~Y~~~t~~G~CG~~l~~~~~~~~~i~GiHvaG 170 (172)
T PF00548_consen 96 IPEYPECVLLVNST-KFPRMIVEVGFVTNFGFI-NL--SGTTTPRSLKYKAPTKPGMCGSPLVSRIGGQGKIIGIHVAG 170 (172)
T ss_dssp GGTEEEEEEEEESS-SSTCEEEEEEEEEEEEEE-EE--TTEEEEEEEEEESEEETTGTTEEEEESCGGTTEEEEEEEEE
T ss_pred cccCCCcEEEEECC-CCccEEEEEEEEeecCcc-cc--CCCEeeEEEEEccCCCCCccCCeEEEeeccCccEEEEEecc
Confidence 12344555555332 2222 3334444443332 11 233346788899999999999999952 67999999985
No 57
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=95.64 E-value=0.013 Score=61.58 Aligned_cols=58 Identities=26% Similarity=0.347 Sum_probs=48.9
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCCEEEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDELLLQGIK 418 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~~v~l~v~R 418 (451)
-|++|..|.+++||++.||+. ||.|+.||.++..+. ++....|....+|+.+++.-++
T Consensus 429 VGIFVaGvqegspA~~eGlqE-----------GDQIL~VN~vdF~nl~REeAVlfLL~lPkGEevtilaQ~ 488 (1027)
T KOG3580|consen 429 VGIFVAGVQEGSPAEQEGLQE-----------GDQILKVNTVDFRNLVREEAVLFLLELPKGEEVTILAQS 488 (1027)
T ss_pred eeEEEeecccCCchhhccccc-----------cceeEEeccccchhhhHHHHHHHHhcCCCCcEEeehhhh
Confidence 589999999999999999999 999999999998887 3445556677889998886554
No 58
>KOG3580 consensus Tight junction proteins [Signal transduction mechanisms]
Probab=95.57 E-value=0.02 Score=60.22 Aligned_cols=84 Identities=23% Similarity=0.359 Sum_probs=59.5
Q ss_pred ccceeccchhhhhhCc---cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCC
Q 013014 334 LGIKFAPDQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKV 408 (451)
Q Consensus 334 lGi~~~~~~~~~~~~~---~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~ 408 (451)
+++.+......+.||+ .-++|.++...+.|++ +++|++||+|++|||....|+ .|..+++.+. .
T Consensus 200 ~kv~LvKsR~nEEyGlrLgSqIFvKeit~~gLAar----------dgnlqEGDiiLkINGtvteNmSLtDar~LIEkS-~ 268 (1027)
T KOG3580|consen 200 IKVLLVKSRANEEYGLRLGSQIFVKEITRTGLAAR----------DGNLQEGDIILKINGTVTENMSLTDARKLIEKS-R 268 (1027)
T ss_pred ceEEEEeeccchhhcccccchhhhhhhcccchhhc----------cCCcccccEEEEECcEeeccccchhHHHHHHhc-c
Confidence 3444444344566776 4577888877766654 345666999999999987776 4777777763 3
Q ss_pred CCEEEEEEEECCeEEEEEEEe
Q 013014 409 GDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 409 g~~v~l~v~R~g~~~~~~v~l 429 (451)
.++++.|+||.+..-++|..
T Consensus 269 -GKL~lvVlRD~~qtLiNiP~ 288 (1027)
T KOG3580|consen 269 -GKLQLVVLRDSQQTLINIPS 288 (1027)
T ss_pred -CceEEEEEecCCceeeecCC
Confidence 46899999998877777753
No 59
>PF00949 Peptidase_S7: Peptidase S7, Flavivirus NS3 serine protease ; InterPro: IPR001850 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies serine peptidases belong to MEROPS peptidase family S7 (flavivirin family, clan PA(S)). The protein fold of the peptidase domain for members of this family resembles that of chymotrypsin, the type example for clan PA. Flaviviruses produce a polyprotein from the ssRNA genome. The N terminus of the NS3 protein (approx. 180 aa) is required for the processing of the polyprotein. NS3 also has conserved homology with NTP-binding proteins and DEAD family of RNA helicase [, , ].; GO: 0003723 RNA binding, 0003724 RNA helicase activity, 0005524 ATP binding; PDB: 2IJO_B 3E90_D 2GGV_B 2FP7_B 2WV9_A 3U1I_B 3U1J_B 2WZQ_A 2WHX_A 3L6P_A ....
Probab=95.55 E-value=0.021 Score=49.61 Aligned_cols=34 Identities=21% Similarity=0.442 Sum_probs=23.4
Q ss_pred EEEEcccCCCCCCCCceeCCCceEEEEEeeeeCC
Q 013014 267 VIQTDAAINPGNSGGPLLDSSGSLIGINTAIYSP 300 (451)
Q Consensus 267 ~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~ 300 (451)
+...+..+.+|.||+|+||.+|+||||.......
T Consensus 87 ~~~~~~d~~~GsSGSpi~n~~g~ivGlYg~g~~~ 120 (132)
T PF00949_consen 87 IGAIDLDFPKGSSGSPIFNQNGEIVGLYGNGVEV 120 (132)
T ss_dssp EEEE---S-TTGTT-EEEETTSCEEEEEEEEEE-
T ss_pred EEeeecccCCCCCCCceEcCCCcEEEEEccceee
Confidence 3445556788999999999999999998876543
No 60
>PF08192 Peptidase_S64: Peptidase family S64; InterPro: IPR012985 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1 [] and belongs to MEROPS petidase family S64 (clan PA). The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS peptidase family S1) and to have a typical trypsin-like catalytic triad [].
Probab=95.37 E-value=0.095 Score=56.33 Aligned_cols=117 Identities=19% Similarity=0.412 Sum_probs=71.0
Q ss_pred CCCEEEEEEcCCC-------CCC------cceecCCC------CCCCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccC
Q 013014 198 DKDVAVLRIDAPK-------DKL------RPIPIGVS------ADLLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSA 258 (451)
Q Consensus 198 ~~DlAlLkv~~~~-------~~~------~~l~l~~s------~~~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~ 258 (451)
-.|+||++++... +++ |.+.+.+. ..+..|..|+-+|...+ .|.|.+.++.-....
T Consensus 542 LsD~AIIkV~~~~~~~N~LGddi~f~~~dP~l~f~NlyV~~~~~~~~~G~~VfK~GrTTg----yT~G~lNg~klvyw~- 616 (695)
T PF08192_consen 542 LSDWAIIKVNKERKCQNYLGDDIQFNEPDPTLMFQNLYVREVVSNLVPGMEVFKVGRTTG----YTTGILNGIKLVYWA- 616 (695)
T ss_pred ccceEEEEeCCCceecCCCCccccccCCCccccccccchhhhhhccCCCCeEEEecccCC----ccceEecceEEEEec-
Confidence 3599999997532 111 22222211 23567999999986644 567777765433221
Q ss_pred CCCCC-cccEEEEc----ccCCCCCCCCceeCCCc------eEEEEEeeeeCCCCCccceeEEEeccCchhhHHHh
Q 013014 259 ATGRP-IQDVIQTD----AAINPGNSGGPLLDSSG------SLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQL 323 (451)
Q Consensus 259 ~~~~~-~~~~i~~d----~~i~~G~SGGPlvd~~G------~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l 323 (451)
.+.. ..+++... .-...|+||+=|++.-+ .|+||.++.. +....+|.+.|+..|.+=+++.
T Consensus 617 -dG~i~s~efvV~s~~~~~Fa~~GDSGS~VLtk~~d~~~gLgvvGMlhsyd---ge~kqfglftPi~~il~rl~~v 688 (695)
T PF08192_consen 617 -DGKIQSSEFVVSSDNNPAFASGGDSGSWVLTKLEDNNKGLGVVGMLHSYD---GEQKQFGLFTPINEILDRLEEV 688 (695)
T ss_pred -CCCeEEEEEEEecCCCccccCCCCcccEEEecccccccCceeeEEeeecC---CccceeeccCcHHHHHHHHHHh
Confidence 1221 13333333 22457999999998643 4999998763 3456799999987776655553
No 61
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=95.24 E-value=0.015 Score=64.09 Aligned_cols=28 Identities=36% Similarity=0.719 Sum_probs=16.6
Q ss_pred EEEcccCCCCCCCCceeCCCceEEEEEe
Q 013014 268 IQTDAAINPGNSGGPLLDSSGSLIGINT 295 (451)
Q Consensus 268 i~~d~~i~~G~SGGPlvd~~G~VVGI~s 295 (451)
+.++..|..||||+|++|.+|+|||++.
T Consensus 624 FlstnDitGGNSGSPvlN~~GeLVGl~F 651 (698)
T PF10459_consen 624 FLSTNDITGGNSGSPVLNAKGELVGLAF 651 (698)
T ss_pred EEeccCcCCCCCCCccCCCCceEEEEee
Confidence 4455555566666666666666666654
No 62
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=95.09 E-value=0.025 Score=60.67 Aligned_cols=54 Identities=30% Similarity=0.443 Sum_probs=43.8
Q ss_pred EEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHH--HHHHHHhcCCCCCEEEEEEEEC
Q 013014 354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDELLLQGIKQ 419 (451)
Q Consensus 354 V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~--dl~~~l~~~~~g~~v~l~v~R~ 419 (451)
|..|.++|||++. ++|++||.|++|||+.|.+.. |+..++.. .|-+|+|+|.-.
T Consensus 782 iGrIieGSPAdRC----------gkLkVGDrilAVNG~sI~~lsHadiv~LIKd--aGlsVtLtIip~ 837 (984)
T KOG3209|consen 782 IGRIIEGSPADRC----------GKLKVGDRILAVNGQSILNLSHADIVSLIKD--AGLSVTLTIIPP 837 (984)
T ss_pred ccccccCChhHhh----------ccccccceEEEecCeeeeccCchhHHHHHHh--cCceEEEEEcCh
Confidence 6778889999876 445569999999999999885 77777775 588999998753
No 63
>TIGR02860 spore_IV_B stage IV sporulation protein B. SpoIVB, the stage IV sporulation protein B of endospore-forming bacteria such as Bacillus subtilis, is a serine proteinase, expressed in the spore (rather than mother cell) compartment, that participates in a proteolytic activation cascade for Sigma-K. It appears to be universal among endospore-forming bacteria and occurs nowhere else.
Probab=94.12 E-value=0.46 Score=49.00 Aligned_cols=39 Identities=26% Similarity=0.587 Sum_probs=29.9
Q ss_pred cccCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEEec
Q 013014 271 DAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSIPV 313 (451)
Q Consensus 271 d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~ 313 (451)
...+..||||+|++ .+|++||=++..+-+ ++..||+|-+
T Consensus 354 tgGivqGMSGSPi~-q~gkliGAvtHVfvn---dpt~GYGi~i 392 (402)
T TIGR02860 354 TGGIVQGMSGSPII-QNGKVIGAVTHVFVN---DPTSGYGVYI 392 (402)
T ss_pred hCCEEecccCCCEE-ECCEEEEEEEEEEec---CCCcceeehH
Confidence 44567899999999 699999988877654 3456777744
No 64
>PF02122 Peptidase_S39: Peptidase S39; InterPro: IPR000382 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein []. The nucleotide sequence for the RNA of PLrV has been determined [, ]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].; GO: 0004252 serine-type endopeptidase activity, 0022415 viral reproductive process, 0016021 integral to membrane; PDB: 1ZYO_A.
Probab=93.50 E-value=0.28 Score=45.89 Aligned_cols=129 Identities=15% Similarity=0.151 Sum_probs=49.8
Q ss_pred eEEEEEEEcCCC--eEEecccccCCCCcEEEEeCCCcEEE---EEEEEEcCCCCEEEEEEcCCC---CCCcceecCCCCC
Q 013014 152 GSGSGFVWDSKG--HVVTNYHVIRGASDIRVTFADQSAYD---AKIVGFDQDKDVAVLRIDAPK---DKLRPIPIGVSAD 223 (451)
Q Consensus 152 ~~GSGfiI~~~G--~ILT~aHVv~~~~~i~V~~~dg~~~~---a~vv~~d~~~DlAlLkv~~~~---~~~~~l~l~~s~~ 223 (451)
+.++.+-. .+| .++|+.||......+.. ..+|+..+ .+.+..+...|++||++.... .....+.+.....
T Consensus 30 Gya~cv~l-~~g~~~L~ta~Hv~~~~~~~~~-~k~g~kipl~~f~~~~~~~~~D~~il~~P~n~~s~Lg~k~~~~~~~~~ 107 (203)
T PF02122_consen 30 GYATCVRL-FDGEDALLTARHVWSRPSKVTS-LKTGEKIPLAEFTDLLESRIADFVILRGPPNWESKLGVKAAQLSQNSQ 107 (203)
T ss_dssp ----EEEE-----EEEEE-HHHHTSSS---E-EETTEEEE--S-EEEEE-TTT-EEEEE--HHHHHHHT-----B----S
T ss_pred ccceEEEC-cCCccceecccccCCCccceeE-cCCCCcccchhChhhhCCCccCEEEEecCcCHHHHhCcccccccchhh
Confidence 44455332 344 69999999998665543 33444443 345566788899999996321 1122223321111
Q ss_pred CCCCcEEEEeeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEEEEcccCCCCCCCCceeCCCceEEEEEeee
Q 013014 224 LLVGQKVYAIGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI 297 (451)
Q Consensus 224 ~~~G~~V~~iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~ 297 (451)
+. --.+.. +....+........+.. . ...+...-+...+|.||.|+++.+ +++|++...
T Consensus 108 ~~----~g~~~~-----y~~~~~~~~~~sa~i~g----~-~~~~~~vls~T~~G~SGtp~y~g~-~vvGvH~G~ 166 (203)
T PF02122_consen 108 LA----KGPVSF-----YGFSSGEWPCSSAKIPG----T-EGKFASVLSNTSPGWSGTPYYSGK-NVVGVHTGS 166 (203)
T ss_dssp EE----EEESST-----TSEEEEEEEEEE-S---------STTEEEE-----TT-TT-EEE-SS--EEEEEEEE
T ss_pred hC----CCCeee-----eeecCCCceeccCcccc----c-cCcCCceEcCCCCCCCCCCeEECC-CceEeecCc
Confidence 10 001111 11122111111111111 0 124556667888999999999988 999999975
No 65
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=93.40 E-value=0.16 Score=51.81 Aligned_cols=70 Identities=26% Similarity=0.335 Sum_probs=52.6
Q ss_pred ccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEE--CCeEEEEE
Q 013014 349 VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIK--QPPVLSDN 426 (451)
Q Consensus 349 ~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R--~g~~~~~~ 426 (451)
..|.-|.+|.++|++.++||.+ --|.|++|||..++..+|..+.+.+... ++|++++.- -...++++
T Consensus 14 teg~hvlkVqedSpa~~aglep----------ffdFIvSI~g~rL~~dnd~Lk~llk~~s-ekVkltv~n~kt~~~R~v~ 82 (462)
T KOG3834|consen 14 TEGYHVLKVQEDSPAHKAGLEP----------FFDFIVSINGIRLNKDNDTLKALLKANS-EKVKLTVYNSKTQEVRIVE 82 (462)
T ss_pred ceeEEEEEeecCChHHhcCcch----------hhhhhheeCcccccCchHHHHHHHHhcc-cceEEEEEecccceeEEEE
Confidence 3677799999999999999997 3799999999999988776666555333 449999874 23344555
Q ss_pred EEe
Q 013014 427 LRL 429 (451)
Q Consensus 427 v~l 429 (451)
|+.
T Consensus 83 I~p 85 (462)
T KOG3834|consen 83 IVP 85 (462)
T ss_pred ecc
Confidence 554
No 66
>COG0750 Predicted membrane-associated Zn-dependent proteases 1 [Cell envelope biogenesis, outer membrane]
Probab=93.39 E-value=0.26 Score=50.31 Aligned_cols=56 Identities=34% Similarity=0.487 Sum_probs=48.2
Q ss_pred EecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCE---EEEEEEE-CCeE
Q 013014 355 LDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDE---LLLQGIK-QPPV 422 (451)
Q Consensus 355 ~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~---v~l~v~R-~g~~ 422 (451)
.++..++++..+|+++ ||.|+++|++++.++++..+.+... .+.. +.+.+.| +++.
T Consensus 134 ~~v~~~s~a~~a~l~~-----------Gd~iv~~~~~~i~~~~~~~~~~~~~-~~~~~~~~~i~~~~~~~~~ 193 (375)
T COG0750 134 GEVAPKSAAALAGLRP-----------GDRIVAVDGEKVASWDDVRRLLVAA-AGDVFNLLTILVIRLDGEA 193 (375)
T ss_pred eecCCCCHHHHcCCCC-----------CCEEEeECCEEccCHHHHHHHHHhc-cCCcccceEEEEEecccee
Confidence 3678899999999999 9999999999999999998887753 4555 8899999 7776
No 67
>KOG3532 consensus Predicted protein kinase [General function prediction only]
Probab=93.28 E-value=0.14 Score=54.82 Aligned_cols=46 Identities=24% Similarity=0.345 Sum_probs=42.2
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcC
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~ 406 (451)
.-|.|..|.++++|.++-|++ ||++++|||.+|++..+..+.++..
T Consensus 398 ~~v~v~tv~~ns~a~k~~~~~-----------gdvlvai~~~pi~s~~q~~~~~~s~ 443 (1051)
T KOG3532|consen 398 RAVKVCTVEDNSLADKAAFKP-----------GDVLVAINNVPIRSERQATRFLQST 443 (1051)
T ss_pred eEEEEEEecCCChhhHhcCCC-----------cceEEEecCccchhHHHHHHHHHhc
Confidence 457789999999999999999 9999999999999999999999874
No 68
>KOG3542 consensus cAMP-regulated guanine nucleotide exchange factor [Signal transduction mechanisms]
Probab=92.69 E-value=0.085 Score=56.38 Aligned_cols=57 Identities=28% Similarity=0.351 Sum_probs=43.8
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIK 418 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R 418 (451)
.|++|.+|.+++.|++.|++. ||.|++|||+...+.. +.++..-...+..+.+++.-
T Consensus 562 fgifV~~V~pgskAa~~GlKR-----------gDqilEVNgQnfenis-~~KA~eiLrnnthLtltvKt 618 (1283)
T KOG3542|consen 562 FGIFVAEVFPGSKAAREGLKR-----------GDQILEVNGQNFENIS-AKKAEEILRNNTHLTLTVKT 618 (1283)
T ss_pred ceeEEeeecCCchHHHhhhhh-----------hhhhhhccccchhhhh-HHHHHHHhcCCceEEEEEec
Confidence 589999999999999999999 9999999999888764 33333322334566666653
No 69
>PF10459 Peptidase_S46: Peptidase S46; InterPro: IPR019500 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This entry represents S46 peptidases, where dipeptidyl-peptidase 7 (DPP-7) is the best-characterised member of this family. It is a serine peptidase that is located on the cell surface and is predicted to have two N-terminal transmembrane domains.
Probab=91.70 E-value=0.099 Score=57.71 Aligned_cols=22 Identities=36% Similarity=0.359 Sum_probs=20.2
Q ss_pred eEEEEEEEcCCCeEEecccccC
Q 013014 152 GSGSGFVWDSKGHVVTNYHVIR 173 (451)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHVv~ 173 (451)
+-|||-+|+++|.|+||.||..
T Consensus 47 gGCSgsfVS~~GLvlTNHHC~~ 68 (698)
T PF10459_consen 47 GGCSGSFVSPDGLVLTNHHCGY 68 (698)
T ss_pred CceeEEEEcCCceEEecchhhh
Confidence 3599999999999999999985
No 70
>KOG3549 consensus Syntrophins (type gamma) [Extracellular structures]
Probab=91.16 E-value=0.33 Score=48.22 Aligned_cols=55 Identities=31% Similarity=0.407 Sum_probs=44.7
Q ss_pred ceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCCEEEEEEE
Q 013014 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDELLLQGI 417 (451)
Q Consensus 351 Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~~v~l~v~ 417 (451)
-|+|+++.++..|+..|+- -+||-|++|||.-|+.- +|+..+|.. .|+.|+++|.
T Consensus 81 PvviSkI~kdQaAd~tG~L----------FvGDAilqvNGi~v~~c~HeevV~iLRN--AGdeVtlTV~ 137 (505)
T KOG3549|consen 81 PVVISKIYKDQAADITGQL----------FVGDAILQVNGIYVTACPHEEVVNILRN--AGDEVTLTVK 137 (505)
T ss_pred cEEeehhhhhhhhhhcCce----------EeeeeeEEeccEEeecCChHHHHHHHHh--cCCEEEEEeH
Confidence 4688899898888887764 24999999999988876 578888864 6999999885
No 71
>PF09342 DUF1986: Domain of unknown function (DUF1986); InterPro: IPR015420 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This domain is found in serine endopeptidases belonging to MEROPS peptidase family S1A (clan PA). It is found in unusual mosaic proteins, which are encoded by the Drosophila nudel gene (see P98159 from SWISSPROT). Nudel is involved in defining embryonic dorsoventral polarity. Three proteases; ndl, gd and snk process easter to create active easter. Active easter defines cell identities along the dorsal-ventral continuum by activating the spz ligand for the Tl receptor in the ventral region of the embryo. Nudel, pipe and windbeutel together trigger the protease cascade within the extraembryonic perivitelline compartment which induces dorsoventral polarity of the Drosophila embryo [].
Probab=90.97 E-value=5.9 Score=37.99 Aligned_cols=88 Identities=18% Similarity=0.263 Sum_probs=59.5
Q ss_pred ccCeEEEEEEEcCCCeEEecccccCCCC----cEEEEeCCCcEEE------EEEEEEc-----CCCCEEEEEEcCCCC--
Q 013014 149 VPQGSGSGFVWDSKGHVVTNYHVIRGAS----DIRVTFADQSAYD------AKIVGFD-----QDKDVAVLRIDAPKD-- 211 (451)
Q Consensus 149 ~~~~~GSGfiI~~~G~ILT~aHVv~~~~----~i~V~~~dg~~~~------a~vv~~d-----~~~DlAlLkv~~~~~-- 211 (451)
.+.-.++|++||++ |+|++..|+.+-. -+.+.++.++.+. -++..+| ++.++++|.++.+..
T Consensus 25 dG~~~CsgvLlD~~-WlLvsssCl~~I~L~~~YvsallG~~Kt~~~v~Gp~EQI~rVD~~~~V~~S~v~LLHL~~~~~fT 103 (267)
T PF09342_consen 25 DGRYWCSGVLLDPH-WLLVSSSCLRGISLSHHYVSALLGGGKTYLSVDGPHEQISRVDCFKDVPESNVLLLHLEQPANFT 103 (267)
T ss_pred cCeEEEEEEEeccc-eEEEeccccCCcccccceEEEEecCcceecccCCChheEEEeeeeeeccccceeeeeecCcccce
Confidence 35678999999987 9999999998743 3667777776543 1244444 678999999987541
Q ss_pred -CCcceecCC-CCCCCCCcEEEEeeCCC
Q 013014 212 -KLRPIPIGV-SADLLVGQKVYAIGNPF 237 (451)
Q Consensus 212 -~~~~l~l~~-s~~~~~G~~V~~iG~p~ 237 (451)
-..|.-+.+ .......+.++++|.-.
T Consensus 104 r~VlP~flp~~~~~~~~~~~CVAVg~d~ 131 (267)
T PF09342_consen 104 RYVLPTFLPETSNENESDDECVAVGHDD 131 (267)
T ss_pred eeecccccccccCCCCCCCceEEEEccc
Confidence 123433432 23455566899999654
No 72
>KOG3606 consensus Cell polarity protein PAR6 [Signal transduction mechanisms]
Probab=90.78 E-value=0.65 Score=44.72 Aligned_cols=57 Identities=25% Similarity=0.413 Sum_probs=44.1
Q ss_pred ccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCC--HHHHHHHHhcCCCCCEEEEEEE
Q 013014 349 VSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSN--GSDLYRILDQCKVGDELLLQGI 417 (451)
Q Consensus 349 ~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~--~~dl~~~l~~~~~g~~v~l~v~ 417 (451)
+.|++|....+++.|+..||-+ +.|.|++|||.+|.. .+++.+++-.. ...+-++|.
T Consensus 193 vpGIFISRlVpGGLAeSTGLLa----------VnDEVlEVNGIEVaGKTLDQVTDMMvAN--shNLIiTVk 251 (358)
T KOG3606|consen 193 VPGIFISRLVPGGLAESTGLLA----------VNDEVLEVNGIEVAGKTLDQVTDMMVAN--SHNLIITVK 251 (358)
T ss_pred cCceEEEeecCCccccccceee----------ecceeEEEcCEEeccccHHHHHHHHhhc--ccceEEEec
Confidence 4899999999999999998865 599999999998865 45777665432 244556655
No 73
>PF00944 Peptidase_S3: Alphavirus core protein ; InterPro: IPR000930 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses []. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis, 0016020 membrane; PDB: 2YEW_D 1EP5_A 3J0C_F 1EP6_C 1WYK_D 1DYL_A 1VCQ_B 1VCP_B 1LD4_D 1KXA_A ....
Probab=90.34 E-value=0.49 Score=40.86 Aligned_cols=31 Identities=29% Similarity=0.501 Sum_probs=25.0
Q ss_pred EEcccCCCCCCCCceeCCCceEEEEEeeeeC
Q 013014 269 QTDAAINPGNSGGPLLDSSGSLIGINTAIYS 299 (451)
Q Consensus 269 ~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~ 299 (451)
.-...-.+|+||-|++|..|+||||+-.+.+
T Consensus 98 ip~g~g~~GDSGRpi~DNsGrVVaIVLGG~n 128 (158)
T PF00944_consen 98 IPTGVGKPGDSGRPIFDNSGRVVAIVLGGAN 128 (158)
T ss_dssp EETTS-STTSTTEEEESTTSBEEEEEEEEEE
T ss_pred eccCCCCCCCCCCccCcCCCCEEEEEecCCC
Confidence 3355667999999999999999999987643
No 74
>KOG3209 consensus WW domain-containing protein [General function prediction only]
Probab=90.12 E-value=0.59 Score=50.60 Aligned_cols=58 Identities=26% Similarity=0.465 Sum_probs=43.3
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHH--HHHHHHhcCCCCCEEEEEEEECC
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDELLLQGIKQP 420 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~--dl~~~l~~~~~g~~v~l~v~R~g 420 (451)
.+++|..+.+++||.+-| .+++||.|++|||+...++. +..++++ .|....++++|+|
T Consensus 923 M~LfVLRlAeDGPA~rdG----------rm~VGDqi~eINGesTkgmtH~rAIelIk---~gg~~vll~Lr~g 982 (984)
T KOG3209|consen 923 MDLFVLRLAEDGPAIRDG----------RMRVGDQITEINGESTKGMTHDRAIELIK---QGGRRVLLLLRRG 982 (984)
T ss_pred cceEEEEeccCCCccccC----------ceeecceEEEecCcccCCCcHHHHHHHHH---hCCeEEEEEeccC
Confidence 578999999999998654 44459999999999998885 4444554 3555666677765
No 75
>KOG3550 consensus Receptor targeting protein Lin-7 [Extracellular structures]
Probab=90.07 E-value=0.6 Score=41.00 Aligned_cols=55 Identities=31% Similarity=0.406 Sum_probs=38.5
Q ss_pred cceEEEecCCCCccccc-CceeeecccCCCCCCCcEEEEECCEEcCCHH--HHHHHHhcCCCCCEEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKA-GLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDELLLQGI 417 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~a-Gl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~--dl~~~l~~~~~g~~v~l~v~ 417 (451)
+-+||+.+.|++-|++. ||+. ||.+++|||..|..-. ...+.|.. ..| .|++.|.
T Consensus 115 spiyisriipggvadrhgglkr-----------gdqllsvngvsvege~hekavellka-a~g-svklvvr 172 (207)
T KOG3550|consen 115 SPIYISRIIPGGVADRHGGLKR-----------GDQLLSVNGVSVEGEHHEKAVELLKA-AVG-SVKLVVR 172 (207)
T ss_pred CceEEEeecCCccccccCcccc-----------cceeEeecceeecchhhHHHHHHHHH-hcC-cEEEEEe
Confidence 56899999999999874 5666 9999999999887653 23334443 233 4565543
No 76
>KOG1892 consensus Actin filament-binding protein Afadin [Cytoskeleton]
Probab=89.97 E-value=0.39 Score=53.43 Aligned_cols=61 Identities=28% Similarity=0.291 Sum_probs=45.7
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECC
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQP 420 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g 420 (451)
-|+||..|.++++|+. +|.|+.||.+++|||+..-...+-..+-.+.+.|..|.+.|-..|
T Consensus 960 lGIYvKsVV~GgaAd~----------DGRL~aGDQLLsVdG~SLiGisQErAA~lmtrtg~vV~leVaKqg 1020 (1629)
T KOG1892|consen 960 LGIYVKSVVEGGAADH----------DGRLEAGDQLLSVDGHSLIGISQERAARLMTRTGNVVHLEVAKQG 1020 (1629)
T ss_pred cceEEEEeccCCcccc----------ccccccCceeeeecCcccccccHHHHHHHHhccCCeEEEehhhhh
Confidence 5899999999999864 456667999999999987766543332223356888999887655
No 77
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=89.56 E-value=0.38 Score=51.47 Aligned_cols=116 Identities=23% Similarity=0.385 Sum_probs=76.9
Q ss_pred ccCCCCCCCCcee-----CCCceEEEEEeeeeCCCCCccceeEEEeccCchhhHHHhhhcccccc------ccccceecc
Q 013014 272 AAINPGNSGGPLL-----DSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQLVKFGKVTR------PILGIKFAP 340 (451)
Q Consensus 272 ~~i~~G~SGGPlv-----d~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l~~~g~~~~------~~lGi~~~~ 340 (451)
.-|..-|+|||.- |..-+++.|+-.. -..+|...++.+++.+++.-.++. |..-+.+..
T Consensus 675 VViAnmm~~GpAarsgkLnIGDQiiaING~S----------LVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~R 744 (829)
T KOG3605|consen 675 VVIANMMHGGPAARSGKLNIGDQIMSINGTS----------LVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIRR 744 (829)
T ss_pred HHHHhcccCChhhhcCCccccceeEeecCce----------eccccHHHHHHHHhcccccceEEEEEecCCCceEEEeec
Confidence 3344557888874 4444677775432 235899999999999887665532 333333433
Q ss_pred chhhhhhCc---cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCC
Q 013014 341 DQSVEQLGV---SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGD 410 (451)
Q Consensus 341 ~~~~~~~~~---~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~ 410 (451)
.+...++|. .| +|-....++.|++-|+++ |-.|++|||+.|--. +-+.++|.. .+|+
T Consensus 745 Pd~kyQLGFSVQNG-iICSLlRGGIAERGGVRV-----------GHRIIEINgQSVVA~pHekIV~lLs~-aVGE 806 (829)
T KOG3605|consen 745 PDLRYQLGFSVQNG-IICSLLRGGIAERGGVRV-----------GHRIIEINGQSVVATPHEKIVQLLSN-AVGE 806 (829)
T ss_pred ccchhhccceeeCc-EeehhhcccchhccCcee-----------eeeEEEECCceEEeccHHHHHHHHHH-hhhh
Confidence 334455664 56 566788999999999999 999999999876443 345555554 4554
No 78
>PF00947 Pico_P2A: Picornavirus core protein 2A; InterPro: IPR000081 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA. Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly. ; GO: 0008233 peptidase activity, 0006508 proteolysis, 0016032 viral reproduction; PDB: 2HRV_B 1Z8R_A.
Probab=87.87 E-value=8.1 Score=33.22 Aligned_cols=32 Identities=31% Similarity=0.471 Sum_probs=24.6
Q ss_pred ccEEEEcccCCCCCCCCceeCCCceEEEEEeee
Q 013014 265 QDVIQTDAAINPGNSGGPLLDSSGSLIGINTAI 297 (451)
Q Consensus 265 ~~~i~~d~~i~~G~SGGPlvd~~G~VVGI~s~~ 297 (451)
.+++.-..+..||+-||+|+- +.-||||++++
T Consensus 78 ~~~l~g~Gp~~PGdCGg~L~C-~HGViGi~Tag 109 (127)
T PF00947_consen 78 YNLLIGEGPAEPGDCGGILRC-KHGVIGIVTAG 109 (127)
T ss_dssp ECEEEEE-SSSTT-TCSEEEE-TTCEEEEEEEE
T ss_pred cCceeecccCCCCCCCceeEe-CCCeEEEEEeC
Confidence 455666788999999999995 55699999986
No 79
>KOG3552 consensus FERM domain protein FRM-8 [General function prediction only]
Probab=87.77 E-value=0.65 Score=51.65 Aligned_cols=56 Identities=27% Similarity=0.423 Sum_probs=43.3
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCCEEEEEEEEC
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDELLLQGIKQ 419 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~~v~l~v~R~ 419 (451)
.-|+|..|.+++|+. |+|+.||.|++|||.+|.+. +.+.+++..+ .+.|.++|.+-
T Consensus 75 rPviVr~VT~GGps~------------GKL~PGDQIl~vN~Epv~daprervIdlvRac--e~sv~ltV~qP 132 (1298)
T KOG3552|consen 75 RPVIVRFVTEGGPSI------------GKLQPGDQILAVNGEPVKDAPRERVIDLVRAC--ESSVNLTVCQP 132 (1298)
T ss_pred CceEEEEecCCCCcc------------ccccCCCeEEEecCcccccccHHHHHHHHHHH--hhhcceEEecc
Confidence 458899999999874 66677999999999999875 4566666654 46677888763
No 80
>KOG0609 consensus Calcium/calmodulin-dependent serine protein kinase/membrane-associated guanylate kinase [Signal transduction mechanisms]
Probab=87.55 E-value=0.82 Score=48.15 Aligned_cols=68 Identities=29% Similarity=0.416 Sum_probs=51.6
Q ss_pred cccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCC
Q 013014 333 ILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGD 410 (451)
Q Consensus 333 ~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~ 410 (451)
.||+.+.... ..-++|..+..++.+.+.|+- .+||.|++|||..+.+. .++.+++.... .
T Consensus 135 plG~Tik~~e------~~~~~vARI~~GG~~~r~glL----------~~GD~i~EvNGi~v~~~~~~e~q~~l~~~~--G 196 (542)
T KOG0609|consen 135 PLGATIRVEE------DTKVVVARIMHGGMADRQGLL----------HVGDEILEVNGISVANKSPEELQELLRNSR--G 196 (542)
T ss_pred ccceEEEecc------CCccEEeeeccCCcchhccce----------eeccchheecCeecccCCHHHHHHHHHhCC--C
Confidence 4666665432 124789999999999998864 35999999999999886 68888888765 4
Q ss_pred EEEEEEEE
Q 013014 411 ELLLQGIK 418 (451)
Q Consensus 411 ~v~l~v~R 418 (451)
.+++++.-
T Consensus 197 ~itfkiiP 204 (542)
T KOG0609|consen 197 SITFKIIP 204 (542)
T ss_pred cEEEEEcc
Confidence 67777763
No 81
>KOG3834 consensus Golgi reassembly stacking protein GRASP65, contains PDZ domain [Intracellular trafficking, secretion, and vesicular transport]
Probab=87.51 E-value=0.89 Score=46.62 Aligned_cols=65 Identities=25% Similarity=0.354 Sum_probs=50.3
Q ss_pred EEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcCCCCCEEEEEEEECC--eEEEEEEEe
Q 013014 354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQCKVGDELLLQGIKQP--PVLSDNLRL 429 (451)
Q Consensus 354 V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~~~g~~v~l~v~R~g--~~~~~~v~l 429 (451)
|.+|.+++||+.|||++ .+|.|+-+-.......+||...|.++ .++.+++-|+.-. ..++++++.
T Consensus 113 vl~V~p~SPaalAgl~~----------~~DYivG~~~~~~~~~eDl~~lIesh-e~kpLklyVYN~D~d~~ReVti~p 179 (462)
T KOG3834|consen 113 VLSVEPNSPAALAGLRP----------YTDYIVGIWDAVMHEEEDLFTLIESH-EGKPLKLYVYNHDTDSCREVTITP 179 (462)
T ss_pred eeecCCCCHHHhccccc----------ccceEecchhhhccchHHHHHHHHhc-cCCCcceeEeecCCCccceEEeec
Confidence 77889999999999996 39999999555566778899888874 6899999887533 345566553
No 82
>KOG3571 consensus Dishevelled 3 and related proteins [General function prediction only]
Probab=87.17 E-value=0.82 Score=47.71 Aligned_cols=74 Identities=28% Similarity=0.385 Sum_probs=44.9
Q ss_pred ccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHH--HHHHHHhc--CC
Q 013014 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQ--CK 407 (451)
Q Consensus 332 ~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~--dl~~~l~~--~~ 407 (451)
++|||.+.....+ -|-.|++|.+|.++++.+.- |.+..||+||.||.....++. |..+.|.+ .+
T Consensus 261 nfLGiSivgqsn~--rgDggIYVgsImkgGAVA~D----------GRIe~GDMiLQVNevsFENmSNd~AVrvLREaV~~ 328 (626)
T KOG3571|consen 261 NFLGISIVGQSNA--RGDGGIYVGSIMKGGAVALD----------GRIEPGDMILQVNEVSFENMSNDQAVRVLREAVSR 328 (626)
T ss_pred ccceeEeecccCc--CCCCceEEeeeccCceeecc----------CccCccceEEEeeecchhhcCchHHHHHHHHHhcc
Confidence 5666665442222 13378999999998776543 444559999999998777663 33333332 12
Q ss_pred CCCEEEEEEEE
Q 013014 408 VGDELLLQGIK 418 (451)
Q Consensus 408 ~g~~v~l~v~R 418 (451)
+| .++++|-.
T Consensus 329 ~g-Pi~ltvAk 338 (626)
T KOG3571|consen 329 PG-PIKLTVAK 338 (626)
T ss_pred CC-CeEEEEee
Confidence 33 35555543
No 83
>PF02907 Peptidase_S29: Hepatitis C virus NS3 protease; InterPro: IPR004109 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This signature identifies the Hepatitis C virus NS3 protein as a serine protease which belongs to MEROPS peptidase family S29 (hepacivirin family, clan PA(S)), which has a trypsin-like fold. The non-structural (NS) protein NS3 is one of the NS proteins involved in replication of the HCV genome. The NS2 proteinase (IPR002518 from INTERPRO), a zinc-dependent enzyme, performs a single proteolytic cut to release the N terminus of NS3. The action of NS3 proteinase (NS3P), which resides in the N-terminal one-third of the NS3 protein, then yields all remaining non-structural proteins. The C-terminal two-thirds of the NS3 protein contain a helicase. The functional relationship between the proteinase and helicase domains is unknown. NS3 has a structural zinc-binding site and requires cofactor NS4. It has been suggested that the NS3 serine protease of hepatitus C is involved in cell transformation and that the ability to transform requires an active enzyme [].; GO: 0008236 serine-type peptidase activity, 0006508 proteolysis, 0019087 transformation of host cell by virus; PDB: 2QV1_B 3LOX_C 2OBQ_C 2OC1_C 2OC0_A 3LON_A 3KNX_A 2O8M_A 2OBO_A 2OC8_A ....
Probab=87.10 E-value=0.42 Score=41.26 Aligned_cols=131 Identities=22% Similarity=0.308 Sum_probs=61.8
Q ss_pred EEEEEEEcCCCeEEecccccCCCCcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEE
Q 013014 153 SGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYA 232 (451)
Q Consensus 153 ~GSGfiI~~~G~ILT~aHVv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~ 232 (451)
.--|+.|+ |-.-|.+|--... ++.-+.| +..-.+.+...|+..-....-...+.|-.-+ -..+|+
T Consensus 13 ~fmgt~vn--GV~wT~~HGagsr---tlAgp~G---pv~q~~~s~~~Dlv~~p~P~Ga~SL~pCtCg-------~~dlyl 77 (148)
T PF02907_consen 13 SFMGTCVN--GVMWTVYHGAGSR---TLAGPKG---PVNQMYTSVDDDLVGWPAPPGARSLTPCTCG-------SSDLYL 77 (148)
T ss_dssp EEEEEEET--TEEEEEHHHHTTS---EEEBTTS---EB-ESEEETTTTEEEEE-STTB--BBB-SSS-------SSEEEE
T ss_pred ceehhEEc--cEEEEEEecCCcc---cccCCCC---cceEeEEcCCCCCcccccccccccCCccccC-------CccEEE
Confidence 34577784 7888888844321 1111222 1122356677788877664322334433332 135666
Q ss_pred eeCCCCCCCceEEeEEeeeeeeeccCCCCCCcccEE-EEcccCCCCCCCCceeCCCceEEEEEeeeeCCCCCccceeEEE
Q 013014 233 IGNPFGLDHTLTTGVISGLRREISSAATGRPIQDVI-QTDAAINPGNSGGPLLDSSGSLIGINTAIYSPSGASSGVGFSI 311 (451)
Q Consensus 233 iG~p~g~~~~~~~G~Vs~~~~~~~~~~~~~~~~~~i-~~d~~i~~G~SGGPlvd~~G~VVGI~s~~~~~~~~~~~~~~aI 311 (451)
+-+-. .+..+ ++. +.....++ -.......|.||||++-.+|.+|||..+.....+....+-|.
T Consensus 78 Vtr~~----~v~p~-----rr~------gd~~~~L~sp~pis~lkGSSGgPiLC~~GH~vG~f~aa~~trgvak~i~f~- 141 (148)
T PF02907_consen 78 VTRDA----DVIPV-----RRR------GDSRASLLSPRPISDLKGSSGGPILCPSGHAVGMFRAAVCTRGVAKAIDFI- 141 (148)
T ss_dssp E-TTS-----EEEE-----EEE------STTEEEEEEEEEHHHHTT-TT-EEEETTSEEEEEEEEEEEETTEEEEEEEE-
T ss_pred EeccC----cEeee-----EEc------CCCceEecCCceeEEEecCCCCcccCCCCCEEEEEEEEEEcCCceeeEEEE-
Confidence 64321 11111 111 00000110 111223479999999999999999988776554433333333
Q ss_pred ecc
Q 013014 312 PVD 314 (451)
Q Consensus 312 P~~ 314 (451)
|.+
T Consensus 142 P~e 144 (148)
T PF02907_consen 142 PVE 144 (148)
T ss_dssp EHH
T ss_pred eee
Confidence 543
No 84
>KOG3551 consensus Syntrophins (type beta) [Extracellular structures]
Probab=86.25 E-value=0.7 Score=46.65 Aligned_cols=70 Identities=29% Similarity=0.403 Sum_probs=46.7
Q ss_pred ccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCC
Q 013014 332 PILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVG 409 (451)
Q Consensus 332 ~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g 409 (451)
+-|||.+.--...+ .-++|+++.++-.|++++ .|..||.|++|||....+. ++..++|+. .|
T Consensus 96 gGLGISIKGGreNk----MPIlISKIFkGlAADQt~----------aL~~gDaIlSVNG~dL~~AtHdeAVqaLKr--aG 159 (506)
T KOG3551|consen 96 GGLGISIKGGRENK----MPILISKIFKGLAADQTG----------ALFLGDAILSVNGEDLRDATHDEAVQALKR--AG 159 (506)
T ss_pred CcceEEeecCcccC----CceehhHhcccccccccc----------ceeeccEEEEecchhhhhcchHHHHHHHHh--hC
Confidence 55666665321111 347888998888887753 3445999999999988776 355556654 68
Q ss_pred CEEEEEEE
Q 013014 410 DELLLQGI 417 (451)
Q Consensus 410 ~~v~l~v~ 417 (451)
+.|.+.|+
T Consensus 160 keV~levK 167 (506)
T KOG3551|consen 160 KEVLLEVK 167 (506)
T ss_pred ceeeeeee
Confidence 87766553
No 85
>PF02395 Peptidase_S6: Immunoglobulin A1 protease Serine protease Prosite pattern; InterPro: IPR000710 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases []. Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ]. This group of serine peptidases belong to the MEROPS peptidase family S6 (clan PA(S)). The type sample being the IgA1-specific serine endopeptidase from Neisseria gonorrhoeae []. These cleave prolyl bonds in the hinge regions of immunoglobulin A heavy chains. Similar specificity is shown by the unrelated family of M26 metalloendopeptidases.; GO: 0004252 serine-type endopeptidase activity, 0006508 proteolysis; PDB: 3SZE_A 3H09_B 3SYJ_A 1WXR_A 3AK5_B.
Probab=85.07 E-value=3.6 Score=46.12 Aligned_cols=64 Identities=19% Similarity=0.256 Sum_probs=35.8
Q ss_pred eEEEEEEEcCCCeEEecccccCCCCcEEEEeCCCcEEEEEEEEE--cCCCCEEEEEEcCCCCCCcceec
Q 013014 152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGF--DQDKDVAVLRIDAPKDKLRPIPI 218 (451)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHVv~~~~~i~V~~~dg~~~~a~vv~~--d~~~DlAlLkv~~~~~~~~~l~l 218 (451)
..|...+|++. ||+|.+|+..+...+..-..++..| +++.. ++..|+.+-|++.-..+..|+..
T Consensus 65 ~~G~aTLigpq-YiVSV~HN~~gy~~v~FG~~g~~~Y--~iV~RNn~~~~Df~~pRLnK~VTEvaP~~~ 130 (769)
T PF02395_consen 65 NKGVATLIGPQ-YIVSVKHNGKGYNSVSFGNEGQNTY--KIVDRNNYPSGDFHMPRLNKFVTEVAPAEM 130 (769)
T ss_dssp TTSS-EEEETT-EEEBETTG-TSCCEECESCSSTCEE--EEEEEEBETTSTEBEEEESS---SS----B
T ss_pred CCceEEEecCC-eEEEEEccCCCcCceeecccCCceE--EEEEccCCCCcccceeecCceEEEEecccc
Confidence 34789999975 9999999996554443333234444 34433 34469999999864444555544
No 86
>KOG3651 consensus Protein kinase C, alpha binding protein [Signal transduction mechanisms]
Probab=83.35 E-value=2.2 Score=41.76 Aligned_cols=56 Identities=23% Similarity=0.375 Sum_probs=41.5
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHH--HHHHHHhcCCCCCEEEEEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDELLLQGI 417 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~--dl~~~l~~~~~g~~v~l~v~ 417 (451)
.=+||..|..++||.+-| .++.||.|++|||..|.... ++.++++.. -.+|++.+.
T Consensus 30 PClYiVQvFD~tPAa~dG----------~i~~GDEi~avNg~svKGktKveVAkmIQ~~--~~eV~IhyN 87 (429)
T KOG3651|consen 30 PCLYIVQVFDKTPAAKDG----------RIRCGDEIVAVNGISVKGKTKVEVAKMIQVS--LNEVKIHYN 87 (429)
T ss_pred CeEEEEEeccCCchhccC----------ccccCCeeEEecceeecCccHHHHHHHHHHh--ccceEEEeh
Confidence 447899999999998765 23349999999999998764 666777653 345677664
No 87
>KOG2921 consensus Intramembrane metalloprotease (sterol-regulatory element-binding protein (SREBP) protease) [Posttranslational modification, protein turnover, chaperones]
Probab=81.76 E-value=1.9 Score=43.80 Aligned_cols=45 Identities=38% Similarity=0.499 Sum_probs=38.5
Q ss_pred cceEEEecCCCCcccc-cCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhc
Q 013014 350 SGVLVLDAPPNGPAGK-AGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQ 405 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~-aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~ 405 (451)
.||.|.+|...||+.. -||.+ ||+|+++||-+|++.+|..+.++.
T Consensus 220 ~gV~Vtev~~~Spl~gprGL~v-----------gdvitsldgcpV~~v~dW~ecl~t 265 (484)
T KOG2921|consen 220 EGVTVTEVPSVSPLFGPRGLSV-----------GDVITSLDGCPVHKVSDWLECLAT 265 (484)
T ss_pred ceEEEEeccccCCCcCcccCCc-----------cceEEecCCcccCCHHHHHHHHHh
Confidence 7999999999988764 36666 999999999999999998877664
No 88
>PF03510 Peptidase_C24: 2C endopeptidase (C24) cysteine protease family; InterPro: IPR000317 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. The two signatures that defines this group of calivirus polyproteins identify a cysteine peptidase signature that belongs to MEROPS peptidase family C24 (clan PA(C)). Caliciviruses are positive-stranded ssRNA viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF2 encodes a structural protein []; while ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely those classified as small round structured viruses (SRSVs) and those classed as non-SRSVs. Calicivirus proteases from the non-SRSV group, which are members of the PA protease clan, constitute family C24 of the cysteine proteases (proteases from SRSVs belong to the C37 family). As mentioned above, the protease activity resides within a polyprotein. The enzyme cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis
Probab=80.42 E-value=7.1 Score=32.50 Aligned_cols=54 Identities=22% Similarity=0.313 Sum_probs=34.4
Q ss_pred EEEEEcCCCeEEecccccCCCCcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcCCCCCCcceecCC
Q 013014 155 SGFVWDSKGHVVTNYHVIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGV 220 (451)
Q Consensus 155 SGfiI~~~G~ILT~aHVv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~ 220 (451)
-++-|. +|..+|+.||.+..+.+. |..+ +++. ...|+|+++.+.. .++..++++
T Consensus 2 ~avHIG-nG~~vt~tHva~~~~~v~-----g~~f--~~~~--~~ge~~~v~~~~~--~~p~~~ig~ 55 (105)
T PF03510_consen 2 WAVHIG-NGRYVTVTHVAKSSDSVD-----GQPF--KIVK--TDGELCWVQSPLV--HLPAAQIGT 55 (105)
T ss_pred ceEEeC-CCEEEEEEEEeccCceEc-----CcCc--EEEE--eccCEEEEECCCC--CCCeeEecc
Confidence 356665 589999999998876542 2111 2222 3449999999753 366666653
No 89
>KOG0606 consensus Microtubule-associated serine/threonine kinase and related proteins [Signal transduction mechanisms; General function prediction only]
Probab=79.62 E-value=2.5 Score=48.38 Aligned_cols=50 Identities=34% Similarity=0.483 Sum_probs=39.9
Q ss_pred EEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHH--HHHHHHhcCCCCCEEEEE
Q 013014 353 LVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDELLLQ 415 (451)
Q Consensus 353 ~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~--dl~~~l~~~~~g~~v~l~ 415 (451)
.|-.|.+++||..+|+++ ||.|+.|||+++.... ++.+.+.+ -|..+.+.
T Consensus 661 ~v~sv~egsPA~~agls~-----------~DlIthvnge~v~gl~H~ev~~Lll~--~gn~v~~~ 712 (1205)
T KOG0606|consen 661 SVGSVEEGSPAFEAGLSA-----------GDLITHVNGEPVHGLVHTEVMELLLK--SGNKVTLR 712 (1205)
T ss_pred eeeeecCCCCccccCCCc-----------cceeEeccCcccchhhHHHHHHHHHh--cCCeeEEE
Confidence 467888999999999999 9999999999998874 67777664 34554443
No 90
>KOG3605 consensus Beta amyloid precursor-binding protein [General function prediction only]
Probab=79.40 E-value=3 Score=44.98 Aligned_cols=68 Identities=29% Similarity=0.436 Sum_probs=47.5
Q ss_pred ceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCH--HHHHHHHhcCCCCCEEEEEEEECCeEEEEEEE
Q 013014 351 GVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNG--SDLYRILDQCKVGDELLLQGIKQPPVLSDNLR 428 (451)
Q Consensus 351 Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~--~dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v~ 428 (451)
-|+|.+...++||++.| +|-.||.|++|||...-.. ..-+.++...+.-..|+++|.+=--..++.|+
T Consensus 674 TVViAnmm~~GpAarsg----------kLnIGDQiiaING~SLVGLPLstcQs~Ik~~KnQT~VkltiV~cpPV~~V~I~ 743 (829)
T KOG3605|consen 674 TVVIANMMHGGPAARSG----------KLNIGDQIMSINGTSLVGLPLSTCQSIIKGLKNQTAVKLNIVSCPPVTTVLIR 743 (829)
T ss_pred HHHHHhcccCChhhhcC----------CccccceeEeecCceeccccHHHHHHHHhcccccceEEEEEecCCCceEEEee
Confidence 34556677888888864 3445999999999876554 34456677666667788988876555555554
No 91
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=73.09 E-value=2.6 Score=43.32 Aligned_cols=25 Identities=24% Similarity=0.513 Sum_probs=21.7
Q ss_pred cccCCCCCCCCceeCCCceEEEEEe
Q 013014 271 DAAINPGNSGGPLLDSSGSLIGINT 295 (451)
Q Consensus 271 d~~i~~G~SGGPlvd~~G~VVGI~s 295 (451)
+..+..|.||+.|+|.+|++|||..
T Consensus 349 ~~~l~gGaSGS~V~n~~~~lvGIy~ 373 (374)
T PF01732_consen 349 NYSLGGGASGSMVINQNNELVGIYF 373 (374)
T ss_pred ccCCCCCCCcCeEECCCCCEEEEeC
Confidence 3456789999999999999999975
No 92
>PF05416 Peptidase_C37: Southampton virus-type processing peptidase; InterPro: IPR001665 In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold: Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, N-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins. Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; N, asparagine; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule. In the case of the asparagine endopeptidases, the nucleophile is asparagine and all are self-processing endopeptidases. In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding. Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad []. This group of cysteine peptidases belong to the MEROPS peptidase family C37, (clan PA(C)). The type example is calicivirin from Southampton virus, an endopeptidase that cleaves the polyprotein at sites N-terminal to itself, liberating the polyprotein helicase. Southampton virus is a positive-stranded ssRNA virus belonging to the Caliciviruses, which are viruses that cause gastroenteritis. The calicivirus genome contains two open reading frames, ORF1 and ORF2. ORF1 encodes a non-structural polypeptide, which has RNA helicase, cysteine protease and RNA polymerase activity []. The regions of the polyprotein in which these activities lie are similar to proteins produced by the picornaviruses []. ORF2 encodes a structural, capsid protein. Two different families of caliciviruses can be distinguished on the basis of sequence similarity, namely the Norwalk-like viruses or small round structured viruses (SRSVs), and those classed as non-SRSVs.; GO: 0004197 cysteine-type endopeptidase activity, 0006508 proteolysis; PDB: 2FYQ_A 2FYR_A 1WQS_D 4ASH_A 2IPH_B.
Probab=63.51 E-value=22 Score=36.80 Aligned_cols=135 Identities=19% Similarity=0.273 Sum_probs=67.5
Q ss_pred CeEEEEEEEcCCCeEEecccccCCCC-cEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcCCC-CCCcceecCCCCCCCCCc
Q 013014 151 QGSGSGFVWDSKGHVVTNYHVIRGAS-DIRVTFADQSAYDAKIVGFDQDKDVAVLRIDAPK-DKLRPIPIGVSADLLVGQ 228 (451)
Q Consensus 151 ~~~GSGfiI~~~G~ILT~aHVv~~~~-~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~~~-~~~~~l~l~~s~~~~~G~ 228 (451)
-++|-||-|++. .++|+.||+.... ++. | .+..-+.++..-+++-+++..+. .+++.+-|. +-...|.
T Consensus 378 fGsGWGfWVS~~-lfITttHViP~g~~E~F-----G--v~i~~i~vh~sGeF~~~rFpk~iRPDvtgmiLE--eGapEGt 447 (535)
T PF05416_consen 378 FGSGWGFWVSPT-LFITTTHVIPPGAKEAF-----G--VPISQIQVHKSGEFCRFRFPKPIRPDVTGMILE--EGAPEGT 447 (535)
T ss_dssp ETTEEEEESSSS-EEEEEGGGS-STTSEET-----T--EECGGEEEEEETTEEEEEESS-SSTTS---EE---SS--TT-
T ss_pred cCCceeeeecce-EEEEeeeecCCcchhhh-----C--CChhHeEEeeccceEEEecCCCCCCCccceeec--cCCCCce
Confidence 367999999987 9999999997532 211 0 00011233444577888887543 234555553 2234465
Q ss_pred EEEE-eeCCCCCC--CceEEeEEeeeeeeeccCCCCCCcccEEEE-------cccCCCCCCCCceeCCCc---eEEEEEe
Q 013014 229 KVYA-IGNPFGLD--HTLTTGVISGLRREISSAATGRPIQDVIQT-------DAAINPGNSGGPLLDSSG---SLIGINT 295 (451)
Q Consensus 229 ~V~~-iG~p~g~~--~~~~~G~Vs~~~~~~~~~~~~~~~~~~i~~-------d~~i~~G~SGGPlvd~~G---~VVGI~s 295 (451)
-+.+ |-.+.|.- ..+..|...++...-... .....++.+ |-...||+-|.|-+-..| -|+||+.
T Consensus 448 V~siLiKR~sGEllpLAvRMgt~AsmkIqgr~v---~GQ~GMLLTGaNAK~mDLGT~PGDCGcPYvyKrgNd~VV~GVH~ 524 (535)
T PF05416_consen 448 VCSILIKRPSGELLPLAVRMGTHASMKIQGRTV---HGQMGMLLTGANAKGMDLGTIPGDCGCPYVYKRGNDWVVIGVHA 524 (535)
T ss_dssp EEEEEEE-TTSBEEEEEEEEEEEEEEEETTEEE---EEEEEEETTSTT-SSTTTS--TTGTT-EEEEEETTEEEEEEEEE
T ss_pred EEEEEEEcCCccchhhhhhhccceeEEEcceee---cceeeeeeecCCccccccCCCCCCCCCceeeecCCcEEEEEEEe
Confidence 4443 45565532 345566554433210000 001233333 334568999999997655 4999999
Q ss_pred eee
Q 013014 296 AIY 298 (451)
Q Consensus 296 ~~~ 298 (451)
+..
T Consensus 525 AAt 527 (535)
T PF05416_consen 525 AAT 527 (535)
T ss_dssp EE-
T ss_pred hhc
Confidence 764
No 93
>PF11874 DUF3394: Domain of unknown function (DUF3394); InterPro: IPR021814 This domain is functionally uncharacterised. This domain is found in bacteria. This presumed domain is about 190 amino acids in length. This domain is found associated with PF06808 from PFAM.
Probab=54.70 E-value=61 Score=29.83 Aligned_cols=61 Identities=23% Similarity=0.236 Sum_probs=41.1
Q ss_pred CCccceeEEEeccCchhhHHHhhhccccccccccceeccchhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCC
Q 013014 302 GASSGVGFSIPVDTVNGIVDQLVKFGKVTRPILGIKFAPDQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLIL 381 (451)
Q Consensus 302 ~~~~~~~~aIP~~~i~~~l~~l~~~g~~~~~~lGi~~~~~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~ 381 (451)
|........+|+..-..--+++.+ .|+.+.++. +.++|..+..+|+|+++|+.-
T Consensus 89 G~~~~k~v~lpl~~~~~g~eRL~~--------~GL~l~~e~-------~~~~Vd~v~fgS~A~~~g~d~----------- 142 (183)
T PF11874_consen 89 GDPVTKTVLLPLGDGADGEERLEA--------AGLTLMEEG-------GKVIVDEVEFGSPAEKAGIDF----------- 142 (183)
T ss_pred CCceEEEEEEEcCCCCCHHHHHHh--------CCCEEEeeC-------CEEEEEecCCCCHHHHcCCCC-----------
Confidence 333445566676544444444433 355555432 457899999999999999998
Q ss_pred CcEEEEE
Q 013014 382 GDIITSV 388 (451)
Q Consensus 382 GDiIl~v 388 (451)
|+.|++|
T Consensus 143 d~~I~~v 149 (183)
T PF11874_consen 143 DWEITEV 149 (183)
T ss_pred CcEEEEE
Confidence 8888877
No 94
>cd00600 Sm_like The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=50.98 E-value=42 Score=24.47 Aligned_cols=32 Identities=19% Similarity=0.428 Sum_probs=28.2
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.||+.+.+++..+|...++.+-...
T Consensus 7 ~~V~V~l~~g~~~~G~L~~~D~~~Ni~L~~~~ 38 (63)
T cd00600 7 KTVRVELKDGRVLEGVLVAFDKYMNLVLDDVE 38 (63)
T ss_pred CEEEEEECCCcEEEEEEEEECCCCCEEECCEE
Confidence 46889999999999999999999988877665
No 95
>cd01720 Sm_D2 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D2 heterodimerizes with subunit D1 and three such heterodimers form a hexameric ring structure with alternating D1 and D2 subunits. The D1 - D2 heterodimer also assembles into a heptameric ring containing D2, D3, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=48.12 E-value=34 Score=27.50 Aligned_cols=37 Identities=5% Similarity=0.297 Sum_probs=31.0
Q ss_pred ccCCCCcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 171 VIRGASDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 171 Vv~~~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
++.....+.|.+.+++.+.+++.++|...++.+=...
T Consensus 10 ~~~~~~~V~V~lr~~r~~~G~L~~fD~hmNlvL~d~~ 46 (87)
T cd01720 10 AVKNNTQVLINCRNNKKLLGRVKAFDRHCNMVLENVK 46 (87)
T ss_pred HHcCCCEEEEEEcCCCEEEEEEEEecCccEEEEcceE
Confidence 3445578999999999999999999999998876654
No 96
>cd01735 LSm12_N LSm12 belongs to a family of Sm-like proteins that associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet that associates with other Sm proteins to form hexameric and heptameric ring structures. In addition to the N-terminal Sm-like domain, LSm12 has a novel methyltransferase domain.
Probab=46.61 E-value=68 Score=24.00 Aligned_cols=34 Identities=15% Similarity=0.300 Sum_probs=29.1
Q ss_pred CCcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcC
Q 013014 175 ASDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (451)
Q Consensus 175 ~~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (451)
...+.+.+..|..++++++.+|....+.+|+...
T Consensus 6 Gs~V~~kTc~g~~ieGEV~afD~~tk~lIlk~~s 39 (61)
T cd01735 6 GSQVSCRTCFEQRLQGEVVAFDYPSKMLILKCPS 39 (61)
T ss_pred ccEEEEEecCCceEEEEEEEecCCCcEEEEECcc
Confidence 3456778888999999999999999999998654
No 97
>cd01731 archaeal_Sm1 The archaeal sm1 proteins: The Sm proteins are conserved in all three domains of life and are always associated with U-rich RNA sequences. They function to mediate RNA-RNA interactions and RNA biogenesis. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Eukaryotic Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Since archaebacteria do not have any splicing apparatus, Sm proteins of archaebacteria may play a more general role. Archaeal Lsm proteins are likely to represent the ancestral Sm domain.
Probab=44.22 E-value=55 Score=24.64 Aligned_cols=33 Identities=9% Similarity=0.245 Sum_probs=29.2
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcC
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (451)
..+.|.+.+|+.+.+++.++|+..++.+-....
T Consensus 11 ~~V~V~l~~g~~~~G~L~~~D~~mNlvL~~~~e 43 (68)
T cd01731 11 KPVLVKLKGGKEVRGRLKSYDQHMNLVLEDAEE 43 (68)
T ss_pred CEEEEEECCCCEEEEEEEEECCcceEEEeeEEE
Confidence 468899999999999999999999998887753
No 98
>PRK00737 small nuclear ribonucleoprotein; Provisional
Probab=43.81 E-value=56 Score=25.04 Aligned_cols=32 Identities=13% Similarity=0.330 Sum_probs=28.4
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.+|+.|.+++.++|+..++.+=...
T Consensus 15 k~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~ 46 (72)
T PRK00737 15 SPVLVRLKGGREFRGELQGYDIHMNLVLDNAE 46 (72)
T ss_pred CEEEEEECCCCEEEEEEEEEcccceeEEeeEE
Confidence 46889999999999999999999998887764
No 99
>PF00571 CBS: CBS domain CBS domain web page. Mutations in the CBS domain of Swiss:P35520 lead to homocystinuria.; InterPro: IPR000644 CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes [, ]. Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains []. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP []. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking [], while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking []. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites [, ]. Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern []. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations []. In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).; GO: 0005515 protein binding; PDB: 3JTF_A 3TE5_C 3TDH_C 3T4N_C 2QLV_C 3OI8_A 3LV9_A 2QH1_B 1PVM_B 3LQN_A ....
Probab=43.40 E-value=20 Score=25.20 Aligned_cols=21 Identities=38% Similarity=0.616 Sum_probs=17.7
Q ss_pred CCCCCCceeCCCceEEEEEee
Q 013014 276 PGNSGGPLLDSSGSLIGINTA 296 (451)
Q Consensus 276 ~G~SGGPlvd~~G~VVGI~s~ 296 (451)
.+.+.-|++|.+|+++|+++.
T Consensus 28 ~~~~~~~V~d~~~~~~G~is~ 48 (57)
T PF00571_consen 28 NGISRLPVVDEDGKLVGIISR 48 (57)
T ss_dssp HTSSEEEEESTTSBEEEEEEH
T ss_pred cCCcEEEEEecCCEEEEEEEH
Confidence 356788999999999999874
No 100
>cd01722 Sm_F The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit F is capable of forming both homo- and hetero-heptamer ring structures. To form the hetero-heptamer, Sm subunit F initially binds subunits E and G to form a trimer which then assembles onto snRNA along with the D3/B and D1/D2 heterodimers.
Probab=43.23 E-value=48 Score=25.05 Aligned_cols=32 Identities=13% Similarity=0.217 Sum_probs=27.8
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.+|+.+.+++.++|...++.+=.+.
T Consensus 12 ~~V~V~Lk~g~~~~G~L~~~D~~mNi~L~~~~ 43 (68)
T cd01722 12 KPVIVKLKWGMEYKGTLVSVDSYMNLQLANTE 43 (68)
T ss_pred CEEEEEECCCcEEEEEEEEECCCEEEEEeeEE
Confidence 46889999999999999999999888876554
No 101
>cd01726 LSm6 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm6 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=42.84 E-value=55 Score=24.63 Aligned_cols=32 Identities=13% Similarity=0.233 Sum_probs=27.9
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.+|+.|.+++.++|+..++.+=...
T Consensus 11 ~~V~V~Lk~g~~~~G~L~~~D~~mNlvL~~~~ 42 (67)
T cd01726 11 RPVVVKLNSGVDYRGILACLDGYMNIALEQTE 42 (67)
T ss_pred CeEEEEECCCCEEEEEEEEEccceeeEEeeEE
Confidence 46889999999999999999999988886654
No 102
>cd01730 LSm3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm3 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=39.42 E-value=52 Score=25.91 Aligned_cols=31 Identities=10% Similarity=0.271 Sum_probs=26.8
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEE
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (451)
..+.|.+.+|+.+.+++.++|...+|.+=..
T Consensus 12 k~V~V~l~~gr~~~G~L~~fD~~mNlvL~d~ 42 (82)
T cd01730 12 ERVYVKLRGDRELRGRLHAYDQHLNMILGDV 42 (82)
T ss_pred CEEEEEECCCCEEEEEEEEEccceEEeccce
Confidence 4688999999999999999999998876444
No 103
>cd06168 LSm9 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm9 proteins have a single Sm-like domain structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=39.39 E-value=73 Score=24.77 Aligned_cols=32 Identities=13% Similarity=0.215 Sum_probs=27.5
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.||+.+.+++.++|...+|.+=...
T Consensus 11 ~~v~V~l~dgR~~~G~l~~~D~~~NivL~~~~ 42 (75)
T cd06168 11 RTMRIHMTDGRTLVGVFLCTDRDCNIILGSAQ 42 (75)
T ss_pred CeEEEEEcCCeEEEEEEEEEcCCCcEEecCcE
Confidence 46889999999999999999999988765553
No 104
>cd01717 Sm_B The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit B heterodimerizes with subunit D3 and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=38.89 E-value=66 Score=25.06 Aligned_cols=32 Identities=19% Similarity=0.429 Sum_probs=27.5
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.||+.+.+.+.++|...++.+=...
T Consensus 11 ~~V~V~l~dgR~~~G~L~~~D~~~NlVL~~~~ 42 (79)
T cd01717 11 YRLRVTLQDGRQFVGQFLAFDKHMNLVLSDCE 42 (79)
T ss_pred CEEEEEECCCcEEEEEEEEEcCccCEEcCCEE
Confidence 46889999999999999999999988865554
No 105
>cd01729 LSm7 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm7 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=38.30 E-value=70 Score=25.20 Aligned_cols=31 Identities=23% Similarity=0.334 Sum_probs=26.8
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEE
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (451)
..+.|.+.+|+.+.+++.++|...+|.+=..
T Consensus 13 k~V~V~l~~gr~~~G~L~~~D~~mNlvL~~~ 43 (81)
T cd01729 13 KKIRVKFQGGREVTGILKGYDQLLNLVLDDT 43 (81)
T ss_pred CeEEEEECCCcEEEEEEEEEcCcccEEecCE
Confidence 4688999999999999999999988877554
No 106
>cd01732 LSm5 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=38.08 E-value=64 Score=25.14 Aligned_cols=31 Identities=16% Similarity=0.409 Sum_probs=27.1
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEE
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (451)
..+.|.+.+|+.+.+++.++|...++.+=..
T Consensus 14 ~~V~V~l~~gr~~~G~L~g~D~~mNlvL~da 44 (76)
T cd01732 14 SRIWIVMKSDKEFVGTLLGFDDYVNMVLEDV 44 (76)
T ss_pred CEEEEEECCCeEEEEEEEEeccceEEEEccE
Confidence 5788999999999999999999998886554
No 107
>cd01719 Sm_G The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit G binds subunits E and F to form a trimer which then assembles onto snRNA along with the D1/D2 and D3/B heterodimers forming a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=36.67 E-value=85 Score=24.09 Aligned_cols=32 Identities=9% Similarity=0.165 Sum_probs=27.3
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.+|+.+.+++.++|...+|.+=...
T Consensus 11 k~V~V~L~~g~~~~G~L~~~D~~mNlvL~~~~ 42 (72)
T cd01719 11 KKLSLKLNGNRKVSGILRGFDPFMNLVLDDAV 42 (72)
T ss_pred CeEEEEECCCeEEEEEEEEEcccccEEeccEE
Confidence 46788999999999999999999888875553
No 108
>KOG3938 consensus RGS-GAIP interacting protein GIPC, contains PDZ domain [Signal transduction mechanisms; Intracellular trafficking, secretion, and vesicular transport]
Probab=36.45 E-value=32 Score=33.44 Aligned_cols=40 Identities=25% Similarity=0.498 Sum_probs=33.9
Q ss_pred CCCCCcEEEEECCEEcCCHH--HHHHHHhcCCCCCEEEEEEE
Q 013014 378 RLILGDIITSVNGKKVSNGS--DLYRILDQCKVGDELLLQGI 417 (451)
Q Consensus 378 ~L~~GDiIl~vnG~~i~~~~--dl~~~l~~~~~g~~v~l~v~ 417 (451)
..++||.|-+|||+.|-.+. ++.++|.+.+.|++.++.+.
T Consensus 167 ~i~VGd~IEaiNge~ivG~RHYeVArmLKel~rge~ftlrLi 208 (334)
T KOG3938|consen 167 AICVGDHIEAINGESIVGKRHYEVARMLKELPRGETFTLRLI 208 (334)
T ss_pred heeHHhHHHhhcCccccchhHHHHHHHHHhcccCCeeEEEee
Confidence 46679999999999998885 77888988888998888766
No 109
>smart00651 Sm snRNP Sm proteins. small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing
Probab=35.40 E-value=91 Score=23.02 Aligned_cols=32 Identities=19% Similarity=0.413 Sum_probs=27.6
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.||+.+.+++..+|+..++-+=...
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~NlvL~~~~ 40 (67)
T smart00651 9 KRVLVELKNGREYRGTLKGFDQFMNLVLEDVE 40 (67)
T ss_pred cEEEEEECCCcEEEEEEEEECccccEEEccEE
Confidence 46889999999999999999999888876654
No 110
>cd01728 LSm1 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm1 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=35.05 E-value=86 Score=24.30 Aligned_cols=31 Identities=16% Similarity=0.185 Sum_probs=27.0
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEE
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRI 206 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv 206 (451)
..+.|.+.||+.+.+.+.++|+..++.+=..
T Consensus 13 k~v~V~l~~gr~~~G~L~~fD~~~NlvL~d~ 43 (74)
T cd01728 13 KKVVVLLRDGRKLIGILRSFDQFANLVLQDT 43 (74)
T ss_pred CEEEEEEcCCeEEEEEEEEECCcccEEecce
Confidence 4688999999999999999999988877554
No 111
>TIGR03000 plancto_dom_1 Planctomycetes uncharacterized domain TIGR03000. Domains described by this model are found, so far, only in the Planctomycetes (Pirellula sp. strain 1 and Gemmata obscuriglobus), in up to six proteins per genome, and may be duplicated within a protein. The function is unknown.
Probab=34.26 E-value=49 Score=25.81 Aligned_cols=48 Identities=15% Similarity=0.214 Sum_probs=33.2
Q ss_pred CcEEEEECCEEcCCHHHHHHHHh-cCCCCCE----EEEEEEECCeEEEEEEEe
Q 013014 382 GDIITSVNGKKVSNGSDLYRILD-QCKVGDE----LLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 382 GDiIl~vnG~~i~~~~dl~~~l~-~~~~g~~----v~l~v~R~g~~~~~~v~l 429 (451)
-|-.+.+||++.++......... ....|.. ++.++.|||+..+.+-++
T Consensus 11 adAkl~v~G~~t~~~G~~R~F~T~~L~~G~~y~Y~v~a~~~~dG~~~t~~~~V 63 (75)
T TIGR03000 11 ADAKLKVDGKETNGTGTVRTFTTPPLEAGKEYEYTVTAEYDRDGRILTRTRTV 63 (75)
T ss_pred CCCEEEECCeEcccCccEEEEECCCCCCCCEEEEEEEEEEecCCcEEEEEEEE
Confidence 58889999999999876554332 2345554 666778999877665444
No 112
>PF01423 LSM: LSM domain ; InterPro: IPR001163 This family is found in Lsm (like-Sm) proteins and in bacterial Lsm-related Hfq proteins. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology. Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function [, ]. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6) []. All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker []. In other snRNPs, certain Sm proteins are replaced with different Lsm proteins, such as with U7 snRNPs, in which the D1 and D2 Sm proteins are replaced with U7-specific Lsm10 and Lsm11 proteins, where Lsm11 plays a role in histone U7-specific RNA processing []. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA []. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.; PDB: 1D3B_K 2Y9D_D 2Y9A_D 2Y9C_R 3VRI_C 2Y9B_K 3QUI_D 3M4G_H 3INZ_E 1U1S_C ....
Probab=33.44 E-value=67 Score=23.79 Aligned_cols=33 Identities=21% Similarity=0.444 Sum_probs=29.2
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcC
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (451)
..+.|.+.+|+.+.+++..+|...++.+-....
T Consensus 9 ~~V~V~l~~g~~~~G~L~~~D~~~Nl~L~~~~~ 41 (67)
T PF01423_consen 9 KRVRVELKNGRTYRGTLVSFDQFMNLVLSDVTE 41 (67)
T ss_dssp SEEEEEETTSEEEEEEEEEEETTEEEEEEEEEE
T ss_pred cEEEEEEeCCEEEEEEEEEeechheEEeeeEEE
Confidence 568899999999999999999999988887764
No 113
>cd01721 Sm_D3 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm subunit D3 heterodimerizes with subunit B and three such heterodimers form a hexameric ring structure with alternating B and D3 subunits. The D3 - B heterodimer also assembles into a heptameric ring containing D1, D2, E, F, and G subunits. Sm-like proteins exist in archaea as well as prokaryotes which form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=32.90 E-value=1e+02 Score=23.41 Aligned_cols=32 Identities=9% Similarity=0.272 Sum_probs=28.7
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.+|..|.+++..+|...++.+-.+.
T Consensus 11 ~~V~VeLk~g~~~~G~L~~~D~~MNl~L~~~~ 42 (70)
T cd01721 11 HIVTVELKTGEVYRGKLIEAEDNMNCQLKDVT 42 (70)
T ss_pred CEEEEEECCCcEEEEEEEEEcCCceeEEEEEE
Confidence 46889999999999999999999999888774
No 114
>COG1958 LSM1 Small nuclear ribonucleoprotein (snRNP) homolog [Transcription]
Probab=30.47 E-value=95 Score=24.10 Aligned_cols=33 Identities=21% Similarity=0.459 Sum_probs=28.7
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEcC
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRIDA 208 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~~ 208 (451)
..+.|.+.+|+.+.+++.++|...++.+--+..
T Consensus 18 ~~V~V~lk~g~~~~G~L~~~D~~mNlvL~d~~e 50 (79)
T COG1958 18 KRVLVKLKNGREYRGTLVGFDQYMNLVLDDVEE 50 (79)
T ss_pred CEEEEEECCCCEEEEEEEEEccceeEEEeceEE
Confidence 578899999999999999999998888776653
No 115
>cd01727 LSm8 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm8 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=30.26 E-value=1.1e+02 Score=23.55 Aligned_cols=32 Identities=19% Similarity=0.267 Sum_probs=27.5
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.||+.+.+++.++|...++.+=...
T Consensus 10 ~~V~V~l~dgr~~~G~L~~~D~~~NlvL~~~~ 41 (74)
T cd01727 10 KTVSVITVDGRVIVGTLKGFDQATNLILDDSH 41 (74)
T ss_pred CEEEEEECCCcEEEEEEEEEccccCEEccceE
Confidence 46788999999999999999999888776653
No 116
>COG0298 HypC Hydrogenase maturation factor [Posttranslational modification, protein turnover, chaperones]
Probab=29.44 E-value=87 Score=24.74 Aligned_cols=47 Identities=19% Similarity=0.349 Sum_probs=31.1
Q ss_pred EEEEEEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEE-eeCC
Q 013014 188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYA-IGNP 236 (451)
Q Consensus 188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~-iG~p 236 (451)
++++++..+...++|++.+-.-... --+.|-. .+++.|+.|++ +||.
T Consensus 5 iPgqI~~I~~~~~~A~Vd~gGvkre-V~l~Lv~-~~v~~GdyVLVHvGfA 52 (82)
T COG0298 5 IPGQIVEIDDNNHLAIVDVGGVKRE-VNLDLVG-EEVKVGDYVLVHVGFA 52 (82)
T ss_pred cccEEEEEeCCCceEEEEeccEeEE-EEeeeec-CccccCCEEEEEeeEE
Confidence 4678889998878999998643211 1122211 26889999987 7764
No 117
>COG5233 GRH1 Peripheral Golgi membrane protein [Intracellular trafficking and secretion]
Probab=28.97 E-value=31 Score=34.38 Aligned_cols=30 Identities=40% Similarity=0.684 Sum_probs=26.0
Q ss_pred EEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcC
Q 013014 354 VLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVS 394 (451)
Q Consensus 354 V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~ 394 (451)
+..|.+.+||+++|+.. ||.|+-+|+-++.
T Consensus 67 ~lrv~~~~~~e~~~~~~-----------~dyilg~n~Dp~~ 96 (417)
T COG5233 67 VLRVNPESPAEKAGMVV-----------GDYILGINEDPLR 96 (417)
T ss_pred heeccccChhHhhcccc-----------ceeEEeecCCcHH
Confidence 56778999999999999 9999999976553
No 118
>PF09465 LBR_tudor: Lamin-B receptor of TUDOR domain; InterPro: IPR019023 The Lamin-B receptor is a chromatin and lamin binding protein in the inner nuclear membrane. It is one of the integral inner nuclear envelope membrane proteins responsible for targeting nuclear membranes to chromatin, being a downstream effector of Ran, a small Ras-like nuclear GTPase which regulates NE assembly. Lamin-B receptor interacts with importin beta, a Ran-binding protein, thereby directly contributing to the fusion of membrane vesicles and the formation of the nuclear envelope []. ; PDB: 2L8D_A 2DIG_A.
Probab=28.46 E-value=2.3e+02 Score=20.76 Aligned_cols=35 Identities=17% Similarity=0.335 Sum_probs=27.4
Q ss_pred CCCcEEEEeCCCcE-EEEEEEEEcCCCCEEEEEEcC
Q 013014 174 GASDIRVTFADQSA-YDAKIVGFDQDKDVAVLRIDA 208 (451)
Q Consensus 174 ~~~~i~V~~~dg~~-~~a~vv~~d~~~DlAlLkv~~ 208 (451)
....+.++.++... |++++..+|...++.-++++.
T Consensus 8 ~Ge~V~~rWP~s~lYYe~kV~~~d~~~~~y~V~Y~D 43 (55)
T PF09465_consen 8 IGEVVMVRWPGSSLYYEGKVLSYDSKSDRYTVLYED 43 (55)
T ss_dssp SS-EEEEE-TTTS-EEEEEEEEEETTTTEEEEEETT
T ss_pred CCCEEEEECCCCCcEEEEEEEEecccCceEEEEEcC
Confidence 34568888888766 599999999999999999975
No 119
>COG4956 Integral membrane protein (PIN domain superfamily) [General function prediction only]
Probab=28.38 E-value=51 Score=32.87 Aligned_cols=42 Identities=26% Similarity=0.299 Sum_probs=33.9
Q ss_pred EEEECCEEcCCHHHHHHHHhc-CCCCCEEEEEEEECCeEEEEE
Q 013014 385 ITSVNGKKVSNGSDLYRILDQ-CKVGDELLLQGIKQPPVLSDN 426 (451)
Q Consensus 385 Il~vnG~~i~~~~dl~~~l~~-~~~g~~v~l~v~R~g~~~~~~ 426 (451)
+-++.|.+|-|.+|+..++.- .-+||++++++.++||+..--
T Consensus 269 Vae~qgV~vLNINDLAnAVkP~vlpGe~l~v~iiK~GkE~~QG 311 (356)
T COG4956 269 VAELQGVQVLNINDLANAVKPVVLPGEELTVQIIKDGKEPGQG 311 (356)
T ss_pred HHhhcCCceecHHHHHHHhCCcccCCCeeEEEEeecCcccCCc
Confidence 346678889999999998873 458999999999999876533
No 120
>PF09122 DUF1930: Domain of unknown function (DUF1930); InterPro: IPR015206 This entry represents a domain found in 3-mercaptopyruvate sulphurtransferase which has no known function. This domain adopts a structure consisting of a four-stranded antiparallel beta-sheet and an alpha-helix, arranged in a beta(2)-alpha-beta(2) fashion, and bearing a remarkable structural similarity to the FK506-binding protein class of peptidylprolyl cis/trans-isomerase []. ; PDB: 1OKG_A.
Probab=26.48 E-value=2e+02 Score=21.61 Aligned_cols=45 Identities=18% Similarity=0.167 Sum_probs=27.1
Q ss_pred CcEEEEECCEEcCCHH-HHHHHHhcCCCCCEEEEEEEECCeEEEEEE
Q 013014 382 GDIITSVNGKKVSNGS-DLYRILDQCKVGDELLLQGIKQPPVLSDNL 427 (451)
Q Consensus 382 GDiIl~vnG~~i~~~~-dl~~~l~~~~~g~~v~l~v~R~g~~~~~~v 427 (451)
.-.-+.|||..+.+.+ ++..++.-.-.|+..++.+. .++...+++
T Consensus 19 ~~~tl~vDg~~v~~PD~El~sA~~HlH~GEkA~V~Fk-S~Rv~~iEv 64 (68)
T PF09122_consen 19 DNATLIVDGEIVENPDAELKSALVHLHIGEKAQVFFK-SQRVAVIEV 64 (68)
T ss_dssp TT--EEETTEEESS--HHHHHHHTT-BTT-EEEEEET-TS-EEEEE-
T ss_pred cceEEEEcCeEcCCCCHHHHHHHHHhhcCceeEEEEe-cCcEEEEEc
Confidence 5667889999999996 78888776678998877654 344444443
No 121
>PF14827 Cache_3: Sensory domain of two-component sensor kinase; PDB: 1OJG_A 3BY8_A 1P0Z_I 2V9A_A 2J80_B.
Probab=25.90 E-value=59 Score=27.09 Aligned_cols=17 Identities=35% Similarity=0.675 Sum_probs=12.6
Q ss_pred CceeCCCceEEEEEeee
Q 013014 281 GPLLDSSGSLIGINTAI 297 (451)
Q Consensus 281 GPlvd~~G~VVGI~s~~ 297 (451)
.|++|.+|++||++..+
T Consensus 94 ~PV~d~~g~viG~V~VG 110 (116)
T PF14827_consen 94 APVYDSDGKVIGVVSVG 110 (116)
T ss_dssp EEEE-TTS-EEEEEEEE
T ss_pred EeeECCCCcEEEEEEEE
Confidence 57888999999998754
No 122
>PF02601 Exonuc_VII_L: Exonuclease VII, large subunit; InterPro: IPR020579 Exonuclease VII 3.1.11.6 from EC is composed of two nonidentical subunits; one large subunit and 4 small ones []. Exonuclease VII catalyses exonucleolytic cleavage in either 5'-3' or 3'-5' direction to yield 5'-phosphomononucleotides. The large subunit also contains the OB-fold domains (IPR004365 from INTERPRO) that bind to nucleic acids at the N terminus. This entry represents Exonuclease VII, large subunit, C-terminal. ; GO: 0008855 exodeoxyribonuclease VII activity
Probab=25.24 E-value=81 Score=31.39 Aligned_cols=35 Identities=29% Similarity=0.499 Sum_probs=30.9
Q ss_pred eEEEEEEEcCCCeEEecccccCCCCcEEEEeCCCc
Q 013014 152 GSGSGFVWDSKGHVVTNYHVIRGASDIRVTFADQS 186 (451)
Q Consensus 152 ~~GSGfiI~~~G~ILT~aHVv~~~~~i~V~~~dg~ 186 (451)
..|-.++-+++|.++|+..-+...+.+.+.+.||.
T Consensus 280 ~RGYaiv~~~~g~vI~s~~~l~~gd~i~i~l~DG~ 314 (319)
T PF02601_consen 280 KRGYAIVRDKDGKVITSVKQLKPGDEIEIRLADGS 314 (319)
T ss_pred hCceEEEECCCCCEECCHHHCCCCCEEEEEEcceE
Confidence 35667788788999999999999999999999995
No 123
>COG0061 nadF NAD kinase [Coenzyme metabolism]
Probab=24.65 E-value=30 Score=34.05 Aligned_cols=32 Identities=28% Similarity=0.525 Sum_probs=29.3
Q ss_pred cccccccceeeeecCCCCcccCCccccccCCc
Q 013014 2 AYSLISSSTFLLSRSPNTTLAPLNKHNFPLRP 33 (451)
Q Consensus 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33 (451)
||||+---|.+.-+.+.+.+.|+++|++++||
T Consensus 179 AY~lSAGGPIv~P~l~ai~ltpi~p~~l~~Rp 210 (281)
T COG0061 179 AYNLSAGGPILHPGLDAIQLTPICPHSLSFRP 210 (281)
T ss_pred HHhhhcCCCccCCCCCeEEEeecCCCcccCCC
Confidence 79999989999999999999999999998764
No 124
>cd01723 LSm4 The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. LSm4 is one of at least seven subunits that assemble onto U6 snRNA to form a seven-membered ring structure. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes.
Probab=24.34 E-value=1.9e+02 Score=22.33 Aligned_cols=32 Identities=13% Similarity=0.270 Sum_probs=28.5
Q ss_pred CcEEEEeCCCcEEEEEEEEEcCCCCEEEEEEc
Q 013014 176 SDIRVTFADQSAYDAKIVGFDQDKDVAVLRID 207 (451)
Q Consensus 176 ~~i~V~~~dg~~~~a~vv~~d~~~DlAlLkv~ 207 (451)
..+.|.+.+|..+.+++..+|...++.+-.+.
T Consensus 12 ~~V~VeLkng~~~~G~L~~~D~~mNi~L~~~~ 43 (76)
T cd01723 12 HPMLVELKNGETYNGHLVNCDNWMNIHLREVI 43 (76)
T ss_pred CEEEEEECCCCEEEEEEEEEcCCCceEEEeEE
Confidence 46889999999999999999999999887764
No 125
>PF01455 HupF_HypC: HupF/HypC family; InterPro: IPR001109 The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesised as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins. The hydrogenase expression/formation proteins (HupF/HypC) form a family of small proteins that are hydrogenase precursor-specific chaperones required for this maturation process []. They are believed to keep the hydrogenase precursor in a conformation accessible for metal incorporation [, ].; PDB: 3D3R_A 2Z1C_C 2OT2_A.
Probab=24.30 E-value=2.2e+02 Score=21.61 Aligned_cols=43 Identities=23% Similarity=0.396 Sum_probs=30.0
Q ss_pred EEEEEEEEcCCCCEEEEEEcCCCCCCcceecCCCCCCCCCcEEEEe
Q 013014 188 YDAKIVGFDQDKDVAVLRIDAPKDKLRPIPIGVSADLLVGQKVYAI 233 (451)
Q Consensus 188 ~~a~vv~~d~~~DlAlLkv~~~~~~~~~l~l~~s~~~~~G~~V~~i 233 (451)
++++++..+.....|++.+.. ....+.+.--.++++||+|++-
T Consensus 5 iP~~Vv~v~~~~~~A~v~~~G---~~~~V~~~lv~~v~~Gd~VLVH 47 (68)
T PF01455_consen 5 IPGRVVEVDEDGGMAVVDFGG---VRREVSLALVPDVKVGDYVLVH 47 (68)
T ss_dssp EEEEEEEEETTTTEEEEEETT---EEEEEEGTTCTSB-TT-EEEEE
T ss_pred ccEEEEEEeCCCCEEEEEcCC---cEEEEEEEEeCCCCCCCEEEEe
Confidence 578899998788999998864 2445555444558999999874
No 126
>PF08669 GCV_T_C: Glycine cleavage T-protein C-terminal barrel domain; InterPro: IPR013977 This entry shows glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase. ; PDB: 3ADA_A 1VRQ_A 1X31_A 3AD9_A 3AD8_A 3AD7_A 3GIR_A 1WOO_A 1WOS_A 1WOR_A ....
Probab=23.78 E-value=1e+02 Score=24.49 Aligned_cols=20 Identities=30% Similarity=0.499 Sum_probs=16.8
Q ss_pred CCCCceeCCCceEEEEEeee
Q 013014 278 NSGGPLLDSSGSLIGINTAI 297 (451)
Q Consensus 278 ~SGGPlvd~~G~VVGI~s~~ 297 (451)
..|.|+++.+|+.||.++..
T Consensus 34 ~~g~~v~~~~g~~vG~vTS~ 53 (95)
T PF08669_consen 34 RGGEPVYDEDGKPVGRVTSG 53 (95)
T ss_dssp STTCEEEETTTEEEEEEEEE
T ss_pred CCCCEEEECCCcEEeEEEEE
Confidence 45899998899999998865
No 127
>PF12381 Peptidase_C3G: Tungro spherical virus-type peptidase; InterPro: IPR024387 This entry represents a rice tungro spherical waikavirus-type peptidase that belongs to MEROPS peptidase family C3G. It is a picornain 3C-type protease, and is responsible for the self-cleavage of the positive single-stranded polyproteins of a number of plant viral genomes. The location of the protease activity of the polyprotein is at the C-terminal end, adjacent and N-terminal to the putative RNA polymerase [, ].
Probab=23.44 E-value=60 Score=30.64 Aligned_cols=45 Identities=16% Similarity=0.420 Sum_probs=33.1
Q ss_pred ccEEEEcccCCCCCCCCceeCC----CceEEEEEeeeeCCCCCccceeEEEec
Q 013014 265 QDVIQTDAAINPGNSGGPLLDS----SGSLIGINTAIYSPSGASSGVGFSIPV 313 (451)
Q Consensus 265 ~~~i~~d~~i~~G~SGGPlvd~----~G~VVGI~s~~~~~~~~~~~~~~aIP~ 313 (451)
...+.+..+...|+=|||++-. --|++||+.++.. ..+.+||-++
T Consensus 168 r~gleY~~~t~~GdCGs~i~~~~t~~~RKIvGiHVAG~~----~~~~gYAe~i 216 (231)
T PF12381_consen 168 RQGLEYQMPTMNGDCGSPIVRNNTQMVRKIVGIHVAGSA----NHAMGYAESI 216 (231)
T ss_pred eeeeeEECCCcCCCccceeeEcchhhhhhhheeeecccc----cccceehhhh
Confidence 4456777888899999999742 3589999998643 2457777555
No 128
>PRK14420 acylphosphatase; Provisional
Probab=23.02 E-value=1.1e+02 Score=24.53 Aligned_cols=47 Identities=11% Similarity=0.211 Sum_probs=31.0
Q ss_pred ceecc--chhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcC
Q 013014 336 IKFAP--DQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (451)
Q Consensus 336 i~~~~--~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~ 406 (451)
+.|++ ...|..+|+.| ||.+... |++-+.+.|.+ ...++|.+.|.+.
T Consensus 15 VGFR~~~~~~A~~~gl~G-~V~N~~d----------------------G~Vei~~qG~~-~~i~~f~~~l~~~ 63 (91)
T PRK14420 15 VGFRYFVQMEADKRKLTG-WVKNRDD----------------------GTVEIEAEGPE-EALQLFLDAIEKG 63 (91)
T ss_pred cCChHHHHHHHHHcCCEE-EEEECCC----------------------CcEEEEEEECH-HHHHHHHHHHHhC
Confidence 44444 45678888888 4544322 88999999965 4566777777654
No 129
>PF02743 Cache_1: Cache domain; InterPro: IPR004010 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source []. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions [].; GO: 0016020 membrane; PDB: 3C8C_A 3LIB_D 3LIA_A 3LI8_A 3LI9_A.
Probab=22.28 E-value=61 Score=24.79 Aligned_cols=30 Identities=27% Similarity=0.626 Sum_probs=21.3
Q ss_pred CceeCCCceEEEEEeeeeCCCCCccceeEEEeccCchhhHHHh
Q 013014 281 GPLLDSSGSLIGINTAIYSPSGASSGVGFSIPVDTVNGIVDQL 323 (451)
Q Consensus 281 GPlvd~~G~VVGI~s~~~~~~~~~~~~~~aIP~~~i~~~l~~l 323 (451)
-|+.+.+|+++|++.. .+..+.+.++++++
T Consensus 19 ~pi~~~~g~~~Gvv~~-------------di~l~~l~~~i~~~ 48 (81)
T PF02743_consen 19 VPIYDDDGKIIGVVGI-------------DISLDQLSEIISNI 48 (81)
T ss_dssp EEEEETTTEEEEEEEE-------------EEEHHHHHHHHTTS
T ss_pred EEEECCCCCEEEEEEE-------------EeccceeeeEEEee
Confidence 4677789999999653 46666666666654
No 130
>PRK14440 acylphosphatase; Provisional
Probab=22.25 E-value=1.2e+02 Score=24.28 Aligned_cols=42 Identities=21% Similarity=0.330 Sum_probs=29.6
Q ss_pred chhhhhhCccceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCEEcCCHHHHHHHHhcC
Q 013014 341 DQSVEQLGVSGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGKKVSNGSDLYRILDQC 406 (451)
Q Consensus 341 ~~~~~~~~~~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~~i~~~~dl~~~l~~~ 406 (451)
...|..+|+.| ||.+... |++-+.+.|.+ .+.+++...|.+.
T Consensus 23 ~~~A~~~gl~G-~V~N~~d----------------------G~Vei~~~G~~-~~v~~f~~~l~~g 64 (90)
T PRK14440 23 QIHAIRLGIKG-YAKNLPD----------------------GSVEVVAEGYE-EALSKLLERIKQG 64 (90)
T ss_pred HHHHHHcCCEE-EEEECCC----------------------CCEEEEEEcCH-HHHHHHHHHHhhC
Confidence 44688889998 6655533 77888888865 5567777777653
No 131
>PF14275 DUF4362: Domain of unknown function (DUF4362)
Probab=21.91 E-value=2.5e+02 Score=23.08 Aligned_cols=47 Identities=15% Similarity=0.211 Sum_probs=28.9
Q ss_pred CCcEEEEECCEEcCCHHHHHHHHhcCC--------------CCCEEEEEEEECCeEEEEEEEe
Q 013014 381 LGDIITSVNGKKVSNGSDLYRILDQCK--------------VGDELLLQGIKQPPVLSDNLRL 429 (451)
Q Consensus 381 ~GDiIl~vnG~~i~~~~dl~~~l~~~~--------------~g~~v~l~v~R~g~~~~~~v~l 429 (451)
.||||.+ .|. |.|.+.|...+..-. .|+++...+.-+|+.+.+++.-
T Consensus 2 ~~DVi~~-~~~-i~Nl~kl~~Fi~nv~~~k~d~IrIv~yT~EGdPI~~~L~~~G~~I~y~~Dn 62 (98)
T PF14275_consen 2 NNDVINK-HGE-IENLDKLDQFIENVEQGKPDKIRIVQYTIEGDPIFQDLEYDGNQIKYTSDN 62 (98)
T ss_pred CCCEEEe-CCe-EEeHHHHHHHHHHHhcCCCCEEEEEEecCCCCCEEEEEEECCCEEEEEECC
Confidence 3899988 444 777776666554221 3555555666677666666653
No 132
>cd04627 CBS_pair_14 The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic gener
Probab=21.36 E-value=67 Score=26.27 Aligned_cols=20 Identities=35% Similarity=0.516 Sum_probs=16.4
Q ss_pred CCCCCceeCCCceEEEEEee
Q 013014 277 GNSGGPLLDSSGSLIGINTA 296 (451)
Q Consensus 277 G~SGGPlvd~~G~VVGI~s~ 296 (451)
+.+.=|++|.+|+++|+++.
T Consensus 98 ~~~~lpVvd~~~~~vGiit~ 117 (123)
T cd04627 98 GISSVAVVDNQGNLIGNISV 117 (123)
T ss_pred CCceEEEECCCCcEEEEEeH
Confidence 44567899999999999875
No 133
>PF11325 DUF3127: Domain of unknown function (DUF3127); InterPro: IPR021474 This bacterial family of proteins has no known function.
Probab=20.63 E-value=2.6e+02 Score=22.32 Aligned_cols=64 Identities=17% Similarity=0.198 Sum_probs=38.2
Q ss_pred cceEEEecCCCCcccccCceeeecccCCCCCCCcEEEEECCE-----EcCCHHHHHHHHhcCCCCCEEEEEEEECCeEEE
Q 013014 350 SGVLVLDAPPNGPAGKAGLLSTKRDAYGRLILGDIITSVNGK-----KVSNGSDLYRILDQCKVGDELLLQGIKQPPVLS 424 (451)
Q Consensus 350 ~Gv~V~~v~~~s~a~~aGl~~~~~~~~~~L~~GDiIl~vnG~-----~i~~~~dl~~~l~~~~~g~~v~l~v~R~g~~~~ 424 (451)
.|-+|..+.+.+--.+.|.+. -|+|++-+++ .+.=+.|=.+.|...++|+.|++.+.=+|.+.+
T Consensus 3 ~Gkii~~l~~~~g~s~~Gw~K-----------re~Vlet~~qYP~~i~f~~~~dk~~~l~~~~~Gd~V~Vsf~i~~RE~~ 71 (84)
T PF11325_consen 3 TGKIIKVLPEQQGVSKNGWKK-----------REFVLETEEQYPQKICFEFWGDKIDLLDNFQVGDEVKVSFNIEGREWN 71 (84)
T ss_pred ccEEEEEecCcccCcCCCcEE-----------EEEEEeCCCcCCceEEEEEEcchhhhhccCCCCCEEEEEEEeeccEec
Confidence 465565554444444466776 7888887663 223333444455667788888887765555443
No 134
>PF11948 DUF3465: Protein of unknown function (DUF3465); InterPro: IPR021856 This family of proteins are functionally uncharacterised. This protein is found in bacteria. Proteins in this family are typically between 131 to 151 amino acids in length. This protein has a conserved HWTH sequence motif.
Probab=20.57 E-value=5.7e+02 Score=22.22 Aligned_cols=12 Identities=33% Similarity=0.246 Sum_probs=10.2
Q ss_pred CCCCCcEEEEee
Q 013014 223 DLLVGQKVYAIG 234 (451)
Q Consensus 223 ~~~~G~~V~~iG 234 (451)
.++.||.|.+.|
T Consensus 85 ~l~~GD~V~f~G 96 (131)
T PF11948_consen 85 WLQKGDQVEFYG 96 (131)
T ss_pred CcCCCCEEEEEE
Confidence 478899999988
No 135
>PF01732 DUF31: Putative peptidase (DUF31); InterPro: IPR022382 This domain has no known function. It is found in various hypothetical proteins and putative lipoproteins from mycoplasmas.
Probab=20.53 E-value=63 Score=33.10 Aligned_cols=23 Identities=35% Similarity=0.525 Sum_probs=18.5
Q ss_pred CeEEEEEEEcC----CC------eEEecccccC
Q 013014 151 QGSGSGFVWDS----KG------HVVTNYHVIR 173 (451)
Q Consensus 151 ~~~GSGfiI~~----~G------~ILT~aHVv~ 173 (451)
...|||.|+|- ++ |+.||.||+.
T Consensus 35 ~~~GT~WIlDy~~~~~~~~p~k~y~ATNlHVa~ 67 (374)
T PF01732_consen 35 SVSGTGWILDYKKPEDNKYPTKWYFATNLHVAS 67 (374)
T ss_pred cCcceEEEEEEeccCCCCCCeEEEEEechhhhc
Confidence 36799999982 22 5999999998
Done!